Frontend Masters Boost RSS Feed

Drizzle Database Migrations

Adam Rackis — Mon, 09 Dec 2024 15:23:12 +0000

Drizzle ORM is an incredibly impressive object-relational mapper (ORM). Like traditional ORMs, it offers a domain-specific language (DSL) for querying entire object graphs. Imagine grabbing some “tasks”, along with “comments” on those tasks from your database. But unlike traditional ORMs, it also exposes SQL itself via a thin, strongly typed API. This allows you to write complex queries using things like MERGE, UNION, CTEs, and so on, but in a strongly typed API that looks incredibly similar to the SQL you already know (and hopefully love).

I wrote about Drizzle previously. That post focused exclusively on the typed SQL API. This post will look at another drizzle feature: database migrations. Not only will Drizzle allow you to query your database via a strongly typed API, but it will also keep your object model and database in sync. Let’s get started!

Our Database

Drizzle supports Postgres, MySQL, and SQLite. For this post we’ll be using Postgres, but the idea is the same for all of them. If you’d like to follow along at home, I urge you to use Docker to spin up a Postgres database (or MySQL, if that’s your preference). If you’re completely new to Docker, it’s not terribly hard to get it installed. Once you have it installed, run this:

docker container run -e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker -p 5432:5432 postgres:17.2-alpine3.20

That should get you a Postgres instance up and running that you can connect to on localhost, with a username and password of docker / docker. When you stop that process, your database will vanish into the ether. Restarting that same process will create a brand new Postgres instance with a completely clean slate, making this especially convenient for the type of testing we’re about to do: database migrations.

Incidentally, if you’d like to run a database that actually persists its data on your machine, you can mount a volume.

docker container run -e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker -p 5432:5432 -v /Users/arackis/Documents/pg-data:/var/lib/postgresql/data postgres:17.2-alpine3.20

That does the same thing, while telling Docker to alias the directory in its image of /var/lib/postgresql/data (where Postgres stores its data) onto the directory on your laptop at /Users/arackis/Documents/pg-data. Adjust the latter path as desired. (The other path isn’t up for debate, as that’s what Postgres uses.)

Setting Up

We’ll get an empty app up (npm init is all we need), and then install a few things.

npm i drizzle-orm drizzle-kit pg

The drizzle-orm package is the main ORM that handles querying your database. The drizzle-kit package is what handles database migrations, which will be particularly relevant for this post. Lastly, the pg package is the Node Postgres drivers.

Configuring Drizzle

Let’s start by adding a drizzle.config.ts to the root of our project.

import { defineConfig } from "drizzle-kit";

export default defineConfig({
  dialect: "postgresql",
  out: "./drizzle-schema",
  dbCredentials: {
    database: "jira",
    host: "localhost",
    port: 5432,
    user: "docker",
    password: "docker",
    ssl: false,
  },
});

We tell Drizzle what kind of database we’re using (Postgres), where to put the generated schema code (the drizzle-schema folder), and then the database connection info.

Wanna see more real-world use cases of Drizzle in action? Check out Scott Moss’ course Intermediate Next.js which uses it and gets into it when the project gets into data fetching needs.

The Database First Approach

Say we already have a database and want to generate a Drizzle schema from it. (If you want to go in the opposite direction, stay tuned.)

To create our initial database, I’ve put together a script, which I’ll put here in its entirety.

CREATE DATABASE jira;

\c jira

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50),
    name VARCHAR(250),
    avatar VARCHAR(500)
);

CREATE TABLE tasks (
    id SERIAL PRIMARY KEY,
    name VARCHAR(250),
    epic_id INT,
    user_id INT
);

CREATE TABLE epics (
    id SERIAL PRIMARY KEY,
    name VARCHAR(250),
    description TEXT,
    due DATE
);

CREATE TABLE tags (
    id SERIAL PRIMARY KEY,
    name VARCHAR(250)
);

CREATE TABLE tasks_tags (
    id SERIAL PRIMARY KEY,
    task INT,
    tag INT
);

ALTER TABLE tasks
    ADD CONSTRAINT fk_task_user
    FOREIGN KEY (user_id)
    REFERENCES users (id);

ALTER TABLE tasks
    ADD CONSTRAINT fk_task_epic
    FOREIGN KEY (epic_id)
    REFERENCES epics (id);

ALTER TABLE tasks_tags
    ADD CONSTRAINT fk_tasks_tags_tag
    FOREIGN KEY (tag)
    REFERENCES tags (id);

ALTER TABLE tasks_tags
    ADD CONSTRAINT fk_tasks_tags_task
    FOREIGN KEY (task)
    REFERENCES tasks (id);

This will construct a basic database for an hypothetical Jira clone. We have tables for users, epics, tasks and tags, along with various foreign keys connecting them. Assuming you have psql installed (can be installed via libpq), you can execute that script from the command line like this:

PGPASSWORD=docker psql -h localhost -p 5432 -U docker -f database-creation-script.sql

Now run this command:

npx drizzle-kit pull

This tells Drizzle to look at our database and generate a schema from it.

Files generated

Inside the drizzle-schema folder there’s now a schema.ts file with our Drizzle schema. Here’s a small sample of it.

import { pgTable, serial, varchar, foreignKey, integer } from "drizzle-orm/pg-core";
import { sql } from "drizzle-orm";

export const users = pgTable("users", {
  id: serial().primaryKey().notNull(),
  username: varchar({ length: 50 }),
  name: varchar({ length: 250 }),
  avatar: varchar({ length: 500 }),
});

export const tasks = pgTable(
  "tasks",
  {
    id: serial().primaryKey().notNull(),
    name: varchar({ length: 250 }),
    epicId: integer("epic_id"),
    userId: integer("user_id"),
  },
  table => {
    return {
      fkTaskUser: foreignKey({
        columns: [table.userId],
        foreignColumns: [users.id],
        name: "fk_task_user",
      }),
      fkTaskEpic: foreignKey({
        columns: [table.epicId],
        foreignColumns: [epics.id],
        name: "fk_task_epic",
      }),
    };
  }
);

The users entity is a table with some columns. The tasks entity is a bit more interesting. It’s also a table with some columns, but we can also see some foreign keys being defined.

In Postgres, foreign keys merely create a constraint that’s checked on inserts and updates to verify that a valid value is set, corresponding to a row in the target table. But it has no effect on application code, so you might wonder why Drizzle saw fit to bother creating it. Essentially, Drizzle will allow us to subsequently modify our schema in code, and generate an SQL file that will make equivalent changes in the database. For this to work, Drizzle needs to be aware of things like foreign keys, indexes, etc, so the schema in code, and the database are always truly in sync, and Drizzle knows what’s missing, and needs to be created.

Relations

The other file Drizzle created is relations.ts. Here’s a bit of it:

import { relations } from "drizzle-orm/relations";

export const tasksRelations = relations(tasks, ({ one, many }) => ({
  user: one(users, {
    fields: [tasks.userId],
    references: [users.id],
  }),
  epic: one(epics, {
    fields: [tasks.epicId],
    references: [epics.id],
  }),
  tasksTags: many(tasksTags),
}));

export const usersRelations = relations(users, ({ many }) => ({
  tasks: many(tasks),
}));

This defines the relationships between tables (and is closely related to foreign keys). If you choose to use the Drizzle query API (the one that’s not SQL with types), Drizzle is capable of understanding that some tables have foreign keys into other tables, and allows you to pull down objects, with related objects in one fell swoop. For example, the tasks table has a user_id column in it, representing the user it’s assigned to. With the relationship set up, we can write queries like this:

const tasks = await db.query.tasks.findMany({
  with: {
    user: true,
  },
});

This will pull down all tasks, along with the user each is assigned to.

Making Changes (Migrations)

With the code generation above, we’d now be capable of using Drizzle. But that’s not what this post is about. See my last post on Drizzle, or even just the Drizzle docs for guides on using it. This post is all about database migrations. So far, we took an existing database, and scaffolded a valid Drizzle schema. Now let’s run a script to add some things to the database, and see about updating our Drizzle schema.

We’ll add a new column to tasks called importance, and we’ll also add an index on the tasks table, on the epic_id column. This is unrelated to the foreign key we already have on this column. This is a traditional database index that would assist us in querying the tasks table on the epic_id column.

Here’s the SQL script we’ll run:

CREATE INDEX idx_tasks_epic ON tasks (epic_id);

ALTER TABLE tasks
    ADD COLUMN importance INT;

After running that script on our database, we’ll now run:

npx drizzle-kit pull

Our terminal should look like this:

We can now see our schema updates in the git diffs:

Note the new columns being added, and the new index being created. Again, the index will not affect our application code; it will make our Drizzle schema a faithful representation of our database, so we can make changes on either side, and generate updates to the other. To that end, let’s see about updating our code, and generating SQL to match those changes.

The Code First Approach

Let’s go the other way. Let’s start with a Drizzle schema, and generate an SQL script from it. In order to get a Drizzle schema, let’s just cheat and grab the schema.ts and relations.ts files Drizzle created above. We’ll paste them into the drizzle-schema folder, and remove anything else Drizzle created: any snapshots, and anything in the meta folder Drizzle uses to track our history.

Next, since we want Drizzle to read our schema files, rather than just generate them, we need to tell Drizzle where they are. We’ll go back into our drizzle.config.ts file, and add this line:

schema: ["./drizzle-schema/schema.ts", "./drizzle-schema/relations.ts"],

Now run:

npx drizzle-kit generate

Voila! We have database assets being created.

The resulting sql file is huge. Mine is named 0000_quick_wild_pack.sql (Drizzle will add these silly names to make the files stand out) and looks like this, in part.

CREATE TABLE IF NOT EXISTS "epics" (
	"id" serial PRIMARY KEY NOT NULL,
	"name" varchar(250),
	"description" text,
	"due" date
);
--> statement-breakpoint
CREATE TABLE IF NOT EXISTS "tags" (
	"id" serial PRIMARY KEY NOT NULL,
	"name" varchar(250)
);
--> statement-breakpoint
CREATE TABLE IF NOT EXISTS "tasks" (
	"id" serial PRIMARY KEY NOT NULL,
	"name" varchar(250),
	"epic_id" integer,
	"user_id" integer
);

Making a schema change

Now let’s make some changes to our schema. Let’s add that same importance column to our tasks table, add that same index on epicId, and then, for fun, let’s tell Drizzle that our foreign key on userId should have an ON DELETE CASCADE rule, meaning that if we delete a user, the database will automatically delete all tasks assigned to that user. This would probably be an awful rule to add to a real issue tracking software, but it’ll help us see Drizzle in action.

Here are the changes:

And now we’ll run npx drizzle-kit generate and you should see:

As before, Drizzle generated a new sql file, this time called 0001_curved_warhawk.sql which looks like this:

ALTER TABLE "tasks" DROP CONSTRAINT "fk_task_user";
--> statement-breakpoint
ALTER TABLE "users" ADD COLUMN "importance" integer;--> statement-breakpoint
DO $$ BEGIN
 ALTER TABLE "tasks" ADD CONSTRAINT "fk_task_user" FOREIGN KEY ("user_id") REFERENCES "public"."users"("id") ON DELETE cascade ON UPDATE no action;
EXCEPTION
 WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "idx_tasks_epicId" ON "tasks" USING btree ("epic_id");

It added a column, overwrote the foreign key constraint we already had to add our CASCADE rule, and created our index on epic_id.

Mixing & Matching Approaches

Make no mistake, you do not have to go all in on code-first, or database-first. You can mix and match approaches. You can scaffold a Drizzle schema from a pre-existing database using drizzle-kit pull, and then make changes to the code, and generate sql files to patch your database with the changes using drizzle-kit generate. Try it and see!

Going Further

Believe it or not, we’re only scratching the surface of what drizzle-kit can do. If you like what you’ve seen so far, be sure to check out the docs.

Concluding Thoughts

Drizzle is an incredibly exciting ORM. Not only does it manage to add an impressive layer of static typing on top of SQL, allowing you to enjoy the power and flexibility of SQL with the type safety you already expect from TypeScript. But it also provides an impressive suite of commands for syncing your changing database with your ORM schema.

Introducing Drizzle

Adam Rackis — Mon, 17 Jun 2024 11:28:41 +0000

This is a post about an exciting new ORM tool (that’s “object relational mapper”) that is different than any ORM I’ve used before—and I’ve used quite a few! Spoiler: it’s Drizzle.

Wait, what’s an ORM?

Even if you’re using a non-relational database now (think MongoDB or Redis), sooner or later you’ll likely need a relational DB.

Knowing SQL is an essential skill for any software engineer, but writing SQL directly can be tricky! The tooling is usually primitive, with only minimal auto-complete to guide you, and you invariably go through a process of running your query, correcting errors, and repeating until you get it right.

ORMs try to help you with this process of crafting SQL. Typically, you tell the ORM about the shape of your DB and it exposes APIs to do typical things. If you have a books table in your DB, an ORM will give you an API for it where you can do stuff like:

const longBooks = books.find({ pages: { gt: 500 } });

Behind the scenes, the SQL created might be something like:

SELECT * FROM books WHERE pages > 500;

That may not look like a massive simplification, but as the parameters get more complex or strung together, the SQL can get a bit mindbending. Not to mention the ORM keeps that code in a language you’re likely already using, like JavaScript.

This ease of use may seem nice, but it can cause other problems. For example, you might struggle figuring out how to do non-trivial queries. And there are performance foot-guns, such as the infamous Select N + 1 problem, which you might cause without realizing it due to the abstracted away syntax.

Why Drizzle is Different

Drizzle takes a novel approach. Drizzle does provide you a traditional ORM querying API, like we saw above. But in addition to that, it also provides an API that is essentially a layer of typing on top of SQL itself. So rather than what we saw before, we might query our books table like this

const longBooks = await db
  .select()
  .from(books)
  .where(gt(books.pages, 500));

It’s more lines, but it’s closer to actual SQL, which provides us some nice benefits: it’s easier to learn, more flexible, and avoids traditional ORM footguns.

Let’s dive in and look closer. This post will take a brief overview of setting up Drizzle, and querying, and then do a deeper dive showing off some of its powerful abilities with this typed SQL querying API. The docs are here if you’d like to look closer at anything.

Using Drizzle in general, and some of the advanced things we’ll cover in this post requires a decent knowledge of SQL. If you’ve never, ever used SQL, you might struggle with a few of the things we discuss later on. That’s expected. Skim and jump over sections as needed. If nothing else, hopefully this post will motivate you to look at SQL.

Setting up the Schema

Drizzle can’t do much of anything if it doesn’t know about your database. There’s lots of utilities for showing Drizzle the structure (or schema) of your tables. We’ll take a very brief look, but a more complete example can be found here.

Drizzle supports Postgres, MySQL, and SQLite. The ideas are the same either way, but we’ll be using MySQL.

Let’s start to set up a table.

import { int, json, mysqlTable, varchar } from "drizzle-orm/mysql-core";

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
});

We tell Drizzle about our columns (we won’t show all of them here), and their data types.

Now we can run queries:

const result = await db
  .select()
  .from(books)
  .orderBy(desc(books.id))
  .limit(1);

This query returns an array of items which match the schema we provided Drizzle for this table.

Alternatively, as expected, we can also narrow our select list.

const result = await db
  .select({ id: books.id, isbn: books.isbn })
  .from(books)
  .orderBy(desc(books.id))
  .limit(1);

Note that the types of the columns match whatever we define in the schema. We won’t go over every possible column type (check the docs), but let’s briefly look at the JSON type:

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
  authors: json("authors"),
});

This adds an authors field to each book. But the type might not be what you want. Right now it’s unknown. This makes sense: JSON can have just about any structure. Fortunately, if you know your json column will have a predictable shape, you can specify it, like this:

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
  authors: json("authors").$type(),
});

And now, when we check, the authors property is of type string[] | null.

If you were to mark the authors column as notNull() it would be typed as string[]. As you might expect, you can pass any type you’d like into the $type helper.

Query Whirlwind Tour

Let’s run a non-trivial, but still basic query to see what Drizzle looks like in practice. Let’s say we’re looking to find some nice beach reading for the summer. We want to find books that belong to you (userId == “123”), and is either less than 150 pages, or was written by Stephan Jay Gould. We want the first ten, and we want them sort from most recently added to least recently added (the id key is auto-numbered, so we can sort on that for the same effect)

In SQL we’d do something like this:

SELECT *
FROM books
WHERE userId = '123' AND (pages < 150 OR authors LIKE '%Stephen Jay Gould%')
ORDER BY id desc
LIMIT 10

With Drizzle we’d write this:

const result = await db
  .select()
  .from(books)
  .where(
    and(
      eq(books.userId, userId),
      or(lt(books.pages, 150), like(books.authors, "%Stephen Jay Gould%"))
    )
  )
  .orderBy(desc(books.id))
  .limit(10);

Which works!

[
  {
    "id": 1088,
    "userId": "123",
    "authors": ["Siry, Steven E"],
    "title": "Greene: Revolutionary General (Military Profiles)",
    "isbn": "9781574889130",
    "pages": 144
  },
  {
    "id": 828,
    "userId": "123",
    "authors": ["Morton J. Horwitz"],
    "title": "The Warren Court and the Pursuit of Justice",
    "isbn": "0809016257",
    "pages": 144
  },
  {
    "id": 506,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "Bully for Brontosaurus: Reflections in Natural History",
    "isbn": "039330857X",
    "pages": 544
  },
  {
    "id": 412,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "The Flamingo's Smile: Reflections in Natural History",
    "isbn": "0393303756",
    "pages": 480
  },
  {
    "id": 356,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "Hen's Teeth and Horse's Toes: Further Reflections in Natural History",
    "isbn": "0393311031",
    "pages": 416
  },
  {
    "id": 319,
    "userId": "123",
    "authors": ["Robert J. Schneller"],
    "title": "Cushing: Civil War SEAL (Military Profiles)",
    "isbn": "1574886967",
    "pages": 128
  }
]

The Drizzle version was actually a little bit longer. But we’re not optimizing for fewest possible lines of code. The Drizzle version is typed, with autocomplete to guide you toward a valid query, and TypeScript to warn you when you miss. The query is also a lot more composable. What do I mean by that?

Composability: Putting Queries Together

Let’s write something slightly more advanced and slightly more realistic. Let’s code up a function that takes any number of search filters, and puts together a query. Here’s what the filters look like

type SearchPacket = Partial<{
  title: string;
  author: string;
  maxPages: number;
  subjects?: number[];
}>;

Note the Partial type. We’re taking in any number of these filters—possibly none of them. Whichever filters are passed, we want them to be additive; we want them combined with and. We’ve seen and already, and it can take the result of calls to eq, lt, and lots of others. We’ll need to create an array of all of these filters, and Drizzle gives us a parent type that can hold any of them: SQLWrapper.

Let’s get started.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
}

We’ve got our array of filters. Now let’s start filling it up.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
}

Nothing new, yet. This is the same filter we saw before with authors.

Speaking of authors, let’s add that query next. But let’s make the author check a little more realistic. It’s not a varchar column, it holds JSON values, which themselves are strings of arrays. MySQL gives us a way to search JSON: the ->> operator. This takes a JSON column, and evaluates a path on it. So if you had objects in there, you’d pass string paths to get properties out. We just have an array of strings, so our path is $, which is the actual values in the array. And the string comparrisons when we’re filtering on JSON columns like this is no longer case insensitive, so we’ll want to use the LOWER function in MySQL.

Typically, with traditional ORM’s you’d scramble to the docs to look for an equivalent to the ->> operator, as well as the LOWER function. Drizzle does something better, and gives us a nice escape hatch to just write SQL directly in situations like this. Let’s implement our authors filter.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`
    );
  }
}

Note the sql tagged template literal. It lets us put arbitrary SQL in for one-off operations that may not be implemented in the ORM. Before moving on, let’s take a quick peak at the SQL generated by this:

{
  "sql": "select `id`, `userId`, `authors`, `title`, `isbn`, `pages` from `books` where (`books`.`userId` = ? and LOWER(`books`.`authors`->>\"$\") LIKE ?) order by `books`.`id` desc limit ?",
  "params": ["123", "%gould%", 10]
}

Let’s zoom in on the authors piece. What we entered as…

sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`

… gets transformed as:

LOWER(`books`.`authors`->>"$") LIKE ?

Our search term was parameterized; however, Drizzle was smart enough to not parameterize our column. I’m continuously impressed by small details like this. The maxPages piece is the same as before

if (args.maxPages) {
  searchConditions.push(lte(books.pages, args.maxPages));
}

Nothing new or interesting. Now let’s look at the subjects filter. We can pass in an array of subject ids, and we want to filter books that have that subject. The relationship between books and subjects is stored in a separate table, booksSubjects. This table simply has rows with an id, a book id, and a subject id (and also the userId for that book, to make other queries easier).

So if book 12 has subject 34, there’ll be a row with bookId of 12, and subjectId of 34.

In SQL when we want to see if a given row exists in some table, we use the exists keyword, and Drizzle has an exists function for this very purpose. Let’s move on with our function

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`,
    );
  }
  if (args.maxPages) {
    searchConditions.push(lte(books.pages, args.maxPages));
  }
  if (args.subjects?.length) {
    searchConditions.push(
      exists(
        db
          .select({ _: sql`1` })
          .from(booksSubjects)
          .where(
            and(
              eq(books.id, booksSubjects.book),
              inArray(booksSubjects.subject, args.subjects),
            ),
          ),
      ),
    );
  }

We can pass an exists() call right into our list of filters, just like with real SQL. This bit:

_: sql`1`

… is curious, but that’s just us saying SELECT 1 which is a common way of putting something into a SELECT list, even though we’re not pulling back any data; we’re just checking for existence. Lastly, the inArray Drizzle helper is how we generate an IN query. Here’s what the generated SQL looks like for this subjects query:

select `id`, `userId`, `authors`, `title`, `isbn`, `pages`
from `books`
where (`books`.`userId` = ? and exists (select 1
                                        from `books_subjects`
                                        where (`books`.`id` = `books_subjects`.`book` and
                                               `books_subjects`.`subject` in (?, ?))))
order by `books`.`id` desc
limit ?

That was our last filter. Now we can pipe our filters in to execute the query we put together.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`
    );
  }
  if (args.maxPages) {
    searchConditions.push(lte(books.pages, args.maxPages));
  }
  if (args.subjects?.length) {
    searchConditions.push(
      exists(
        db
          .select({ _: sql`1` })
          .from(booksSubjects)
          .where(
            and(
              eq(books.id, booksSubjects.book),
              inArray(booksSubjects.subject, args.subjects)
            )
          )
      )
    );
  }

  const result = await db
    .select()
    .from(books)
    .where(and(eq(books.userId, userId), ...searchConditions))
    .orderBy(desc(books.id))
    .limit(10);
}

The ability to treat SQL queries as typed function calls that can be combined arbitratily is what makes Drizzle shine.

Digging deeper

We could end the post here, but let’s go further and see how Drizzle handles something fairly complex. You might never need (or want to) write queries like this. My purpose in including this section is to show that you can, if you ever need to.

With that out of the way, let’s write a query to get aggregate info about our books. We want our most and least popular subject(s), and how many books we have with those subjects. We also want to know any unused subjects, as well as that same info about tags (which we haven’t talked about). And also the total number of books we have overall. This data might be displayed in a screen like this.

Screenshot

To keep this section manageable we’ll just get the book counts and the most and least subjects. The other pieces are variations on that theme. You can see the finished product here.

Let’s look at some of the SQL for this and how to write it with Drizzle.

Number of books per subject

In SQL we can group things together with GROUP BY.

SELECT
    subject,
    count(*)
FROM books_subjects
GROUP BY subject

Now our SELECT list, rather than pulling items from a table, is now pulling from a (conceptual) lookup table. We (conceptually) have a bunch of buckets stored by subject id. So we can select those subject id’s, as well as aggregate info from the buckets themselves, which we do with the count(*). This selects each subject, and the number of books under that subject.

And it works:

But we want the most, and least popular subjects. SQL also has what are called window functions. We can, on the fly, sort these buckets in some order, and then ask questions about the data, sorted in that way. We basically want the subject(s) with the highest, or lowest number of books, including ties. It turns out RANK is exactly what we want. Let’s see how this works

SELECT
    subject,
    count(*) as count,
    RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
    RANK() OVER (ORDER BY count(*) ASC) MinSubject
FROM books_subjects
WHERE userId = '123'
GROUP BY subject

We ask for the rank of each row, when the whole result set is sorted in whatever way we describe.

Subjects 79, 137 and 150 all have a minSubject rank of 1, which means they are the least used subject, which makes sense since there’s only one book with that subject.

It’s a little mind bendy at first, so don’t worry if this looks a little weird. The point is to show how well Drizzle can simplify SQL for us, not to be a deep dive into SQL, so let’s move on.

We want the subjects with a MaxSubject of 1, or a MinSubject of 1. We can’t use WHERE for this, at least not directly. The solution in SQL is to turn this query into a virtual table, and query that. It looks like this:

SELECT
    t.subject id,
    CASE WHEN t.MinSubject = 1 THEN 'MinSubject' ELSE 'MaxSubject' END as label,
    t.count
FROM (
    SELECT
        subject,
        count(*) as count,
        RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
        RANK() OVER (ORDER BY count(*) ASC) MinSubject
    FROM books_subjects
    WHERE userId = '123'
    GROUP BY subject
) t
WHERE t.MaxSubject = 1 OR t.MinSubject = 1

And it works.

Moving this along

We won’t show tags, since it’s basically identical except we hit a books_tags table, instead of books_subjects. We also won’t show unused subjects (or tags), which is also very similar, except we use a NOT EXISTS query.

The query to get the total number of books looks like this:

SELECT count(*) as count
FROM books
WHERE userId = '123'

Let’s add some columns to get it in the same structure as our subjects queries:

SELECT
    0 id,
    'Books Count' as label,
    count(*) as count
FROM books
WHERE userId = '123'

Now, since these queries return the same structure, let’s combine them into one big query. We use UNION for this.

SELECT *
FROM (
    SELECT
        t.subject id,
        CASE WHEN t.MinSubject = 1 THEN 'MinSubject' ELSE 'MaxSubject' END as label,
        t.count
    FROM (
        SELECT
            subject,
            count(*) as count,
            RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
            RANK() OVER (ORDER BY count(*) ASC) MinSubject
        FROM books_subjects
        GROUP BY subject
    ) t
    WHERE t.MaxSubject = 1 OR t.MinSubject = 1
) subjects
UNION
    SELECT
        0 id,
        'Books Count' as label,
        count(*) as count
    FROM books
    WHERE userId = '123';

And it works! Phew!

But this is gross to write manually, and even grosser to maintain. There’s a lot of pieces here and there’s no (good) way to break this apart and manage separately. SQL is ultimately text, and you can, of course, generate these various pieces of text with different functions in your code, and then concatenate them together.

But that’s fraught with difficulty too. It’s easy to get small details wrong when you’re pasting strings of code together. And believe it or not, this query is much simpler than much of what I’ve seen.

The Drizzle Way

Let’s see what this looks like in Drizzle. Remember that initial query to get each subject, with its count, and rank? Here it is in Drizzle

const subjectCountRank = () =>
  db
    .select({
      subject: booksSubjects.subject,
      count: sql`COUNT(*)`.as("count"),
      rankMin: sql`RANK() OVER (ORDER BY COUNT(*) ASC)`.as("rankMin"),
      rankMax: sql`RANK() OVER (ORDER BY COUNT(*) DESC)`.as("rankMax"),
    })
    .from(booksSubjects)
    .where(eq(booksSubjects.userId, userId))
    .groupBy(booksSubjects.subject)
    .as("t");

Drizzle supports grouping, and it even has an as function to alias a query, and enable it to be queried from. Let’s do that next.

const subjectsQuery = () => {
  const subQuery = subjectCountRank();

  return db
    .select({
      label:
        sql`CASE WHEN t.rankMin = 1 THEN 'MIN Subjects' ELSE 'MAX Subjects' END`.as(
          "label"
        ),
      count: subQuery.count,
      id: subQuery.subject,
    })
    .from(subQuery)
    .where(or(eq(subQuery.rankMin, 1), eq(subQuery.rankMax, 1)));
};

We stuck our query to get the ranks in a function, and then we just called that function, and queried from its result. SQL is feeling a lot more like normal coding!

The query for the total book count is simple enough.

db
  .select({ label: sql`'All books'`, count: sql`COUNT(*)`, id: sql`0` })
  .from(books)
  .where(eq(books.userId, userId)),

Hopefully we won’t be too surprised to learn that Drizzle has a union function, to union queries together. Let’s see it all together:

const dataQuery = union(
  db
    .select({
      label: sql`'All books'`,
      count: sql`COUNT(*)`,
      id: sql`0`,
    })
    .from(books)
    .where(eq(books.userId, userId)),
  subjectsQuery()
);

Which generates this SQL for us:

(select 'All books', COUNT(*), 0 from `books` where `books`.`userId` = ?)
union
(select CASE WHEN t.rankMin = 1 THEN 'MIN Subjects' ELSE 'MAX Subjects' END as `label`, `count`, `subject`
 from (select `subject`,
              COUNT(*)                             as `count`,
              RANK() OVER (ORDER BY COUNT(*) ASC)  as `rankMin`,
              RANK() OVER (ORDER BY COUNT(*) DESC) as `rankMax`
       from `books_subjects`
       where `books_subjects`.`userId` = ?
       group by `books_subjects`.`subject`) `t`
 where (`rankMin` = ? or `rankMax` = ?))

Basically the same thing we did before, but with a few more parens, plus some userId filtering I left off for clarity.

I left off the tags queries, and the unused subjects/tags queries, but if you’re curious what they look like, the code is all here and the final union looks like this:

const dataQuery = union(
  db
    .select({
      label: sql`'All books'`,
      count: sql`COUNT(*)`,
      id: sql`0`,
    })
    .from(books)
    .where(eq(books.userId, userId)),
  subjectsQuery(),
  unusedSubjectsQuery(),
  tagsQuery(),
  unusedTagsQuery()
);

Just more function calls thrown into the union.

Flexibility

Some of you might wince seeing that many large queries all union‘d together. Those queries are actually run one after the other on the MySQL box. But, for this project it’s a small amount of data and there’s not multiple round trips over the network to do it. Our MySQL engine executes those queries one after the other.

But let’s say you decide you’re better off breaking that union apart, and sending N queries, with each piece, and putting it all together in application code. These queries are already separate function calls. It would be fairly easy to remove those calls from the union, and instead invoke them in isolation (and then modify your application code).

This kind of flexibility is what I love the most about Drizzle. Refactoring large, complex stored procedure has always been a pain with SQL. When you code it through Drizzle, it becomes much more like refactoring a typed programming language like TypeScript or C#.

Debugging Queries

Before we wrap up, let’s take a look at how easily Drizzle let’s you debug your queries. Let’s say the query from earlier didn’t return what we expected, and we want to see the actual SQL being run. We can do that by removing the await from the query, and then calling toSQL on the result.

import { and, desc, eq, like, lt, or } from "drizzle-orm";

const result = db
  .select()
  .from(books)
  .where(
    and(
      eq(books.userId, userId),
      or(lt(books.pages, 150), like(books.authors, "%Stephen Jay Gould%"))
    )
  )
  .orderBy(desc(books.id))
  .limit(10);

console.log(result.toSQL());

This displays the following:

{
  "sql": "select `id`, `userId`, `authors`, `title`, `isbn`, `pages` from `books` where (`books`.`userId` = ? and (`books`.`pages` < ? or `books`.`authors` like ?)) order by `books`.`id` desc limit ?",
  "params": ["123", 150, "%Stephen Jay Gould%", 10]
}

result.toSQL() returned an object, with a sql field with our query, and a params field with the parameters. As any ORM would, Drizzle parameterized our query, so fields with invalid characters wouldn’t break anything. You can now run this query directly against your database to see what went wrong.

Wrapping Up

I hope you’ve enjoyed this introduction to Drizzle. If you’re not afraid of a little SQL, it can make your life a lot easier.