Seeding your Database with Thousands of Users using Knex.js and Faker.js

Prototype more efficiently by loading your database with realistic data-points.

Mario Hoyos
Bits and Pieces

--

Photo by chuttersnap on Unsplash

Being able to quickly whip up prototypes and change them on the fly is part of what makes software so much fun to build.

Before your project has any users, however, it can be hard to see how your application will function with real data. Today, we will solve this problem by seeding a database with thousands of data points using Knex.js and Faker.js.

For the sake of staying on topic, this post assumes that you have a Node.js environment and a database already set up. You can download Node here, and if you need help setting up a database, I would highly recommend following along here.

Once you have that out of the way, strap yourself in and let’s get to it.

Tip: Don’t duplicate JS code. Instead, use tools like Bit to organize and share common components, syncing them across projects to build faster. It’s free.

Part 1: Setting Up Knex.

https://knexjs.org

There are many ways to go about interacting with a database, but that’s a discussion for another day. For this example, we will be using Knex.js, which calls itself “a batteries-included SQL query builder for Postgres, MSSQL, MySQL, MariaDB, SQLite3, Oracle, and Amazon Redshift designed to be flexible, portable, and fun to use.

In particular, I will be using Knex with a Postgres database, but what makes Knex cool is that you can swap Postgres out with any flavor they support and your code should still work just the same.

The reason we are using Knex is that it abstracts away a lot of the complexity that comes with maintaining and querying a database. I am far from a database guru and the more help I can get, the happier I am.

Let’s go ahead and install Knex.

npm install --save knex

Because I have a Postgres database, I am going to install the “pg” client, which Knex will use under the hood. If you are using a different flavor of SQL, make sure to install the corresponding client.

npm install --save pg

Awesome! Now I want to set up a couple of directories that will be useful to us in the near future. I’m going to call mine “migrations” and “seeds”. So far, my file structure looks like this:

These are the directories in which we will store the files which describe our migrations (which you can think of as small changes to a database schema which are reversible) and our seed files (which will allow us to populate our database tables with data).

Now let’s go ahead and create a file called “knexfiles.js” in the root directory, where will configure Knex. This file is important because it allows you to configure different databases to use in different environments, such as development and production.

For this example, we will use a development environment. Your knexfile.js should look something like the following:

Basically, this file says that if we are using our development environment, Knex should connect to the database you specified in the connection section, and look in the migrations and seeds directories we created earlier when we want to run either of those.

There are many more configuration options available, but I’ll let you refer to the docs for that if you would like more control!

Part 2: Create A Database Table

If you already have a table set up that you want to seed with data, feel free to skip to Part 3!

For this example, we are going to create a basic “users” table which we will then seed with thousands of fake users.

If you followed along through Part 1, then you should remember that we created a directory called “migrations”. Each migration file we make should contain instructions for making a change to our database schema as well as instructions for doing that change. Knex provides us with a system to do just that.

In order to make our first migration file, run the command:

knex migrate:make initUsers --env development

What we are doing here is having Knex create a file in our “migrations” directory which should be called something like “20181029094559_initUsers.js” and should contain this:

Knex creates a file beginning with a timestamp so that when you run multiple migrations, it know which order to execute them in. In the file, anything you put into “exports.up” can be thought of as the new changes you want, while “exports.down” should contain instructions for how to undo those changes.

Sometimes code is worth a thousand words, so let me show you how my initUsers migration looks.

We are creating a table called “users” where each user has an email, first name, and last name. This should be enough for us to get the basics of what we are trying to do.

Now that our migration file is ready to run, let’s create our “users” table! In the command line, execute the following:

knex migrate:latest --env development

If all went well, your database should now have a users table with the schema we defined above!

Now that we have a table, let’s learn how to stuff it full of users that we can use in our prototype!

Part 3: Seed Our Database With 1000 Users

https://github.com/Marak/faker.js

If you have been following along, you have now set up Knex to interface with your database and created a users table.

If you skipped part 2, you will need to adapt this section to fit your current schema, which should be easy-peasy.

Let’s go ahead and have Knex whip us up a seed file.

knex seed:make addUsers --env development

If everything is setup correctly, you should now have an “addUsers.js” file in your seeds directory that looks like this:

In this file, you can programmatically add in whatever data you want to whatever table you want. In the example above they are showing how you could add three items to a table.

We could, of course, hard code 1000 user objects, making up names and emails as we go, but that’s not why we are here.

Enter Faker.js!

Faker is a super-cool library which will generate tons of real-looking fake data for cases like this. Let’s see how we can use this!

We know that we don’t want to hard-code each user object to insert into our database, so let’s create a function that will return a random user object for us! My seed file now looks like this:

As you can see, we brought in the Faker.js library, and used it in a function which will return us a user object with real-looking data!

Now that we have a function that will return a fake user object, let’s go ahead and create 1000 of these and have Knex insert them in our database!

As you can see, in our seed function, we are creating an empty “fakeUsers” array, and then using a for-loop to push 1000 user objects into it. Once we have that array, we just pass it to Knex to insert into our “users” table.

Now that the code is written, we just need to execute it!

knex seed:run --env development

If all went well, your “users” table should now have 1000 users! You can scale this to whatever numbers you want with whatever other tables you have!

Conclusion

I have only scratched the surface of what Knex.js and Faker.js have to offer. What is important is that you understand the concepts here, and adapt them to fit your use-case. Both have great docs!

I hope that you have learned something new today and I wish you happy prototyping! I would appreciate it if you could drop some 👏 or leave a comment to ask anything! Feel free to follow me on Twitter and Medium :)

--

--

I am a passionate pharmacist-turned-web-developer who wants to help others make the career change.