Ravi Talks Tech

Building Reddit*

API Server Guide Pt. 1

*Read: a locally-hosted API with link aggregation, commenting, and most importantly no voting. That should ensure no toxicity, right?

May 7, 2023

Do you want to build the next new social media app to take the world by storm (or go the way of MySpace, Google Plus, or heaven forbid, Twitter)? It’s actually not that hard to do—as long as you don’t want to serve more than around 10,000 concurrent users. The technologies that support the breath-taking scale of websites like Facebook, visited by literally billions of users daily, are among the greatest engineering feats of humanity. But those websites all started small, likely running on a single computer. With a bit of technical background, hacking together a website running on a single computer is not that hard. However, building it to be reliable, maintainable, and easy to work on does require a degree of engineering prowess.

This is the first part in a three-part series on building an API server for a Reddit-like application. This guide will explain not only how to implement the server, but also the types of design and architecture decisions that go into that implementation. Hopefully this will equip you to make informed design decisions for both social-media and non-social-media projects in the future. To focus more on the design aspect, in some parts I may gloss over implementation details if I’ve covered similar implementation before. The full implementation is always available in this GitHub repository, which I will link to throughout the guide.

Additionally, I’ve written a part 0 which details the series’s motivation, the API specification, and the rationale behind the tech stack. This post will pick up from where that left off, starting with installing the tech stack. I’d recommend reading through the prelude first for context on the design, but as a brief rationale-less summary, we will be implementing the LinkDump API on Node.js / Express with MongoDB as the data store. The full series is linked here:

1. Installing the tech stack

The commands I use in this series will be applicable to most *nix distributions (e.g. Linux, macOS). You can reference distibution specific installation instructions in the linked official documentation. If you’re using Windows, you can either try to find the corresponding commands for Windows or install a Linux distribution for free on your machine (via the Windows Subsystem for Linux or a virtual machine manager like VirtualBox). Where possible, I will also give the exact versions of the software I used while creating this article. You will likely be able to install newer versions without issue, but if you do encounter problems, you can fall back to the specified ones.

1.1 MongoDB

Installation

To install MongoDB, follow their instructions for installing the Community Edition based on your operating system. Once you’ve completed the installation, you should have access to the mongod and mongosh commands.

Set up

To access our database, we first need to start it up. You can either set it up as a service, so that it automatically starts when your computer starts, or you can manually run it. You should be able to find OS-specific instructions for how to set up a service in MongoDB’s installation docs (ex. for Ubuntu or macOS).

If you’d prefer to manually run it, from an interactive shell run the following command:

$ mongod --config <mongod-config-file> [--fork]

Replace <mongod-config-file> with the default config file. For macOS, the documentation gives you its location in the run instructions. For Linux, it should be /etc/mongod.conf. The program will block input and is quittable by interrupt (Ctrl-C), so you will need to create a new virtual terminal to run additional commands. You can also pass the --fork to make mongod run in the background, but you will then need to reference its PID to kill it. Make sure that your database is running whenever you run the API server.

With the MongoDB instance running, we can now to connect to it by simply running the command mongosh. No additional parameters are necessary because we have not changed the default port, 27017, and we have not enabled user authentication. Since the database will only be running on and accessible from your local machine, we’ll skip enabling user authentication for now (see the access control documentation for more information if you’re curious).

We’ll still go through the process of creating an application user for our API server for if and when you do enable access control. In the mongosh shell, run the following commands:

> use admin
> db.createUser({
    user: "linkdumpApi",
    pwd:  passwordPrompt(),
    roles: [ { role: "readWrite", db: "linkdump" } ]
  })

The second command should prompt you to enter a password in a hidden input. You can choose whatever password you want, even something simple like “password123” (unless that’s one of your real passwords; it’s okay to admit it, this is a safe space). Make sure you remember it as we’ll need it later. After running these commands, you will have created the linkdumpApi user in MongoDB’s pre-existing admin database. The database users are stored in does not necessarily have to relate to the databases they use, and the pre-existing admin database is a good default place to put users.

And that’s MongoDB entirely configured! Note that we didn’t have to create our application’s database or any of the collections we will use in our application. In the write queries we will run later, if we specify the name of a new database or collection, MongoDB will automatically create it.

1.2 Node.js

Installation

This guide was written using the current LTS version of Node, version 18 codenamed Hydrogen, though likely any version newer than that will work too. The LTS version, which stands for long-term support, is generally preferrable over the “Current” version because of its prioritization of security and stability, though feel free to try the Current version if there’s a new feature you’d like to try out. See Node’s release cycle documentation for more details.

The Node.js documentation provides installation instructions via a few different methods. The easiest way to install Node is likely via package manager installation, though often your package manager will not have the latest version (unless you’re using a third-party package manager like Homebrew). For the latest version, you can download an executable installer from Node’s website, which also includes links to installers for past versions. If you want to be able to easily switch between versions of Node, you can instead install it via nvm, a third-party open-source version manager for Node.

Set up

Now that we have Node installed, we can set up our project structure. When you install Node, it comes with npm’s CLI tool, which we can use to initialize our project. Create a folder for your project and then within it, run the command npm init -y. You can do so with the following commands:

$ cd ~
$ mkdir -p workspace/linkdump
$ cd workspace/linkdump
$ npm init -y

This will autogenerate a file named package.json in the folder you just created, which should look like this:

// package.json
{
    "name": "linkdump",
    "version": "1.0.0",
    "description": "",
    "author": "",
    "license": "ISC",
    "main": "index.js",
    "keywords": [],
    "scripts": {
        "test": "echo \"Error: no test specified\" && exit 1"
    }
}

Modify your package.json to look like the following:

 // package.json
 {
    "name": "linkdump",
    "version": "1.0.0",
    "description": "",
    "author": "",
    "license": "ISC",
-    "main": "index.js",
-    "keywords": [],
+    "type": "module",
     "scripts": {
+        "start": "node src/app/run.js",
         "test": "echo \"Error: no test specified\" && exit 1"
     }
 }

The package.json file defines certain settings for your project, like the name, the version, common commands, etc. The two keys we removed, "main" and "keywords" are only necessary if we wish to publish our project to npm, but since this project is an API server and not a library, we won’t be publishing it. The "type": "module" addition allows us to use new JavaScript features, most importantly import functionality. The start script command will be how we run our server, though it won’t work right now since we haven’t implemented anything yet.1

That’s the basic project set up, but the next few installations will be npm packages, which will automatically update the package.json file.

You can reference my code up to this point here.

1.3 Node.js packages

The next couple of tools we will install will be Node packages downloaded via npm. These are some of the core modules we will be making use of, but we will add others to our project later in the series.

Express

We will install Express via npm. Any version 4.X.X should work, but this guide was written with Express version 4.18.2, which you can specify if you do happen to encounter any strange errors.2 As of writing, Express 5 is still in beta; if you wish to use it, beware there may be breaking changes.

Install Express via npm using one of these command:

$ npm install express     # install latest released version, may install v5
$ npm install express@4   # install latest 4.X.X version (recommended)
$ npm install --save-exact express@4.18.2  # install 4.18.2

If you look in your package.json file, you should now see a "dependencies" key containing the installed Express version. Your folder should now also have a sub-folder called node_modules.

There are no set-up instructions as we will configure Express in the code.

The MongoDB Node Driver

We will also install the MongoDB driver via npm. This guide was developed using version 4.13.0. Install as follows:

$ npm install mongodb
$ npm install mongodb@4                   # recommended
$ npm install --save-exact mongodb@4.13.0

Again, the driver configuration will be in the code, so there is no set up.

If you need to reference the documentation for the driver, the MongoDB website has a developer guide and full API documentation.

You can reference my code up to this point here.

Other helpful tools

  • Git: Git is a version control manager which is useful even for an individual, allowing you to create save-points you can return to when (not if) you f*** something up. You can also pair it with a remote version control service like GitHub to back up your work to the cloud. Atlassian provides helpful guides for getting started with Git.
  • Visual Studio Code: Visual Studio Code is a psuedo-IDE which can make development easier. It will do syntax highlighting and code completion, but most importantly, it has a built-in Node debugger, which I recommend familiarizing yourself with and using throughout the guide to diagnose bugs.
  • curl: curl is a command-line program that allows you to make HTTP requests (and many other types too). This will be the primary tool we’ll use to test our server, but if you prefer a GUI application, you can always use the next tool, Postman. curl comes standard with macOS, but for other OSes, you may need to install it from your package manager.
  • Postman: Postman is a powerful GUI application that allows you to make HTTP requests. I usually prefer the simplicity of the command line, but Postman has a lot of useful features not present in curl like saving and sharing requests you make.

2. Creating the server

2.1 Code structure set up

After all that set up, we are finally ready to start creating an application. To start with, create the folder src/app, and within it, create two files: run.js and server.js. After that’s done, your project structure should look like this (with the contents of node_modules omitted):

.
├── node_modules
├── package-lock.json
├── package.json
└── src
    └── app
        ├── run.js
        └── server.js

In run.js, copy the following code:

// run.js
/**
 * @file Run the application server
 *
 * @module run
 * */

import { main } from "./server.js";

try {
    await main();
} catch (err) {
    console.error(err);
    process.exit(1);
}

And in server.js copy the following:

// server.js
/**
 * @file The application server
 *
 * @module server
 * */

/**
 * Start the server.
 *
 * @returns {Promise<any>} The promise running the server
 */
export async function main() {
    console.log("hello world!");
}

Once that’s done, try running npm start from your root directory. You should see the following:

$ npm start

> linkdump@1.0.0 start ...
> node src/app/run.js

hello world!

Congrats, you’ve run the first piece of code for your server!

The main() function in server.js will eventually contain all the logic of running our API server. As the name implies, the run.js file exists simply to run our server.3 Marking the main function as async makes it return a Promise, which will become necessary when we asynchronously create our MongoDB connection (now would be a good time to brush up on Promises and async / await if you aren’t familiar with them). run.js calls the main() function and exits with an error code if it encounters an error. You can try testing this by adding throw new Error("test error"); to your main function and seeing what happens. I’ve included documentation for the files and functions here,4 but I will mostly exclude it in further code examples for sake of succinctness. You can decide for yourself if you think documenting your code is worthwile, but you can always reference my documented code in the GitHub repository.

You can reference my code up to this point here.

2.2 Adding Express

To make your program respond to HTTP requests, we will now create an Express application. We’ll first import Express and then create a new function createApp() to create the server object:

// server.js
   * @module server
   * */
+
+ import express from "express";
+
+ export function createApp() {
+     const app = express();
+
+     app.get("/health", (req, res, next) => {
+         console.log("GET /health");
+         return res.send("hello world!\n");
+     });
+
+     return app;
+ }

  /**
   * Run the server.

Breaking down the body of the function, we first start by creating the actual Express application. Then we tell the application when we get an HTTP GET request to the path /health, to log the request to the console and return the response “hello world!“.

Though we’ve implemented the logic for creating the application, we have not yet started running it. We will do so in the main() method:

// server.js
  export async function main() {
-     console.log("hello world!");
+     const app = createApp();
+
+     const port = 2000;
+     const server = app.listen(port);
+     server.on("listening", () => {
+         console.log(`Listening on port ${server.address().port}.`);
+     });
  }

Now, run npm start to start running your server. If everything worked correctly, you should see Listening on port 2000. in your console.

To make a request to the server, open a new terminal window and run the command below. If you prefer to make the request via Postman, make a GET request to the same URL.

$ curl -i "http://localhost:2000/health"

You should get output that looks like:

HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: text/html; charset=utf-8
Content-Length: 13
ETag: W/"d-+VGxAZibLDt0cXELTnj8Tb36DKY"
Date: Thu, 29 Dec 2022 21:45:14 GMT
Connection: keep-alive
Keep-Alive: timeout=5

hello world!

And simple as that, you’ve made a functional web server! Well, you’ve mostly just called a library that implements a web server, but a win is a win.

You can reference my code up to this point, with documentation, here.

2.3 Connecting to your database

We will make a new module to store the code for connecting with the MongoDB server. Make folders for the path src/lib/util/mongodb and make a new file mongoConnectionUtils.js there. A quick note on the project directory structure, all of the code related to routing and endpoint functionality will live in the src/app directory, and all the other utility code will live in src/lib. In the new file, we will make a function that uses the Node MongoDB driver to connect to the database:

// mongoConnectionUtils.js
import { MongoClient } from "mongodb";

export async function getConnection({ user, password, host, port, authenticationDb }) {
    const uri = `mongodb://${user}:${password}@${host}:${port}/${authenticationDb}`;
    const client = new MongoClient(uri);
    await client.connect();

    return client;
}

In the first line, we create the MongoDB URI, the standard way MongoDB uses to designate the address of and access to a database. We then create a client for that URI and return it.

We will also add a new method to test the latency of the connection:

// mongoConnectionUtils.js
  export async function getConnection({ user, password, host, port, authenticationDb }) {
      const uri = `mongodb://${user}:${password}@${host}:${port}/${authenticationDb}`;
      const client = new MongoClient(uri);
      await client.connect();
+
+     await timePing(client); // test connection

      return client;
  }
+
+ export async function timePing(client) {
+     const start = Date.now();
+
+     return client
+         .db("admin")
+         .command({ ping: 1 })
+         .then(() => Date.now() - start);
+ }

This function returns the number of milliseconds it takes for the ping command to run.

Now, we will start the connection to our database in our server module. In server.js, add the following code:

// server.js
  import express from "express";
+
+ import * as mongoConnectionUtils from "../lib/util/mongodb/mongoConnectionUtils.js";
+
+ // TODO: REMOVE SENSITIVE INFORMATION
+ const MONGO_CONFIG = {
+     user: "linkdumpApi",
+     password: "<your-password-here>",
+     host: "localhost",
+     port: 27017,
+     authenticationDb: "admin",
+ }
// server.js
- export function createApp() {
+ export function createApp({ mongoClient }) {
      const app = express();

-     app.get("/health", (req, res, next) => {
+     app.get("/health", async (req, res, next) => {
          console.log("GET /health");
-         return res.send("hello world!\n");
+         const mongodbPingTimeMs = await mongoConnectionUtils.timePing(mongoClient);
+         return res.send({ mongodbPingTimeMs });
      });
// server.js
  export async function main() {
+     const mongoClient = await mongoConnectionUtils.getConnection(MONGO_CONFIG);
-     const app = createApp();
+     const app = createApp({ mongoClient });

We’ve done a few things here. At the beginning of the file, we commited a cardinal sin by putting the database credentials in the code itself, which is a very bad idea. Addressing this issue will add a bit more complexity to our project, and this post is running quite long already, so we will take up fixing this in the beginning of the next part. For now though, this is not a big issue, largely because we have not actually enabled access control on our database anyway.

In the createApp() method, we now are taking in a databse client, and we’ve made our health endpoint slightly more functional by returning the latency for our database connection. In our main() function, we now initialize the MongoDB connection and pass it to the app creation.

Let’s try it out! Make sure that your MongoDB instance is running and then make the same request as before. You should now see something like the following:

$ curl -i "http://localhost:2000/health"
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 21
ETag: W/"15-edxzyQWjuQWgXNIGe+fy7CXks+M"
Date: Fri, 30 Dec 2022 04:04:52 GMT
Connection: keep-alive
Keep-Alive: timeout=5

{"mongodbPingTimeMs":1}

Since your database is running on the same machine as the server, the ping time should be very low.

And that’s it for this part! You’ve created a server that tells you about its connection to a database. Admittedly not very functional, but a good foundation for implementing the rest of the API.

You can reference my implementation up to the end of this part here.

What’s next?

As I mentioned earlier, we will start the next part by fixing our credential management by moving the MongoDB configuration out of the code and into a separate file. Following that, we will create our first two endpoints, user registration and user fetching. Additionally, we will touch on how to test our codebase, with both unit and integration tests.

See you in the next part!

Footnotes

  1. For a better developer experience, you can add another script that automatically restarts your server any time you change a file. You can do this by installing a node package called nodemon (via npm install nodemon) and then adding another script: "start-dev": "nodemon --ignore test/ src/app/run.js",. Node 18 also comes with experimental built-in support for file watching, which should allow you to get the same behavior by running node --watch-path=./src src/app/run.js, but I was encountering intermittent issues with it. Whenever you start the server, you can use npm run start-dev instead of npm start to start it in watch mode.

  2. Most npm packages will follow semantic versioning (SemVer). By using this system, if you install a newer patch version (i.e. 4.18.X > 4.18.2), it should have the exact same functionality as version 4.18.2, but maybe with some bug fixes or security patches. If you install a newer minor version (i.e. 4.X.X > 4.18.2), it should have all the same functionality as version 4.18.2 with maybe some new functionality too. If you install a new major version (i.e. 5.X.X+), some functionality in version 4.18.2 may be missing or the APIs may have been modified. My recommendation is to use the latest minor version for the same major version I used, but you can also install just the latest patch for my minor version. npm explains how to specify installing based on semantic versioning in their docs.

    Relatedly, for this repository, you can see that the project version was set to “1.0.0” automatically in the package.json file. If you wanted to properly follow semantic versioning, until we finish building out the implementation, you can set the version to something of the form “0.*.*”, because major version 0 is considered pre-release, so non-backwards-compatible changes can be made without issue. Doing so is completely optional though, as no one will really be looking at the version number.

  3. Currently, when using ES6 modules (the import syntax) there is no good way in Node to tell if a program is being run as an application, or is simply being imported. Consequentially, we will put all of the functionality in our server.js file, but run it from the run.js file. By doing so, we can import the server.js module in our tests without inadverdently running the server. We won’t be able to properly test run.js, but its logic is so simple that it’s unnecessary.

  4. The documentation I write mostly follows the JSDoc syntax, though I will stray from it if it causes the code to be overly verbose. The parameter and return types can be helpful for understanding how the code works and can even allow your IDE to catch type errors.

Feedback or Comments?