Saving to disk

Our key-value store will need a way to read and write data to and from persistent storage.

There are many ways to persist structured data to disk, some optimized more for reading, or writing, or a balance of the two. Each approach has advantages and trade-offs. Two common approaches are B+trees and Log-structured merge-trees (LSM trees) .

We’re going to eventually implement a log-structured merge-tree system to persist our key-value store. The details are relatively easy to grasp and implement compared to some other approaches. Popular database solutions that use LSM trees include MongoDB, SQLite, and Apache Cassandra.

Before we jump into LSM trees, let’s try out an obvious approach that many programmers independently dream up at some point during their adventures. This approach involves storing data on the filesystem where the directory and file name (or just the file name) comprise the key, and the file content is the value. I’ll refer to this as filesystem-as-a-database.

The filesystem-as-a-database approach is pretty straightforward, and works well on a small scale. It quickly breaks down, however, when the number of keys gets too high. This is because modern filesystems typically aren’t optimized for the usage pattern of a key-value store, and some even have a limit on the number of files and data blocks they can represent (search for inodes to learn more).

File names as keys

We want to be flexible in the kinds of data we support for keys. Ideally, we can guarantee that any arbitrary data will work as a key, including strings, numbers, dates, binary, or any combination of those.

Filesystem restrictions on file names

Since we’re going to use file names as the keys for our filesystem-as-a-database implementation, we need some way to represent any arbitrary key as a valid filename. Most common filesystems used today have restrictions on the length of file names and the characters they can use. Windows/NTFS doesn’t allow filenames to contain any of these characters: / ? < > \ : * | ". Typically, UNIX-like operating systems, including Linux and MacOS, don’t allow / or NUL characters in filenames (even if the underlying filesystem might support it).

Also, interestingly, Windows does not permit a file to have any of the following names, regardless of extension: com1, com2, com3, com4, com5, com6, com7, com8, com9, lpt1, lpt2, lpt3, lpt4, lpt5, lpt6, lpt7, lpt8, lpt9, con, nul, and prn. This is some “backwards-compatible” behavior that dates back to the 1970s.

Given these restrictions, we likely wouldn’t be able to use keys like the following as file names: nul, my/key:$, person/bob, asdfNUL. We need an alternative way to represent keys in file names.

Hashing saves the day

We’re going to run our keys through a hash function and use the output in the file names for our filesystem-as-a-database. A hash function takes arbitrary input of any length and maps it to a pseudo-random output of a fixed length (typically). Hashing is ubiquitous in software and enables or is related to features like data fingerprinting, checksums, digests, encryption, authentication, and secure password storage. Common hash functions, algorithms, or derived tools include MD5, SHA, CRC, Java hashCode(), HMAC, and bcrypt.

Hash functions come in many varieties and it’s critical to choose one with the correct properties for its intended use. Some of the properties to consider include:

  • Is it cryptographically secure?
  • Does it generate output that is uniformly distributed or balanced among the possible outputs?
  • What size output does it produce?
  • What are the performance characteristics? Some algorithms, such as those used for password hashing (like bcrypt), intentionally use expensive computations that can be tuned, to hinder brute force password cracking.
  • Is it designed for a particular type of input?

Since we’re just going to use a hash function to get a normalized file name, we don’t need a cryptographically secure algorithm. For our implementation, I’ve chosen a very simple algorithm provided by the string-hash npm package. It takes an arbitrary input and produces an unsigned 32-bit integer as output. We’ll just use the output as the file name for each key.

File contents as values

We’re going to store the value associated with each key as the content in its associated file. Since we want to support values that include arrays, objects, numbers, and strings, we’ll represent values as JSON strings.

In order to store some metadata with the value, we’ll wrap it in an array as the first element. That lets us extend our file format later without breaking backwards compatibility, and using an array gives us more control over the file layout.

The files we create will have the .json extension in order to make it easy to view in an editor.

To demonstrate the file format, if we have the key person:1 with an object for the value that includes the properties first_name = "Joe" and last_name = "McLearny", we’ll end up with a file named 2750769851.json (the output of stringHash('person:1') plus the json extension) with its content being:

[{ "first_name": "Joe", "last_name": "McLearny" }, "person:1"]

We need to store the key alongside the value to verify that it hashed correctly and that it is not the result of a hash collision. In the future we’ll add safeguards so that hash collisions won’t cause problems such as overwriting data.

Tests

In your copy of the learndb repo, check out the key-value-store-fsdb_before branch if you haven’t already and install dependencies by running npm install.

The first thing we need to do is update our tests to pass options to the key-value store so it knows where to write on the filesystem. We also want to clear out the files created during tests so the filesystem doesn’t get cluttered.

I typically do all cleanup only at the beginning of a test suite. This makes it easier to debug a failing test by inspecting the state it leaves behind.

Open tests/key-value-store.js and add the following import lines after the existing import lines at the top:

import fs from 'fs'
import path from 'path'
import shell from 'shelljs'

The shelljs package gives us a convenient unix-like interface (including cp, mv, cat, and rm) to performing file operations, and it works cross-platform.

Next, replace the top part of the describe block so that it looks like the following:

describe('KeyValueStore', () => {
  // Test keys and values we'll use in the new tests
  const testKey1 = 'test-key-1'
  const testValue1 = 'test-value-1'
  const testValue2 = 'test-value-2'

  // The path where temporary db files will be written, in the project root.
  const dbTempPath = path.resolve(__dirname, '../db_temp')

  // Contains a fresh instance of the key-value store for each test.
  let keyValueStore = null

  // Contains the path for the db files for the currently executing test.
  let dbPath = null

  // Allows us to give each test a unique directory.
  let testId = 1

  // Functions passed to before() run only once, before any of the tests run.
  before(() => {
    // Safety check so we don't delete the wrong files
    assert.endsWith(dbTempPath, 'db_temp')
    shell.rm('-rf', dbTempPath)
  })

  // Functions passed to beforeEach will run before every test.
  beforeEach(() => {
    // Generate a unique path in the project root to hold the db files for this test.
    dbPath = path.resolve(
      dbTempPath,
      process.pid.toString() + '_' + (testId++).toString()
    )
    shell.mkdir('-p', dbPath)

    // Before each test, create a new instance of the key-value store.
    keyValueStore = new KeyValueStore({ dbPath })
    keyValueStore.init()
  })

  it('get() returns value that was set()', () => {

The three changes for this block of code can be summarized as:

  1. Clear out all db files from previous test runs.
  2. Generate a unique path inside the project to store db files for each test.
  3. Pass the unique path to the key-value store in its constructor.

This is all we need to change in our test suite for now. The tests should still pass as long as our new implementation behaves correctly as before.

Implementation

If you’re following along and making changes to your copy of the learndb repo, you’ll need to go ahead and install the string-hash package before continuing.

npm install --save string-hash

Imports and constants

Open src/key-value-store.js and add the following code above the class export:

import path from 'path'
import fs from 'fs'
import shell from 'shelljs'
import stringHash from 'string-hash'

const VALUE_INDEX = 0
const KEY_INDEX = 1

This brings in some packages we’ll be using and sets a couple constants. These constants name the array indexes where we will be reading and writing parts of the entries we use to store values and related metadata.

constructor({ dbPath })

Next is the constructor. Previously we had been using it to initialize the in-memory store. The new constructor will accept the dbPath argument and save it for later. Replace the constructor with this:

constructor({ dbPath }) {
  this.dbPath = dbPath
}

Not too complicated.

init()

The init() method is pretty light as well. It’s just going to ensure the directory exists for dbPath.

init() {
  shell.mkdir('-p', this.dbPath)
}

set(key, value, isDeleted)

Now we get to the implementation for the set method. Any time this method is called, we’re going to hash the key and write the the JSON string representing the key and value to a file named after the hashed key.

set(key, value) {
  const keyHash = stringHash(key)
  const fileName = `${keyHash}.json`
  // Creates the file if it does not exist or overwrites it if it does exist.
  // Sets the file content to the JSON for the entry.
  fs.writeFileSync(
    path.resolve(this.dbPath, fileName),
    JSON.stringify([value, key])
  )
}

Notice that we’re storing the value at index 0 and the key at index 1. These indexes correspond to the constants we set at the top of the file. Recall that we are using arrays to store the entries to keep the file layout predictable–objects don’t always serialize to JSON with consistent sorting for their property names.

Here’s an example of calling the set method and the corresponding files that will be written to disk:

set('test-1', 'value-1')
set('test-2', 'value-2')
set('test-3', null)

3804000015.json:

["value-1", "test-1"]

3847802988.json:

["value-2", "test-2"]

3882405965.json:

[null, "test-3"]

delete(key)

Let’s skip ahead to the delete method, since it’s pretty simple. For this implementation, a delete call will just delete the file named with the given key’s hash if it exists. The term unlink is just a weird, ancient incantation that means “delete”.

delete(key) {
  const keyHash = stringHash(key)
  const fileName = `${keyHash}.json`
  if (fs.existsSync(path.resolve(this.dbPath, fileName))) {
    fs.unlinkSync(path.resolve(this.dbPath, fileName))
  }
}

get(key)

The get method mirrors the set method in that it also hashes the key and performs a file operation. In this case, it will read the file (if it exists) and return the value for the given key after parsing the JSON and verifying the key.

get(key) {
  const keyHash = stringHash(key)
  const fileName = `${keyHash}.json`

  if (!fs.existsSync(path.resolve(this.dbPath, fileName))) {
    return undefined // If the file doesn't exist, there's no value set for the key.
  }

  // readFileSync returns a Buffer object that represents binary data.
  const buffer = fs.readFileSync(path.resolve(this.dbPath, fileName))

  // Stringify the buffer so we can parse it as JSON.
  const bufferString = buffer.toString()

  // Parse the JSON into an array representing an entry.
  const entry = JSON.parse(bufferString)

  // Verify the key matches
  if (entry[KEY_INDEX] !== key) {
    throw new Error(
      `Keys do not match: '${
        entry[KEY_INDEX]
      }' !== '${key}' -- probably a hash collision.`
    )
  }

  // Return the value for the key.
  return entry[VALUE_INDEX]
}

First we hash the key to figure out what file to check for its value. A quick check to see if the file exists handles the case where no value has been set for a key yet, in which case we can just return undefined.

Next we read the file content (which should be JSON) and parse it to an entry array. Now we can double check that the key stored with the value is really the one we want and handle the rare situation where two keys hash to the same file name.

Finally, if all has gone well, we just return the value associated with the key.

That’s it for the implementation.

Benchmarks

You might be wondering about performance, since having to read and write a file for every get and set seems like it would be a bit slow. This is true, and keep in mind that we’re not yet going for performance. The filesystem-as-a-database implementation keeps things pretty simple. In the future we’ll tackle increasingly complex and interesting ways to store data with corresponding improvements in performance.

Let’s use our benchmarks to look at the performance and compare it to the in-memory implementation.

We have to make some small changes to the benchmarks to manage the path for the database files, similar to how we had to modify the tests.

Open src/benchmarks.js and add the following imports near the top of the file:

import path from 'path'
import shell from 'shelljs'

Next, update the first part of the run method so it looks like this:

function run() {
  const dbPath = path.resolve(__dirname, '../db_bench')

  if (!dbPath.includes('db_bench')) {
    throw new Error(
      `Refusing to run benchmarks for db path not containing 'db_bench': ${dbPath}`
    )
  }

  shell.rm('-rf', dbPath)
  shell.mkdir('-p', dbPath)

  const keyValueStore = new KeyValueStore({ dbPath })
  const suite = new Benchmark.Suite()

This code is generating a path where the database files for the benchmark will be stored, doing a safety check on that path, and then recreating it to ensure it is empty.

You can now run the benchmarks with the command npm run benchmarks. Here’s the output when I run the benchmarks against the new implementation on my local machine:

> @learndb/learndb@0.1.1 benchmarks C:\projects\learndb
> node --require @babel/register src/benchmarks.js

Starting RSS memory usage: 47.426 MB
KeyValueStore#set x 1,106 ops/sec ±1.12% (86 runs sampled)
KeyValueStore#get x 9,631 ops/sec ±4.52% (77 runs sampled)
KeyValueStore#delete x 9,730 ops/sec ±4.82% (69 runs sampled)
Ending RSS memory usage: 76.145 MB
Difference: 28.719 MB

Compare this to the benchmark results for the simple in-memory implementation:

> @learndb/learndb@0.1.1 benchmarks C:\projects\learndb
> node --require @babel/register src/benchmarks.js

Starting RSS memory usage: 57.676 MB
KeyValueStore#set x 1,774,185 ops/sec ±2.53% (84 runs sampled)
KeyValueStore#get x 11,633,332 ops/sec ±3.46% (83 runs sampled)
KeyValueStore#delete x 11,518,884 ops/sec ±3.51% (85 runs sampled)
Ending RSS memory usage: 82.695 MB
Difference: 25.020 MB

The new results are quite a bit slower than the in-memory implementation. That’s expected though–we’re just getting started with disk-based persistence. We should see performance increases throughout the upcoming chapters.

Remember, you can check out the corresponding key-value-store-fsdb_after branch to see the result of the changes made here.