String tables

We now have a persistence layer for our key-value store that scales in size better than the filesystem-as-a-database approach. It’s currently fairly straightforward but slow, so we’re going to start making some optimizations while learning a few new concepts.

The next thing we’ll do is introduce a buffer so we can write batches of entries to disk using sorted string tables.

In-memory buffer

The in-memory buffer is where our entries will initially be stored with set() calls, and it’s the first place we’ll look for keys passed to get(). When the buffer reaches a certain size, we’ll remove duplicate entries, sort the buffer, and flush it to disk as a sorted string table. Then we’ll clear the buffer to make room for new entries.

String tables

A string table is a collection of strings, separated in a predictable way. An array of strings is an easy example of a string table. A CSV file, which contains a collection of comma-separated values, one row per line, can also be considered a string table.

A string table might be a log if it only has new data appended to the end. It might also be a sorted string table or SST, which makes it easy for us to perform a binary search for a particular string prefix. Conveniently, we are writing the key for our entries first on each line, so we’ll be able to do a binary search for a given key.

Our file format is going to be similar to before, a log of JSON strings, one on each line. The difference, however, is that we will sort the entries before writing them to disk, resulting in files containing sorted string tables.

We’ll continue to use the .json file extension to make it easy to view the files in an editor.

Here’s an example of what one of our files might contain:

[["animal",1],{"name":"Hobgoblin","description":"An unsightly little elf"}]\n
[["person",1],{"first_name":"Joe","last_name":"McLearny"}]\n
[["person",2],{"first_name":"Jimmy","last_name":"Long"}]\n
[["person",3],{"first_name":"Ed","last_name":"Ward"}]\n

The only difference in the file format from before is that it will always be sorted. There will also be multiple files–one for each time the in-memory buffer fills up and gets flushed to disk.

Tests

In your copy of the learndb repo, check out the key-value-store-sst_before branch if you haven’t already and install dependencies by running npm install.

The first thing we need to do is update our tests to pass an extra option to the key-value store to define the size of the in-memory buffer.

Open tests/key-value-store.js and add the following variable declaration after the line declaring dbPath:

let maxBufferLength = 3

We also need to pass this argument to the key-value store when it’s created, so update the corresponding line for that:

keyValueStore = new KeyValueStore({ dbPath, maxBufferLength })

The rest of the changes are adding new tests to make sure our in-memory buffer gets flushed to disk properly. After the last test, but before the end of the describe() block, add the following code:

it('flush() flushes buffer to disk', () => {
  assert.equal(shell.ls(dbPath).length, 0, 'no SST files should exist in dbPath yet')
  keyValueStore.set('test-key-3', 'test-value-3')

  assert.equal(shell.ls(dbPath).length, 0, 'no SST files should exist in dbPath yet')
  keyValueStore.flush()

  assert.equal(shell.ls(dbPath).length, 1, 'buffer should be flushed to disk as sorted_string_table_0001.json')
  assert.lengthOf(keyValueStore.buffer, 0, 'the buffer should be emptied after flushing to disk')
})

it('set() flushes buffer to disk after maxBufferLength (3) entries', () => {
  assert.equal(shell.ls(dbPath).length, 0, 'no SST files should exist in dbPath yet')
  keyValueStore.set('test-key-2', 'test-value-2')

  assert.equal(shell.ls(dbPath).length, 0, 'no SST files should exist in dbPath yet')
  keyValueStore.set('test-key-1', 'test-value-1')

  assert.equal(shell.ls(dbPath).length, 0, 'no SST files should exist in dbPath yet')
  keyValueStore.set('test-key-3', 'test-value-3')

  assert.equal(shell.ls(dbPath).length, 1, 'buffer should be flushed to disk as sorted_string_table_0001.json')
  assert.lengthOf(keyValueStore.buffer, 0, 'the buffer should be emptied after flushing to disk')

  assert.equal(shell.ls(dbPath).length, 1, 'first SST file should exist in dbPath')
  keyValueStore.set('test-key-2', 'test-value-2')

  assert.equal(shell.ls(dbPath).length, 1, 'first SST file should exist in dbPath')
  keyValueStore.set('test-key-1', 'test-value-1')

  assert.equal(shell.ls(dbPath).length, 1, 'first SST file should exist in dbPath')
  keyValueStore.set('test-key-3', 'test-value-3')

  assert.equal(shell.ls(dbPath).length, 2, 'buffer should be flushed to disk as sorted_string_table_0002.json')
  assert.lengthOf(keyValueStore.buffer, 0, 'the buffer should be emptied after flushing to disk')

  const expectedEntries = [
    ['test-key-1', 'test-value-1', false],
    ['test-key-2', 'test-value-2', false],
    ['test-key-3', 'test-value-3', false]
  ]

  const expectedSortedStringTableContent = expectedEntries.map(JSON.stringify).join('\n') + '\n'

  const actualSortedStringTableContent1 = shell.cat(path.resolve(dbPath, 'sorted_string_table_0001.json')).stdout
  assert.equal(actualSortedStringTableContent1, expectedSortedStringTableContent)

  const actualSortedStringTableContent2 = shell.cat(path.resolve(dbPath, 'sorted_string_table_0002.json')).stdout
  assert.equal(actualSortedStringTableContent2, expectedSortedStringTableContent)
})

Implementation

Imports and constants

Open src/key-value-store.js and add the following line above the class export:

const SST_FILE_NAME_REGEXP = /^sorted_string_table_(\d+)[.]json$/

This regular expression lets us match the filenames for our sorted string tables and extract the digits representing which file in the sequence we’re looking at.

constructor({ dbPath, maxBufferLength })

Our constructor needs to take in an additional parameter that defines the size limit for the buffer. It also needs to initialize the buffer to an empty array:

constructor({ dbPath, maxBufferLength }) {
  this.dbPath = dbPath
  this.maxBufferLength = maxBufferLength
  this.buffer = []
}

set(key, value, isDeleted)

The set method gets simplified a bit since we’ll be primarily storing entries in the buffer. Once the buffer is full, though, we’ll call a new method named flush that will write the buffer to disk and clear it.

Update the set method so it looks like the following:

set(key, value, isDeleted = false) {
  this.buffer.push([key, value, isDeleted])

  if (this.buffer.length < this.maxBufferLength) {
    // The buffer isn't full yet, so we're done
    return
  }

  this.flush()
}

flush()

flush() {
  if (this.buffer.length === 0) {
    return
  }

  // De-duplicate buffer entries for the same key, preserving only the last entry.
  const bufferObject = {}
  for (const entry of this.buffer) {
    bufferObject[entry[KEY_INDEX]] = entry
  }

  const sstFileName = this._generateNextSstFileName()

  // Flush the buffer to disk
  fs.writeFileSync(
    path.resolve(this.dbPath, sstFileName),
    // Stringify the buffer entries, reverse sort them, and then join them
    // into one string separated by newlines.  Still need a final trailing newline.
    Object.keys(bufferObject)
      .map(key => JSON.stringify(bufferObject[key]))
      .sort()
      .join('\n') + '\n'
  )

  this.buffer = []
}

The flush() method is now where the magic happens for storing new values. The first thing we do is de-duplicate the buffer. Since set() stores every operation in an array, it’s possible to set the value for the same key multiple times, or delete it. We want to collapse all these operations so only the last one remains for each key. The rest no longer matter.

We do this by iterating through the buffer entries, oldest to newest, and applying them to a temporary object keyed by the entry key. This ensures that newer entries for the same key overwrite the older ones, and leaves us with exactly one entry per key. You could say that we’re applying the buffer operations to a temporary structure before we flush it to disk.

This idea of applying operations in-memory will become more important when we start to deal with transactions and have to keep track of multiple versions of data for each running transaction.

The next thing we do is call the helper method _generateNextSstFileName(), which is explained below. It generates a unique filename that we can write the buffer to.

Finally, we flush the buffer to disk. First, we stringify all the buffer entries to JSON strings, which is the format we’ll be storing them in. Next, we sort the JSON strings, which is the important step in yielding a sorted string table. Finally, we join the sorted JSON strings together with newlines and add a final newline (so the last line is terminated).

One last thing we also do is clear the buffer so it can start recording new entries again.

Here’s an example of some set() operations, the buffer that will be generated, and how it will be de-duplicated and written to disk.

set('test-1', 'value-1') // 3rd argument isDeleted defaults to false
set('test-2', 'value-2', false)
set('test-3', 'value-3', false)
set('test-3', null, true) // isDeleted is true for this entry
set('abc', 'asdf')
set('test-1', 'new value')
set('defg', 'foo')
set('000000000', '0 no')

Results in the following in-memory buffer just before it is flushed:

const buffer = [
  ['test-1', 'value-1', false],
  ['test-2', 'value-2', false],
  ['test-3', 'value-3', false],
  ['test-3', null, true],
  ['abc', 'asdf', false],
  ['test-1', 'new value', false],
  ['defg', 'foo', false],
  ['000000000', '0 no', false]
]

When flush() is called, the buffer is de-duplicated into the following object:

{
  'test-2': ['test-2', 'value-2', false],
  'test-3': ['test-3', null, true],
  'abc': ['abc', 'asdf', false],
  'test-1': ['test-1', 'new value', false],
  'defg': ['defg', 'foo', false],
  '000000000': ['000000000', '0 no', false]
}

Finally, we stringify each entry to JSON, sort them, and write the following output to the SST file:

["000000000","0 no",false]\n
["abc","asdf",false]\n
["defg","foo",false]\n
["test-1","new value",false]\n
["test-2","value-2",false]\n
["test-3",null,true]\n

This results in the SST file capturing exactly the last valid set of changes to the key-value store.

_generateNextSstFileName()


const SST_FILE_NAME_REGEXP = /^sorted_string_table_(\d+)[.]json$/

// ...

_generateNextSstFileName() {
  const existingSstFileNames = shell.ls(this.dbPath).filter(fileName => SST_FILE_NAME_REGEXP.test(fileName))

  if (existingSstFileNames.length === 0) {
    return 'sorted_string_table_0001.json'
  }

  // By default, ls returns a file list sorted by name.  So we can use pop() to
  // get the last filename, which will end up being the one with the highest index.
  const lastSstFileName = existingSstFileNames.pop()

  // This regex only matches the format of SST file names and extracts the index.
  const lastSstIndexString = SST_FILE_NAME_REGEXP.exec(lastSstFileName)[1]

  // We need to explicitly parse it to an Int to do math on it.  Thanks JavaScript.
  const lastSstIndex = parseInt(lastSstIndexString)
  const nextSstIndex = lastSstIndex + 1

  // E.g. 1 becomes '0001' and 123 becomes '0123'
  const nextSstIndexPaddedString = nextSstIndex.toString().padStart(4, '0')

  const nextSstFileName = `sorted_string_table_${nextSstIndexPaddedString}.json`
  return nextSstFileName
}

This is a private helper method that generates the next filename needed to flush the buffer to a new SST file. Essentially, it looks for the last SST file created, increments the index from the filename, and returns a new filename with the new index in it.

The first time this method is called, when the buffer is flushed for the first time, it’s going to return a default filename with a starting index of 1: sorted_string_table_0001.json. The next call to _generateNextSstFileName() will return sorted_string_table_0002.json, and so on. The index numbers are padded with 0 to ensure all the filenames are the same length and sort appropriately.

get(key)

get(key) {
  // First, check the buffer for the newest entry with the key.
  const latestBufferEntryValue = this._findLatestBufferEntryValue(key)

  if (latestBufferEntryValue !== undefined) {
    // The key was found in an entry in the buffer, so we're done.
    return latestBufferEntryValue
  }

  // The key wasn't found in the buffer, so now we search the SST files.
  const sstFileNames = shell.ls(this.dbPath).filter(fileName => SST_FILE_NAME_REGEXP.test(fileName))

  if (sstFileNames.length === 0) {
    // If there are no SST files, the key can't exist.
    return undefined
  }

  // We want to search the newest SSTs first so that we get the newest entry for the key.
  sstFileNames.reverse()

  // Search through the SST files, newest to oldest.
  for (const sstFileName of sstFileNames) {
    // Parse the SST file into an array of entries.  It's the same structure as the buffer, but it's sorted.
    const entries = this._loadEntriesFromSstFile(sstFileName)
    const entryValue = this._findEntryValue(key, entries)

    if (entryValue !== undefined) {
      // The key was found in an entry in the current SST file, so we're done.
      return entryValue
    }
  }

  // The key was not found.
  return undefined
}

The get() method is the most complex and has a few things going on, some of which are delegated to private helper methods. Our goal is to find the newest entry matching the given key, so it’s important that we check the buffer first. If we find an entry with the given key in the buffer, we’re done.

If there isn’t an entry in the buffer with the given key, we need to then search the SST files for one. The important thing here is that we search the SST files from newest to oldest (highest index to lowest index) in case multiple files contain an entry for the key.

Keep in mind that each SST file can only contain up to one entry for a given key because of the de-duplication step we perform in the flush() method.

After we have a list of sorted SST file names, we iterate through each one and delegate to other private helper methods the work of loading the SST file content and searching its entries. If we find an entry with the given key, we can immediately return its value and we’re done.

Finally, if none of our work yielded an entry with the key, we return undefined, indicating that it was not found.

Next we’ll step through the private helper methods for get() in the order they are used.

_findLatestBufferEntryValue(key)

_findLatestBufferEntryValue(key) {
  // Search the entries from most recent to oldest.
  for (let i = this.buffer.length - 1; i >= 0; i--) {
    if (this.buffer[i][KEY_INDEX] === key) {
      // We found the entry with the key in the buffer, so we're done.
      return this.buffer[i][IS_DELETED_INDEX]
        ? undefined // The isDeleted flag is set to true.
        : this.buffer[i][VALUE_INDEX] // Return the value for the key.
    }
  }

  // The key was not found in the buffer.
  return undefined
}

The first thing get() needs to do is search the in-memory buffer for an entry with the given key and return the newest if there are any matches.

Since entries are appended to the buffer, we have to iterate through it in reverse order to ensure we look at the newest ones first. Note that we can’t use reverse() here without making a copy of the buffer, since JavaScript’s reverse() method modifies arrays in-place. So we just use a slightly tweaked for loop.

If we do find an entry with the given key, we have to be careful not to just blindly return its value. The entry needs to be checked to see if it was deleted, and if so, we should return undefined to indicate that the key no longer has a value. Since JSON doesn’t support the value undefined, we’ve been putting null in the value for deleted keys and have to explicitly return undefined in that case.

_loadEntriesFromSstFile(sstFileName)

_loadEntriesFromSstFile(sstFileName) {
  // readFileSync returns a Buffer object that represents binary data.
  const buffer = fs.readFileSync(path.resolve(this.dbPath, sstFileName))

  // Stringify the buffer so we can split it into lines.
  const bufferString = buffer.toString()

  // Split the buffer into lines, each representing an entry in JSON format.
  const lines = bufferString.trim().split('\n')

  // Parse the JSON in each line into an array representing an entry.
  const entries = lines.map(jsonLine => JSON.parse(jsonLine))
  return entries
}

This method deserializes the SST file written by the flush() method. Recall that SST files contain multiple JSON strings, one on each line. This makes it a little bit more complex to deserialize since we have to process each line individually.

_findEntryValue(key, entries)

_findEntryValue(key, entries) {
  let entry = undefined
  let left = 0
  let right = entries.length - 1

  while (left <= right) {
    const mid = left + Math.floor((right - left) / 2)

    // We found the key.
    if (entries[mid][KEY_INDEX] === key) {
      entry = entries[mid]
      break
    }

    if (entries[mid][KEY_INDEX] > key) {
      // The key might exist in an entry before this entry.
      right = mid - 1
    } else {
      // The key might exist in an entry after this entry.
      left = mid + 1
    }
  }

  if (entry) {
    // We found the entry with the key in the sst file, so we're done.
    return entry[IS_DELETED_INDEX]
      ? undefined // The isDeleted flag is set to true.
      : entry[VALUE_INDEX] // Return the value for the key.
  }

  // The key was not found in the given entries.
  return undefined
}
}

This method consists mostly of a binary search for a specific key in the given entries. It’s a bit inefficient since we have to load the entire SST into memory to get its entries, but we’ll work on that more later.

Binary search is a fundamental algorithm in computer science. It allows you to search a sorted set of items for a specific item much more quickly than checking each item sequentially. The details are outside the scope of this section–here are some resources where you can learn more:

Now it’s time to use our benchmarks to look at the performance and compare it to the previous implementations.

Benchmarks

We need to make a few small tweaks to our benchmarks to support the new maxBufferLength parameter and to do a final flush of the buffer at the end of each benchmark. Make the following changes to src/benchmarks.js:

  • Update maxItems to 5000 so we have less duplicate keys.
  • Update the KeyValueStore constructor call to match: new KeyValueStore({ dbPath, maxBufferLength: 10000 })
  • After the line .on('complete', function() {, add a new line containing keyValueStore.flush()

Now you can run the benchmarks with the commandnpm run benchmarks. Here’s the output when I run the benchmarks against the new implementation on my local machine:

> @learndb/learndb@0.1.1 benchmarks C:\projects\learndb
> node --require @babel/register src/benchmarks.js

Starting RSS memory usage: 83.375 MB
KeyValueStore#set x 391,603 ops/sec ±3.00% (67 runs sampled)
KeyValueStore#get x 129 ops/sec ±3.08% (6 runs sampled)
KeyValueStore#delete x 696,304 ops/sec ±1.21% (58 runs sampled)
Ending RSS memory usage: 96.215 MB
Difference: 12.840 MB

Compare this to the benchmark results for the previous log-structure implementation:

> @learndb/learndb@0.1.1 benchmarks C:\projects\learndb
> node --require @babel/register src/benchmarks.js

Starting RSS memory usage: 63.109 MB
KeyValueStore#set x 2,137 ops/sec ±2.88% (64 runs sampled)
KeyValueStore#get x 62.86 ops/sec ±1.98% (63 runs sampled)
KeyValueStore#delete x 2,352 ops/sec ±0.66% (88 runs sampled)
Ending RSS memory usage: 83.066 MB
Difference: 19.957 MB

And the benchmark results for the simple in-memory implementation:

> @learndb/learndb@0.1.1 benchmarks C:\projects\learndb
> node --require @babel/register src/benchmarks.js

Starting RSS memory usage: 57.676 MB
KeyValueStore#set x 1,774,185 ops/sec ±2.53% (84 runs sampled)
KeyValueStore#get x 11,633,332 ops/sec ±3.46% (83 runs sampled)
KeyValueStore#delete x 11,518,884 ops/sec ±3.51% (85 runs sampled)
Ending RSS memory usage: 82.695 MB
Difference: 25.020 MB

And the benchmark results for the filesystem-as-a-database implementation:

> @learndb/learndb@0.1.1 benchmarks C:\projects\learndb
> node --require @babel/register src/benchmarks.js

Starting RSS memory usage: 47.426 MB
KeyValueStore#set x 1,106 ops/sec ±1.12% (86 runs sampled)
KeyValueStore#get x 9,631 ops/sec ±4.52% (77 runs sampled)
KeyValueStore#delete x 9,730 ops/sec ±4.82% (69 runs sampled)
Ending RSS memory usage: 76.145 MB
Difference: 28.719 MB

So we’ve increased our set() performance by almost 200x by buffering entries and writing them out in batches. We also doubled our get() performance, but it’s still quite slow. Probably because we’re reading in every SST file to search it until we find a given key, and if the key doesn’t exist, we read through every single SST file.

In the next section we’ll learn about bloom filters, which will let us skip expensive searches for keys we know don’t exist, both on the global level and for each SST file. In the section after that we’ll learn how to add indexes so we only need to search one part of an SST file to find a key after we know it exists there.

Remember, you can check out the corresponding key-value-store-sst_after branch to see the end result of the changes made in this section.