databasedata-structuresb-treestorage-enginesb-tree-index

How do I index variable length strings, integers, binaries in b-tree?


I am creating a database storage engine (for fun).

I know it uses b-trees (and stuff), but in all of b-tree base examples, it shows that we need to sort keys and then store it for indexing, not for integers.

I can understand sorting, but how to do it for strings, if I have string as a key for indexing?

Ex : I want to index all email addresses in btree , how would I do that ??


Solution

  • It does not matter, what type of data you are sorting. For a B-Tree you only need a comparator. The first value you put into your db is the root. The second value gets compared to the root. If smaller, then continue down left, else right. Inserting new values often requires to restructure your tree.

    A comparator for a string could use the length of the string or compare it alphabetically or count the dots in an email behind the at-sign.