Posts

Showing posts from February, 2023

Data Structure: log structure merge tree(LSM)

Image
Data Structure: log structure merge tree(LSM): LSM Tree is a data structure that restricts your datastore to append only operations, whereas your datastore is simply a log containing totally ordered key-value pairs. The sort order is by a key which can be any byte sequence. The values too can be any arbitrary byte sequence. For example, in Cassandra's data model, a column family (akin to a table in a relational database) contains a row key (the key) and any sequence of arbitrary key values that can vary in cardinality (the value). 1. Fast write 2. Data stored in relational database a. backed by data structure B-tree B-tree: it's optimized for reads updating is relatively more expensive as it involves random I/o, updating multiple pages on the disk this limits how fast a B-tree can ingest data 3. LSM works differently writes are batched in memory as they arrive in a structure called Memtable (memory) 1. Memtable is ordered by object key and is usually implemented as a