cassandraimmutabilitytombstonestream-compaction

What does cassandra do during compaction?


I know that cassandra merges sstables, row-keys, remove tombstone and all.

  1. But i am really interested to know how it performs compaction ?

  2. As sstables are immutable does it copy all the relevant data to new file? and while writing to this new file it discard the tombstone marked data.

i know what compaction does but want to know how it make this happen(T)


Solution

  • I hope this thread helps, provided if you follow all the posts and comments in it

    http://comments.gmane.org/gmane.comp.db.cassandra.user/10577

    AFAIK

    Whenever memtable is flushed from memory to disk they are just appended[Not updated] to new SSTable created, sorted via rowkey.
    SSTable merge[updation] will take place only during compaction. 
    Till then read path will read from all the SSTable having that key you look up and the result from them is merged to reply back,
    
    Two types : Minor and Major
    
    Minor compaction is triggered automatically whenever a new sstable is being created.
    May remove all tombstones
    Compacts sstables of equal size in to one [initially memtable flush size] when minor compaction threshold is reached [4 by default]. 
    
    Major Compaction is manually triggered using nodetool
    Can be applied over a column family over a time
    Compacts all the sstables of a CF in to 1
    
    Compacts the SSTables and marks delete over unneeded SSTables. GC takes care of freeing up that space
    

    Regards, Tamil