nosqltokyo-cabinettokyo-tyrant

how to merge Tokyo Cabinet hash-table db's (.tch files) (no duplicate keys)


Is this possible? I couldn't find an answer anywhere.

Basically I'm looking at a setup where I have multiple workers (boxes) which must all store there data into a Tokyo Cabinet index/db eventually (I'm using Tokyto Tyrant over the memcached protocol abtw. not that it matters but still)

Basically, I tried pushing the data directly to another box which runs Tokyo Tyrant, but the TT can't handle it after a while. Inserts get really slow, and workers sit there idle wanting to offload data to the TT-server. (I tried all sorts of things to improve performance, more ram, raid-configs, multiple TT-servers on the box, etc) but the major drop in performance (inserts/ sec) comes sooner or later.

Now, I'm looking at the option to let each worker store it's own data in a local Tokyo Tyrant db and merge the db's of all workers afterwards (no duplicate keys guarenteed)

Any help appreciated, (also of other ways to distribute load on TT appreciated)

btw: the config for TT: #bnum=20000000#opts=l#xmsiz=162000000 I set bnum to the upperbound of items expected: 20 mil.

Thanks, Geert-Jan


Solution

  • check out kchashmgr. you could dump the files out into data files and then load them into a new kch file created with a bigger bnum.