I need to generate millions of redis data
with a value size of 1kb
to a redis cluster, assuming that only the value type is string. I learned about two options, the first one is to use debug populate
to generate a specific amount of data, but it does not set the value size
.
127.0.0.1:6379> DEBUG POPULATE 1000000
OK
The second one is to use shell to call redis-cli
and I don't know how to generate 1kb data
for i in `seq 1000000`;
do
redis-cli SET key$i val$i ;
done
I am newbie on this. How do I meet the demand? I really appreciate any help with this.
Try the solution based on Mark Setchell.
#!/bin/bash
# Generate around 32kB (+ around 33% base64 overhead) of random characters
stuff=$(head -c 32000 /dev/urandom | base64)
# Set 100,000 keys to 1kB strings, e.g. SET key32 A87H34..PHNQZ
for ((i=0;i<100;i++)) ; do
echo SET key$i ${stuff:RANDOM:1024}
done | redis-cli -p 6371 -c --pipe
The following error occurs using the above code
sh fake_data_test.sh
All data transferred. Waiting for the last reply...
MOVED 13252 172.20.0.33:6379
MOVED 9189 172.20.0.32:6379
ERR syntax error
ERR syntax error
MOVED 13120 172.20.0.33:6379
MOVED 9057 172.20.0.32:6379
ERR syntax error
ERR syntax error
...
ERR syntax error
Last reply received from server.
errors: 100, replies: 100
Then I thought whether it was a value formatting issue, so I put it in double quotes echo SET key$i "${stuff:RANDOM:1024}"
sh fake_data_test.sh
All data transferred. Waiting for the last reply...
MOVED 13252 172.20.0.33:6379
ERR unknown command `kpshETtdvDBpL1BYimJl3FkpuJMom/heyj02qJwUGUCQvSZODHXHwNGodfVyIR6sWSv8agjlGMtl`, with args beginning with:
...
ERR unknown command `UmBAaiwqgB25mSDhsK7qrveXhJV0cJCBRaz`, with args beginning with:
MOVED 9189 172.20.0.32:6379
ERR unknown command ERR unknown command `gRolxGVLUVbnU5I/ykaXPCA+0Nev`, with args beginning with:
Last reply received from server.
errors: 1397, replies: 1428
for ((i=0;i<100;i++)) ; do
redis-cli -p 6371 -c SET key$i "${stuff:RANDOM:1024}"
done
// All output ok
I don't know if I'm using pipe in the wrong way
Note: OS is centos7. redis cluster creation via docker-compose. images is redis:4.0.11-alpine
Updated Answer
If you are doing this in order to just generate test data, there's another much faster way. You could:
FLUSHALL
,SAVE
CONFIG GET DIR
So, essentially, you empty Redis and set it up how you want it (per my original answer) and back it up. Then, before each test, just replace the main database with the backup file and restart.
Original Answer
There are probably better ways, but (before my morning coffee) here's a method...
First, generate 40kB of random text near the start of your script:
stuff=$(head -c 40000 /dev/urandom | base64)
Now, inside your loop, go to a random offset of 0..32767 in the text and take the following 1024 bytes:
val=${stuff:RANDOM:1024}
In case you wonder, I am trying to avoid expensive creation of processes inside your big loop. So the line val=${...}
is a bash
"internal" that doesn't create a new process.
Note that if you take a million random samples starting at offsets 0..32768, there will inevitably be repetitions. You could reduce this by taking multiple smaller chunks from different offsets and appending them together. Or perhaps, generate absolutely unique values by prefixing each value with a sequential number and making the strings slightly over 1024 bytes.
Aside, I think you'd be better pipelining some of this, or using Python or some bulk-loading to speed it up.
This code does 100,000 insertions of 1024 byte strings in around 49 seconds for example:
#!/bin/bash
# Generate around 32kB (+ around 33% base64 overhead) of random characters
stuff=$(head -c 32000 /dev/urandom | base64)
# Set 100,000 keys to 1kB strings, e.g. SET key32 A87H34..PHNQZ
for ((i=0;i<100000;i++)) ; do
echo SET key$i ${stuff:RANDOM:1024}
done | redis-cli --pipe
If you want to ensure the values are unique, and don't mind making each value just over 1024 bytes, replace the line in the loop with:
echo SET key$i "${i}-${stuff:RANDOM:1024}"
If you require exactly 1024 unique bytes, you can use the following at a 10% time penalty:
# Generate value: 8 digits of sequence number, a dash and 1015 random characters
printf -v val "%08d-%s" $i ${stuff:RANDOM:1015}
echo SET key$i $val