I want to generate via bash script.
The desired output should be something like this:
0.00.0
0.00.00
0.00.01
...
1.26.0
1.26.00
1.26.01
1.26.02
...
0.00.0_a
...
0.00.0_z
0.00.00_a
...
0.00.01_a
...
9.99.99_z
...
0.00.0_aa
...
0.00.00_aa
...
1.26.99_zz
...
9.99.99_zz
I find this:
printf "%03d\n" {0..999}
But with this script output is:
000
001
002
...
997
998
999
So, how to modify this script to get my desired output?
Concatenate multiple brace expansions to build their cartesian product. That is, to generate 00 01 ... 99
you can write {0..9}{0..9}
. Since bash 4.0 you can also write {00..99}
. This only works for numbers. For letters, you still have to write {a..z}{a..z}
.
For the single 0
in 0 00 01 02 ... 99
you can nest brace expansions like so: {0,{00..99}}
. Same goes for the missing letters where we use the empty string: {,{a..z}}
.
WARNING: The following commands take up a lot of memory. The output might be "only" around 750 MB on disk but the running bash processes used more than 16 GB memory for me. If you have insufficient memory/swap the command might just get killed (if you are lucky) or your system freezes, requiring you do a hard reboot.
For a better solution, see the end of this answer.
Now lets put everything together:
printf %s\\n {0..9}.{00..99}.{0,{00..99}}{,_{,{a..z}}{a..z}} > outputFile
This brace expansion generates 71'003'000 lines, printing them to stdout would take ages, so we redirected the output to the file outputFile
instead. You can confirm that this generates at least the lines from your example by running grep -Fxf exampleAsAFile outputFile
. Alternatively, run this simplified command where we replaced 0..9
by 0..1
and a..z
by a..b
, then inspect the result manually:
printf %s\\n {0..1}.{0..1}{0..1}.{0,{0..1}{0..1}}{,_{,{a..b}}{a..b}}
Even though we just generated all the required lines, the order is different from your example. To adapt the order you could run the result through a Schwartzian transform sort
, but that would be a waste of ressources. Instead you can use multiple brace expansions such that everything is generated in the right order:
printf %s\\n \
{0..9}.{00..99}.{0,{00..99}} \
{0..9}.{00..99}.{0,{00..99}}_{a..z} \
{0..9}.{00..99}.{0,{00..99}}_{a..z}{a..z} \
> outputFile
To reduce the memory footprint you can split off a prefix into a for
loop. Where exactly to split depends on your preference and system. Less braces in the loop means more memory but faster execution (as long as you have enough memory). More braces in the loop means slower execution but less memory (as long as the prefix is shorted than half of the brace expansion; making it longer will have only negative effects).
# use only if order doesn't matter.
# takes 1m30s and 24 MB of memory
for prefix in {0..9}.{00..99}; do
printf "$prefix.%s\n" {0,{00..99}}{,_{,{a..z}}{a..z}}
done > outputFile
or
# takes 2m and 24 MB of memory
for prefix in {0..9}.{00..99}; do
printf "$prefix.%s\n" {0,{00..99}} >> part1
printf "$prefix.%s\n" {0,{00..99}}_{a..z} >> part2
printf "$prefix.%s\n" {0,{00..99}}_{a..z}{a..z} >> part3
done
cat part{1..3} > outputFile