Assuming that I created a file with 10 lines:
yes "foo bar" | head -n 10 > foobar.txt
[out]:
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
And I want to randomly replace 30% of the lines with empty line to look like this:
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
I could technically write a python script to generate random num and do ratio-foobar.sh
#!/bin/bash
ratio=$1
numlines=$2
coinflip() {
randnum=$(bc -l <<< $(python -S -c "import random; print(int(random.random() * 100))"))
if [ $randnum -gt $ratio ]
then
return 1
else
return 0
fi
}
for i in $(seq 1 $numlines);
do
if coinflip
then
echo "foo bar"
else
echo ""
fi
done
Usage:
bash ratio-foobar.sh 33 10 > foobar.txt
[out]:
foo bar
foo bar
foo bar
foo bar
foo bar
foo bar
But is there a simpler way to just generate (maybe with yes
) a certain percent of the time?
Tried to use @renaud-pacalet solution but I realized the whole reading float into shell thing was a mess and somehow bc
got involved again. But somehow this didn't work:
ratio=$1
lines=$2
ratio=$(echo "scale=3; $ratio/100" | bc)
yes "foo bar" | head -n $2 | awk 'BEGIN {srand()} {print rand() < $ratio ? "" : $0}' > output
cat output
Use: bash flip.sh 33 10
for 33% and 10 lines of foo bar.
But when the ratio is hard-coded, it worked:
ratio=$1
lines=$2
ratio=$(echo "scale=3; $ratio/100" | bc)
yes "foo bar" | head -n $2 | awk 'BEGIN {srand()} {print rand() < 0.3? "" : $0}' > output
cat output
Any solution to do this reading of the percentage and make the yes | head | awk
works properly?
As you can apparently use python
you could use only that:
from random import randrange as rnd
def foo(n, r, s):
for i in range(n):
print("" if rnd(100) < r else s)
foo(10, 33, "foo bar")
Where n
is the number of lines to print, r
is the percentage of empty lines and s
is the string to print. See the argparse
module if you want to pass arguments to a python
script.
You could do the same with any POSIX awk
(tested with GNU awk
):
awk -v n=10 -v r=33 -v s="foo bar" '
END {srand(); for(i=1; i<=n; i++) print rand() < r/100 ? "" : s}' /dev/null
Or, with plain bash:
n=10; r=33; s="foo bar"
for (( i=1; i<=n; i++ )) ; do
(( SRANDOM % 100 < r )) && echo "" || echo "$s"
done
The SRANDOM
special variable expands as a 32 bits random number. So, it could be that you don't get exactly 33%
of empty lines (2 to the power of 32 is not a multiple of 100) but the difference should be very small.