I'm trying to download a polish dictionary. Unfortunately, the existing files contain all inflections (not sure what the proper english word is). I found out that the command
lynx --dump https://sjp.pl/slownik/lp.phtml?f_vl=2&page=1 > file.txt
can download a single dictionary webpage. I would then have to somehow extract only the dictionary entries from the block of text, but at least it's a start.
Unfortunately, I'm a linux noob and don't know how I can iterate through all the 3067 pages.
Untested, but you should be able to do it quite fast and easily with GNU Parallel
parallel -qk 'lynx --dump https://sjp.pl/slownik/lp.phtml?f_vl=2&page={}' ::: {1..3067} > file.txt
If it doesn't work, try removing the single quotes. If that doesn't work, try putting a backslash before the &
. Sorry, I don't have any way to test at the moment.
The slow way is:
for ((i=1;i<3068;i++)) ; do
lynx --dump ...page=$i
done > file.txt