I am trying to use wget -m <address>
to download the contents of an FTP server. A lot of the content is icelandic and so contains a bunch of weird characters that I think are causing issues as I keep seeing:
Incomplete or invalid multibyte sequence encountered
I have tried adding flags such as --restrict-file-names=nocontrol
but to no avail.
I have also tried using lftp
but doesn't seem to make any difference.
According to wget
manual
If you specify
‘nocontrol’
, then the escaping of the control characters is also switched off.
that is it as actually more permissive than default, bunch of weird characters suggest you have some issues with getting encoding right and therefore ascii
seems to be best fit for your use case
The
‘ascii’
mode is used to specify that any bytes whose values are outside the range of ASCII characters (that is, greater than 127) shall be escaped. This can be useful when saving filenames whose encoding does not match the one used locally.
As I do not have ability to test, please try it and write about result it give.