I have several Windows applications that read a file path from command-line arguments. Everything works flawlessly, except when passing paths with non-ANSI characters. I expected this, but don't know how to deal with it. Probably an entry-level question but that is driving me crazy.
My current code looks like:
int main(int argc, char* argv[]) {
namespace po = boost::program_options;
po::options_description po_desc("Allowed options");
po_desc.add_options()
("file", po::value<std::string>(), "path to file");
po::variables_map po_vm;
try {
po::store(po::parse_command_line(argc, argv, po_desc), po_vm);
po::notify(po_vm);
} catch (...) {
std::cout << po_desc << std::endl;
return false;
}
const std::string file_path = po_vm["file"].as<std::string>();
// ...
}
I've found that if I replace the type of file_path
from std::string
to boost::filesystem::path
, some paths are now read. I don't know exactly why but can deduce that it has to be with a translation from the Latin1 charset.
For example, having following files:
malaga.txt
málaga.txt
mąlaga.txt
The first is always read correctly, while the second one fails when using std::string file_path
but not boost::filesystem::path file_path
. The third one always fails.
I've tried switching the main function to int main(int argc, wchar_t* argv)
and using std::wstring
for the argument type, but it is not compatible with boost::program_options
parser.
How can I correctly read such Unicode file names?
Thanks for everyone contributing with their comments, thanks to them I managed to solved my problem.
Here the fixed code:
int wmain(int argc, wchar_t* argv[]) { // <<<
namespace po = boost::program_options;
po::options_description po_desc("Allowed options");
po_desc.add_options()
("file", po::wvalue<std::wstring>(), "path to file") // <<<
("ansi", po::value<std::string>(), "an ANSI string")
;
po::variables_map po_vm;
try {
po::store(po::wcommand_line_parser(argc, argv) // <<<
.options(po_desc)
.run(),
po_vm);
po::notify(po_vm);
} catch (...) {
std::cout << po_desc << std::endl;
return false;
}
const boost::filesystem::path file_path = po_vm["file"].as<std::wstring>(); // <<<
// ...
}
First, switch to wmain
and wchar_t* argv
: as mentioned by @erik-sun, it is necessary to switch the entry point to an Unicode aware function. Important note: it is possible to use int main(int, wchar_t*)
(in the sense it will compile) but it won't receive arguments with the correct codification and parser will fail, you have to use wmain
.
Then, the Unicode support link provided by @richard-critten was very useful for understanding the compilation errors:
boost::program_options::wvalue
when the type is wide-char. The internal implementation uses a string stream: the default one only works with 8-bits chars.boost::program_options::wcommand_line_parser
to accept wchar_t*
arguments. Unfortunately, this class doesn't have an all-in-one constructor and you must use the long form for parsing the command line.std::wstring
when needed.I've extended the code snippet to show it is still compatible with std::string
inputs.
My complete solution requires instantiating a Qt QApplication
at some point. QApplication
constructor is incompatible with the wide-char argv
. As no command-line interaction is needed with the Qt part (everything is processed long before by Boost), it can be re-written to receiv fake arguments:
int fake_argc = 1;
char* fake_argv[] = {"AplicationName"};
QApplication a(fake_argc, fake_argv);