Isolate Wikidata query output
curl https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fitem%20%3FitemLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cmul%2Cen%22.%20%7D%0A%20%20%7B%0A%20%20%20%20SELECT%20DISTINCT%20%3Fitem%20WHERE%20%7B%0A%20%20%20%20%20%20%3Fitem%20p%3AP1417%20%3Fstatement0.%0A%20%20%20%20%20%20%3Fstatement0%20ps%3AP1417%20%22topic%2FJacobs-Room%22.%0A%20%20%20%20%7D%0A%20%20%20%20LIMIT%20100%0A%20%20%7D%0A%7D > tmp.txt && xmllint tmp.txt --html --xpath "/html/body/div[2]/div[4]/div/div[1]/div[2]/div[2]/table/tbody/tr/td[1]/a[2]"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 18033 0 18033 0 0 35974 0 --:--:-- --:--:-- --:--:-- 35994
tmp.txt:1: HTML parser error : Tag nav invalid
ueryservice container-fluid"><div class="row"><nav class="navbar navbar-default"
^
tmp.txt:1: HTML parser error : htmlParseEntityRef: expecting ';'
="https://www.mediawiki.org/w/index.php?title=Talk:Wikidata_Query_Service&action
^
tmp.txt:1: HTML parser error : htmlParseEntityRef: expecting ';'
.mediawiki.org/w/index.php?title=Talk:Wikidata_Query_Service&action=edit§ion
^
tmp.txt:1: HTML parser error : Tag nav invalid
/div></div></noscript><div class="row"><nav class="navbar navbar-default result"
^
tmp.txt:1: element button: validity error : ID open-example already defined
btn-default" id="open-example" data-toggle="modal" data-target="#QueryExamples"
^
tmp.txt:1: element span: validity error : ID examples-label already defined
r-open-o"></span> <span data-i18n="wdqs-app-button-examples" id="examples-label"
^
XPath set is empty
alinuxchap@libertus-desktop:~ $ hostnamectl
Static hostname: libertus-desktop
Icon name: computer
Machine ID: ########
Boot ID: ########
Operating System: Debian GNU/Linux 12 (bookworm)
Kernel: Linux 6.12.25+rpt-rpi-v8
Architecture: arm64
alinuxchap@libertus-desktop:~ $ xmllint --version
xmllint: using libxml version 20914
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU ISO8859X Unicode Regexps Automata Schemas Schematron Modules Debug Zlib Lzma
alinuxchap@libertus-desktop:~ $ curl --version
curl 7.88.1 (aarch64-unknown-linux-gnu) libcurl/7.88.1 OpenSSL/3.0.16 zlib/1.2.13 brotli/1.0.9 zstd/1.5.4 libidn2/2.3.3 libpsl/0.21.2 (+libidn2/2.3.3) libssh2/1.10.0 nghttp2/1.52.0 librtmp/2.3 OpenLDAP/2.5.13
Release-Date: 2023-02-20, security patched: 7.88.1-10+deb12u12
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets zstd
alinuxchap@libertus-desktop:~ $ bash --version
GNU bash, version 5.2.15(1)-release (aarch64-unknown-linux-gnu)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Thanks so much, hope I didn't miss anything obvious :>
Firefox is correct, but xmllint isn't. xmllint uses the HTML parser of the libxml2 library which was written 20+ years ago and never supported HTML5. Simply don't use xmllint or libxml2 to parse HTML.