pythonpython-requestsproxypycharmrdflib

Forcing libraries that use requests library to use my proxy explicitly defined in get request


I'm trying to parse an RDF file using RDFLib. However, I have to use a proxy when making requests and I don't know how to get RDFLib to use my proxy.

import rdflib
g = rdflib.Graph()
g.parse(url)

results in:

Traceback (most recent call last): File "C:\Users\d72704\PycharmProjects\Ontology\main.py", line 113, in Read_RDF() File "C:\Users\d72704\PycharmProjects\Ontology\main.py", line 21, in Read_RDF g.parse(r'https://spec.edmcouncil.org/fibo/ontology/LOAN/LoansGeneral/Loans/Loan') File "C:\Users\d72704\AppData\Roaming\Python\Python39\site-packages\rdflib\graph.py", line 1234, in parse source = create_input_source( File "C:\Users\d72704\AppData\Roaming\Python\Python39\site-packages\rdflib\parser.py", line 326, in create_input_source ) = _create_input_source_from_location( File "C:\Users\d72704\AppData\Roaming\Python\Python39\site-packages\rdflib\parser.py", line 375, in _create_input_source_from_location input_source = URLInputSource(absolute_location, format) File "C:\Users\d72704\AppData\Roaming\Python\Python39\site-packages\rdflib\parser.py", line 218, in init file = _urlopen(req) File "C:\Users\d72704\AppData\Roaming\Python\Python39\site-packages\rdflib\parser.py", line 206, in _urlopen return urlopen(req) File "C:\Program Files\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Program Files\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Program Files\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Program Files\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Program Files\Python39\lib\urllib\request.py", line 1389, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Program Files\Python39\lib\urllib\request.py", line 1349, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>

I can see that the requests library is being called by RDFLib "C:\Program Files\Python39\lib\urllib\request.py" But I don't know how to pass my proxy values as I would do with a get request which does works

proxies = {myproxyhere}
r = requests.get(myurlhere, proxies=proxies, timeout=5)

I cannot alter my windows proxy settings due to admin constrains and if I try to set up the pycharm proxy it fails with this error

Problem with connection: Received fatal alert: protocol_version


Solution

  • If you read the traceback more closely, you'll see that rdflib is not using the Requests library but rather the urllib.request module from the standard library.

    Its documentation tells that it uses proxies defined by the http_proxy and https_proxy environment variables if those are set.

    So you can set them:

    import os
    os.environ['http_proxy'] = 'http://proxy.example:8899'
    os.environ['https_proxy'] = 'http://proxy.example:8899'
    

    (Requests also relies on those same environment variables, so the same solution will work with libraries that do use Requests.)