nlppython-3.7stanford-nlpstanza

How can you ensure a viable endpoint for a stanza CoreNLPClient?


I would like to use the stanza CoreNLPClient to extract noun phrases, similar to this method.

However, I cannot seem to find a good port to start the server on. The default is 9000, but this is often occupied, as indicated by the error message:

PermanentlyFailedException: Error: unable to start the CoreNLP server on port 9000 (possibly something is already running there)

EDIT: Port 9000 is in use by python.exe, which is why I can't just shut the process down to make space for the CoreNLPClient.

Then, when I select other ports such as 7999, 8000, or 8080, the server keeps listening indefinetely, not executing the consecutive code lines, showing only the following:

2021-07-19 12:05:55 INFO: Starting server with command: java -Xmx8G -cp C:\Users\timjo\stanza_corenlp* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 7998 -timeout 60000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-2e15724b8064491b.props -preload -outputFormat serialized

I have the latest version of stanza installed, and am running the following code from an .ipynb file in VS Code:

# sample sentence
sentence = "Albert Einstein was a German-born theoretical physicist." 

# start the client as indicated in the docs
with CoreNLPClient(properties='corenlp_server-2e15724b8064491b.props', endpoint='https://localhost:7998', memory='8G', be_quiet=True) as client:
     matches = client.tregex(text=sentence, pattern = 'NP')

# extract the noun phrases and their indices
noun_phrases = [[text, begin, end] for text, begin, end in
     zip([sentence[match_id]['spanString'] for sentence in matches['sentences'] for match_id in sentence],
         [sentence[match_id]['characterOffsetBegin'] for sentence in matches['sentences'] for match_id in sentence],
         [sentence[match_id]['characterOffsetEnd'] for sentence in matches['sentences'] for match_id in sentence])]

Main question: How can I ensure that the server starts on an open port, and closes afterwards? I would prefer having a semi-automatic way to finding open / shutting down occupied ports for the client to run on.


Solution

  • In general it is sufficient to choose another number that nothing else is using – maybe 9017? There are lots of numbers to choose from! But the more careful choice would be to create the CoreNLPClient in a while loop with a try/catch and to increment the port number till you found one that was open.