Client Basic Usage
Table of contents
After CoreNLP has been properly set up, you can start using the client functions to obtain CoreNLP annotations in Stanza. Below are some basic examples of starting a server, making requests, and accessing various annotations from the returned Document object. By default, CoreNLP Client uses protobuf
for message passing. A full definition of our protocols (a.k.a., our supported annotations) can be found here.
Apart from the following example code, we have also prepared an interactive Jupyter notebook tutorial to get you started with the CoreNLP client functionality.
It is highly advised to start the server in a context manager (e.g. with CoreNLPClient(...) as client:
) to ensure the server is properly shut down when your Python application finishes.
Importing the client
Importing the client from Stanza is as simple as a one-liner:
from stanza.server import CoreNLPClient
Starting a client-server communication and running annotation
Here we are going to run CoreNLP annotation on some example sentences. We start by first instantiating a CoreNLPClient
object, and then pass the text into the client with the annotate
function. Note that here we use the recommended Python with
statement to start the client, which makes sure that the client and server are properly closed after the annotation:
text = "Chris Manning is a nice person. Chris wrote a simple sentence. He also gives oranges to people."
with CoreNLPClient(
annotators=['tokenize','ssplit','pos','lemma','ner', 'parse', 'depparse','coref'],
timeout=30000,
memory='6G') as client:
ann = client.annotate(text)
The CoreNLP server will be automatically started in the background upon the instantiation of the client, so normally you don’t need to worry about it. If you see an error message about port 9000 already in use, you need to choose a different port; see Server Start Options.
Accessing basic annotation results
The returned annotation object contains various annotations for sentences, tokens, and the entire document that can be accessed as native Python objects. For instance, the following code shows how to access various syntactic information of the first sentence in the piece of text in our example above:
# get the first sentence
sentence = ann.sentence[0]
# get the constituency parse of the first sentence
constituency_parse = sentence.parseTree
print(constituency_parse)
This prints the constituency parse of the sentence, where the first child and its value can be accessed through constituency_parse.child[0]
and constituency_parse.child[0].value
, respectively
child {
child {
child {
child {
value: "Chris"
}
value: "NNP"
score: -9.281864166259766
}
...
}
...
value: "S"
score: -50.052059173583984
}
value: "ROOT"
score: -50.20326614379883
Similarly, we can access the dependency parse of the first sentence as follows
print(sentence.basicDependencies)
which prints output like the following
node {
sentenceIndex: 0
index: 1
}
...
edge {
source: 2
target: 1
dep: "compound"
isExtra: false
sourceCopy: 0
targetCopy: 0
language: UniversalEnglish
}
...
root: 6
Here is an example to access token information, where we inspect the textual value of the token, its part-of-speech tag and named entity tag
# get the first token of the first sentence
token = sentence.token[0]
print(token.value, token.pos, token.ner)
Chris NNP PERSON
Last but not least, we can examine the entity mentions in the first sentence and the coreference chain in the input text as follows
# get an entity mention from the first sentence
print(sentence.mentions[0].entityMentionText)
# access the coref chain in the input text
print(ann.corefChain)
This gives us the mention text of the first entity mention in the first sentence, as well as a coref chain between entity mentions in the original text (the three mentions are “Chris Manning”, “Chris”, and “He”, respectively, where CoreNLP has identified “Chris Manning” as the canonical mention of the cluster)
Chris Manning
[
mention {
mentionID: 0
mentionType: "PROPER"
...
}
mention {
mentionID: 2
mentionType: "PROPER"
...
}
mention {
mentionID: 5
mentionType: "PRONOMINAL"
...
}
representative: 0
...
]