Client Basic Usage

Table of contents

After CoreNLP has been properly set up, you can start using the client functions to obtain CoreNLP annotations in Stanza. Below are some basic examples of starting a server, making requests, and accessing various annotations from the returned Document object. By default, CoreNLP Client uses protobuf for message passing. A full definition of our protocols (a.k.a., our supported annotations) can be found here.

Apart from the following example code, we have also prepared an interactive Jupyter notebook tutorial to get you started with the CoreNLP client functionality.

Importing the client

Importing the client from Stanza is as simple as a one-liner:

from stanza.server import CoreNLPClient

Starting a client-server communication and running annotation

Here we are going to run CoreNLP annotation on some example sentences. We start by first instantiating a CoreNLPClient object, and then pass the text into the client with the annotate function. Note that here we use the recommended Python with statement to start the client, which makes sure that the client and server are properly closed after the annotation:

text = "Chris Manning is a nice person. Chris wrote a simple sentence. He also gives oranges to people."
with CoreNLPClient(
        annotators=['tokenize','ssplit','pos','lemma','ner', 'parse', 'depparse','coref'],
        memory='6G') as client:
    ann = client.annotate(text)

The CoreNLP server will be automatically started in the background upon the instantiation of the client, so normally you don’t need to worry about it. If you see an error message about port 9000 already in use, you need to choose a different port; see Server Start Options.

Accessing basic annotation results

The returned annotation object contains various annotations for sentences, tokens, and the entire document that can be accessed as native Python objects. For instance, the following code shows how to access various syntactic information of the first sentence in the piece of text in our example above:

# get the first sentence
sentence = ann.sentence[0]

# get the constituency parse of the first sentence
constituency_parse = sentence.parseTree

This prints the constituency parse of the sentence, where the first child and its value can be accessed through constituency_parse.child[0] and constituency_parse.child[0].value, respectively

child {
  child {
    child {
      child {
        value: "Chris"
      value: "NNP"
      score: -9.281864166259766
  value: "S"
  score: -50.052059173583984
value: "ROOT"
score: -50.20326614379883

Similarly, we can access the dependency parse of the first sentence as follows


which prints output like the following

node {
  sentenceIndex: 0
  index: 1
edge {
  source: 2
  target: 1
  dep: "compound"
  isExtra: false
  sourceCopy: 0
  targetCopy: 0
  language: UniversalEnglish
root: 6

Here is an example to access token information, where we inspect the textual value of the token, its part-of-speech tag and named entity tag

# get the first token of the first sentence
token = sentence.token[0]
print(token.value, token.pos, token.ner)

Last but not least, we can examine the entity mentions in the first sentence and the coreference chain in the input text as follows

# get an entity mention from the first sentence

# access the coref chain in the input text

This gives us the mention text of the first entity mention in the first sentence, as well as a coref chain between entity mentions in the original text (the three mentions are “Chris Manning”, “Chris”, and “He”, respectively, where CoreNLP has identified “Chris Manning” as the canonical mention of the cluster)

Chris Manning
mention {
  mentionID: 0
  mentionType: "PROPER"
mention {
  mentionID: 2
  mentionType: "PROPER"
mention {
  mentionID: 5
  mentionType: "PRONOMINAL"
representative: 0