Link

Using CoreNLP within other programming languages and packages

Table of contents


Below are interfaces and packages for running Stanford CoreNLP from other languages or within other packages. They have been written by many other people (thanks!). In general you should contact these people directly if you have problems with these packages.

C#/F#/.NET

Previously

Clojure

  • DataLinguist by Simon Gray wraps most of CoreNLP with an idiomatic Clojure API. As of 2022, this is the most complete and completely up-to-date Clojure API for CoreNLP.
  • org.clojurenlp.core extended the earlier https://github.com/damienstanton/stanford-corenlp. It incorporates work by Cory Giles, Hans Engel, Damien Stanton, Andrew McCloud, Leon Talbot, and Marek Owsikowski. It covers tokenization, POS tagging, NER, and parsing, but is currently (2021) not very actively maintained.
  • Clojure wrapper for CoreNLP by Nils Gruenwald. Very partial, currently only wrapping the tagger and TokensRegex, and not being developed.

Docker

Okay, Docker isn’t a language, but you know what we mean….

And there are about 200 others – it’s not so hard to build a dockerfile! Here’s a list, which includes a number of dockerfiles setup to run CoreNLP with different human languages:

Note on running the CoreNLP server under docker: The container’s port 9000 has to be published to the host. For example, give a command like: docker run -p 9000:9000 -itd --name CoreNLP graham3333/corenlp-complete. If, when going to localhost:9000/, you see the error This site can’t be reached. localhost refused to connect, then this is what you failed to do!

Elixir

  • corenlp by Robert Bates provides a thin client interface in Elixir to a CoreNLP server. GitHub.

Go (golang)

  • go-corenlp is a Golang wrapper for CoreNLP by Hironobu Saito.

  • corenlp-golang is another wrapper by Peter Bi written in 2022.

Java

JavaScript (node.js)

Caution: Disrecommended

  • stanford-corenlp (github site) is a simple node.js wrapper by hiteshjoshi.
  • stanford-corenlp-node (github site) is a webservice interface to CoreNLP in node.js by Mike Hewett. No recent development.
  • stanford-simple-nlp (github site) is a node.js CoreNLP wrapper by Taeho Kim (xissy). This doesn’t seem to have been updated lately. You’re better off with something else
  • corenlp-js-interface was a (too) simple interface to a CoreNLP server in node.js. It is deprecated, suffers from a command injection vulnerability and the GitHub site is no longer available.
  • corenlp-js-prefab was a simple interface to the CoreNLP server with a prefab function so you only have to send text and no extra parameters with each call by Noah Dessauer. It is deprecated and the GitHub site is no longer available.

Lua

Perl

PHP

Python

Official Stanza Package by the Stanford NLP Group

We are actively developing a Python package called Stanza, with state-of-the-art NLP performance enabled by deep learning. Besides, this package also includes an API for starting and making requests to a Stanford CoreNLP server. It is the recommended way to use Stanford CoreNLP in Python.

  • Stanza: Official Stanford NLP Python package, covering 70+ human languages, as well as biomedical English text.

Packages using the Stanford CoreNLP server

These packages use the Stanford CoreNLP server that we’ve developed over the last couple of years.

  • stanfordcorenlp by Lynten Guo. A Python wrapper to Stanford CoreNLP server, version 3.9.1. PyPI page: pip install stanfordcorenlp
  • pycorenlp, A Python wrapper for Stanford CoreNLP by Smitha Milli that uses the new CoreNLP v3.6+ server. Available on PyPI.
  • corenlp-pywrap by Sherin Thomas also uses the new CoreNLP v3.6+ server. Python 3.x (only). Also: PyPI page.
  • Stanford CoreNLP Python Interface: A reference implementation of a Python interface to the Stanford CoreNLP server. By Arun Chaganty. PyPI page: pip install stanford-corenlp PyPI page.
  • pynlp A (Pythonic) Python wrapper for Stanford CoreNLP by Sina. PyPI page.
  • NLTK since version 3.2.3 (from mid-2018) has a new interface to Stanford CoreNLP using the StanfordCoreNLPServer: nltk.parse.corenlp.CoreNLPParser. Please use it. There is a nice wiki page of instructions. See also: the API for the dependency and constituency parsers (with many examples) and the code for this module. Here’s a friendly introduction on how to get started by Data District Labs. Much of this work was done by Dmitrijs Milajevs. NLTK also includes an older generation of interfaces to Stanford NLP tools, and, unfortunately, they do not want to remove them until version 4 for compatibility reasons and, for some other reason that we don’t understand, they don’t even warn you against using them in the documentation. You should totally avoid using the old Stanford tokenizer/segmenter/NER/parser (unless stuck on a very old version of NLTK) – these classes are very slow, since they perform calls to Java via the command-line for each invocation. That is, you should avoid: nltk.tag.stanford.StanfordTagger, nltk.tag.StanfordNERTagger, nltk.tokenize.stanford.StanfordTokenizer, nltk.tokenize.stanford_segmenter.StanfordSegmenter, and nltk.parse.stanford.StanfordParser.

Miscellaneous Python packages

These packages are miscellaneous utilities or other frameworks that use Stanford CoreNLP.

  • python-corenlp-protobuf: Stanford CoreNLP Python Bindings by Arun Chaganty. This package contains python bindings for Stanford CoreNLP’s protobuf specifications, as generated by protoc. These bindings can used to parse binary data produced by, e.g., the Stanford CoreNLP server. PyPI page.
  • PyStanfordDependencies, a Python interface for converting Penn Treebank trees to Stanford Dependencies by David McClosky (see also: PyPI page). Last we checked, it is at Stanford CoreNLP v3.5.2 and can do Universal and Stanford dependencies (though it’s currently missing Universal POS tags and features).
  • corenlp-xml, a library for handling interactions with CoreNLP’s XML output by Robert Elwell. Available on PyPI. Documentation.
  • corpkit, a sophisticated corpus linguistics toolkit with GUI by Daniel McDonald. Interfaces with CoreNLP v3.6.0 to parse documents, and uses Tregex/CoreNLP XML to find patterns in corpora. Available on PyPI. A graphical interface is also available.
  • corenlp-xml-reader by Edward Newell on GitHub and there it’s a PyPI package. He also has corenlpy, which runs Java in a subprocess; see Github repository.

Older Python packages

These are previous generation Python interfaces to Stanford CoreNLP, using a subprocess or their own server. They are now not generally being developed and are obsolete. (But thanks a lot to the people who wrote them in the early days!)

R (CRAN)

Ruby

  • Stanford CoreNLP Ruby bindings by Louis Mullie (see also: Ruby Gems page). (Updated in Feb 2017 to CoreNLP 3.5.0.)
  • The larger TREAT NLP toolkit by Louis Mullie also makes available Stanford CoreNLP.
  • corenlp by Lengio Corp. is another interface to CoreNLP (last updated for CoreNLP 3.4).
  • stanford-core-nlp by Will Hayworth is another older interface to CoreNLP (also for CoreNLP 3.4).

Scala

Thrift server

XQuery

ZeroMQ/ØMQ servers

  • stanford-0mq by Diane Napolitano. An implementation of a server for Stanford’s CoreNLP suite using Ømq and a basic client/server/JSON requests configuration. Last commit: Oct 2015.
  • stanford-corenlp-zeromq by URXtech. Basic JSON wrapper around CoreNLP.
  • corenlp-zmq by Thom Neale. A Dockerfile and Ansible provisioning script to build and run a Stanford CoreNLP server process with a single ZMQ broker font-end that proxies incoming requests to one or more back-end Scala workers. Last commit: 2015.
  • corenlp-server by Eric Kow. Simple Java server communicating with clients via XML through ZeroMQ. Example Python client included. Last commit: 2014.