About

This is the software package used by the Stanford team in the CoNLL 2018 Shared Task on Universal Dependency Parsing. It contains tools to convert a string of text to lists of sentences and words, generate base forms of those words, their parts of speech and morphological features, and a syntactic structure that is designed to be parallel among more than 70 languages.

This package is built with highly accurate neural network components that enables efficient training and evaluation with your own annotated data. The modules are built on top of PyTorch.

Choose CoreNLPy if you need:

  • An integrated NLP toolkit with a broad range of grammatical analysis tools
  • A fast, robust annotator for arbitrary texts, widely used in production
  • A modern, regularly updated package, with the overall highest quality text analytics
  • Support for a number of major (human) languages
  • Available APIs for most major modern programming languages
  • Ability to run as a simple web service

Download

Stanford CoreNLPy can be downloaded from the link below. This will download (1) all the code needed to run the models, which is itself largely language-agnostic, and (2) the model files for English to get you started.

Download CoreNLPy 0.1

Alternatively, you can find the source code on GitHub. For more information about CoreNLPy, see the download page.

To use CoreNLPy, you also need to download the actual model files for the languages and models you need. Some commonly used languages are listed below.

Language model files version
Arabic download 0.1
Chinese download 0.1
English download 0.1
French download 0.1
German download 0.1
Spanish download 0.1

License

Actual license TBD.

Stanford CoreNLP is licensed under the GNU General Public License (v3 or later; in general Stanford NLP code is GPL v2+, but CoreNLP uses several Apache-licensed libraries, and so the composite is v3+). Note that the license is the full GPL, which allows many free uses, but not its use in proprietary software which is distributed to others. For distributors of proprietary software, CoreNLP is also available from Stanford under a commercial licensing You can contact us at java-nlp-support@lists.stanford.edu. If you don’t need a commercial license, but would like to support maintenance of these tools, we welcome gift funding: use this form and write “Stanford NLP Group open source software” in the Special Instructions.

Citing CoreNLPy in papers

If you use CoreNLPy in your work, please cite this paper:

Peng Qi, Timothy Dozat, Yuhao Zhang and Christopher D. Manning. 2018. Universal Dependency Parsing from Scratch In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 160-170. [pdf] [bib]