Parser

About

The Stanford Parser can be used to generate constituency and dependency parses of sentences for a variety of languages. The package includes PCFG, Shift Reduce, and Neural Dependency parsers. To fully utilize the parser, also make sure to download the models jar for the specific language you are interested in. Links to models jars provided below in History section or here.

Download Stanford Parser 4.2.1

You can consult this legacy FAQ for more info.

Demo

You can see demonstrations of the various parsers here.

Differences between Standalone and CoreNLP

If you are using Stanford NLP software for non-commercial purposes, you should use the full CoreNLP package.

Parsing requires tokenization and in some cases part-of-speech tagging. The Stanford Parser distribution includes English tokenization, but does not provide tokenization used for French, German, and Spanish. Access to that tokenization requires using the full CoreNLP package. Likewise usage of the part-of-speech tagging models requires the license for the Stanford POS tagger or full CoreNLP distribution.

License

The parser code is dual licensed (in a similar manner to MySQL, etc.). Open source licensing is under the full GPL, which allows many free uses. For distributors of proprietary software, commercial licensing is available. (Fine print: The traditional (dynamic programmed) Stanford Parser does part-of-speech tagging as it works, but the newer constituency and neural network dependency shift-reduce parsers require pre-tagged input. For convenience, we include the part-of-speech tagger code, but not models with the parser download. However, if you want to use these parsers under a commercial license, then you need a license to both the Stanford Parser and the Stanford POS tagger. Or you can get the whole bundle of Stanford CoreNLP.) If you don’t need a commercial license, but would like to support maintenance of these tools, we welcome gift funding: use this form and write “Stanford NLP Group open source software” in the Special Instructions.

History

Version	Date	Changes	Models
4.2.1	2020-05-05	Reduce size of srparser models	arabic, chinese , english , english (kbp), french , german , spanish
4.2.0	2020-11-17	Retrain English models with treebank fixes	arabic, chinese , english , english (kbp), french , german , spanish
4.0.0	2020-04-19	Model tokenization updated to UDv2.0	arabic, chinese , english , english (kbp), french , german , spanish
3.9.2	2018-10-17	Updated for compatibility	arabic, chinese , english , english (kbp), french , german , spanish
3.9.1	2018-02-27	new French and Spanish UD models, misc. UD enhancements, bug fixes	arabic, chinese , english , english (kbp), french , german , spanish
3.8.0	2017-06-09	Updated for compatibility	arabic, chinese , english , english (kbp), french , german , spanish
3.7.0	2016-10-31	new UD models	arabic, chinese , english , english (kbp), french , german , spanish
3.6.0	2015-12-09	Updated for compatibility	chinese , english , french , german , spanish
3.5.2	2015-04-20	Switch to universal dependencies	caseless , chinese , shift reduce parser , spanish
3.5.0	2014-10-31	Upgrade to Java 8; add neural-network dependency parser	caseless , chinese , shift reduce parser , spanish
3.4.1	2014-08-27	Spanish models added.	caseless , chinese , shift reduce parser , spanish
3.4	2014-06-16	Shift-reduce parser, dependency improvements, French parser uses CC tagset	caseless , chinese , shift reduce parser
3.3.1	2014-01-04	English dependency “infmod” and “partmod” combined into “vmod”, other minor dependency improvements
3.3.0	2013-11-12	English dependency “attr” removed, other dependency improvements, imperative training data added
3.2.0	2013-06-20	New CVG based English model with higher accuracy
2.0.5	2013-04-05	Dependency improvements, -nthreads option, ctb7 model
2.0.4	2012-11-12	Improved dependency code extraction efficiency, other dependency changes
2.0.3	2012-07-09	Minor bug fixes
2.0.2	2012-05-22	Some models now support training with extra tagged, non-tree data
2.0.1	2012-03-09	Caseless English model included, bugfix for enforced tags
2.0	2012-02-03	Threadsafe!
1.6.9	2011-09-14	Improved recognition of imperatives, dependencies now explicitely include a root, parser knows osprey is a noun
1.6.8	2011-08-04	New French model, improved foreign language models, bug fixes
1.6.7	2011-05-18	Minor bug fixes.
1.6.6	2011-04-20	Internal code and API changes (ArrayLists rather than Sentence; use of CoreLabel objects) to match tagger and CoreNLP.
1.6.5	2010-11-30	Further improvements to English Stanford Dependencies and other minor changes
1.6.4	2010-08-20	More minor bug fixes and improvements to English Stanford Dependencies and question parsing
1.6.3	2010-07-09	Improvements to English Stanford Dependencies and question parsing, minor bug fixes
1.6.2	2010-02-26	Improvements to Arabic parser models, and to English and Chinese Stanford Dependencies
1.6.1	2008-10-26	Slightly improved Arabic and German parsing, and Stanford Dependencies
1.6	2007-08-19	Added Arabic, k-best PCCFG parsing; improved English grammatical relations
1.5.1	2006-06-11	Improved English and Chinese grammatical relations; fixed UTF-8 handling
1.5	2005-07-21	Added grammatical relations output; fixed bugs introduced in 1.4
1.4	2004-03-24	Made PCFG faster again (by FSA minimization); added German support
1.3	2003-09-06	Made parser over twice as fast; added tokenization options
1.2	2003-07-20	Halved PCFG memory usage; added support for Chinese
1.1	2003-03-25	Improved parsing speed; included GUI, improved PCFG grammar
1.0	2002-12-05	Initial release