Link

Document Date

Table of contents


Description

Provides several methods for setting the date of documents. One can use the standalone docdate annotator or use the sub-annotator ner.docdate that is contained by the ner annotator. If using the sub-annotator in ner do not also use the standalone annotator.

Property nameAnnotator class nameGenerated Annotation
docdateDocDateAnnotatorDocDateAnnotation

Example Usage

Command Line

# as a standalone annotator
java -Xmx5g edu.stanford.nlp.pipeline.StanfordCoreNLP edu.stanford.nlp.pipeline.DocDateAnnotator -annotators tokenize,ssplit,docdate -docdate.useFixedDate 2019-01-01 -file example.txt
# as a sub-annotator of ner
java -Xmx5g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -ner.docdate.useFixedDate 2019-01-01 -file example.txt

Options

Option nameTypeDefaultDescription
docdate.useFixedDateString-Set every document to have a fixed date (e.g. 2019-01-01)
docdate.useMappingFilefile, classpath, or URL-Use a tab-delimited file to specify doc dates. First column is document ID, second column is date.
docdate.usePresent--Set every document to have the present date as the date.
docdate.useRegexString-Specify a regular expression matching file names. The first group will be extracted as the date. (e.g. NYT-([0-9]{4}-[0-9]{2}-[0-9]{2}).xml )