This annotator generates a list of the mentions, identified by NER, found in each sentence of a document. Rather than per-token labeling, it produces whole entity mentions. For example, “New York City” will be identified as one entity mention. The “sentences” element of a document annotation will additionally contain an “entitymentions” element. The value of this is a list of individual entity mentions, including their text, their span in tokens and character offsets, their NER tag, and for quantities, their normalized value, and TIMEX form, as appropriate. This detail is available to API users. At present, entity mentions are only included in the output of the JSON outputter.
||Annotator class name
||Class to use as key to look up NER value.
||Class to use as key to look up normalized named entity value.
||Class to use as key to look up mentions.
||If true, heuristically search for organization acronyms, even if they are not marked explicitly by an NER tag. That is, it looks for putative acronyms of an organization identified elsewhere in the document. In some work this has been super useful (+20% recall).