Gazetteer
Gazetteer is a user-supplied dictionary mapping labels to terms. Attach one to a Tagger to make the tagger recognize domain-specific vocabulary — product names, code identifiers, internal jargon — that the system model wouldn't normally catch.
Construction
dictionary:{ label: [term, term, ...] }. Each label can have many terms; the gazetteer maps any matching term back to that label.language?: Language— optional hint when terms are language-specific.
The constructor throws if the dictionary is malformed (e.g. empty terms or duplicate terms under different labels).
Properties
Methods
label(term: string): string | null
Returns the label term maps to, or null for a miss.
Using a gazetteer with Tagger
Gazetteer hits override the system model for that scheme. If the tagger is constructed with "nameType", then setGazetteers([gz], "nameType") will replace standard OrganizationName / PersonalName labels with your custom labels for the words that match.
Notes
- This is the lightweight, in-script way to extend tagging. The heavier (and out of scope here) alternative is
NLModel, which trains a full classifier from labeled data. - Multiple gazetteers can be attached to the same scheme via
setGazetteers([gz1, gz2, ...], scheme).
