How Spelling Correction Works
To enable spell checking, set the parameters SpellCheckMaxCheckTerms, SpellCheckIncorrectMaxDocOccs, and UnstemmedMinDocOccs in the [Server] section of the configuration file before you index content. When you perform a query that includes , the Content component uses these settings in the spell checking process, as shown below:SpellCheck=True
-
Content determines if the query is eligible for spell checking.
Content checks how many terms the query text contains (it ignores stop words, proper-name terms and hyphenated terms). If the number does not exceed the specified
SpellCheckMaxCheckTerms, the query is eligible for spell checking. -
Content determines which terms are misspelled.
Content checks how many times each query term occurs in its data index. If a term occurs fewer times than the specified
SpellCheckIncorrectMaxDocOccs, Content assumes that the term is misspelled. -
Content finds correct spellings and suggests them.
Content uses a proprietary term-distancing algorithm to find terms in its data index that are closest to the misspelled terms. It then checks how many times these terms occur. If a term occurs at least the specified number of
UnstemmedMinDocOccstimes, it uses it as a spell check suggestion.Content returns the corrected terms as a comma-separated list in an
<autn:spelling>field. It also returns a corrected version of the query text in an<autn:spellingquery>field. -
When you shut down the Content component, it creates a spelling correction file.
The spelling correction file stores the corrections that you make. You can add further corrections to the file or amend existing corrections.