When you compile a grammar by using edktool, you can add an optional configuration file to specify additional options for compilation.
This configuration file allows you to specify character expansions, which detects certain characters as if they are a different character. For example, you can use this to detect different varieties of punctuation characters to match a standard form that you use in your grammar files.
The compilation configuration file is a JSON file that contains your character expansions.
You specify an expansions array, which contains a list of your expansions. Each array item has a src and dest element:
src. The source character. Eduction detects the destination characters as if they are this source character. In the output text, Eduction normalizes all the destination characters to the source character.
dest. An array of destination characters that you want to detect as the source character.
The following example configuration matches the letters b and c as if they are the letter a:
{
"expansions": [
{ "src": "a", "dest": ["b", "c"] }
]
}
When your grammar includes the following pattern:
<pattern>ade</pattern>
And your input contains the following text:
ade bde cde dde
Eduction matches ade, bde, and cde, as if they are ade, and produces the following normalized matches:
ade ade ade
You add a configuration file to your compilation by setting the -c command-line option in the compile command. For more information, see Compile.
When you compile a grammar by using the Eduction SDK, you can specify the path to a compilation configuration file by using one of the following options:
C API: the EdkLoadResourceFileWithCompileConfig and EdkLoadResourceBufferWithCompileConfig functions.
Java API: the loadResourceFile, loadResourceFiles, and loadResourceBuffer methods in the TextExtractionEngine interface.
.NET API: the GetCompiler method on the EDKFactory class.
For more information, refer to the API documentation.
_FT_HTML5_bannerTitle.htm