Optical Character Recognition
Optical Character Recognition (OCR) recognizes text in media. This includes text that appears in images, video, and text embedded in PDF files and Office document file formats.
Configuration Parameter | Description |
---|---|
CharacterTypes | The types of characters to include in the character set used for recognition. |
ContextCheck | Specifies whether to use context checking to improve OCR results |
DetectAlphabet | Specifies whether to detect the alphabet for each image or page. |
DisabledCharacters | Characters to exclude from the character set used for recognition. |
ExtraEnabledCharacters | Extra characters to add to the character set. |
FontType | The basic character type of the text that you want to recognize |
GPUDeviceID | The device ID of the GPU to use. |
HollowText | Specifies whether to look for outlined text. |
Input | The image track to process. |
KeepOnly | Keep only particular types of words and discard all others. |
Languages | The languages to use, which affects the character set and dictionaries used. |
MaxInputQueueLength | Can be used to place a limit on latency. |
NumParallel | The maximum number of video frames to analyze simultaneously. |
OcrMode | The OCR mode to use when you ingest images or documents. |
Orientation | The orientation of text in the ingested media. |
OutputTablesByColumn | Specifies how to order the records produced when OCR encounters a table. |
ProcessTextElements | Specifies whether to merge the content of text elements into the OCR results. |
Region | A region of the image or video frame to restrict processing to. |
SampleInterval | The interval at which frames are selected to be analyzed. |
SceneAlgorithmBias | Specifies whether to prioritize accuracy or processing speed when finding text in scene mode. |
SceneImageSizeLimit | The maximum image size for text detection in scene mode. |
Spacing | Specifies whether to allow multiple spaces between words in the output from OCR. |
Type | The analysis engine to use. Set this parameter to OCR . |
UserDictionary | A comma-separated list of dictionaries to use in addition to the standard dictionaries. |
WordRejectThreshold | The minimum confidence level required to include a word in the output. |
Output Tracks
The following table describes the tracks that are generated by this engine. The Output column indicates whether the information contained in the track is included by default in the output created by an output task (when you don't set the Input
parameter for the output task).
Output track | Description | Output |
---|---|---|
Data
|
Contains one record, describing the analysis results, per line of text, per video frame. | No |
DataWithSource
|
The same as the |
No |
Result
|
Contains one record, describing the analysis results, for each line of text. When a line of text appears in many consecutive frames, Media Server produces a single result. | Yes |
ResultWithSource
|
The same as the |
No |
CharResult
|
(Image/document ingest only) Contains one record, describing the analysis results, for each line of text. However, the records in this track also provide detail about individual characters. | No |
PageResult
|
(Image/document ingest only) Contains one record for each page, describing the orientation of the page, and the alphabet(s) and OCR mode that were used. | No |
TableResult
|
(Image/document ingest only) Contains one record for each table that is detected. | No |
WordData
|
Contains one record for each word, describing the analysis results, per video frame. | No |
WordResult
|
Contains one record for each word, describing the analysis results. | No |
Start
|
The same as the |
No |
End
|
The same as the |
No |
For more information see OCR Results or use the action GetExampleRecord.