Speech-To-Text
Speech-To-Text transcribes words spoken in audio into text.
TIP: To use speech-to-text you must install a speech-to-text language pack. Due to their size, these are not included with Media Server by default. For more information, see Install Speech-to-Text Language Packs
| Configuration Parameter | Description |
|---|---|
| AlternativeWordsThreshold | A threshold that alternative words must meet to be included in the output. |
| CustomLanguageModel | The identifier and interpolation weight of each custom language model to use. |
| CustomWordDatabase | The name of a custom word database to use. |
| FilterMusic | Specifies whether to include speech-to-text results for audio segments identified as music or noise. |
| GPUDeviceID | The device ID of the GPU to use. |
| Input | The audio track to process. |
| LanguagePack | The language pack to use. |
| MatchWords | A comma-separated list of words to tag in the speech-to-text output. |
| MatchWordsAddUnknown | Specifies whether to add any unknown MatchWords to the language resource. |
| MatchWordsCaseSensitive | Specifies whether matches between MatchWords and the speech-to-text output are case-sensitive. |
| MatchWordsThreshold | The minimum score that is necessary for an alternative word to be considered a match to one of the words specified by MatchWords. |
| ModelVersion | The model to use to convert speech into text. |
| NumParallel | The maximum number of audio segments to process concurrently. |
| SpeedBias | Specifies whether to prioritize processing accuracy or speed. |
| SyncDatabase | Specifies whether to synchronize with the training database before beginning the analysis task. |
| Type | The analysis engine to use. Set this parameter to SpeechToText. |
Output Tracks
| Output track | Description |
|---|---|
Result
|
Contains a record for each word. |
For more information see Speech-to-Text Results or use the action GetExampleRecord.