The IvSpkIdTrainWav task takes a single audio file containing speech data from the speaker to be trained, and creates a new iVector speaker template file.
To process streamed audio, use the IvSpkIdTrainStream task.
| Parameter | Description | Required |
|---|---|---|
| Type | The task name. Set to IvSpkIdTrainWav.
|
Yes |
| FrameDupl | An integer value which allows for greater time efficiency without significant change in recognition accuracy. | |
| DiagFile | The file to write the diagnostic information to. | |
| DiagLevel | The level of detail to include in the diagnostic information. | |
| File | The audio file that contains sample speech from one person. | Yes |
| LabFile | A single label file to use. | |
| LabType | The type of labels to use. | |
| Out | The name of the speaker template file to create. You must include the
audio template file extension (.iv)
. |
Yes |
| Sfreq | The sample frequency of the audio file to process. | |
| SugdInputChannels | The channel layout of the input media file. | |
| SugdInputFrequency | The sampling rate of the input media file. |
http://localhost:15000/action=AddTask&Type=IvSpkIdTrainWav&File=C:/Data/BrownSpeech.wav&Out=Brown.iv
This action uses port 15000 to instruct HPE IDOL Speech Server, which is located on the local machine, to create the Brown.iv template file by using the BrownSpeech.wav file.
|
|