/
Text-to-Speech

Text-to-Speech

1. Description

It is designed to generate a voice from a given text.

This component is present in flow types like:

  • Voice.

The block (Fig.1(1)) is used in space to build the Flow. The settings panel for this component opens (Fig.1(2)) when you click on the block.

Fig.1. Text-to-Speech component

1.1. Text-to-Speech block

The block consists of the following elements:

  1. The name of the component;

  2. In branch — receiving (connecting with the previous block) branch;

  3. Out branch — outgoing (connecting with the following block) branch. When hovering over the Out field, a switch appears. The switch allows this component to be connected to a component that already has a connection.

The presence of an  icon indicates that the switch is on, and this component can be connected to a component that already has a connection.

Changing the switch position removes the outgoing branch of this component.

The  button deletes the block from the Flow Schema.

1.2. Text-to-Speech settings panel

It consists of the following elements:

  1. Provider;

  2. Key;

  3. Token;

  4. Language;

  5. Voice;

  6. Region;

  7. Add your custom flags here;

  8. Text type;

  9. Text;

  10. Get speech;

  11. Get digits;

  12. Break;

  13. Limit;

  14. Add description.

1.2.1. Provider

Here you can choose a voice generation service provider.

The following options are available:

  • google;

  • microsoft;

  • yandex.

1.2.2. Key

When you select microsoft, an additional Key field appears in which you must enter the key.

1.2.3. Token

When you select yandex, an additional Token field appears, in which you must enter the authorization token.

1.2.4. Language

Here, you can select the language in which you want to voice the text.

1.2.5. Voice

Here, you can select the voice that will be used to sound the text.

1.2.6. Region

The field is present when microsoft is selected in the Provider field.

It is designed to select a region.

1.2.7. Add your custom flags here

The field is present when google is selected in the Provider field.

The field for adding custom flags by writing code in JSON format.

1.2.8. Text type

The field for selecting the type of text. The following options are available:

  • ssml (speech synthesis markup language) is an XML-based markup language for speech synthesis applications;

  • text — regular text.

1.2.9. Text

Here, you enter the text you want to convert to audio.

1.2.10. Get speech

It is designed to recognize what the subscriber says while listening to the sounds configured in this component.

It consists of the following elements:

  1. Get speech;

  2. Timeout.

Recognized speech is stored in a variable — google_transcript

1.2.10.1. Get speech

It is designed to enable/disable the ability to recognize what the subscriber says while listening to the sounds configured in this component.

1.2.10.2. Timeout

Here, you can enter the number of milliseconds after the end of the speech when we still recognize what the subscriber is saying.

Only one of the Get speech or Get digits switches can be in an active position at a time.

1.2.11. Get digits

It is designed to receive information about what the subscriber presses on the phone while listening to the sounds configured in this component.

It consists of the following elements:

  1. Get digits;

  2. Min;

  3. Max;

  4. Set result to variable;

  5. Timeout;

  6.  Digit timeout (ms);

  7. Tries;

  8. Terminators;

  9. Flush DTMF.

1.2.11.1. Get digits

It is designed to enable/disable the ability to receive information about what the subscriber presses on the phone while listening to the sounds configured in this component.

1.2.11.2. Min

Here, you enter the minimum number of digits we expect to receive from the subscriber.

1.2.11.3. Max

Here, you enter the maximum number of digits we expect to receive from a subscriber.

1.2.11.4. Set result to variable

Here enters the variable's name to which the values selected by the subscriber will be recorded while listening to the sounds configured in this component.

When the Get speech switch is enabled, the recognized speech of the subscriber is written to the variable in the form of text.

When the Get digits switch is enabled, the digit(s) pressed by the subscriber is written to the variable.

1.2.11.5. Timeout

Here, you enter the waiting time for the subscriber to enter the required number of digits.

1.2.11.6. Digit timeout (ms)

Here, you enter the waiting time between digit sets — how many milliseconds to wait for the next digit set. If this field is not filled in, the limits are taken from the Timeout field.

1.2.11.7. Tries

Here, you enter the number of times the subscriber tries to enter the required number of digits.

1.2.11.8. Terminators

If the field contains the “-” symbol, pressing the “#” button on the phone will be recorded into the variable. This is used to enable the “#” button in the voice menu.

If the field contains “#”, pressing the “#” button on the phone will exit the playback.

1.2.11.9. Flush DTMF

Responsible for canceling the digits entered by subscribers before the recording starts playing.

— cancels the numbers entered by subscribers;

— does not cancel the numbers entered by subscribers.

1.2.12. Break

1.2.13. Limit

Fig. 2. Setting a limit

1.2.14. Add description

Fig. 3. Add description

 

Related content

Send TTS
More like this
STT
More like this
Call's applications
Call's applications
Read with this
Answer
More like this
Playback
More like this