viewerhoogl.blogg.se - Acapela infovox alyona

#Acapela infovox alyona drivers#

The engine comes with Lexicon Manager, a dictionary editor that allows you to change the pronunciation of words both alphabetically and phonetically.įor example, with the help of KooBAudio, mp3book2005 and this voice engine, a 4 hour novel can be voiced and converted to mp3 in 10 minutesĪcapela Alyona - works well with such programs as: KooBAudio 0.7.0.7, mp3book2005, Balabolka, Cool Reader. To use TTS, you need a synthesis program (TextAloud, Cool Reader, Balabolka, etc.), which replaces the interface, allowing you to work with TTS, change its settings, sound and speech timbre, and control other features.Īcapela, the developers of the popular Russian-language voice module Nikolay, have released a new Russian, female voice engine, which was called Alena, works on SAPI-5 with a frequency of 22 kHz, Alena is far ahead of Nikolay in the quality of the synthesized speech, the timbre of voice and intonation, according to users, is more pleasant than at the Katerina engine from ScanSoft RealSpeak. Text-to-speech engines do not have built-in controls and it takes more than one TTS to make your computer speak.

#Acapela infovox alyona drivers#

Information: Text-to-speech (TTS) engine or speech synthesis engines - programs similar to drivers designed to convert and synthesize text into a sound wave.

Acapela ELAN NIKOLAI Tempo Multimedia(Acapela ELAN Tempo Multimedia) V5.1.0.0 Russian (255 channels)Īdd.

Acapela ALYONA Multimedia(Acapela Multimedia Alyona).

However, due to restrictions on the size of the voice base, some texts (words and their combinations) are pronounced with noticeable distortions, up to the complete loss of individual sounds.

Unit Selection- the naturalness of the timbre of speech is high and in the synthesized voice retains the timbre coloring of the voice of the donor speaker.

However, as in diphonic synthesis, the voice turns out to be quite robotic, and it is difficult to recognize the voice of the donor speaker in it.

Allophone approach- the naturalness of the voice is somewhat higher than in the diphonic approach due to a larger set of sound elements.

The timbre of the donor speaker is not recognized in the timbre of the synthesized speech.

Diphonic approach- allows you to make a speech signal that is legible, but unnatural in timbre.

But each of them individually has its drawbacks: Today, there are three main directions of synthesis: the diphonic approach (a diphone is a sound from the middle of one phoneme to the middle of an adjacent phoneme), the allophone approach (the realization of a phoneme surrounded by context on the left and right) and the Unit Selection technology (selection of sound elements from the speech base). In this case, the user can ask the synthesis system to pronounce any phrase or sentence. It is much more difficult to make a speech synthesizer for unlimited text of any subject area. An example of such a synthesis (called macrosynthesis) is provided by the train traffic warning systems used at stations in large cities in Russia. For a narrow area, the sound quality can be reduced to the most natural by compiling pre-recorded long speech fragments related to this area. Synthesis technology can be in demand both in a narrow subject area, and in a wide, or unlimited one. In order for the synthesized speech to sound natural, it is necessary to solve a whole range of tasks related to both ensuring the naturalness of the voice at the timbre level, smooth sounding and intonation, and the correct placement of stresses, deciphering abbreviations, numbers, abbreviations and special characters. Automatic speech synthesis- the process of generating a speech signal - a technology that makes it possible to read a text (document, letter, sms) in a voice close to natural.