Speech recognition is the newest technology which means understanding voice by the computer and performing any required task. It is also known as automatic speech recognition or computer speech recognition or speech to text.
First, Speech recognition that allows the machine to catch the words, phrases and sentences we speak
Second, natural language processing to allow the machine to understand what we speak, and
Third, Speech synthesis to allow the machine to speak.
Recognizing the speaker can simplify the task of translating speech in systems that have been trained on a specific person’s voice or it can be used to authenticate or verify the identity of a speaker as part of a security process.
Analog to Digital Recognition:
They are two Recognition models
1) Acoustic model:
An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It is used by a speech recognition engine to recognize speech.
2) Language Model:
Language model is used in many natural languages processing applications such as speech recognition tries to capture the properties of a language, and to predict the next word in a speech sequence.
Types of Voice Recognition:
Speaker Dependent:
Speaker dependent software is designed to recognize the unique characteristics of a single person’s voice. Training is involved.
Speaker Independent:
Speaker independent software is designed to recognize any one’s voice, so no training is involved.