Speech API and Natural Language Processing (NLP) is the upcoming trends for user interface design, where there would not be any need to clicking and typing over the software or any mobile application, there are two aspect of Speech API – Speech Recognition and Speech Synthesis also called as text to speech .
In speech recognition the input would be taken by a microphone device which than sent to the speech services, the underlying speech services than check the input against predefine grammatical engine or a grammar database, once input get successfully verified a response will be sent to the calling API and further actions can be taken as a result. While implementing Speech API with web the main controller interface which take part in SpeechRecognition this also handles the SpeechRecognitionEvent sent from the recognition service by default speech recognition from the device will be used, like Dictation on macOS, Siri on iOS, Cortana on Windows 10, Android Speech, etc.
Note: On some browsers, for example Chrome, Speech Recognition on an HTML page requires a server-based recognition engine, audio is sent to a web service for recognition processing, so it won’t work offline
Browser and OS Support
Web Speech API speech recognition is currently limited to Chrome for Desktop and Android — Chrome has supported it since version 33 but with prefixed interfaces, so you need to include prefixed versions of them, e.g. webkitSpeechRecognition.
Creates a new SpeechRecognition object.
Returns and sets a collection of SpeechGrammar objects that represent the grammars that will be understood by the current SpeechRecognition.
Returns and sets the language of the current SpeechRecognition. If not specified, this defaults to the HTML lang attribute value, or the user agent’s language setting if that isn’t set either.
Controls whether continuous results are returned for each recognition, or only a single result. Defaults to single (false.)
Controls whether interim results should be returned (true) or not (false.) Interim results are results that are not yet final (e.g. the SpeechRecognitionResult.isFinal property is false.)
Sets the maximum number of SpeechRecognitionAlternatives provided per result. The default value is 1.
Specifies the location of the speech recognition service used by the current SpeechRecognition to handle the actual recognition. The default is the user agent’s default speech service.
Stops the speech recognition service from listening to incoming audio, and doesn’t attempt to return a SpeechRecognitionResult.
Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.
Stops the speech recognition service from listening to incoming audio, and attempts to return a SpeechRecognitionResult using the audio captured so far.
Listen to these events using addEventListener() or by assigning an event listener to the oneventname property of this interface.
Fired when the user agent has started to capture audio. Also available via the onaudiostart property.
Fired when the user agent has finished capturing audio. Also available via the onaudioend property.
Fired when the speech recognition service has disconnected. Also available via the onend property.
Fired when a speech recognition error occurs. Also available via the onerror property.
Fired when the speech recognition service returns a final result with no significant recognition. This may involve some degree of recognition, which doesn’t meet or exceed the confidence threshold. Also available via the onnomatch property.
Fired when the speech recognition service returns a result — a word or phrase has been positively recognized and this has been communicated back to the app. Also available via the onresult property.
Fired when any sound — recognisable speech or not — has been detected. Also available via the onsoundstart property.
Fired when any sound — recognisable speech or not — has stopped being detected. Also available via the onsoundend property.
Fired when sound that is recognised by the speech recognition service as speech has been detected.Also available via the onspeechstart property.
Fired when speech recognised by the speech recognition service has stopped being detected.Also available via the onspeechend property.
Fired when the speech recognition service has begun listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition. Also available via the onstart property.
This is the process where input for a speech engine has taken by the keyboard or a mouse and in result the output will be given from an audio speaker this concept also known as Text-To-Speech or TTS. The main controller interface for Web Speech API is SpeechSynthesis — along with number of closely-related interfaces for representing text to be synthesized (known as utterances), different voices to be used for the utterance.
Properties of Speech Synthesis
SpeechSynthesis.paused Read only
A Boolean that returns true if the SpeechSynthesis object is in a paused state.
SpeechSynthesis.pending Read only
A Boolean that returns true if the utterance queue contains as-yet-unspoken utterances.
SpeechSynthesis.speaking Read only
A Boolean that returns true if an utterance is currently in the process of being spoken — even if SpeechSynthesis is in a paused state.
Removes all utterances from the utterance queue.
Returns a list of SpeechSynthesisVoice objects representing all the available voices on the current device.
Puts the SpeechSynthesis object into a paused state.
Puts the SpeechSynthesis object into a non-paused state: resumes it if it was already paused.
Adds an utterance to the utterance queue; it will be spoken when any other utterances queued before it have been spoken.
Listen to this event using addEventListener() or by assigning an event listener to the oneventname property of this interface.
Fired when the list of SpeechSynthesisVoice objects that would be returned by the SpeechSynthesis.getVoices() method has changed. Also available via the onvoiceschanged property.