Advanced Speech SDK Controls
  • 07 Jun 2024
  • 5 Minutes to read
  • Contributors
  • Dark
    Light

Advanced Speech SDK Controls

  • Dark
    Light

Article summary

Advanced Controls

The Vuzix Speech Recognition engine has advanced controls described here. These have been expanded since the initial SDK was released. All features here require the Vuzix Speech SDK version 1.4, and M300 or M300XL running version 1.3 or higher.

Enabling and Disabling Speech Recognition

The Vuzix Speech SDK will listen for the trigger phrase "Hello Vuzix" whenever Vuzix Speech Recognition is enabled in the Settings menu. When Speech Recognition is disabled, the microphone icon in the notification bar is grayed-out. When Speech Recognition is enabled, the microphone icon becomes outlined.

The speech recognizer has global commands, such as "go home" and "flashlight on" that are processed in any application. The recognizer also supports custom vocabulary that is processed by each individual application.

It is possible for an application to rely on custom voice commands to perform essential tasks. In this scenario, it would be an unwanted burden to require the user to navigate to the system Settings menu. Instead the Vuzix Speech Recognition may be programatically enabled from within an application.

import com.vuzix.sdk.speechrecognitionservice.VuzixSpeechClient;
try {
	VuzixSpeechClient.EnableRecognizer(getApplicationContext(), true);
}
catch(NoClassDefFoundError e) {
	// This device does not implement the Vuzix Speech SDK
	// todo: Implelment error recovery
}

This method is static. Passing the the optional context parameter allows the proper user permissions to be applied, and is recommended for robustness.

The recognizer may be similarly disabled via code during times when false detection would impair the application behavior.

VuzixSpeechClient.EnableRecognizer(getApplicationContext(), false);

Once Vuzix Speech Recognition is disabled, the notification bar icon is grayed-out, and the phrase "Hello Vuzix" will no longer trigger speech recognition.

It is safe to set the Speech Recognition to the existing state, so there is no need to query the state before enabling or disabling Vuzix Speech Recognition. Simply specify the desired state. However, if you want to display the current enabled/disabled state you can query it using isRecognizerEnabled(). This value is not changed by the system while your application is active so the appropriate place for this query is your activity onResume().

bool mSpeechEnabled; 
@Override protected void onResume() {
	super.onResume();
	mSpeechEnabled = VuzixSpeechClient.isRecognizerEnabled(this);
	// todo: update status to user showing state of mSpeechEnabled
}

Triggering the Speech Recognizer

When Speech Recognition is enabled, the recognizer remains in a low-power mode listening only for the trigger phrase, "Hello Vuzix". Once this is heard, the recognizer wakes and becomes active. This state is indicated by the microphone icon in the notification bar becoming fully filled. While active, all audio data is scanned for known phrases.

It is possible for an application to programatically trigger the recognizer to wake and become active, rather than relying on the "Hello Vuzix" trigger phrase. This can be tied to a button press or a fragment opening.

import com.vuzix.sdk.speechrecognitionservice.VuzixSpeechClient;
try {
	VuzixSpeechClient.TriggerVoiceAudio(getApplicationContext(), true);
}
catch(NoClassDefFoundError e) {
	// This device does not implement the Vuzix Speech SDK
	// todo: Implelment error recovery
}

The recognizer has a timeout that can be modified in the system Settings menu. The active recognizer will return to idle mode after that duration has elapsed since the most recent phrase was recognized. This state is again indicated by the microphone icon in the notification bar returning to the unfilled outline icon, and the recognizer will only respond to the trigger phrase "Hello Vuzix."

Some workflows are best suited to return the active recognizer to idle at a specific time. For example, during recording of a voice memo. This prevents phrases such as "go back" and "go home" from being recognized and acted upon.

The recognizer may be programatically un-triggered to idle state with the same method.

VuzixSpeechClient.TriggerVoiceAudio(getApplicationContext(), false);

Trigger State Notification

Since the Speech Recognition engine may be triggered externally and may timeout internally, it is likely that applications that wish to control this behavior need to know the state of the recognizer.

The same Speech Recognition Intent that broadcasts phrases also broadcasts state change updates. Simply check for the presence of the extra boolean RECOGNIZER_ACTIVE_BOOL_EXTRA.

boolean mSpeechTriggered;
@Override public void onReceive(Context context, Intent intent) {
	if (intent.getAction().equals(VuzixSpeechClient.ACTION_VOICE_COMMAND)) {
		Bundle extras = intent.getExtras();
		if (extras != null) {
			// We will determine what type of message this is based upon the extras provided
			if (extras.containsKey(VuzixSpeechClient.RECOGNIZER_ACTIVE_BOOL_EXTRA)) {
				// if we get a recognizer active bool extra, it means the recognizer was
				// activated or stopped
				mSpeechTriggered = extras.getBoolean(VuzixSpeechClient.RECOGNIZER_ACTIVE_BOOL_EXTRA, false);
				// todo: Implement behavior based upon the recognizer being changed to active or idle
			}
		}
	}
}

Since the state may also change while your application is not running, if you display the state using these notifications you should also query the current state in your onResume().

bool mSpeechTriggered; 
@Override protected void onResume() {
	super.onResume();
	mSpeechTriggered = VuzixSpeechClient.isRecognizerTriggered(this);
	// todo: Implement behavior based upon the recognizer being changed to active or idle
}

Startup Timing Concerns

It is possible for applications that automatically launch with the operating system to be initialized before the speech engine has come online. This is true for launcher applications, among others. Any speech queries or commands issued at startup will fail, and must be retried after the speech engine comes online. In such applications, you should surround initialization logic with a call such as:

if( VuzixSpeechClient.isRecognizerInitialized(this) ) {
	//todo perform your speech customizations here
}

Even if the initialization code cannot be run at startup, you should still register the broadcast receiver for the trigger state, as described in the preceding section. When the engine becomes initialized it will send out an initial trigger state. The receipt of this trigger state can cause your application to retry the speech initialization. This allows you to create an application that is starts before the speech engine, and can interact with the speech engine as soon as it becomes available without any unnecessary polling.

Canceling Repeating Characters

Certain Commands like "scroll up" and "scroll down" initiate repeating key presses. This allows the user interface to continue to scroll in the selected direction. The repeating key presses stop when the engine detects any other phrase, such as "select this". The default phrase "stop" is recognized by the speech engine and has no behavior other than to terminate the scrolling.

You may wish to stop repeating key presses programatically without requiring the user to say another phrase. This is useful when reaching the first or last item in a list. To do this, simply call StopRepeatingKeys().

try {
	sc.StopRepeatingKeys();
}
catch(NoClassDefFoundError e) {
	// The ability to stop repeating keys was added in Speech SDK v1.6 which
	// was released on M400 v1.1.4. Earlier versions will not support this.
}

Get the Maximum Recognizer Timeout Time

Beginning with SDK v1.91 you now have access to the maximum recognizer timeout time.

int recognizedMaxTimeoutTime = sc.getRecognizerTimeoutMax();

Getting and Setting the Recognizer Timeout Config

Beginning with SDK v1.91 you can now retrieve and set the recognizer timeout config.

int recognizerTimeoutConfig = sc.getRecognizerTimeoutConfig();
...
sc.setRecognizerTimeoutConfig(30);
// in seconds

Sample Project

A sample application for Android Studio demonstrating the Vuzix Speech SDK is available to download here.


Was this article helpful?

What's Next