Advanced Speech SDK Controls
  • 11 Jun 2024
  • 5 Minutes to read
  • Contributors
  • Dark
    Light

Advanced Speech SDK Controls

  • Dark
    Light

Article summary

Advanced Controls

The Vuzix Speech Command engine has advanced controls described here. These have been expanded since the initial SDK was released.

Enabling and Disabling Speech Recognition

The Vuzix Speech SDK will listen for the trigger phrases "Hello Vuzix" or "Hello Blade" whenever Vuzix Speech Command engine is enabled in the Settings menu (unless explicitly removed by an application).

When the Speech Command engine is enabled, the microphone icon becomes outlined. When the Speech Command engine is disabled, the microphone icon in the notification bar is not present.

The Speech Command engine has global commands, such as "go home" and "start recording" that are processed in any application. The Speech Command engine also supports custom vocabulary that is processed by each individual application.

It is possible for an application to rely on custom voice commands to perform essential tasks. In this scenario, it would be an unwanted burden to require the user to navigate to the system Settings menu. Instead the Speech Command engine may be programmatically enabled from within an application.

import com.vuzix.sdk.speechrecognitionservice.VuzixSpeechClient;
try {
	VuzixSpeechClient.EnableRecognizer(getApplicationContext(), true);
}
catch(NoClassDefFoundError e) {
	// This device does not implement the Vuzix Speech SDK
	// todo: Implelment error recovery
}

This method is static. Passing the the optional context parameter allows the proper user permissions to be applied, and is recommended for robustness.

The command engine may be similarly disabled via code during times when false detection would impair the application behavior.

VuzixSpeechClient.EnableRecognizer(getApplicationContext(), false);

Once the Speech Command engine is disabled, the notification bar icon is grayed-out, and the phrase "Hello Vuzix" will no longer trigger speech recognition.

It is safe to set the Speech Command engine to the existing state, so there is no need to query the state before enabling or disabling the Speech Command engine. Simply specify the desired state. However, if you want to display the current enabled/disabled state you can query it using isRecognizerEnabled(). This value is not changed by the system while your application is active so the appropriate place for this query is your activity onResume().

bool mSpeechEnabled;@Override protected void onResume() {
	super.onResume();
	mSpeechEnabled = VuzixSpeechClient.isRecognizerEnabled(this);
	// todo: update status to user showing state of mSpeechEnabled
}

Triggering the Speech Command Engine

When the Speech Command engine is enabled, the engine remains in a low-power mode listening only for the trigger phrase, "Hello Vuzix". Once this is heard, the engine wakes and becomes active. This state is indicated by the microphone icon in the notification bar becoming fully filled. While active, all audio data is scanned for known phrases.

It is possible for an application to programmatically trigger the recognizer to wake and become active, rather than relying on the "Hello Vuzix" trigger phrase. This can be tied to a button press or a fragment opening.

import com.vuzix.sdk.speechrecognitionservice.VuzixSpeechClient;
try {
	VuzixSpeechClient.TriggerVoiceAudio(getApplicationContext(), true);
}
catch(NoClassDefFoundError e) {
	// This device does not implement the Vuzix Speech SDK
	// TODO: Implement error recovery
}

The Speech Command engine has a timeout that can be modified in the system Settings menu. The active engine will return to idle mode after that duration has elapsed since the most recent phrase was recognized. This state is again indicated by the microphone icon in the notification bar returning to the unfilled outline icon, and the engine will only respond to the trigger phrase "Hello Vuzix."

Some workflows are best suited to return the active engine to idle at a specific time. For example, during recording of a voice memo. This prevents phrases such as "go back" and "go home" from being recognized and acted upon.

The Speech Command engine may be programmatically un-triggered to idle state with the same method.

VuzixSpeechClient.TriggerVoiceAudio(getApplicationContext(), false);

Trigger State Notification

Since the Speech Command engine may be triggered externally and may timeout internally, it is likely that applications that wish to control this behavior need to know the state of the engine.

The same Speech Command Intent that broadcasts phrases also broadcasts state change updates. Simply check for the presence of the extra boolean RECOGNIZER_ACTIVE_BOOL_EXTRA.

boolean mSpeechTriggered;
@Override public void onReceive(Context context, Intent intent) {
	if (intent.getAction().equals(VuzixSpeechClient.ACTION_VOICE_COMMAND)) {
		Bundle extras = intent.getExtras();
		if (extras != null) {
			// We will determine what type of message this is based upon the extras provided
			if (extras.containsKey(VuzixSpeechClient.RECOGNIZER_ACTIVE_BOOL_EXTRA)) {
				// if we get a recognizer active bool extra, it means the recognizer was
				// activated or stopped
				mSpeechTriggered = extras.getBoolean(VuzixSpeechClient.RECOGNIZER_ACTIVE_BOOL_EXTRA, false);
				// TODO: Implement behavior based upon the engine being changed to active or idle
			}
		}
	}
}

Since the state may also change while your application is not running, if you display the state using these notifications you should also query the current state in your onResume().

bool mSpeechTriggered;@Override protected void onResume() {
	super.onResume();
	mSpeechTriggered = VuzixSpeechClient.isRecognizerTriggered(this);
	// TODO: Implement behavior based upon the recognizer being changed to active or idle
}

Startup Timing Concerns

It is possible for applications that automatically launch with the operating system to be initialized before the speech engine has come online. This is true for launcher applications, among others. Any speech queries or commands issued at startup will fail, and must be retried after the speech engine comes online. In such applications, you should surround initialization logic with a call such as:

if( VuzixSpeechClient.isRecognizerInitialized(this) ) {
	//todo perform your speech customizations here
}

Even if the initialization code cannot be run at startup, you should still register the broadcast receiver for the trigger state, as described in the preceding section. When the engine becomes initialized it will send out an initial trigger state. The receipt of this trigger state can cause your application to retry the speech initialization. This allows you to create an application that is starts before the speech engine, and can interact with the speech engine as soon as it becomes available without any unnecessary polling.

Canceling Repeating Characters

Certain Commands like "scroll up" and "scroll down" initiate repeating key presses. This allows the user interface to continue to scroll in the selected direction. The repeating key presses stop when the engine detects any other phrase, such as "select this". The default phrase "stop" is recognized by the speech engine and has no behavior other than to terminate the scrolling.

You may wish to stop repeating key presses programatically without requiring the user to say another phrase. This is useful when reaching the first or last item in a list. To do this, simply call StopRepeatingKeys().

try {
	sc.StopRepeatingKeys();
}
catch(NoClassDefFoundError e) {
	// The ability to stop repeating keys was added in Speech SDK v1.6 which
	// was released on M400 v1.1.4. Earlier versions will not support this.
}

Get the Maximum Recognizer Timeout Time

Beginning with SDK v1.91 you now have access to the maximum recognizer timeout time.

int recognizedMaxTimeoutTime = sc.getRecognizerTimeoutMax();

Getting and Setting the Recognizer Timeout Config

Beginning with SDK v1.91 you can now retrieve and set the recognizer timeout config.

int recognizerTimeoutConfig = sc.getRecognizerTimeoutConfig();
...
sc.setRecognizerTimeoutConfig(30);
// in seconds

Sample Project

A sample application for Android Studio demonstrating the Vuzix Speech SDK is available to download here.


Was this article helpful?

What's Next