Getting Started Code
  • 07 Jun 2024
  • 8 Minutes to read
  • Contributors
  • Dark
    Light

Getting Started Code

  • Dark
    Light

Article summary

Creating the Speech Client

To work with the Vuzix Speech SDK, you first create a VuzixSpeechClient, and pass it your Activity

import com.vuzix.sdk.speechrecognitionservice.VuzixSpeechClient;
Activity myActivity = this;
VuzixSpeechClient sc = new VuzixSpeechClient(myActivity);

Handling Exceptions

It is possible for a user to attempt to run code compiled against the Vuzix Speech SDK on non-Vuzix hardware. This will generate a RuntimeException "Stub!" to be thrown. It is also possible to write an application against the latest Vuzix Speech SDK, and have customers attempt to run the application on older devices. Any calls to unsupported interfaces will cause a NoClassDefFoundError. For this reason all SDK calls should be inside try / catch blocks.

// Surround the creation of the VuzixSpeechClient with a try/catch for non-Vuzix hardware VuzixSpeechClient sc;
try {
    sc = new VuzixSpeechClient(myActivity);
} catch (RuntimeException e) {
    if (e.getMessage().equals("Stub!")) {
        // This is not being run on Vuzix hardware (or the Proguard rules are incorrect)
        // Alert the user, or insert recovery here.
    } else {
        // Other RuntimeException to be handled
    }
}
// Surround all speech client commands with try/catch
try {
    // sc.anySdkCommandHere();
} catch (NoClassDefFoundError e) {
    // The hardware does not support the specific command expected by the Vuzix Speech SDK.
    // Alert the user, or insert recovery here.
}

For brevity, this article may omit the try/catch blocks, but creating a robust application requires they be present.

Removing Existing Phrases

Removing existing phrases may reduce the likelihood that the speech recognizer resolves the incorrect phrase. This is especially true if your phrases sound similar to default phrases. For example, a game control of "roll right" might be confused with the default phrase "scroll right."

Your application may want to control the navigation itself, in which case you could remove default navigation commands to prevent confusion.

The default vocabulary may be modified by removing individual phrases with commands such as:

sc.deletePhrase("torch on"); 
sc.deletePhrase("torch off");

Or the entire default vocabulary may be removed from your activity using an asterisk as a wildcard.

sc.deletePhrase("*");

Note, on M300XL and M300 the default wake work "Hello Vuzix" is not deleted.

The results of the modified recognizer map can be viewed during debugging with the dump command.

Log.i(LOG_TAG, sc.dump());

Additionally it is important to note that when the speech recognizer gets a command that is already in the list, the previous command is overwritten. There is no need to delete the original entry first. For example this is useful if you want to implement your own navigation methods using "go up", "go down", "go left", ect.

Adding Custom Trigger Phrases

When the speech recognition engine is enabled but idle it is listening only for a trigger phrase, also known as a "wake word". Once the trigger phrase is recognized, the engine transitions to the active state where it listens for the full vocabulary (such as "go home" and "select this"). The M300XL and M300 use localized wake words based on which language is currently selected. For each language the wake word will be a translation of "Hello Vuzix".

The speech engine will time-out after the period configured in the system settings and return to the idle state that listens only for the trigger phrases. The operator can circumvent the timeout and immediately return to idle by saying a voice-off phrase. By default the voice-off phrase is "voice off". You can insert custom voice-off phrases using the following commands.

sc.insertVoiceOffPhrase("voice off");
// Add-back the default phrase for consistency
sc.insertVoiceOffPhrase("privacy please");
// Add application specific stop listening phrase

Adding Phrases to Receive Keycodes

You can register for a spoken command that will generate a keycode. This keycode will behave exactly the same as if a USB keyboard were present and generated that key. This capability is implemented by mapping phrases to Android key events (android.view.KeyEvent).

sc.insertKeycodePhrase("toggle caps lock", KEYCODE_CAPS_LOCK);
Log.i(LOG_TAG, sc.dump());

Keycodes added by your application will be processed in addition to the Keycodes in the base vocabulary set:

  • K_DPAD_LEFT "move left"

  • K_DPAD_RIGHT "move right"

  • K_DPAD_UP "move up"

  • K_DPAD_DOWN "move down"

  • K_NAVIGATE_IN "move in"

  • K_NAVIGATE_OUT "move out"

  • K_FORWARD "move forward"

  • K_BACK "move back"

  • K_DPAD_LEFT "scroll left" (repeats)

  • K_DPAD_RIGHT "scroll right" (repeats)

  • K_DPAD_UP "scroll up" (repeats)

  • K_DPAD_DOWN "scroll down" (repeats)

  • K_DPAD_LEFT "go left"

  • K_DPAD_RIGHT "go right"

  • K_DPAD_UP "go up"

  • K_DPAD_DOWN "go down"

  • K_NAVIGATE_IN "go in"

  • K_NAVIGATE_OUT "go out"

  • K_FORWARD "go forward"

  • K_BACK "go back"

  • K_HOME "go home"

  • k_ENTER "select this"

  • k_ENTER "pick this"

  • K_HOME "quit"

  • K_ENTER "okay"

  • K_ENTER "confirm"

  • K_NAVIGATE_NEXT "next"

  • K_NAVIGATE_PREVIOUS "previous"

  • K_ESCAPE "cancel"

  • K_MENU "show menu"

Note that speaking any valid phrase will terminate a previous repeating keycode. The phrase "stop" terminates the repeating keycodes an has no further behavior.

Adding Phrases to Receive Intents

The most common use for the speech recognition is to receive intents which can trigger any custom actions, rather than simply receiving keycodes. To do this you must have a broadcast receiver in your application such as:

public class VoiceCmdReceiver extends BroadcastReceiver { .... }

The broadcast receiver must register with Android for the Vuzix speech intent. This can be done in the constructor as shown here.

public class VoiceCmdReceiver extends BroadcastReceiver {
	public VoiceCmdReceiver(MainActivity iActivity) {
		iActivity.registerReceiver(this, new IntentFilter(VuzixSpeechClient.ACTION_VOICE_COMMAND));
		...
	}
}

The phrases you want to receive can be inserted in the same constructor. This is done using insertPhrase() which registers a phrase for the speech SDK intent. The parameter is a string containing the phrase for which you want the device to listen.

public class VoiceCmdReceiver extends BroadcastReceiver {
    public VoiceCmdReceiver(MainActivity iActivity) {
        iActivity.registerReceiver(this, new IntentFilter(VuzixSpeechClient.ACTION_VOICE_COMMAND));
        VuzixSpeechClient sc = new VuzixSpeechClient(iActivity);
        sc.insertPhrase("testing");
        Log.i(LOG_TAG, sc.dump());
    }
}

Now handle the speech SDK intent VuzixSpeechClient.ACTION_VOICE_COMMAND in your onReceive() method. Whatever phrase you used in insertPhrase() will be provided in the the received intent as a string extra named VuzixSpeechClient.PHRASE_STRING_EXTRA.

public class VoiceCmdReceiver extends BroadcastReceiver {
	public VoiceCmdReceiver(MainActivity iActivity) {
		...
	}
	@Override public void onReceive(Context context, Intent intent) {
		// All phrases registered with insertPhrase() match ACTION_VOICE_COMMAND
		if (intent.getAction().equals(VuzixSpeechClient.ACTION_VOICE_COMMAND)) {
			String phrase = intent.getStringExtra(VuzixSpeechClient.PHRASE_STRING_EXTRA);
			if (phrase != null ) {
				if (phrase.equals("testing")) {
					// todo: add test behavior
				}
			}
		}
	}
}

With that, you will be able to say "Hello Vuzix" to activate the recognizer, followed by "testing" and the code you inserted in place of the //todo will execute.

The recognizer always broadcasts the phrase that was registered in insertPhrase(). Note: If the phrase contains spaces, they will be replaced by underscores.

Replacement Text

As mentioned above, the string that was recognized is returned, with spaces replaced by underscores. That can be somewhat cumbersome to the developer, especially since we expect recognized spoken phrases to be localized into many languages.

To make this easier, insertPhrase() can take an optional substitution string parameter. When this is supplied, the substitution string is returned in place of the spoken text.

This example updates the original by replacing the hard-coded strings properly. Notice insertPhrase() is given two parameters, and it is the second that is used by the onReceive() method.

Note: The substitution string may not contain spaces

This now gives us a complete solution to receive a custom phrase and handle it properly.

public class VoiceCmdReceiver extends BroadcastReceiver {
    final String MATCH_TESTING = "Phrase_Testing";
    public VoiceCmdReceiver(MainActivity iActivity) {
        iActivity.registerReceiver(this, new IntentFilter(VuzixSpeechClient.ACTION_VOICE_COMMAND));
        VuzixSpeechClient sc = new VuzixSpeechClient(iActivity);
        // strings.xml contains: testing my voice application
        sc.insertPhrase(iActivity.getResources().getString(R.string.spoken_phrase_testing), MATCH_TESTING);
        Log.i(LOG_TAG, sc.dump());
    }
    @Override public void onReceive(Context context, Intent intent) {
        // All phrases registered with insertPhrase() match ACTION_VOICE_COMMAND
        if (intent.getAction().equals(VuzixSpeechClient.ACTION_VOICE_COMMAND)) {
            String phrase = intent.getStringExtra(VuzixSpeechClient.PHRASE_STRING_EXTRA);
            if (phrase != null) {
                if (phrase.equals(MATCH_TESTING)) {
                    // Todo: add test behavior
                }
            }
        }
    }
}

The substitution parameter also allows us to create multiple phrases that perform the same action. Phrases in the recognizer must be unique but substitution text does not.

We could have multiple insertPhrase() calls with different phrase parameters and identical substitutions. Use this technique to simplify your code in situations where you do not need to differentiate between phrases. For example, the phrases "start call" and "make a call" can have the same substitution.

Adding Phrases to Receive Custom Intents

To add even more flexibility, the speech SDK can send any intent you define, rather than only sending its own ACTION_VOICE_COMMAND. This is especially useful for creating multiple broadcast receivers and directing the intents properly.

Note, this example differs from the above in that the CUSTOM_SDK_INTENT is used in place of ACTION_VOICE_COMMAND

public class VoiceCmdReceiver extends BroadcastReceiver {
    public final String CUSTOM_SDK_INTENT = "com.your_company.CustomIntent";
    final String CUSTOM_EVENT = "my_event";
    public VoiceCmdReceiver(MainActivity iActivity) {
        iActivity.registerReceiver(this, new IntentFilter(CUSTOM_SDK_INTENT); 
		VuzixSpeechClient sc = new VuzixSpeechClient(iActivity); 
		Intent customSdkIntent = new Intent(mMainActivity.CUSTOM_SDK_INTENT); 
		sc.defineIntent(CUSTOM_EVENT, customSdkIntent);
        // strings.xml contains: testing my voice application
        sc.insertIntentPhrase(iActivity.getResources().getString(R.string.spoken_phrase_testing), CUSTOM_EVENT); 
		Log.i(LOG_TAG, sc.dump());
    }
    @Override public void onReceive(Context context, Intent intent) {
        // Since we only registered one phrase to this intent, we don't need any further switching. We know we got our CUSTOM_EVENT
        // todo: add test behavior
    }
}

The system can support multiple broadcast receivers. Each receiver simply registers for the intents it expects to receive. They do not need to be in the same class that creates the VuzixSpeechClient.

Deleting a Custom Intent

Beginning with SDK v1.91 you can now delete a custom intent. Similar to inserting an intent, call the deleteIntent() method and supply the label of the intent you wish to delete.

// Voice command custom intent names
final String TOAST_EVENT = "other_toast";
...
sc.deleteIntent(TOAST_EVENT);

Listing all Intent Labels

Beginning with SDK v1.91 you can now list all intent labels. The list will be returned as a List.

List intentLabels = sc.getIntentLabels();

Checking the Engine Version

As mentioned above, it is possible for the SDK to expose newer calls than what is supported by a given device OS version. You can query getEngineVersion() to determine the version of the engine on the device to allow you to protect newer SDK calls with conditional logic to eliminate possible NoClassDefFoundError from being generated. For example, if you know the device is running SDK v1.8 you would not attempt calls introduced in v1.9.

Because getEngineVersion() is a newer SDK call, it should itself be protected.

float version = 1.4f;
// The first stable SDK released with M300 v1.2.6
try {
	version = sc.getEngineVersion();
	Log.d(mMainActivity.LOG_TAG, "Device is running SDK v" + version);
}
catch (NoSuchMethodError e) {
	Log.d(mMainActivity.LOG_TAG, "Device is running SDK prior to v1.8. Assuming version " + version);
}

Sample Project

A sample application for Android Studio demonstrating the Vuzix Speech SDK is available to download here.


Was this article helpful?