Skip to content

Commit c7986d9

Browse files
committed
Update on-device explainer with changes + languages
1 parent 4d6e11e commit c7986d9

File tree

1 file changed

+63
-29
lines changed

1 file changed

+63
-29
lines changed

explainers/on-device-speech-recognition.md

Lines changed: 63 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -31,30 +31,61 @@ Some websites would only adopt the Web Speech API if it meets strict performance
3131
### 3. Educational website (e.g. khanacademy.org)
3232
Applications that need to function in unreliable or offline network conditions—such as voice-based productivity tools, educational software, or accessibility features—benefit from on-device speech recognition. This enables uninterrupted functionality during flights, remote travel, or in areas with limited connectivity. When on-device recognition is unavailable, a website can choose to hide the UI or gracefully degrade functionality to maintain a coherent user experience.
3333

34-
## New Methods
34+
## New API Components
3535

36-
### 1. `Promise<boolean> availableOnDevice(DOMString lang)`
37-
This method checks if on-device speech recognition is available for a specific language. Developers can use this to determine whether to enable features that require on-device speech recognition.
36+
This enhancement introduces one new attribute to the `SpeechRecognition` interface and two new static methods for managing on-device capabilities.
37+
38+
### 1. `processLocally` Attribute
39+
The `processLocally` boolean attribute on a `SpeechRecognition` instance allows developers to require that speech recognition be performed locally on the user's device.
40+
41+
- When set to `true`, the recognition session **must** be processed locally. If on-device recognition is not available for the specified language, the session will fail with a `service-not-allowed` error.
42+
- When `false` (the default), the user agent is free to use either local or cloud-based recognition.
3843

3944
#### Example Usage
4045
```javascript
41-
const lang = 'en-US';
42-
SpeechRecognition.availableOnDevice(lang).then((available) => {
43-
if (available) {
44-
console.log(`On-device speech recognition is available for ${lang}.`);
45-
} else {
46-
console.log(`On-device speech recognition is not available for ${lang}.`);
47-
}
46+
const recognition = new SpeechRecognition();
47+
recognition.lang = 'en-US';
48+
recognition.processLocally = true; // Require on-device speech recognition.
49+
50+
recognition.onerror = (event) => {
51+
if (event.error === 'service-not-allowed') {
52+
console.error('On-device recognition is not available for the selected language, or the request was denied.');
53+
}
54+
};
55+
56+
recognition.start();
57+
```
58+
59+
### 2. `Promise<boolean> available(SpeechRecognitionOptions options)`
60+
The static `SpeechRecognition.available(options)` method allows developers to check the availability of speech recognition for a given set of languages and processing preferences. It returns a `Promise` that resolves with an `AvailabilityStatus` string.
61+
62+
#### Example Usage
63+
```javascript
64+
const options = {
65+
langs: ['en-US'],
66+
processLocally: true // Check for on-device availability
67+
};
68+
69+
SpeechRecognition.available(options).then((status) => {
70+
console.log(`On-device availability for ${options.langs.join(', ')}: ${status}`);
71+
if (status === 'available') {
72+
console.log('Ready to use on-device recognition.');
73+
} else if (status === 'downloadable') {
74+
console.log('On-device recognition can be installed.');
75+
}
4876
});
4977
```
5078

51-
### 2. `Promise<boolean> installOnDevice(DOMString[] lang)`
79+
### 2. `Promise<boolean> install(SpeechRecognitionOptions options)`
5280
This method install the resources required for on-device speech recognition for the given BCP-47 language codes. The installation process may download and configure necessary language models.
5381

5482
#### Example Usage
5583
```javascript
56-
const lang = 'en-US';
57-
SpeechRecognition.installOnDevice([lang]).then((success) => {
84+
const options = {
85+
langs: ['en-US'],
86+
processLocally: true
87+
};
88+
SpeechRecognition.install(options).then((success) => {
5889
if (success) {
5990
console.log('On-device speech recognition resources installed successfully.');
6091
} else {
@@ -63,22 +94,25 @@ SpeechRecognition.installOnDevice([lang]).then((success) => {
6394
});
6495
```
6596

66-
## New Attribute
67-
68-
### 1. `mode` attribute in the `SpeechRecognition` interface
69-
The `mode` attribute in the `SpeechRecognition` interface defines how speech recognition should behave when starting a session.
70-
71-
#### `SpeechRecognitionMode` Enum
72-
73-
- **"on-device-preferred"**: Use on-device speech recognition if available. If not, fall back to cloud-based speech recognition.
74-
- **"on-device-only"**: Only use on-device speech recognition. If it's unavailable, throw an error.
75-
76-
#### Example Usage
77-
```javascript
78-
const recognition = new SpeechRecognition();
79-
recognition.mode = "ondevice-only"; // Only use on-device speech recognition.
80-
recognition.start();
81-
```
97+
## Supported languages
98+
The availability of on-device speech recognition languages is user-agent dependent. As an example, Google Chrome supports the following languages for on-device recognition:
99+
* de-DE (German, Germany)
100+
* en-US (English, United States)
101+
* es-ES (Spanish, Spain)
102+
* fr-FR (French, France)
103+
* hi-IN (Hindi, India)
104+
* id-ID (Indonesian, Indonesia)
105+
* it-IT (Italian, Italy)
106+
* ja-JP (Japanese, Japan)
107+
* ko-KR (Korean, South Korea)
108+
* pl-PL (Polish, Poland)
109+
* pt-BR (Portuguese, Brazil)
110+
* ru-RU (Russian, Russia)
111+
* th-TH (Thai, Thailand)
112+
* tr-TR (Turkish, Turkey)
113+
* vi-VN (Vietnamese, Vietnam)
114+
* zh-CN (Chinese, Mandarin, Simplified)
115+
* zh-TW (Chinese, Mandarin, Traditional)
82116

83117
## Privacy considerations
84118
To reduce the risk of fingerprinting, user agents must implementing privacy-preserving countermeasures. The Web Speech API will employ the same masking techniques used by the [Web Translation API](https://github.com/webmachinelearning/writing-assistance-apis/pull/47).

0 commit comments

Comments
 (0)