You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: explainers/on-device-speech-recognition.md
+63-29Lines changed: 63 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,30 +31,61 @@ Some websites would only adopt the Web Speech API if it meets strict performance
31
31
### 3. Educational website (e.g. khanacademy.org)
32
32
Applications that need to function in unreliable or offline network conditions—such as voice-based productivity tools, educational software, or accessibility features—benefit from on-device speech recognition. This enables uninterrupted functionality during flights, remote travel, or in areas with limited connectivity. When on-device recognition is unavailable, a website can choose to hide the UI or gracefully degrade functionality to maintain a coherent user experience.
This method checks if on-device speech recognition is available for a specific language. Developers can use this to determine whether to enable features that require on-device speech recognition.
36
+
This enhancement introduces one new attribute to the `SpeechRecognition` interface and two new static methods for managing on-device capabilities.
37
+
38
+
### 1. `processLocally` Attribute
39
+
The `processLocally` boolean attribute on a `SpeechRecognition` instance allows developers to require that speech recognition be performed locally on the user's device.
40
+
41
+
- When set to `true`, the recognition session **must** be processed locally. If on-device recognition is not available for the specified language, the session will fail with a `service-not-allowed` error.
42
+
- When `false` (the default), the user agent is free to use either local or cloud-based recognition.
The static `SpeechRecognition.available(options)` method allows developers to check the availability of speech recognition for a given set of languages and processing preferences. It returns a `Promise` that resolves with an `AvailabilityStatus` string.
61
+
62
+
#### Example Usage
63
+
```javascript
64
+
constoptions= {
65
+
langs: ['en-US'],
66
+
processLocally:true// Check for on-device availability
This method install the resources required for on-device speech recognition for the given BCP-47 language codes. The installation process may download and configure necessary language models.
### 1. `mode` attribute in the `SpeechRecognition` interface
69
-
The `mode` attribute in the `SpeechRecognition` interface defines how speech recognition should behave when starting a session.
70
-
71
-
#### `SpeechRecognitionMode` Enum
72
-
73
-
-**"on-device-preferred"**: Use on-device speech recognition if available. If not, fall back to cloud-based speech recognition.
74
-
-**"on-device-only"**: Only use on-device speech recognition. If it's unavailable, throw an error.
75
-
76
-
#### Example Usage
77
-
```javascript
78
-
constrecognition=newSpeechRecognition();
79
-
recognition.mode="ondevice-only"; // Only use on-device speech recognition.
80
-
recognition.start();
81
-
```
97
+
## Supported languages
98
+
The availability of on-device speech recognition languages is user-agent dependent. As an example, Google Chrome supports the following languages for on-device recognition:
99
+
* de-DE (German, Germany)
100
+
* en-US (English, United States)
101
+
* es-ES (Spanish, Spain)
102
+
* fr-FR (French, France)
103
+
* hi-IN (Hindi, India)
104
+
* id-ID (Indonesian, Indonesia)
105
+
* it-IT (Italian, Italy)
106
+
* ja-JP (Japanese, Japan)
107
+
* ko-KR (Korean, South Korea)
108
+
* pl-PL (Polish, Poland)
109
+
* pt-BR (Portuguese, Brazil)
110
+
* ru-RU (Russian, Russia)
111
+
* th-TH (Thai, Thailand)
112
+
* tr-TR (Turkish, Turkey)
113
+
* vi-VN (Vietnamese, Vietnam)
114
+
* zh-CN (Chinese, Mandarin, Simplified)
115
+
* zh-TW (Chinese, Mandarin, Traditional)
82
116
83
117
## Privacy considerations
84
118
To reduce the risk of fingerprinting, user agents must implementing privacy-preserving countermeasures. The Web Speech API will employ the same masking techniques used by the [Web Translation API](https://github.com/webmachinelearning/writing-assistance-apis/pull/47).
0 commit comments