Skip to content

Commit 444404b

Browse files
author
DavidQ
committed
Clean up Text to Speech V2 browser queue behavior and voice summary - PR_26130_017-text-to-speech-v2-queue-behavior-cleanup
1 parent 2a362e0 commit 444404b

10 files changed

Lines changed: 213 additions & 168 deletions

File tree

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# PR_26130_017-text-to-speech-v2-queue-behavior-cleanup
2+
3+
## Summary
4+
5+
Updated Text to Speech V2 to match browser `speechSynthesis` queue behavior instead of modeling fake concurrent speakers. Removed Auto Speak from the visible UI, schema, defaults, queue item data, and hydration path while safely stripping legacy `autoSpeak` from older saved queue items.
6+
7+
## Scope
8+
9+
- Tool fixed: Text to Speech V2.
10+
- Failing behavior before: browser speech queue behavior was represented as active independent speakers, and Auto Speak remained in schema/default/control paths.
11+
- Remaining failures after this PR: none known in the PR scope.
12+
- Out of scope: non-browser speech backends, true concurrent speech, sample JSON alignment, and full samples smoke validation.
13+
14+
## Changes
15+
16+
- `src/engine/audio/TextToSpeechEngine.js`
17+
- Replaced active speaker tracking with queued speech item tracking.
18+
- `append` calls `speechSynthesis.speak()` without global cancel.
19+
- `replace` explicitly calls `speechSynthesis.cancel()` and clears tracked queued items before speaking.
20+
- Stop now reflects browser-global queue cancellation and logs cleared queued item count.
21+
- `src/engine/audio/TextToSpeechDefaults.js`
22+
- Removed `autoSpeak` from required fields, default options, and default queue data.
23+
- `tools/schemas/tools/text2speach-V2.schema.json`
24+
- Removed `autoSpeak` from schema required fields and properties.
25+
- `tools/text2speach-V2/index.html`
26+
- Removed Auto Speak and Stop All controls.
27+
- `tools/text2speach-V2/js/TextToSpeechToolApp.js`
28+
- Removed auto-speak triggering and Stop All wiring.
29+
- Migrates legacy `autoSpeak`, `repeatCount`, and `delayBetweenRepeatsMs` fields out of saved queue items without rendering removed controls.
30+
- Logs browser queue-oriented speech status.
31+
- `tools/text2speach-V2/js/bootstrap.js`
32+
- Removed Auto Speak and Stop All element wiring.
33+
- `tools/text2speach-V2/js/controls/ActionNavControl.js`
34+
- Removed Stop All button plumbing.
35+
- `tools/text2speach-V2/js/controls/SpeechOptionsControl.js`
36+
- Removed Auto Speak state from control values.
37+
- Changed voice match details to count/filter heading plus language-prefixed bullet list.
38+
- `tests/playwright/tools/WorkspaceManagerV2.spec.mjs`
39+
- Updated Text to Speech V2 coverage for queue behavior, Auto Speak removal, schema/default cleanup, voice match details, neutral/unknown voice helper buckets, and workspace launch controls.
40+
41+
## Playwright
42+
43+
Playwright impacted: Yes.
44+
45+
Validated behavior:
46+
- Text to Speech V2 no longer exposes Auto Speak or Stop All controls.
47+
- Schema/default queue data no longer require or emit `autoSpeak`.
48+
- Browser queue append can queue multiple utterances without canceling existing queued speech.
49+
- Replace mode uses global browser cancel and reports the current queue.
50+
- Stop uses browser-global queue cancellation and reports cleared queued items.
51+
- Visible text does not claim multi-threaded or concurrent speech support.
52+
- Voice details render as:
53+
- `N voices match <filter>:`
54+
- one `- <language>: <voice name>` line per matching voice.
55+
- Neutral/unknown voices remain available under Any, Male Preferred, and Female Preferred filters.
56+
57+
Expected pass behavior:
58+
- Queue append keeps previous queued utterances and does not call `cancel()`.
59+
- Queue replace clears the tracked queue through browser cancel before speaking.
60+
- Auto Speak is absent from UI, schema, and default queue data.
61+
62+
Expected fail behavior:
63+
- Missing voices keep Speak disabled with actionable status.
64+
- Removed controls are not available for interaction.
65+
66+
## Validation
67+
68+
- PASS: `node --check src/engine/audio/TextToSpeechDefaults.js`
69+
- PASS: `node --check src/engine/audio/TextToSpeechEngine.js`
70+
- PASS: `node --check tools/text2speach-V2/js/TextToSpeechToolApp.js`
71+
- PASS: `node --check tools/text2speach-V2/js/bootstrap.js`
72+
- PASS: `node --check tools/text2speach-V2/js/controls/ActionNavControl.js`
73+
- PASS: `node --check tools/text2speach-V2/js/controls/SpeechOptionsControl.js`
74+
- PASS: `node --check tests/playwright/tools/WorkspaceManagerV2.spec.mjs`
75+
- PASS: `tools/schemas/tools/text2speach-V2.schema.json` parsed with `JSON.parse`.
76+
- PASS: `npx playwright test tests/playwright/tools/WorkspaceManagerV2.spec.mjs --project=playwright --workers=1 --reporter=list -g "Text to Speech V2|text2speach-V2"` passed 5 tests.
77+
- PASS: `npm run test:workspace-v2` passed 28 tests.
78+
- PASS: `git diff --check HEAD -- .` passed with only Windows line-ending warnings.
79+
- PASS: Text to Speech V2 HTML scan found no inline event handlers, no inline styles, and only allowed external module scripts.
80+
- PASS: Forbidden-scope scan found no `tools/shared`, `imageDataUrl`, or `start_of_day` matches in changed implementation/test scopes.
81+
82+
## Coverage
83+
84+
Playwright V8 coverage was generated by the required Workspace V2 run.
85+
86+
- `(78%) src/engine/audio/TextToSpeechEngine.js - changed JS file with browser V8 coverage`
87+
- `(100%) src/engine/audio/TextToSpeechDefaults.js - changed JS file with browser V8 coverage`
88+
- `(100%) tools/text2speach-V2/js/bootstrap.js - changed JS file with browser V8 coverage`
89+
- `(100%) tools/text2speach-V2/js/controls/ActionNavControl.js - changed JS file with browser V8 coverage`
90+
- `(100%) tools/text2speach-V2/js/controls/SpeechOptionsControl.js - changed JS file with browser V8 coverage`
91+
- `(100%) tools/text2speach-V2/js/TextToSpeechToolApp.js - changed JS file with browser V8 coverage`
92+
93+
## Full Samples Smoke Test
94+
95+
Skipped. This PR is limited to Text to Speech V2 browser speech queue behavior, Auto Speak removal, voice match display, and Workspace V2 Playwright coverage. It does not modify shared sample loading, sample JSON, broad game launch behavior, or shared runtime paths that require the full samples smoke test.
96+
97+
## Manual Validation
98+
99+
1. Open `tools/text2speach-V2/index.html`.
100+
2. Confirm the Speech Options panel has no Auto Speak control and Named Sentences has Speak, Pause, Resume, and Stop only.
101+
3. Confirm voice details show a heading like `4 voices match Any:` followed by one bullet per voice.
102+
4. Select Queue Mode `Append to queue`, speak two different named sentences, and confirm the status reports queued item counts without claiming concurrent speakers.
103+
5. Select Queue Mode `Replace current speech`, speak, and confirm status reports the replacement speech item.
104+
6. Use Stop and confirm status reports the number of queued items cleared.
105+
106+
Expected outcome: Text to Speech V2 uses browser queue semantics, removed controls stay absent, and no UI copy claims true concurrent speech.

src/engine/audio/TextToSpeechDefaults.js

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -88,13 +88,11 @@ const TEXT_TO_SPEECH_QUEUE_ITEM_REQUIRED_FIELDS = Object.freeze([
8888
"rate",
8989
"pitch",
9090
"queueMode",
91-
"autoSpeak",
9291
"characterPreset",
9392
"ssmlLikePreset"
9493
]);
9594

9695
const TEXT_TO_SPEECH_DEFAULT_OPTIONS = Object.freeze({
97-
autoSpeak: false,
9896
characterPreset: "manual",
9997
gender: "any",
10098
language: "en-US",
@@ -123,7 +121,6 @@ const TEXT_TO_SPEECH_DEFAULT_QUEUE = Object.freeze([
123121
Object.freeze({
124122
...TEXT_TO_SPEECH_DEFAULT_OPTIONS,
125123
...TEXT_TO_SPEECH_CHARACTER_PRESET_DEFAULTS.dramatic,
126-
autoSpeak: false,
127124
characterPreset: "dramatic",
128125
gender: "male-preferred",
129126
id: "hero-ready",
@@ -139,7 +136,6 @@ const TEXT_TO_SPEECH_DEFAULT_QUEUE = Object.freeze([
139136
Object.freeze({
140137
...TEXT_TO_SPEECH_DEFAULT_OPTIONS,
141138
...TEXT_TO_SPEECH_CHARACTER_PRESET_DEFAULTS.alert,
142-
autoSpeak: false,
143139
characterPreset: "alert",
144140
gender: "female-preferred",
145141
id: "alert-warning",

src/engine/audio/TextToSpeechEngine.js

Lines changed: 36 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ class TextToSpeechEngine {
1818
speechSynthesisRef = globalThis.speechSynthesis,
1919
utteranceCtor = globalThis.SpeechSynthesisUtterance
2020
} = {}) {
21-
this.activeSpeakers = new Map();
21+
this.queueSequence = 0;
22+
this.queuedSpeechItems = new Map();
2223
this.speechSynthesis = speechSynthesisRef;
2324
this.Utterance = utteranceCtor;
2425
}
@@ -95,8 +96,8 @@ class TextToSpeechEngine {
9596
};
9697
}
9798

98-
activeSpeakerList() {
99-
return Array.from(this.activeSpeakers.values()).map((speaker) => ({ ...speaker }));
99+
queuedSpeechItemList() {
100+
return Array.from(this.queuedSpeechItems.values()).map((item) => ({ ...item }));
100101
}
101102

102103
createUtterance({
@@ -146,15 +147,14 @@ class TextToSpeechEngine {
146147
}
147148

148149
speak({
149-
autoSpeak = TEXT_TO_SPEECH_DEFAULTS.autoSpeak,
150150
characterPreset = TEXT_TO_SPEECH_DEFAULTS.characterPreset,
151151
gender = TEXT_TO_SPEECH_DEFAULTS.gender,
152152
language = TEXT_TO_SPEECH_DEFAULTS.language,
153153
pitch = TEXT_TO_SPEECH_DEFAULTS.pitch,
154154
queueMode = TEXT_TO_SPEECH_DEFAULTS.queueMode,
155155
rate = TEXT_TO_SPEECH_DEFAULTS.rate,
156-
speakerId = "",
157-
speakerName = "",
156+
speechItemId = "",
157+
speechItemName = "",
158158
ssmlLikePreset = TEXT_TO_SPEECH_DEFAULTS.ssmlLikePreset,
159159
text = TEXT_TO_SPEECH_DEFAULTS.sampleText,
160160
voice = TEXT_TO_SPEECH_DEFAULTS.voice,
@@ -170,47 +170,54 @@ class TextToSpeechEngine {
170170
return { message: `Unsupported ${TEXT_TO_SPEECH_DISPLAY_NAME} queueMode: ${queueMode}.`, ok: false };
171171
}
172172

173-
const activeSpeakerId = String(speakerId || speakerName || `speaker-${Date.now().toString(36)}`);
174-
const activeSpeakerName = String(speakerName || activeSpeakerId);
175-
const activeSpeaker = {
176-
id: activeSpeakerId,
173+
if (queueMode === "replace") {
174+
this.speechSynthesis.cancel();
175+
this.queuedSpeechItems.clear();
176+
}
177+
178+
this.queueSequence += 1;
179+
const selectedSpeechItemId = String(speechItemId || speechItemName || `speech-item-${Date.now().toString(36)}`);
180+
const queuedSpeechItemId = `${selectedSpeechItemId || "speech-item"}::${this.queueSequence}`;
181+
const queuedSpeechItemName = String(speechItemName || selectedSpeechItemId);
182+
const queuedSpeechItem = {
183+
id: queuedSpeechItemId,
177184
language: firstUtterance.utterance.lang,
178-
name: activeSpeakerName,
185+
name: queuedSpeechItemName,
179186
pitch: firstUtterance.utterance.pitch,
180187
queueMode,
181188
rate: firstUtterance.utterance.rate,
189+
speechItemId: selectedSpeechItemId,
182190
status: "queued",
183191
text: firstUtterance.text,
184192
voiceName: firstUtterance.voiceName,
185193
voiceURI: firstUtterance.voiceURI,
186194
volume: firstUtterance.utterance.volume
187195
};
188196
firstUtterance.utterance.onstart = () => {
189-
const speaker = this.activeSpeakers.get(activeSpeakerId);
190-
if (speaker) {
191-
this.activeSpeakers.set(activeSpeakerId, { ...speaker, status: "speaking" });
197+
const item = this.queuedSpeechItems.get(queuedSpeechItemId);
198+
if (item) {
199+
this.queuedSpeechItems.set(queuedSpeechItemId, { ...item, status: "speaking" });
192200
}
193201
};
194-
const clearSpeaker = () => {
195-
this.activeSpeakers.delete(activeSpeakerId);
202+
const clearQueuedSpeechItem = () => {
203+
this.queuedSpeechItems.delete(queuedSpeechItemId);
196204
};
197-
firstUtterance.utterance.onend = clearSpeaker;
198-
firstUtterance.utterance.onerror = clearSpeaker;
199-
this.activeSpeakers.set(activeSpeakerId, activeSpeaker);
205+
firstUtterance.utterance.onend = clearQueuedSpeechItem;
206+
firstUtterance.utterance.onerror = clearQueuedSpeechItem;
207+
this.queuedSpeechItems.set(queuedSpeechItemId, queuedSpeechItem);
200208
this.speechSynthesis.speak(firstUtterance.utterance);
201209

202210
return {
203-
activeSpeakers: this.activeSpeakerList(),
204-
autoSpeak: autoSpeak === true,
205211
characterPreset,
206212
gender,
207213
language: firstUtterance.utterance.lang,
208214
ok: true,
209215
pitch: firstUtterance.utterance.pitch,
216+
queuedSpeechItems: this.queuedSpeechItemList(),
210217
queueMode,
211218
rate: firstUtterance.utterance.rate,
212-
speakerId: activeSpeakerId,
213-
speakerName: activeSpeakerName,
219+
speechItemId: selectedSpeechItemId,
220+
speechItemName: queuedSpeechItemName,
214221
ssmlLikePreset,
215222
text: firstUtterance.text,
216223
voiceAge,
@@ -236,41 +243,19 @@ class TextToSpeechEngine {
236243
return { ok: true };
237244
}
238245

239-
stop({ speakerId = "" } = {}) {
240-
if (!this.isSupported()) {
241-
return { message: "SpeechSynthesis is unavailable in this browser.", ok: false };
242-
}
243-
const selectedSpeakerId = String(speakerId || "");
244-
const speaker = selectedSpeakerId
245-
? this.activeSpeakers.get(selectedSpeakerId)
246-
: this.activeSpeakerList()[0];
247-
if (!speaker) {
248-
return { message: `No active ${TEXT_TO_SPEECH_DISPLAY_NAME} speaker is tracked for ${selectedSpeakerId || "the current selection"}.`, ok: false };
249-
}
250-
if (this.activeSpeakers.size > 1) {
251-
return {
252-
activeSpeakers: this.activeSpeakerList(),
253-
message: `Cannot stop only ${speaker.name}: browser SpeechSynthesis exposes global cancel only. No global cancel was called while other speakers are active.`,
254-
ok: false
255-
};
256-
}
257-
this.speechSynthesis.cancel();
258-
this.activeSpeakers.delete(speaker.id);
259-
return { activeSpeakers: this.activeSpeakerList(), ok: true, speakerName: speaker.name };
260-
}
261-
262-
stopAll() {
246+
stop() {
263247
if (!this.isSupported()) {
264248
return { message: "SpeechSynthesis is unavailable in this browser.", ok: false };
265249
}
266-
const stoppedCount = this.activeSpeakers.size;
250+
const stoppedCount = this.queuedSpeechItems.size;
267251
this.speechSynthesis.cancel();
268-
this.activeSpeakers.clear();
252+
this.queuedSpeechItems.clear();
269253
return { ok: true, stoppedCount };
270254
}
271255

272-
resetActiveSpeakers() {
273-
this.activeSpeakers.clear();
256+
resetQueuedSpeechItems() {
257+
this.queueSequence = 0;
258+
this.queuedSpeechItems.clear();
274259
return { ok: true };
275260
}
276261
}

0 commit comments

Comments
 (0)