Skip to content

Bug: iOS Chrome — voice input (dictation) sends duplicate/repeated text to terminal #70

@buallen

Description

@buallen

Environment

  • Device: iPhone (iOS 16+)
  • Browser: Chrome 117.x (CriOS)
  • Reproduced on: iOS 16.x + Chrome 117.0.5938.108

Problem

When using iOS voice dictation (microphone button on keyboard) to input text, the terminal receives duplicate or repeated characters. For example, saying "hello" may result in hellohello or helloello being sent.

Root Cause

iOS 16 voice dictation fires InputEvent with inputType = 'insertFromVoice' on the xterm helper textarea. The xterm internal _inputEvent handler also processes this event as normal keyboard input, causing double-sending.

Additionally, iOS accumulates all dictation phrases in textarea.value across a session — each new phrase appends to the existing value rather than replacing it. Without tracking the previous value, the entire accumulated string gets re-sent on every phrase.

Steps to Reproduce

  1. Open claude-code-web on iPhone with Chrome
  2. Tap the terminal to focus it
  3. Use the microphone button on the iOS keyboard to dictate text (e.g. say "hello world")
  4. Observe that the terminal receives the text multiple times or with extra characters

Expected Behavior

Each dictated phrase should be sent exactly once to the terminal.

Technical Details

The fix requires:

  1. Listen for insertFromVoice InputEvent on the xterm helper textarea and intercept before xterm processes it:
document.addEventListener('input', (ev) => {
  if (ev.target !== ta) return;
  if (ev.inputType !== 'insertFromVoice' && ev.inputType !== 'insertText') return;
  ev.stopImmediatePropagation();
  // extract only new suffix since last phrase
  const newText = ta.value.startsWith(_lastTaValue)
    ? ta.value.slice(_lastTaValue.length)
    : ta.value;
  _lastTaValue = ta.value;
  if (newText.trim()) terminal._core.coreService.triggerDataEvent(newText, true);
}, true); // capture phase
  1. Guard insertText events — only intercept them during an active voice session (after an insertFromVoice event has been seen), to avoid breaking normal swipe/autocorrect keyboard input:
let _inVoiceSession = false;
// Only block insertText if we're in a voice session
if (ev.inputType === 'insertText' && !_inVoiceSession) return;
if (ev.inputType === 'insertFromVoice') _inVoiceSession = true;
  1. Reset voice session after idle timeout (e.g. 2s) to handle session end:
clearTimeout(_voiceIdleTimer);
_voiceIdleTimer = setTimeout(() => {
  ta.value = '';
  _inVoiceSession = false;
  _lastTaValue = '';
}, 2000);

This approach is tested and working on iOS 16+ Chrome 117.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions