Skip to content

Latest commit

 

History

History
287 lines (216 loc) · 7.09 KB

File metadata and controls

287 lines (216 loc) · 7.09 KB

TypeScript version Node.js version MIT

speechmarkdown-js

Speech Markdown grammar, parser, and formatters for use with JavaScript.

Supported platforms:

  • amazon-alexa
  • amazon-polly
  • amazon-polly-neural
  • apple-avspeechsynthesizer
  • google-assistant
  • ibm-watson
  • microsoft-azure
  • microsoft-sapi
  • w3c
  • samsung-bixby
  • elevenlabs

Find the architecture here

Platform-specific SSML notes are tracked in docs/platforms. Use npm run docs:update-voices to refresh the auto-generated voice maps in src/formatters/data when vendor credentials are available.

Quick start

SSML - Amazon Alexa

Convert Speech Markdown to SSML for Amazon Alexa

const smd = require('speechmarkdown-js');

const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
  platform: 'amazon-alexa',
};

const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);

The resulting SSML is:

<speak>
Sample <break time="3s"/> speech <break time="250ms"/> markdown
</speak>

SSML - Google Assistant

Convert Speech Markdown to SSML for Google Assistant

const smd = require('speechmarkdown-js');

const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
  platform: 'google-assistant',
};

const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);

The resulting SSML is:

<speak>
Sample <break time="3s"/> speech <break time="250ms"/> markdown
</speak>

SSML - Microsoft Azure

Convert Speech Markdown to SSML for Microsoft Azure with automatic MSTTS namespace injection

const smd = require('speechmarkdown-js');

const markdown = `(This is exciting news!)[excited:"1.5"] The new features are here.`;
const options = {
  platform: 'microsoft-azure',
};

const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);

The resulting SSML is:

<speak xmlns:mstts="https://www.w3.org/2001/mstts">
<mstts:express-as style="excited" styledegree="1.5">This is exciting news!</mstts:express-as> The new features are here.
</speak>

Azure supports 27 express-as styles including emotional styles (excited, disappointed, friendly, cheerful, sad, angry, etc.) and scenario-specific styles (newscaster, customerservice, chat, etc.). See Azure platform documentation for complete details.

Plain Text

Convert Speech Markdown to Plain Text

const smd = require('speechmarkdown-js');

const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {};

const speech = new smd.SpeechMarkdown();
const text = speech.toText(markdown, options);

The resulting text is:

Sample speech markdown

More

Options

You can pass options into the constructor:

const smd = require('speechmarkdown-js');

const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
  platform: 'amazon-alexa',
};

const speech = new smd.SpeechMarkdown(options);
const ssml = speech.toSSML(markdown);

Or in the methods toSSML and toText:

const smd = require('speechmarkdown-js');

const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
  platform: 'amazon-alexa',
};

const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);

Available options are:

  • platform (string) - Determines the formatter to use to render SSML. Valid values are:

    • "amazon-alexa"
    • "amazon-polly"
    • "amazon-polly-neural"
    • "apple-avspeechsynthesizer"
    • "google-assistant"
    • "ibm-watson"
    • "microsoft-azure"
    • "microsoft-sapi"
    • "w3c"
    • "samsung-bixby"
    • "elevenlabs"
  • includeFormatterComment (boolean) - Adds an XML comment to the SSML output indicating the formatter used. Default is false.

  • includeSpeakTag (boolean) - Determines if the <speak> tag will be rendered in the SSML output. Default is true.

  • includeParagraphTag (boolean) - Determines if the <p> tag will be rendered in the SSML output. Default is false.

  • preserveEmptyLines (boolean) - keep empty lines in markdown in SSML. Default is true.

  • escapeXmlSymbols (boolean) - Currently only for amazon-alexa and microsoft-azure. Escape XML text. Default is false.

  • voices (object) - give custom names to voices and use that in your markdown:

    {
      "platform": "amazon-alexa",
      "voices": {
        "Scott": { "voice": { "name": "Brian" } },
        "Sarah": { "voice": { "name": "Kendra" } }
      }
    }
    {
      "platform": "google-assistant",
      "voices": {
        "Brian": {
          "voice": { "gender": "male", "variant": 1, "language": "en-US" }
        },
        "Sarah": {
          "voice": { "gender": "female", "variant": 3, "language": "en-US" }
        }
      }
    }

Working on this project?

Grammar

The biggest place we need help right now is with the completion of the grammar and formatters.

Short Format

  • break
  • emphasis - strong
  • emphasis - moderate
  • emphasis - none
  • emphasis - reduced
  • ipa
  • sub

Short-form examples:

  • (pecan)/'pi.kæn/<phoneme alphabet="ipa" ph="'pi.kæn">pecan</phoneme>
  • (Al){aluminum}<sub alias="aluminum">Al</sub>
  • /ˈdeɪtə/<phoneme alphabet="ipa" ph="ˈdeɪtə">ipa</phoneme>

Standard Format

  • address
  • audio
  • break (time)
  • break (strength)
  • characters / chars
  • date
  • defaults (section)
  • disappointed
  • disappointed (section)
  • dj (section)
  • emphasis
  • excited
  • excited (section)
  • expletive / bleep
  • fraction
  • interjection
  • ipa
  • lang
  • lang (section)
  • mark
  • newscaster (section)
  • number
  • ordinal
  • telephone / phone
  • pitch
  • rate
  • sub
  • time
  • unit
  • voice
  • voice (section)
  • volume / vol
  • whisper

Available scripts

  • clean - remove coverage data, Jest cache and transpiled files,
  • build - perform all build tasks
  • build:ts - transpile TypeScript to ES5
  • build:browser - creates single file ./dist.browser/speechmarkdown.js file for use in browser,
  • build:minify - creates single file ./dist.browser/speechmarkdown.min.js file for use in browser,
  • watch - interactive watch mode to automatically transpile source files,
  • lint - lint source files and tests,
  • test - run tests,
  • test:watch - interactive watch mode to automatically re-run tests

License

Licensed under the MIT. See the LICENSE file for details.