Speechmatics offers a robust suite of APIs designed to convert speech into high-fidelity text transcripts. Their cloud-based, deep learning models deliver exceptional accuracy across various accents, dialects, and audio quality levels. Integrate Speechmatics' APIs to empower your applications with features like:

  • Speech-to-Text Transcription: Convert audio and video files into accurate text transcripts.
  • Real-time Speech Recognition: Enable real-time transcriptions for applications like live captioning and voice assistants.
  • Customizable Vocabularies: Tailor speech recognition for specific domains or industries with custom language models.

Authentication:

Speechmatics utilizes API keys for authentication. You can obtain your free API key with limited usage by creating an account on the Speechmatics platform. Paid plans offer increased usage limits, additional features, and priority processing.

Here's a breakdown of some key Speechmatics Text API endpoints along with JavaScript example code snippets for each:

1. Batch Speech-to-Text Transcription (Transcript)

  • This endpoint transcribes audio or video files asynchronously and returns the text transcript.
  • Type: POST
  • JavaScript Example:
const apiKey = 'YOUR_API_KEY';
const url = 'https://api.speechmatics.com/v1/speech/transcribe';

const data = {
    config: {
        language: 'en-GB', // Change language code as needed
    },
    audio: {
        uri: 'https://www.example.com/audio.mp3', // Replace with your audio/video URL
    },
};

const headers = {
    Authorization: `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
};

fetch(url, { method: 'POST', headers, body: JSON.stringify(data) })
    .then((response) => response.json())
    .then((data) => {
        console.log('Speech Transcript:', data.results.transcripts[0].text);
    })
    .catch((error) => console.error(error));

2. Real-time Speech Recognition (Streaming)

  • This endpoint enables real-time transcription of audio streams.
  • Type: WebSocket (Requires additional configuration beyond basic fetch requests)
  • JavaScript Example: (Note: More complex setup than basic fetch requests)

Please refer to Speechmatics' documentation for detailed instructions on implementing real-time speech recognition using WebSockets.

3. Custom Vocabularies (CreateVocabulary) (Paid Plans)

  • This endpoint allows you to create custom language models trained on your specific domain terminology.
  • Type: POST
  • JavaScript Example: (Note: Requires a paid plan)

Speechmatics documentation offers guidance on creating custom vocabularies through their developer portal.

Explore More with Speechmatics

Speechmatics offers a free tier with limited usage, along with paid plans that unlock increased processing quotas, real-time speech recognition capabilities, and custom vocabulary creation. Explore their comprehensive documentation to delve deeper into each API endpoint, advanced functionalities, pricing options, and real-time speech recognition implementation. Leverage Speechmatics' powerful speech recognition technology to enhance your applications and unlock the potential of spoken language data.