AssemblyAI's Speech API empowers developers and businesses to transform speech data into actionable insights with exceptional accuracy and ease of use. This API leverages advanced machine learning models to transcribe audio and video recordings, extract valuable information, and unlock the potential of your speech content.

Authentication:

AssemblyAI utilizes API keys for authentication. You can obtain your free API key by creating an account on the AssemblyAI platform.

Here's a breakdown of some key AssemblyAI Speech API endpoints along with JavaScript example code snippets for each:

1. Speech-to-Text Transcription (Transcription)

  • About: This API endpoint transcribes audio and video recordings into high-fidelity text transcripts.
  • Type: POST
  • JavaScript Example:
const apiKey = 'YOUR_API_KEY';
const url = 'https://api.assemblyai.com/v2/uploads';

const data = new FormData();
data.append('audio_url', 'https://www.example.com/audio.mp3'); // Replace with your audio URL
data.append('transcription_type', 'automated'); // Choose transcription type (automated or human-reviewed)

const headers = {
    Authorization: `Bearer ${apiKey}`,
};

fetch(url, { method: 'POST', headers, body: data })
    .then((response) => response.json())
    .then((data) => {
        console.log('Upload ID:', data.id);
        // Use the upload ID to track the status and retrieve the completed transcript
    })
    .catch((error) => console.error(error));

2. Real-time Speech-to-Text (Streaming) (Paid Plans)

  • About: This endpoint allows for real-time transcription of live audio streams.
  • Type: WebSocket connection
  • Note: Real-time Speech-to-Text requires a paid subscription. Refer to AssemblyAI documentation for details on paid plans and functionalities.

3. Speaker Diarization (Paid Plans)

  • About: This endpoint identifies and differentiates between different speakers within a recording.
  • Type: POST
  • Note: Speaker Diarization requires a paid subscription. Refer to AssemblyAI documentation for details on paid plans and functionalities.

4. Custom Vocabularies (Paid Plans)

  • About: This feature allows you to train the model with your industry-specific terms for enhanced accuracy in specialized fields.
  • Type: Various (refer to documentation)
  • Note: Custom Vocabularies require a paid subscription. Refer to AssemblyAI documentation for details on paid plans and functionalities.

5. Sentiment Analysis (Beta)

  • About: This endpoint analyzes the emotional tone of speech within a recording. (Beta feature)
  • Type: POST
  • Note: Sentiment Analysis is currently in Beta. Refer to AssemblyAI documentation for details and availability.

AssemblyAI's Speech API offers a robust and versatile toolkit for handling your speech processing needs. Explore their comprehensive documentation for in-depth information on each endpoint, additional functionalities, and pricing options. With AssemblyAI, you can unlock the valuable insights hidden within your audio and video content.