Feedback: guides-gladia_to_aai

Documentation Feedback

Original URL: https://www.assemblyai.com/docs/guides/gladia_to_aai
Category: guides
Generated: 05/08/2025, 4:40:44 pm

Claude Sonnet 4 Feedback

Generated: 05/08/2025, 4:40:43 pm

Technical Documentation Analysis: Migration Guide - Gladia to AssemblyAI

Overall Assessment

This migration guide provides a solid foundation but has several areas for improvement in clarity, completeness, and user experience. Below is my detailed analysis with actionable recommendations.

🔴 Critical Issues

1. Missing Prerequisites & Setup Information

Problem: The guide jumps directly into code without proper environment setup.

Fix: Add a comprehensive prerequisites section:

## Prerequisites

Before starting this migration, ensure you have:

### Required Tools
- Python 3.7+ installed
- `requests` library (`pip install requests`)
- Your existing Gladia API key (for reference)
- A new AssemblyAI API key ([get one here](https://assemblyai.com/dashboard/signup))

### Environment Setup
```bash
# Install required dependencies
pip install requests

# Set your API key as an environment variable (recommended)
export ASSEMBLYAI_API_KEY="your_api_key_here"

Test Audio File

Download a sample audio file to test with:

curl -O https://github.com/AssemblyAI-Examples/audio-examples/raw/main/20220913_120000.mp3

### 2. **Incomplete Error Handling**
**Problem**: Basic error handling without specific guidance for common migration issues.

**Fix**: Add comprehensive error handling section:
```markdown
## Error Handling & Troubleshooting

### Common Migration Issues

| Error | Cause | Solution |
|-------|-------|----------|
| `401 Unauthorized` | Invalid API key format | Ensure you're using AssemblyAI key, not Gladia key |
| `400 Bad Request` | Invalid audio URL | Check file format and accessibility |
| `429 Rate Limited` | Too many requests | Implement exponential backoff |

### Robust Error Handling Example
```python
import time
import random

def transcribe_with_retry(audio_url, max_retries=3):
    for attempt in range(max_retries):
        try:
            # Your transcription code here
            response = requests.post(url, json=data, headers=headers)

            if response.status_code == 429:  # Rate limited
                wait_time = (2 ** attempt) + random.uniform(0, 1)
                time.sleep(wait_time)
                continue

            response.raise_for_status()
            return response.json()

        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                raise RuntimeError(f"Failed after {max_retries} attempts: {e}")
            time.sleep(2 ** attempt)

---

## 🟡 **Structure & Organization Issues**

### 3. **Improve Information Architecture**
**Current structure is confusing**. Reorganize as:

```markdown
# Migration Guide: Gladia to AssemblyAI

## Table of Contents
1. [Prerequisites](#prerequisites)
2. [Quick Migration Checklist](#quick-migration-checklist)
3. [Step-by-Step Migration](#step-by-step-migration)
4. [Feature Mapping Reference](#feature-mapping-reference)
5. [Testing Your Migration](#testing-your-migration)
6. [Troubleshooting](#troubleshooting)
7. [Next Steps](#next-steps)

## Quick Migration Checklist
- [ ] Set up AssemblyAI account and get API key
- [ ] Install dependencies
- [ ] Update authentication headers
- [ ] Change API endpoints
- [ ] Update response field names
- [ ] Test with sample audio
- [ ] Migrate advanced features

## Step-by-Step Migration
[Detailed steps here...]

4. Add Feature Mapping Reference Table

Problem: Users need to quickly reference equivalent features.

Fix: Add comprehensive mapping table:

## Feature Mapping Reference

| Feature | Gladia Parameter | AssemblyAI Parameter | Notes |
|---------|------------------|---------------------|--------|
| Speaker Diarization | `diarization: true` | `speaker_labels: true` | Response structure differs |
| Auto Chapters | `chapterization: true` | `auto_chapters: true` | Similar functionality |
| Entity Detection | `named_entity_recognition: true` | `entity_detection: true` | Different entity types available |
| Language Detection | `language_detection: true` | `language_detection: true` | ✅ Same parameter name |
| Custom Vocabulary | `custom_vocabulary: [...]` | `word_boost: [...]` | Different format required |

### Response Field Mapping

| Data | Gladia Path | AssemblyAI Path |
|------|-------------|-----------------|
| Full transcript | `result.transcription.full_transcript` | `text` |
| Utterances | `result.transcription.utterances` | `utterances` |
| Speaker labels | `utterance.speaker` | `utterance.speaker` |
| Chapters | `result.chapterization.results` | `chapters` |
| Entities | `result.named_entity_recognition.results` | `entities` |

🟡 Code Quality Issues

5. Improve Code Examples

Problem: Code lacks proper structure and real-world considerations.

Fix: Provide production-ready examples:

import os
import requests
import time
from typing import Dict, Any, Optional

class AssemblyAIClient:
    def __init__(self, api_key: Optional[str] = None):
        self.api_key = api_key or os.getenv('ASSEMBLYAI_API_KEY')
        if not self.api_key:
            raise ValueError("API key required. Set ASSEMBLYAI_API_KEY environment variable.")

        self.base_url = "https://api.assemblyai.com/v2"
        self.headers = {"authorization": self.api_key}

    def upload_file(self, file_path: str) -> str:
        """Upload local audio file and return upload URL."""
        try:
            with open(file_path, "rb") as f:
                response = requests.post(
                    f"{self.base_url}/upload",
                    headers=self.headers,
                    data=f
                )
            response.raise_for_status()
            return response.json()["upload_url"]
        except FileNotFoundError:
            raise FileNotFoundError(f"Audio file not found: {file_path}")
        except requests.RequestException as e:
            raise RuntimeError(f"Upload failed: {e}")

    def transcribe(self, audio_url: str, **options) -> Dict[str, Any]:
        """Submit transcription request with options."""
        data = {"audio_url": audio_url, **options}

        response = requests.post(
            f"{self.base_url}/transcript",
            json=data,
            headers=self.headers
        )
        response.raise_for_status()
        return response.json()

    def wait_for_completion(self, transcript_id: str, polling_interval: int = 3) -> Dict[str, Any]:
        """Poll for transcription completion."""
        url = f"{self.base_url}/transcript/{transcript_id}"

        while True:
            response = requests.get(url, headers=self.headers)
            response.raise_for_status()
            result = response.json()

            if result['status'] == 'completed':
                return result
            elif result['status'] == 'error':
                raise RuntimeError(f"Transcription failed: {result.get('error', 'Unknown error')}")

            time.sleep(polling_interval)

# Usage example
def main():
    client = AssemblyAIClient()

    # Upload and transcribe with features
    upload_url = client.upload_file("./my-audio.mp3")

    transcript_request = client.transcribe(
        upload_url,
        speaker_labels=True,
        auto_chapters=True,
        entity_detection=True
    )

    transcript = client.wait_for_completion(transcript_request['id'])

    # Process results


---