Feedback: guides-automatic-language-detection-route-nano-model

Documentation Feedback

Original URL: https://www.assemblyai.com/docs/guides/automatic-language-detection-route-nano-model
Category: guides
Generated: 05/08/2025, 4:43:20 pm

Claude Sonnet 4 Feedback

Generated: 05/08/2025, 4:43:19 pm

Technical Documentation Analysis & Recommendations

Overall Assessment

This documentation provides a useful workflow but has several clarity, completeness, and user experience issues that need addressing.

Critical Issues & Recommendations

1. Missing Information

Problem: Key details are absent or unclear.

Specific Issues:

No explanation of what “language confidence” actually means
Missing list of 17 Universal vs 99 Nano supported languages
No error handling for non-language-related failures
Billing/cost implications not clearly explained
No mention of performance differences between models

Recommendations:

## Language Model Comparison

| Feature | Universal Model | Nano Model |
|---------|----------------|------------|
| Languages Supported | 17 | 99 |
| Cost | Higher | Lower |
| Accuracy | Higher for supported languages | Good for all languages |
| Processing Speed | Faster | Standard |

### Supported Languages
- **Universal**: [link to complete list]
- **Nano**: [link to complete list]

2. Unclear Explanations

Problem: Technical concepts lack proper context.

Issues:

“language_confidence_threshold” concept introduced without explanation
Workflow logic is confusing (why does it “error out”?)
The relationship between confidence and model selection is unclear

Improved Explanation:

## How Language Confidence Works

Language confidence represents how certain the automatic language detection is about the identified language (0.0 = no confidence, 1.0 = completely certain).

When you set a `language_confidence_threshold`:
- **Above threshold**: Transcription proceeds with Universal model
- **Below threshold**: Request fails with error containing detected language
- **Your code**: Catches error and retries with Nano model using detected language

3. Better Examples Needed

Problem: Current example lacks real-world context and error handling.

Improved Complete Example:

import assemblyai as aai
import re
import logging

# Setup logging for better debugging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def transcribe_with_fallback(audio_url, confidence_threshold=0.8):
    """
    Transcribe audio with automatic fallback to Nano model if
    language confidence is too low for Universal model.

    Args:
        audio_url (str): URL or local path to audio file
        confidence_threshold (float): Minimum confidence (0.0-1.0)

    Returns:
        dict: Transcription result with metadata
    """
    aai.settings.api_key = "YOUR_API_KEY"
    transcriber = aai.Transcriber()

    # First attempt: Universal model with language detection
    universal_config = aai.TranscriptionConfig(
        speech_model=aai.SpeechModel.universal,
        language_detection=True,
        language_confidence_threshold=confidence_threshold
    )

    try:
        logger.info("Attempting transcription with Universal model...")
        transcript = transcriber.transcribe(audio_url, universal_config)

        if transcript.error:
            raise Exception(transcript.error)

        return {
            'text': transcript.text,
            'model_used': 'universal',
            'language': transcript.language_code,
            'confidence': getattr(transcript, 'language_confidence', None)
        }

    except Exception as e:
        error_msg = str(e)

        # Check if this is a language confidence error
        if "below the requested confidence threshold" in error_msg:
            logger.info("Language confidence too low, falling back to Nano model...")

            # Extract detected language from error message
            match = re.search(r"detected language '(\w+)'", error_msg)
            if not match:
                raise Exception(f"Could not parse language from error: {error_msg}")

            detected_language = match.group(1)
            logger.info(f"Detected language: {detected_language}")

            # Retry with Nano model
            nano_config = aai.TranscriptionConfig(
                speech_model=aai.SpeechModel.nano,
                language_code=detected_language
            )

            transcript = transcriber.transcribe(audio_url, nano_config)

            if transcript.error:
                raise Exception(f"Nano model transcription failed: {transcript.error}")

            return {
                'text': transcript.text,
                'model_used': 'nano',
                'language': detected_language,
                'fallback_reason': 'low_language_confidence'
            }
        else:
            # Re-raise non-language-confidence errors
            raise Exception(f"Transcription failed: {error_msg}")

# Usage examples
if __name__ == "__main__":
    # Example 1: English audio (likely to use Universal)
    try:
        result = transcribe_with_fallback("https://example.com/english-audio.mp3")
        print(f"Transcription: {result['text']}")
        print(f"Model used: {result['model_used']}")
    except Exception as e:
        print(f"Error: {e}")

    # Example 2: Less common language (likely to fallback to Nano)
    try:
        result = transcribe_with_fallback("https://example.com/swedish-audio.mp3")
        print(f"Transcription: {result['text']}")
        print(f"Model used: {result['model_used']}")
    except Exception as e:
        print(f"Error: {e}")

4. Improved Structure

Current structure issues:

Information scattered without logical flow
No clear sections for different user needs
Missing troubleshooting section

Recommended Structure:

# Route to Nano Speech Model for Low Language Confidence

## Overview
Brief explanation of the workflow and when to use it

## Prerequisites
- API key setup
- Supported audio formats
- SDK installation

## Understanding the Models
- Universal vs Nano comparison table
- Language support details
- Cost implications

## Implementation Guide
### Basic Setup
### Error Handling Strategy
### Complete Working Example

## Troubleshooting
### Common Issues
### Error Message Reference
### Performance Optimization

## Best Practices
## Related Guides

5. User Pain Points

Major Pain Points Identified:

Error message parsing is fragile - relies on string matching

Solution: Request a structured error response or provide an SDK method:

# Suggest this API improvement
if transcript.error and transcript.error.type == 'LANGUAGE_CONFIDENCE_LOW':
    detected_language = transcript.error.detected_language
    confidence = transcript.error.confidence

No guidance on choosing confidence threshold

Solution: Add a decision matrix:

## Choosing Your Confidence Threshold

| Threshold | Use Case | Trade-offs |
|-----------|----------|------------|
| 0.9-1.0 | High accuracy critical | More fallbacks to Nano |
| 0.7-0.9 | Balanced approach | Recommended for most use cases |
| 0.5-0.7 | Minimize fallbacks | May get lower accuracy |

No cost estimation

Solution: Add cost calculator or examples:

## Cost Implications
- Universal model failure: No charge
- Nano fallback: Lower rate than Universal
- Estimated savings: 30-50% when fallback is used

Additional Recommendations

Add Interactive Elements

## Try It Yourself
Use our [interactive demo](link) to test different confidence thresholds with sample audio files.

Include Monitoring Guidance

# Add metrics tracking
def track_model_usage(result):
    """Track which model was used for analytics"""
    metrics = {
        'model_used': result['model_used'],
        'language': result['language'],
        'timestamp': datetime.now().isoformat()
    }
    # Log to your analytics system

## See Also
- [

---