Skip to content

Feedback: speech-to-text-pre-recorded-audio-supported-languages

Original URL: https://www.assemblyai.com/docs/speech-to-text/pre-recorded-audio/supported-languages
Category: speech-to-text
Generated: 05/08/2025, 4:24:08 pm


Generated: 05/08/2025, 4:24:07 pm

Technical Documentation Analysis: Supported Languages

Section titled “Technical Documentation Analysis: Supported Languages”

The documentation covers the essential information but has several areas for improvement in clarity, completeness, and user experience. Here’s my detailed analysis:

  • No actual language codes displayed: The embedded Airtable iframes are not accessible to all users and don’t show the actual language_code values mentioned in the introduction
  • Missing feature availability matrix: The intro mentions “features available for that language” but this information is not visible
  • No fallback content: Users with JavaScript disabled or accessibility needs cannot access the embedded tables
  • Performance expectations: No guidance on expected processing times for different languages
  • Audio quality requirements: No mention of audio quality standards needed for optimal results per language
## Language Codes Quick Reference
| Language | Code | Slam-1 | Universal |
|----------|------|---------|-----------|
| English (US) | `en_us` | ✓ | ✓ |
| Spanish | `es` | ✓ | ✓ |
| French | `fr` | ✓ | ✓ |
[Continue with all supported languages...]
  • WER terminology: “Word Error Rate (WER)” is used without definition
  • Model selection guidance: Insufficient explanation of when to choose Slam-1 vs Universal
  • Accuracy categories: The accordion groupings use technical WER ranges without explaining practical implications
### Understanding Accuracy Levels
- **High accuracy (≤ 10% WER)**: Excellent for production use, suitable for automated workflows
- **Word Error Rate (WER)**: Percentage of words incorrectly transcribed (lower is better)
  • No code examples showing language specification
  • No real-world use cases
  • Missing error handling examples
import assemblyai as aai
aai.settings.api_key = "YOUR_API_KEY"
transcriber = aai.Transcriber()
# Specify Spanish audio
config = aai.TranscriptionConfig(language_code="es")
transcript = transcriber.transcribe("spanish_audio.mp3", config=config)
# When language is unknown
config = aai.TranscriptionConfig(language_detection=True)
transcript = transcriber.transcribe("multilingual_audio.mp3", config=config)
print(f"Detected language: {transcript.language_code}")
try:
config = aai.TranscriptionConfig(language_code="invalid_code")
transcript = transcriber.transcribe("audio.mp3", config=config)
except aai.TranscriptionError as e:
print(f"Error: {e}")
  • Information is fragmented across embedded tables
  • No logical flow from overview to implementation
  • Missing decision-making framework
# Supported Languages
## Quick Start
[Language selection decision tree]
## Language Support Overview
[Consolidated table with all information]
## Model-Specific Support
### Slam-1 Languages
### Universal Languages
## Implementation Guide
### Setting Language Codes
### Automatic Detection
### Error Handling
## Performance Expectations
[Accuracy and processing time by language]
## Troubleshooting
[Common issues and solutions]
  1. Accessibility: Embedded iframes exclude users with disabilities
  2. Mobile experience: Iframes may not render well on mobile devices
  3. Copy-paste friction: Can’t easily copy language codes from embedded tables
  4. Decision paralysis: Unclear guidance on model selection
  5. No offline reference: Can’t access language codes without internet
## Choose Your Model
| Your Need | Recommended Model | Why |
|-----------|------------------|-----|
| English-only, high accuracy | Slam-1 | Optimized for English |
| Multiple languages | Universal | Broader language support |
| Unknown language | Universal + Auto-detect | Built-in language detection |

Replace iframes with proper HTML tables that work with screen readers and allow text selection.

## Common Issues
### Language not detected correctly
- Ensure audio quality is sufficient
- Check if language is in supported list
- Consider manual language specification
### Poor transcription quality
- Verify language code matches audio
- Check accuracy tier for your language
- Consider audio preprocessing
  1. Add language-specific tips: Special considerations for tonal languages, RTL languages, etc.
  2. Include sample accuracy: Show before/after examples for different WER levels
  3. Performance benchmarks: Expected processing times by language and model
  4. Feature compatibility matrix: Which features work with which languages
  1. Search functionality: Allow users to quickly find their language
  2. Language code validator: Interactive tool to verify codes
  3. Regional variants: Clear explanation of dialect support
  4. Migration guide: Help users transition between models
  1. API response examples: Show what successful/failed responses look like
  2. Rate limiting info: Language-specific processing limits
  3. Batch processing: Handling multiple languages in one request
  4. Webhook considerations: Language-specific callback handling

This documentation would benefit significantly from making the embedded content accessible, providing comprehensive examples, and creating a more user-centric structure that guides users from decision-making through implementation.