Skip to content

Feedback: guides-aws_to_aai

Original URL: https://www.assemblyai.com/docs/guides/aws_to_aai
Category: guides
Generated: 05/08/2025, 4:43:15 pm


Generated: 05/08/2025, 4:43:14 pm

Technical Documentation Analysis: AWS to AssemblyAI Migration Guide

Section titled “Technical Documentation Analysis: AWS to AssemblyAI Migration Guide”

This migration guide provides a good foundation but has several critical gaps that could frustrate users during migration. The content needs better organization, more comprehensive coverage, and clearer explanations.

  • AWS code has syntax errors: Missing indentation in the while loop and incorrect if name == "main" syntax
  • Installation section misleading: Shows imports but not actual installation commands
  • Missing error handling: No guidance on handling common migration pitfalls
  • No mention of AWS credentials setup for comparison
  • Missing Python version requirements
  • No guidance on SDK installation commands
# Suggested New Structure:
1. Prerequisites & Setup
2. Key Differences Overview
3. Step-by-Step Migration Process
4. Feature Mapping Table
5. Common Migration Issues
6. Performance & Cost Comparison
7. Testing & Validation

1. Add Installation Section:

Terminal window
# AWS
pip install boto3
# AssemblyAI
pip install assemblyai

2. Add Prerequisites Checklist:

  • Python 3.7+ installed
  • AssemblyAI API key obtained
  • AWS credentials configured (for S3 access)
  • Test audio files prepared

3. Include Feature Mapping Table:

AWS Transcribe FeatureAssemblyAI EquivalentNotes
Speaker Labelsspeaker_labels=TrueSimilar functionality
Custom Vocabularyword_boostDifferent implementation
Job NamesNot requiredJobs auto-managed

1. Fix AWS Code:

# Corrected AWS example with proper indentation and syntax
import time
import boto3
def transcribe_file(job_name, file_uri, transcribe_client):
transcribe_client.start_transcription_job(
TranscriptionJobName=job_name,
Media={"MediaFileUri": file_uri},
MediaFormat="wav",
LanguageCode="en-US",
)
max_tries = 60
while max_tries > 0:
max_tries -= 1
job = transcribe_client.get_transcription_job(
TranscriptionJobName=job_name
)
job_status = job["TranscriptionJob"]["TranscriptionJobStatus"]
if job_status in ["COMPLETED", "FAILED"]:
print(f"Job {job_name} is {job_status}.")
if job_status == "COMPLETED":
print(f"Download the transcript from\n"
f"\t{job['TranscriptionJob']['Transcript']['TranscriptFileUri']}")
break
else:
print(f"Waiting for {job_name}. Current status is {job_status}.")
time.sleep(10)
if __name__ == "__main__": # Fixed syntax
transcribe_client = boto3.client("transcribe")
file_uri = "s3://test-transcribe/answer2.wav"
transcribe_file("Example-job", file_uri, transcribe_client)

2. Add Error Handling Examples:

# AssemblyAI with proper error handling
import assemblyai as aai
try:
aai.settings.api_key = "YOUR-API-KEY"
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(audio_file)
if transcript.status == aai.TranscriptStatus.error:
print(f"Transcription failed: {transcript.error}")
exit(1)
print(transcript.text)
except Exception as e:
print(f"Error during transcription: {e}")

1. Authentication Comparison:

# AWS - Multiple authentication methods
import boto3
# Method 1: Environment variables
client = boto3.client('transcribe')
# Method 2: Explicit credentials
client = boto3.client(
'transcribe',
aws_access_key_id='YOUR_ACCESS_KEY',
aws_secret_access_key='YOUR_SECRET_KEY'
)
# AssemblyAI - Simple API key
import assemblyai as aai
aai.settings.api_key = "YOUR-API-KEY" # or set AAI_API_KEY env var

2. Add Migration Checklist:

## Migration Checklist
- [ ] Install AssemblyAI SDK
- [ ] Obtain and configure API key
- [ ] Test with sample audio file
- [ ] Update file input handling (S3 → pre-signed URLs)
- [ ] Remove job management code
- [ ] Update feature configuration syntax
- [ ] Test error handling
- [ ] Validate output format changes

3. Common Issues Section:

## Common Migration Issues
### Issue: S3 Access
**Problem:** AWS Transcribe directly accesses private S3 files
**Solution:** Generate pre-signed URLs or make files public
### Issue: Job Management
**Problem:** AWS requires manual job status polling
**Solution:** AssemblyAI SDK handles this automatically
### Issue: Output Format
**Problem:** AWS returns URL to JSON file
**Solution:** AssemblyAI returns structured object directly

Start with simplest case, then add complexity:

# Step 1: Basic transcription
transcript = transcriber.transcribe("audio.mp3")
# Step 2: Add one feature
config = aai.TranscriptionConfig(speaker_labels=True)
transcript = transcriber.transcribe("audio.mp3", config)
# Step 3: Multiple features
config = aai.TranscriptionConfig(
speaker_labels=True,
auto_chapters=True,
sentiment_analysis=True
)
  • “Migration typically takes 30-60 minutes”
  • “Simple projects: 15 minutes”
  • “Complex projects with multiple features: 2+ hours”
# Test your migration
def test_migration():
# Test with short audio file first
test_file = "path/to/short-test.mp3"
transcript = transcriber.transcribe(test_file)
assert transcript.status == aai.TranscriptStatus.completed
print("✅ Migration successful!")
  1. Add troubleshooting section with common error codes
  2. Include performance comparison (speed, accuracy)
  3. Add cost comparison calculator or examples
  4. Provide rollback instructions if migration fails
  5. Link to community forum or support channels
  6. Add next steps after successful migration

This documentation has good bones but needs these improvements to truly serve users migrating from AWS Transcribe to AssemblyAI effectively.