Feedback: speech-to-text-pre-recorded-audio-custom-spelling

Documentation Feedback

Original URL: https://www.assemblyai.com/docs/speech-to-text/pre-recorded-audio/custom-spelling
Category: speech-to-text
Generated: 05/08/2025, 4:26:00 pm

Claude Sonnet 4 Feedback

Generated: 05/08/2025, 4:25:59 pm

Analysis & Recommendations for Custom Spelling Documentation

1. Critical Issues - High Priority

A. Inconsistent API Structure Documentation

Problem: The Python SDK uses a completely different structure than all other languages, creating major confusion.

Fix: Add a clear explanation of the structural differences:

## API Structure Differences

**Python SDK**: Uses dictionary format where:
- Key = desired output spelling
- Value = array of input words to replace

**All other APIs**: Use object/array format where:
- `from` = array of input words to replace
- `to` = desired output spelling

### Python SDK Format
```python
{
  "SQL": ["Sequel", "sequel"],  # Output: Input variations
  "DeCarlo": ["decarlo", "Decarlo"]
}

REST API Format

[
  {
    "from": ["Sequel", "sequel"],
    "to": "SQL"
  }
]

#### B. Contradictory Examples
**Problem**: Examples show conflicting mappings (SQL→Sequel vs Sequel→SQL) without explanation.

**Fix**: Use consistent, logical examples throughout:
```markdown
## Consistent Example Set
- Company names: "decarlo" → "DeCarlo"
- Technical terms: "sequel" → "SQL"
- Brand names: "goo gle" → "Google"

2. Missing Critical Information

A. Add Limitations & Constraints Section

## Limitations & Constraints

- **Word limits**: Maximum X words per `from` array
- **Mapping limits**: Maximum X custom spelling rules per request
- **Character limits**: `to` value limited to X characters
- **Language considerations**: How custom spelling interacts with different languages
- **Processing priority**: How custom spelling interacts with other features

B. Add Use Cases & Best Practices

## Common Use Cases

### Technical Terms
- Database terminology: "sequel" → "SQL", "my sequel" → "MySQL"
- Programming languages: "java script" → "JavaScript"

### Proper Nouns
- Company names: "micro soft" → "Microsoft"
- Personal names: "o'connor" → "O'Connor"
- Geographic locations: "new york" → "New York"

### Acronyms & Abbreviations
- "C E O" → "CEO"
- "A P I" → "API"

## Best Practices
- Test with sample audio before processing large batches
- Use consistent capitalization in `to` values
- Include common variations in `from` arrays

A. Add Quick Reference Section

## Quick Reference

| Language | Structure | Key→Value Mapping |
|----------|-----------|-------------------|
| Python SDK | Dictionary | `"output": ["input1", "input2"]` |
| REST APIs | Object Array | `{"from": ["input1"], "to": "output"}` |

## Parameter Summary
- **Case sensitivity**: `to` values are case-sensitive, `from` values are not
- **Word count**: `to` must be single word, `from` can be multiple words
- **Array format**: `from` always expects an array, even for single words

B. Improve Code Examples

Current issues:

Inconsistent highlighting
Missing error handling in some examples
No output examples

Improvements:

### Before/After Output Examples

**Input audio**: "The sequel database and decarlo both work well"

**Without custom spelling**:

“The sequel database and decarlo both work well”

**With custom spelling**:
```python
config.set_custom_spelling({
    "SQL": ["sequel"],
    "DeCarlo": ["decarlo"]
})

Output:

"The SQL database and DeCarlo both work well"

### 4. **User Experience Pain Points**

#### A. Add Troubleshooting Section
```markdown
## Troubleshooting

### Common Issues

**Custom spelling not applied**
- ✅ Verify exact spelling matches (case-insensitive for `from`)
- ✅ Check that `to` value contains only one word
- ✅ Ensure `from` is an array format

**Unexpected results**
- ✅ Test with shorter audio clips first
- ✅ Verify JSON structure matches your language's requirements
- ✅ Check for conflicting custom spelling rules

### Debugging Tips
1. Start with simple, single-word replacements
2. Use the transcript confidence scores to verify audio quality
3. Test common variations of your target words

B. Add Validation Guidelines

## Input Validation

### Valid Examples ✅
```json
{"from": ["hello world"], "to": "HelloWorld"}     // Multi-word to single word
{"from": ["c e o"], "to": "CEO"}                  // Spelled out acronym
{"from": ["decarlo", "de carlo"], "to": "DeCarlo"} // Multiple variations

Invalid Examples ❌

{"from": "hello", "to": "hi"}           // Missing array brackets
{"from": ["hi"], "to": "hello world"}   // Multi-word output
{"from": [], "to": "test"}              // Empty input array

### 5. **Additional Recommendations**

#### A. Add Integration Examples
Show how custom spelling works with other features:
```markdown
## Integration with Other Features

Custom spelling can be combined with:
- Speaker diarization
- Custom vocabulary
- Punctuation formatting
- Language detection

B. Add Performance Notes

## Performance Considerations

- Custom spelling adds minimal processing time
- Large numbers of rules (>100) may impact processing speed
- Consider batching similar audio files with shared custom spelling rules

6. Content Organization Improvements

Recommended new structure:

Overview & quick examples
API differences explanation
Common use cases
Code examples (organized by complexity)
Limitations & best practices
Troubleshooting
Integration notes

This restructuring would significantly improve user comprehension and reduce implementation errors.