Feedback: speech-to-text-pre-recorded-audio-custom-spelling
Documentation Feedback
Section titled “Documentation Feedback”Original URL: https://www.assemblyai.com/docs/speech-to-text/pre-recorded-audio/custom-spelling
Category: speech-to-text
Generated: 05/08/2025, 4:26:00 pm
Claude Sonnet 4 Feedback
Section titled “Claude Sonnet 4 Feedback”Generated: 05/08/2025, 4:25:59 pm
Analysis & Recommendations for Custom Spelling Documentation
Section titled “Analysis & Recommendations for Custom Spelling Documentation”1. Critical Issues - High Priority
Section titled “1. Critical Issues - High Priority”A. Inconsistent API Structure Documentation
Section titled “A. Inconsistent API Structure Documentation”Problem: The Python SDK uses a completely different structure than all other languages, creating major confusion.
Fix: Add a clear explanation of the structural differences:
## API Structure Differences
**Python SDK**: Uses dictionary format where:- Key = desired output spelling- Value = array of input words to replace
**All other APIs**: Use object/array format where:- `from` = array of input words to replace- `to` = desired output spelling
### Python SDK Format```python{ "SQL": ["Sequel", "sequel"], # Output: Input variations "DeCarlo": ["decarlo", "Decarlo"]}REST API Format
Section titled “REST API Format”[ { "from": ["Sequel", "sequel"], "to": "SQL" }]#### B. Contradictory Examples**Problem**: Examples show conflicting mappings (SQL→Sequel vs Sequel→SQL) without explanation.
**Fix**: Use consistent, logical examples throughout:```markdown## Consistent Example Set- Company names: "decarlo" → "DeCarlo"- Technical terms: "sequel" → "SQL"- Brand names: "goo gle" → "Google"2. Missing Critical Information
Section titled “2. Missing Critical Information”A. Add Limitations & Constraints Section
Section titled “A. Add Limitations & Constraints Section”## Limitations & Constraints
- **Word limits**: Maximum X words per `from` array- **Mapping limits**: Maximum X custom spelling rules per request- **Character limits**: `to` value limited to X characters- **Language considerations**: How custom spelling interacts with different languages- **Processing priority**: How custom spelling interacts with other featuresB. Add Use Cases & Best Practices
Section titled “B. Add Use Cases & Best Practices”## Common Use Cases
### Technical Terms- Database terminology: "sequel" → "SQL", "my sequel" → "MySQL"- Programming languages: "java script" → "JavaScript"
### Proper Nouns- Company names: "micro soft" → "Microsoft"- Personal names: "o'connor" → "O'Connor"- Geographic locations: "new york" → "New York"
### Acronyms & Abbreviations- "C E O" → "CEO"- "A P I" → "API"
## Best Practices- Test with sample audio before processing large batches- Use consistent capitalization in `to` values- Include common variations in `from` arrays3. Structure & Navigation Issues
Section titled “3. Structure & Navigation Issues”A. Add Quick Reference Section
Section titled “A. Add Quick Reference Section”## Quick Reference
| Language | Structure | Key→Value Mapping ||----------|-----------|-------------------|| Python SDK | Dictionary | `"output": ["input1", "input2"]` || REST APIs | Object Array | `{"from": ["input1"], "to": "output"}` |
## Parameter Summary- **Case sensitivity**: `to` values are case-sensitive, `from` values are not- **Word count**: `to` must be single word, `from` can be multiple words- **Array format**: `from` always expects an array, even for single wordsB. Improve Code Examples
Section titled “B. Improve Code Examples”Current issues:
- Inconsistent highlighting
- Missing error handling in some examples
- No output examples
Improvements:
### Before/After Output Examples
**Input audio**: "The sequel database and decarlo both work well"
**Without custom spelling**:“The sequel database and decarlo both work well”
**With custom spelling**:```pythonconfig.set_custom_spelling({ "SQL": ["sequel"], "DeCarlo": ["decarlo"]})Output:
"The SQL database and DeCarlo both work well"### 4. **User Experience Pain Points**
#### A. Add Troubleshooting Section```markdown## Troubleshooting
### Common Issues
**Custom spelling not applied**- ✅ Verify exact spelling matches (case-insensitive for `from`)- ✅ Check that `to` value contains only one word- ✅ Ensure `from` is an array format
**Unexpected results**- ✅ Test with shorter audio clips first- ✅ Verify JSON structure matches your language's requirements- ✅ Check for conflicting custom spelling rules
### Debugging Tips1. Start with simple, single-word replacements2. Use the transcript confidence scores to verify audio quality3. Test common variations of your target wordsB. Add Validation Guidelines
Section titled “B. Add Validation Guidelines”## Input Validation
### Valid Examples ✅```json{"from": ["hello world"], "to": "HelloWorld"} // Multi-word to single word{"from": ["c e o"], "to": "CEO"} // Spelled out acronym{"from": ["decarlo", "de carlo"], "to": "DeCarlo"} // Multiple variationsInvalid Examples ❌
Section titled “Invalid Examples ❌”{"from": "hello", "to": "hi"} // Missing array brackets{"from": ["hi"], "to": "hello world"} // Multi-word output{"from": [], "to": "test"} // Empty input array### 5. **Additional Recommendations**
#### A. Add Integration ExamplesShow how custom spelling works with other features:```markdown## Integration with Other Features
Custom spelling can be combined with:- Speaker diarization- Custom vocabulary- Punctuation formatting- Language detectionB. Add Performance Notes
Section titled “B. Add Performance Notes”## Performance Considerations
- Custom spelling adds minimal processing time- Large numbers of rules (>100) may impact processing speed- Consider batching similar audio files with shared custom spelling rules6. Content Organization Improvements
Section titled “6. Content Organization Improvements”Recommended new structure:
- Overview & quick examples
- API differences explanation
- Common use cases
- Code examples (organized by complexity)
- Limitations & best practices
- Troubleshooting
- Integration notes
This restructuring would significantly improve user comprehension and reduce implementation errors.