Feedback: speech-to-text-pre-recorded-audio-filler-words

Documentation Feedback

Original URL: https://www.assemblyai.com/docs/speech-to-text/pre-recorded-audio/filler-words
Category: speech-to-text
Generated: 05/08/2025, 4:25:26 pm

Claude Sonnet 4 Feedback

Generated: 05/08/2025, 4:25:25 pm

Technical Documentation Analysis: Filler Words Feature

Overall Assessment

This documentation covers a specific feature (filler words detection/removal) but lacks crucial context, examples, and user guidance. While the code examples are comprehensive across multiple languages, the conceptual explanation is minimal.

Critical Issues & Recommendations

1. Missing Context and Purpose

Problem: The documentation jumps straight into technical details without explaining what filler words are or why this feature matters.

Recommendations:

# Filler Words Detection

Filler words (also called disfluencies) are speech hesitations like "um," "uh," and "hmm" that speakers use while thinking or pausing. AssemblyAI automatically removes these from transcripts by default to provide cleaner, more readable text.

## When to use this feature
- **Keep filler words** when analyzing speech patterns, creating subtitles, or studying natural speech
- **Remove filler words** (default) for clean transcripts, meeting notes, or content creation

## How it works
AssemblyAI's speech recognition model identifies and filters out common filler words during transcription. You can control this behavior using the `disfluencies` parameter.

2. Incomplete Filler Words List

Problem: The list appears incomplete and doesn’t explain language variations or detection confidence.

Recommendations:

## Detected Filler Words
The following filler words are automatically removed by default:

**Hesitations:** "um", "uh", "ah"
**Confirmations:** "mhm", "uh-huh", "hm"
**Thinking sounds:** "hmm", "huh"
**Minimal responses:** "m"

> **Note:** Detection accuracy may vary based on audio quality, speaker accent, and context. Some instances might be missed or incorrectly identified as filler words.

3. Missing Before/After Examples

Problem: Users can’t see the actual impact of enabling/disabling this feature.

Recommendations:

## Example Output

**With filler words removed (default):**

I think we should go to the store and buy some groceries for tonight’s dinner.

**With filler words included (`disfluencies: true`):**

Um, I think we should, uh, go to the store and buy some groceries for, hmm, tonight’s dinner.

4. Inadequate Language Support Information

Problem: The accordion is easy to miss, and there’s no explanation of why only English variants are supported.

Recommendations:

## Language Support
Currently available for English variants only:
- Global English (`en`)
- Australian English (`en_au`)
- British English (`en_uk`)
- US English (`en_us`)

> **Coming soon:** Support for additional languages is in development. [Request a language →](link-to-feedback)

5. Code Examples Need Improvement

Problems:

No explanation of the disfluencies parameter
Missing error handling context
No guidance on when to use each approach

Recommendations:

Add parameter documentation:

## Configuration

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `disfluencies` | boolean | `false` | When `true`, includes filler words in transcript. When `false` (default), removes filler words for cleaner text. |

Improve code example introduction:

## Implementation

Set `disfluencies: true` in your transcription configuration to preserve filler words in the output:

6. Missing Practical Guidance

Problem: No guidance on best practices, limitations, or troubleshooting.

Recommendations:

## Best Practices

- **For meeting transcripts:** Keep default (`disfluencies: false`) for cleaner notes
- **For speech analysis:** Enable (`disfluencies: true`) to preserve natural speech patterns
- **For accessibility:** Consider your audience - some prefer filler words for context

## Limitations

- Detection accuracy depends on audio quality and speaker clarity
- Some words may be incorrectly classified as filler words in certain contexts
- Very quiet or unclear filler words might not be detected

## Troubleshooting

**Filler words appearing when they shouldn't?**
- Check that `disfluencies` is set to `false` (default)
- Verify your audio quality meets [minimum requirements](link)

**Missing expected filler words?**
- Ensure `disfluencies` is set to `true`
- Consider that very quiet hesitations may not be detected

7. Structural Improvements

Current structure issues:

Abrupt start without introduction
Language support buried in accordion
No logical flow from concept to implementation

Recommended structure:

1. Feature introduction and purpose
2. How it works (brief technical explanation)
3. Language support (prominent placement)
4. Configuration options
5. Before/after examples
6. Code implementation
7. Best practices and limitations
8. Troubleshooting
9. Related features (cross-links)

8. Missing Cross-References

Add related documentation links:

## Related Features
- [Speech Recognition Accuracy](link) - Improve overall transcription quality
- [Custom Vocabulary](link) - Handle domain-specific terms
- [Punctuation & Formatting](link) - Additional text cleaning options

Quick Wins

Add a 2-3 sentence introduction explaining what filler words are
Include a simple before/after example at the top
Move language support out of the accordion to make it more visible
Add a troubleshooting section with common issues
Include cross-links to related features

These changes would significantly improve user understanding and reduce support requests while maintaining the existing comprehensive code examples.