Skip to content

Feedback: guides-transcribe_system_audio

Original URL: https://www.assemblyai.com/docs/guides/transcribe_system_audio
Category: guides
Generated: 05/08/2025, 4:35:03 pm


Generated: 05/08/2025, 4:35:02 pm

Technical Documentation Analysis: Transcribe System Audio in Real-Time (macOS)

Section titled “Technical Documentation Analysis: Transcribe System Audio in Real-Time (macOS)”

This documentation provides a comprehensive code example but has significant gaps in user guidance, setup instructions, and error handling explanations. The structure needs improvement to better serve users with different experience levels.

1. Missing Prerequisites & Setup Information

Section titled “1. Missing Prerequisites & Setup Information”

Issues:

  • No clear system requirements
  • Missing BlackHole installation instructions
  • No mention of required Python version or dependencies
  • Audio routing setup is relegated to troubleshooting

Recommendations:

## Prerequisites
### System Requirements
- macOS 10.15 or later
- Python 3.7 or later
- AssemblyAI API key ([get one here](https://assemblyai.com/dashboard/signup))
### Required Dependencies
```bash
pip install pyaudio websocket-client
  1. Install BlackHole:

    Terminal window
    brew install blackhole-2ch

    Or download from ExistentialAudio/BlackHole

  2. Configure Audio Routing (Critical Step):

    • Open “Audio MIDI Setup” (Spotlight search: “Audio MIDI Setup”)
    • Click ”+” → “Create Multi-Output Device”
    • Check both “BlackHole 2ch” and your speakers/headphones
    • Name it “BlackHole + Speakers”
    • Set this as your system output device in System Preferences > Sound
  3. Verify installation by running the script - you should see BlackHole in the device list

### 2. Poor Information Architecture
**Issues:**
- Quickstart comes before essential setup information
- Step-by-step guide repeats code without adding value
- Missing logical flow for different user types
**Recommended Structure:**
```markdown
# Transcribe System Audio in Real-Time (macOS)
## Overview
[Brief description and use cases]
## Prerequisites
[All setup requirements]
## Quick Start
[Minimal working example - 20 lines max]
## Complete Implementation
[Full code with detailed explanations]
## Configuration Options
[API parameters, audio settings]
## Troubleshooting
[Common issues and solutions]
## Advanced Usage
[Customizations, optimizations]

Issues:

  • Unused variables (recorded_frames, recording_lock)
  • Missing error handling explanations
  • No code commenting for complex sections
  • Hardcoded values without explanation

Recommendations:

# Remove unused recording functionality or explain its purpose
# If keeping for future use, add this comment:
# WAV recording variables (reserved for future file output feature)
recorded_frames = [] # Store audio frames for WAV file
recording_lock = threading.Lock() # Thread-safe access to recorded_frames
# Add configuration explanations:
FRAMES_PER_BUFFER = 800 # 50ms audio chunks (optimal for real-time streaming)
SAMPLE_RATE = 16000 # Required by AssemblyAI Streaming API
CHANNELS = 1 # Mono audio (stereo will be converted)

Issues:

  • No explanation of common error scenarios
  • Missing API error code meanings
  • No guidance for debugging connection issues

Add Error Handling Section:

## Common Errors & Solutions
### "BlackHole audio device not found"
**Cause:** BlackHole not installed or not properly configured
**Solution:** Follow the Prerequisites section installation steps
### "WebSocket Error: [Errno 61] Connection refused"
**Cause:** Invalid API key or network connectivity issues
**Solution:**
- Verify your API key is correct
- Check your internet connection
- Ensure you have API credits remaining
### "Error streaming audio: [Errno -9981] Input overflowed"
**Cause:** Audio buffer overflow (system too slow to process)
**Solution:** Increase `FRAMES_PER_BUFFER` or close other applications

Issues:

  • No explanation of what users should expect to see
  • Missing setup verification steps
  • No guidance for testing the setup

Add User Experience Section:

## Testing Your Setup
1. **Verify BlackHole Detection:**
Run the script initially - you should see output like:

Available audio devices: 0: Built-in Microphone (inputs: 1) 1: BlackHole 2ch (inputs: 2) -> Found BlackHole device at index 1

2. **Test Audio Routing:**
- Play audio/video content
- You should see real-time transcription appearing
- If no transcription appears, check your audio routing setup
3. **Expected Output:**

WebSocket connection opened. Session began: ID=abc123, ExpiresAt=2024-01-01 15:30:00 Hello, this is a test of the transcription system…

Issues:

  • Mentions Windows support but provides no implementation
  • Creates false expectations for Windows users

Recommendations:

  • Either provide Windows implementation or clearly state “macOS only”
  • If keeping Windows mention, add:
## Windows Implementation (Coming Soon)
This guide currently supports macOS only. For Windows users:
- Use [Virtual Audio Cable (VAC)](https://vac.muzychenko.net/en/)
- The code structure remains similar - main difference is device detection
- Windows implementation guide coming in Q2 2024

Issues:

  • No explanation of API parameters
  • Missing performance optimization guidance
  • No customization options

Add Configuration Section:

## Configuration Options
### API Parameters
```python
CONNECTION_PARAMS = {
"sample_rate": 16000, # Required: 16kHz for optimal accuracy
"format_turns": True, # Optional: Get formatted final transcripts
"word_boost": ["specific"], # Optional: Boost specific words
"language_code": "en", # Optional: Specify language (auto-detect default)
}
  • FRAMES_PER_BUFFER: Lower = more responsive, higher = more stable
  • CHANNELS: Keep at 1 (mono) for best performance
  • SAMPLE_RATE: Must match API parameter (16000)
## Additional Recommendations
1. **Add Visual Aids:** Include screenshots of Audio MIDI Setup and System Preferences
2. **Provide Minimal Example:** Create a 20-line quick start version
3. **Add Logging:** Replace print statements with proper logging
4. **Include Performance Notes:** Expected latency, CPU usage, etc.
5. **Add FAQ Section:** Address common user questions
6. **Version Information:** Specify which versions of dependencies were tested
This documentation has good technical depth but needs significant improvements in user onboarding, structure, and practical guidance to become truly user-friendly.
---