Overview
TranscriptionManager handles all transcription-related functionality within a user session. It provides provider abstraction for multiple transcription services (Azure and Soniox), manages stream lifecycle, handles multi-language support, implements VAD (Voice Activity Detection) audio buffering, and maintains transcript history. File:packages/cloud/src/services/session/transcription/TranscriptionManager.ts
Key Features
- Provider Abstraction: Supports Azure and Soniox transcription services
- Multi-Language Support: Handles transcription for various languages
- VAD Audio Buffering: Prevents speech loss during stream startup
- Automatic Provider Failover: Falls back to alternative providers on failure
- Stream Health Monitoring: Detects and replaces unhealthy streams
- Transcript History: Maintains per-language transcript segments
- Local Transcription Support: Can receive transcripts from device-side processing
Architecture
Provider Management
Provider Configuration
Provider Initialization
Stream Lifecycle Management
Subscription Updates
Stream Creation
VAD Audio Buffering
Buffer Management
Buffer Operations
Stream Health Management
Health Checking
Stream Recovery
Error Handling and Recovery
Provider Failover
Transcript History
History Management
History Pruning
Local Transcription Support
Integration Methods
Audio Feed
Token Finalization
Metrics and Monitoring
Configuration
Timing Constants
Lifecycle Management
Disposal
Best Practices
- Always buffer audio during VAD startup to prevent speech loss
- Monitor stream health and replace unhealthy streams automatically
- Use appropriate timeouts - shorter for VAD scenarios
- Implement smart provider failover based on error types
- Prune transcript history to prevent memory growth
- Handle local transcription for device-side processing
Integration Points
- AudioManager: Receives audio data for transcription
- TranslationManager: Works in parallel for translation needs
- SubscriptionService: Determines which apps receive transcripts
- Provider Classes: Azure and Soniox implementations
- VAD Detection: Triggers stream lifecycle events
Related Documentation
- AudioManager: Audio source
- TranslationManager: Translation services
- SubscriptionService: App subscriptions
- Message Types: Transcript data formats