Get Recent Audio
Retrieve the last 10 seconds of audio data from a user session.
This endpoint is restricted to the com.augmentos.shazam package only.
Endpoint
Production
Development
Local
GET https://api.mentra.glass/api/audio/:userId
The code shows this endpoint is incorrectly defined as /api/audio/:userId at line 47, but it should be just /audio/:userId since the router is mounted at /api.
Parameters
Parameter Type Description userIdstring Target user ID (in URL)
Query Parameters
Parameter Type Required Description apiKeystring Yes App API key packageNamestring Yes Must be com.augmentos.shazam userIdstring Yes Target user ID (same as URL parameter)
Response
Success (200):
Binary audio data stream
Content-Type: application/octet-stream
Format: PCM audio buffer (concatenated audio chunks)
Error (401):
{
"success" : false ,
"message" : "Invalid API key." // or "Authentication required. Provide apiKey, packageName, and userId."
}
Error (403):
{
"success" : false ,
"message" : "Unauthorized package name"
}
Error (404):
{
"error" : "Session not found" // or "No audio available", "No decodable audio available"
}
Error (500):
{
"error" : "Error fetching audio"
}
Implementation
File : packages/cloud/src/routes/audio.routes.ts:47-91
Middleware : shazamAuthMiddleware - Validates package and API key
Service : Uses AudioManager.getRecentAudioBuffer()
Authorization
Only com.augmentos.shazam package is allowed
Requires valid API key for the package
Must specify target user ID in both URL and query parameters
Audio Processing
Returns buffered audio from userSession.audioManager.getRecentAudioBuffer()
Audio chunks are concatenated into single buffer
LC3 codec support is commented out but planned for future
Text-to-Speech
Convert text to speech using ElevenLabs API.
Endpoint
Production
Development
Local
GET https://api.mentra.glass/api/tts
The code shows this endpoint is incorrectly defined as /api/tts at line 94, but it should be just /tts since the router is mounted at /api.
Query Parameters
Parameter Type Required Description textstring Yes Text to convert to speech voice_idstring No ElevenLabs voice ID (uses default if not provided) model_idstring No TTS model (defaults to eleven_flash_v2_5) voice_settingsJSON string No Voice customization settings
Response
Success (200):
Audio stream
Content-Type: audio/mpeg
Streaming MP3 audio data
Connection: keep-alive
Error (400):
{
"success" : false ,
"message" : "Text parameter is required and must be a string" // or other validation errors
}
Error (500):
{
"success" : false ,
"message" : "TTS service not configured" // or "Internal server error"
}
Voice Settings Example
{
"stability" : 0.5 ,
"similarity_boost" : 0.5 ,
"style" : 0.5 ,
"use_speaker_boost" : true
}
Implementation
File : packages/cloud/src/routes/audio.routes.ts:94-223
Service : Proxies to ElevenLabs API
Streaming : Streams response directly to client using fetch API
Configuration
Requires environment variables:
ELEVENLABS_API_KEY: Your ElevenLabs API key
ELEVENLABS_DEFAULT_VOICE_ID: Default voice to use (optional if voice_id provided)
Example Request
GET /api/tts?text=Hello%20world&voice_id=21m00Tcm4TlvDq8ikWAM
ElevenLabs Integration
API endpoint: https://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream
Requires xi-api-key header for authentication
Supports streaming response for low latency
Error Codes
Code Description 400 Invalid parameters or voice settings 401 Authentication required or invalid API key 403 Unauthorized package name (audio endpoint only) 404 Session not found or no audio available 500 Internal server error or TTS service not configured
Notes
Audio endpoint is restricted to Shazam app for music recognition
TTS endpoint is publicly accessible but requires ElevenLabs configuration
Audio is buffered and retrieved from AudioManager
TTS responses are streamed for low latency
Both endpoints have incorrect route definitions that include /api prefix when they shouldn’t