Get Recent Audio

Retrieve the last 10 seconds of audio data from a user session.
This endpoint is restricted to the com.augmentos.shazam package only.

Endpoint

GET https://api.mentra.glass/api/audio/:userId
The code shows this endpoint is incorrectly defined as /api/audio/:userId at line 47, but it should be just /audio/:userId since the router is mounted at /api.

Parameters

ParameterTypeDescription
userIdstringTarget user ID (in URL)

Query Parameters

ParameterTypeRequiredDescription
apiKeystringYesApp API key
packageNamestringYesMust be com.augmentos.shazam
userIdstringYesTarget user ID (same as URL parameter)

Response

Success (200):
  • Binary audio data stream
  • Content-Type: application/octet-stream
  • Format: PCM audio buffer (concatenated audio chunks)
Error (401):
{
  "success": false,
  "message": "Invalid API key." // or "Authentication required. Provide apiKey, packageName, and userId."
}
Error (403):
{
  "success": false,
  "message": "Unauthorized package name"
}
Error (404):
{
  "error": "Session not found" // or "No audio available", "No decodable audio available"
}
Error (500):
{
  "error": "Error fetching audio"
}

Implementation

  • File: packages/cloud/src/routes/audio.routes.ts:47-91
  • Middleware: shazamAuthMiddleware - Validates package and API key
  • Service: Uses AudioManager.getRecentAudioBuffer()

Authorization

  • Only com.augmentos.shazam package is allowed
  • Requires valid API key for the package
  • Must specify target user ID in both URL and query parameters

Audio Processing

  • Returns buffered audio from userSession.audioManager.getRecentAudioBuffer()
  • Audio chunks are concatenated into single buffer
  • LC3 codec support is commented out but planned for future

Text-to-Speech

Convert text to speech using ElevenLabs API.

Endpoint

GET https://api.mentra.glass/api/tts
The code shows this endpoint is incorrectly defined as /api/tts at line 94, but it should be just /tts since the router is mounted at /api.

Query Parameters

ParameterTypeRequiredDescription
textstringYesText to convert to speech
voice_idstringNoElevenLabs voice ID (uses default if not provided)
model_idstringNoTTS model (defaults to eleven_flash_v2_5)
voice_settingsJSON stringNoVoice customization settings

Response

Success (200):
  • Audio stream
  • Content-Type: audio/mpeg
  • Streaming MP3 audio data
  • Connection: keep-alive
Error (400):
{
  "success": false,
  "message": "Text parameter is required and must be a string" // or other validation errors
}
Error (500):
{
  "success": false,
  "message": "TTS service not configured" // or "Internal server error"
}

Voice Settings Example

{
  "stability": 0.5,
  "similarity_boost": 0.5,
  "style": 0.5,
  "use_speaker_boost": true
}

Implementation

  • File: packages/cloud/src/routes/audio.routes.ts:94-223
  • Service: Proxies to ElevenLabs API
  • Streaming: Streams response directly to client using fetch API

Configuration

Requires environment variables:
  • ELEVENLABS_API_KEY: Your ElevenLabs API key
  • ELEVENLABS_DEFAULT_VOICE_ID: Default voice to use (optional if voice_id provided)

Example Request

GET /api/tts?text=Hello%20world&voice_id=21m00Tcm4TlvDq8ikWAM

ElevenLabs Integration

  • API endpoint: https://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream
  • Requires xi-api-key header for authentication
  • Supports streaming response for low latency

Error Codes

CodeDescription
400Invalid parameters or voice settings
401Authentication required or invalid API key
403Unauthorized package name (audio endpoint only)
404Session not found or no audio available
500Internal server error or TTS service not configured

Notes

  • Audio endpoint is restricted to Shazam app for music recognition
  • TTS endpoint is publicly accessible but requires ElevenLabs configuration
  • Audio is buffered and retrieved from AudioManager
  • TTS responses are streamed for low latency
  • Both endpoints have incorrect route definitions that include /api prefix when they shouldn’t