Audio

Get Recent Audio

Retrieve the last 10 seconds of audio data from a user session.

This endpoint is restricted to the com.augmentos.shazam package only.

Endpoint

GET https://api.mentra.glass/api/audio/:userId

The code shows this endpoint is incorrectly defined as /api/audio/:userId at line 47, but it should be just /audio/:userId since the router is mounted at /api.

Parameters

Parameter	Type	Description
`userId`	string	Target user ID (in URL)

Query Parameters

Parameter	Type	Required	Description
`apiKey`	string	Yes	App API key
`packageName`	string	Yes	Must be `com.augmentos.shazam`
`userId`	string	Yes	Target user ID (same as URL parameter)

Response

Success (200):

Binary audio data stream
Content-Type: application/octet-stream
Format: PCM audio buffer (concatenated audio chunks)

Error (401):

{
  "success": false,
  "message": "Invalid API key." // or "Authentication required. Provide apiKey, packageName, and userId."
}

Error (403):

{
  "success": false,
  "message": "Unauthorized package name"
}

Error (404):

{
  "error": "Session not found" // or "No audio available", "No decodable audio available"
}

Error (500):

{
  "error": "Error fetching audio"
}

Implementation

File: packages/cloud/src/routes/audio.routes.ts:47-91
Middleware: shazamAuthMiddleware - Validates package and API key
Service: Uses AudioManager.getRecentAudioBuffer()

Authorization

Only com.augmentos.shazam package is allowed
Requires valid API key for the package
Must specify target user ID in both URL and query parameters

Audio Processing

Returns buffered audio from userSession.audioManager.getRecentAudioBuffer()
Audio chunks are concatenated into single buffer
LC3 codec support is commented out but planned for future

Text-to-Speech

Convert text to speech using ElevenLabs API.

Endpoint

GET https://api.mentra.glass/api/tts

The code shows this endpoint is incorrectly defined as /api/tts at line 94, but it should be just /tts since the router is mounted at /api.

Query Parameters

Parameter	Type	Required	Description
`text`	string	Yes	Text to convert to speech
`voice_id`	string	No	ElevenLabs voice ID (uses default if not provided)
`model_id`	string	No	TTS model (defaults to `eleven_flash_v2_5`)
`voice_settings`	JSON string	No	Voice customization settings

Response

Success (200):

Audio stream
Content-Type: audio/mpeg
Streaming MP3 audio data
Connection: keep-alive

Error (400):

{
  "success": false,
  "message": "Text parameter is required and must be a string" // or other validation errors
}

Error (500):

{
  "success": false,
  "message": "TTS service not configured" // or "Internal server error"
}

Voice Settings Example

{
  "stability": 0.5,
  "similarity_boost": 0.5,
  "style": 0.5,
  "use_speaker_boost": true
}

Implementation

File: packages/cloud/src/routes/audio.routes.ts:94-223
Service: Proxies to ElevenLabs API
Streaming: Streams response directly to client using fetch API

Configuration

Requires environment variables:

ELEVENLABS_API_KEY: Your ElevenLabs API key
ELEVENLABS_DEFAULT_VOICE_ID: Default voice to use (optional if voice_id provided)

Example Request

GET /api/tts?text=Hello%20world&voice_id=21m00Tcm4TlvDq8ikWAM

ElevenLabs Integration

API endpoint: https://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream
Requires xi-api-key header for authentication
Supports streaming response for low latency

Error Codes

Code	Description
400	Invalid parameters or voice settings
401	Authentication required or invalid API key
403	Unauthorized package name (audio endpoint only)
404	Session not found or no audio available
500	Internal server error or TTS service not configured

Notes

Audio endpoint is restricted to Shazam app for music recognition
TTS endpoint is publicly accessible but requires ElevenLabs configuration
Audio is buffered and retrieved from AudioManager
TTS responses are streamed for low latency
Both endpoints have incorrect route definitions that include /api prefix when they shouldn’t

Overview

WebSocket Endpoints

REST Endpoints

Get Recent Audio

Endpoint

Parameters

Query Parameters

Response

Implementation

Authorization

Audio Processing

Text-to-Speech

Endpoint

Query Parameters

Response

Voice Settings Example

Implementation

Configuration

Example Request

ElevenLabs Integration

Error Codes

Notes

Overview

WebSocket Endpoints

REST Endpoints

​Get Recent Audio

​Endpoint

​Parameters

​Query Parameters

​Response

​Implementation

​Authorization

​Audio Processing

​Text-to-Speech

​Endpoint

​Query Parameters

​Response

​Voice Settings Example

​Implementation

​Configuration

​Example Request

​ElevenLabs Integration

​Error Codes

​Notes

Get Recent Audio

Endpoint

Parameters

Query Parameters

Response

Implementation

Authorization

Audio Processing

Text-to-Speech

Endpoint

Query Parameters

Response

Voice Settings Example

Implementation

Configuration

Example Request

ElevenLabs Integration

Error Codes

Notes