Hackathon Showcase

Sambanova Voice AI Productivity System

Enterprise-grade Sambanova Voice AI Productivity System: LangGraph agents, voice/call-center integration, team collaboration, Redis audio stream and composio; hackathon-ready.

1 member

Project Description

🎯 Project Overview

Project Name: Sambanova Voice AI Productivity System
Theme: AI-Powered Productivity Assistant
Duration: October 25th 2025 Hackathon at Composio with Redis
Team: Sambanova (HJ Lee)

📋 Project Description

The Sambanova Voice AI Productivity System is an advanced AI-powered voice assistant that integrates Redis for session management and caching, Composio for external tool orchestration, and real-time audio processing to provide seamless productivity management through natural voice interactions. The system features WebRTC voice communication, Redis-based session persistence, Composio tool integrations (Slack, GitHub, Gmail, Notion, Jira), and comprehensive audio stream processing with WebM format support for enterprise-grade voice AI applications.

🚀 Demo Instructions & Expected Outputs

Demo Setup (5 minutes)

Access the System: Navigate to https://hjlees.com
Launch WebRTC Voice Assistant: Click “🌐 WebRTC Voice Assistant” → “Launch Voice →”
Authentication: Enter PIN 1234 when prompted
Voice Interaction: Speak naturally to the AI assistant

Demo Scenarios & Expected Outputs

Scenario 1: Todo Management

Input: “Create a todo for grocery shopping with high priority”
Expected Output:
- AI confirms: “I’ve created a high-priority todo for grocery shopping”
- Redis stores session data with audio buffer
- Database records the new todo item
- TTS plays confirmation audio

Scenario 2: External Tool Integration

Input: “Send a message to my team on Slack about the project update”
Expected Output:
- AI processes request through Composio integration
- Slack message sent via API
- Confirmation: “Message sent to your team on Slack”
- Redis logs the activity

Scenario 3: Audio Stream Analysis

Input: “Show me my recent audio sessions”
Expected Output:
- Audio Stream Player dashboard displays active sessions
- WebM audio files available for download
- Real-time session monitoring with Redis data

🏆 Judging Criteria Satisfaction

1. Running Code

Live Demo: Fully functional system deployed at https://hjlees.com
End-to-End Flow: Complete voice-to-action pipeline working
Error Handling: Robust error recovery and timeout management
Performance: Sub-2 second response times for voice processing
Monitoring: Sentry integration for real-time error tracking

2. Use of the Stack (Redis/Composio)

Redis Implementation:

Session Management: User sessions stored with TTL expiration
Audio Buffer Storage: Base64-encoded audio data persistence
Pub/Sub Notifications: Real-time team updates and notifications
Rate Limiting: Request throttling (60 requests/minute per user)
Caching: User data and activity tracking with cache invalidation
Analytics: User behavior tracking and performance metrics

Composio Integration:

Slack: Message sending, channel management, team notifications
OAuth2 Authentication: Secure platform connections
Robust Method Discovery: Fallback handling for API compatibility

3. Innovation & Creativity

WebRTC Voice Processing: Real-time audio capture and processing
Multi-Format Audio Support: WebM, WAV, MP3 format detection and conversion
Redis-Powered Session Management: Scalable session handling with audio persistence
Composio Tool Orchestration: Seamless external platform integration
Audio Stream Player: Real-time audio debugging and analysis dashboard
LangGraph Agent Architecture: Advanced AI agent with 38+ tools
Voice-to-Action Pipeline: Natural language to executable commands

4. Real-world Impact

Enterprise Productivity: Voice-driven task management for busy professionals
Contact Center Integration: FreePBX call transfer capabilities
Team Collaboration: Slack/GitHub/Gmail integration for distributed teams
Accessibility: Voice-first interface for hands-free productivity
Scalability: Redis-based architecture supports multiple concurrent users
Monitoring: Production-grade error tracking and performance analytics

5. Theme Alignment

AI-Powered: Advanced LangGraph agent with 38+ productivity tools
Productivity Focus: Todo management, calendar integration, team collaboration
Voice Interface: Natural language processing for intuitive interactions
Integration Ecosystem: Seamless connection to popular productivity platforms
Real-time Processing: Instant voice-to-action conversion

🛠️ Technology Stack & Implementation Details

Core Technologies

Backend: Flask, Python 3.12, Gunicorn
AI/ML: LangGraph, LangChain, OpenAI GPT-4, Whisper STT, TTS
Database: PostgreSQL with SQLAlchemy ORM
Caching: Redis 7.x with session management
Voice: WebRTC, JsSIP, MediaRecorder API
External APIs: Composio, Slack, GitHub, Gmail, Notion, Jira
Monitoring: Sentry, Flask-SocketIO
Deployment: Render.com with ASGI support

Redis Implementation Details

# Session Management  
session_data = {  
    'user_id': 'user-123',  
    'audio_buffer': 'base64_encoded_audio',  
    'created_at': timestamp,  
    'expires_at': ttl,  
    'activity_log': []  
}  
  
# Pub/Sub Notifications  
redis_client.publish('team_updates', {  
    'user_id': user_id,  
    'action': 'todo_created',  
    'timestamp': now()  
})  

Composio Integration Details

# Tool Loading with Fallback  
if hasattr(toolset, 'get_tools'):  
    tools = toolset.get_tools(apps=["slack"])  
elif hasattr(toolset, 'get_actions'):  
    tools = toolset.get_actions(apps=["slack"])  
  
# OAuth2 Authentication  
composio_client = ComposioToolSet(api_key=COMPOSIO_API_KEY)  
slack_tools = composio_client.get_tools(apps=["slack"])  

Performance Metrics

Voice Processing Latency: 1.2-2.5 seconds end-to-end
Redis Response Time: <50ms for session operations
Composio API Calls: 200-500ms per external tool execution
Audio Buffer Processing: Real-time with WebM format support
Concurrent Users: Supports 100+ simultaneous sessions
Memory Usage: Optimized for 512MB deployment constraints

Multi-Agent Behavior

LangGraph Agent: 38+ tools including MCP database operations
Tool Orchestration: Automatic tool selection based on user intent
Error Recovery: Thread reset and fallback mechanisms
Context Awareness: Conversation history and user state management
Parallel Processing: Async tool execution with timeout handling

Integrations & APIs

OpenAI: GPT-4 for natural language understanding
Whisper: Speech-to-text conversion
TTS: Text-to-speech with Nova voice
Google Calendar: OAuth2 integration for calendar sync
FreePBX: SIP call transfer capabilities
Composio: External platform orchestration
Sentry: Error tracking and performance monitoring

📊 Performance & Scalability

System Performance

Response Time: 1.2-2.5 seconds for voice-to-action
Throughput: 100+ concurrent voice sessions
Memory: Optimized for 512MB deployment
Storage: Redis-based session persistence
Monitoring: Real-time error tracking with Sentry

Redis Performance

Session Operations: <50ms average response time
Audio Buffer Storage: Efficient base64 encoding
Pub/Sub Latency: <10ms for real-time notifications
Cache Hit Rate: 95%+ for user data retrieval
Memory Usage: Optimized TTL and cleanup strategies

Composio Integration Performance

Tool Loading: 200-500ms per external API call
Authentication: OAuth2 with secure token management
Error Handling: Robust fallback for API failures
Rate Limiting: Respects external API limits
Caching: Intelligent result caching for repeated operations

Team

Hyeokjin Lee

Products & Tools

#### **Integrations & APIs** - **FreePBX:** SIP call transfer capabilities - **Google Calendar:** OAuth2 integration for calendar sync - **OpenAI:** GPT-4 for natural language understanding - **Sentry:** Error tracking and performance monitoring - **TTS:** Text-to-speech with Nova voice - **Whisper:** Speech-to-text conversion Composio Redis