Hackathon Showcase 3rd Place Winner

WarriorsPersonalAssistant

Team consisting of a Cornell aerospace PhD, a Stanford researcher, and startup founding engineers skilled in edge AI, robotics, full-stack, and on-device voice agents.

4 members Watch Demo

YouTube Video

Project Description

Our hybrid routing system revolutionizes function calling by intelligently balancing on-device FunctionGemma execution with cloud Gemini fallbacks, achieving optimal performance through adaptive traffic shifting, query normalization, and deterministic argument extraction. The architecture implements sophisticated agentic workflows that prioritize local-first reasoning for speed and privacy, utilizing FunctionGemma and Cactus Compute to redefine the edge-cloud frontier. Our model persistence optimization eliminates redundant initialization overhead, while the adaptive traffic shifter learns optimal routing patterns per query category through exponential moving averages. Query rewriting normalizes indirect phrasing before processing, and hybrid argument extraction separates tool selection (FunctionGemma) from argument parsing (regex), ensuring both accuracy and efficiency. This multi-layered approach dynamically escalates complex queries to cloud processing while maintaining high on-device ratios for simple requests, fundamentally transforming how AI systems balance local computation with cloud resources.

Prior Work

This project was developed entirely during the hackathon period. The core hybrid routing architecture, adaptive traffic shifter, model persistence optimization, and regex argument extraction systems were all designed and implemented from scratch. While we utilized existing APIs (FunctionGemma via Cactus Compute and Gemini 2.5 Flash), the intelligent routing logic, multi-phase execution strategy, and agentic workflow coordination represent novel contributions that fundamentally advance local-first function calling capabilities.

Team

Products & Tools

Adaptive Traffic Shifter - Machine learning component that dynamically learns optimal routing strategies Architecture Components: Cactus Compute Cactus Compute - Local inference framework enabling model persistence and optimization Collections (defaultdict Core Technologies: FunctionGemma (270M-it model via Cactus Compute) - On-device function calling and tool selection Gemini 2.5 Flash API - Cloud fallback for complex multi-tool coordination Google DeepMind Google GenAI SDK - Gemini API integration and structured tool definitions JSON - Tool definition parsing and response formatting Key Libraries & Frameworks: Multi-Phase Execution Engine - Orchestrates local processing Python 3 - Primary development language Query Rewriter - Natural language processing for indirect phrasing normalization Random - Probabilistic routing decisions in traffic shifter Re (Regular Expressions) - Deterministic argument extraction and query pattern matching Regex Argument Reconstructor - Type-driven parameter extraction system These technologies enable dynamic escalation between edge and cloud processing Time - Performance measurement and optimization tracking and cloud fallback deque) - Traffic shifter outcome tracking and sliding window analysis supporting agentic utility through intelligent local-first decision making while maintaining production-ready reliability and cross-domain generalization. validation

Additional Links

https://github.com/arhrid/warriors_function_gemma

Summarizing URL...