Nudgy-Nudge (Team: Sahara)
Team consisting of UC Berkeley and IIT Madras grads from Amazon Frontier AI, Tesla, and NLR, specializing in VLM/VLA robotics, physics-based AI, and healthcare ML.
YouTube Video
Project Description
Nudgy-Nudge is a fully on-device, privacy-first, agentic habit coaching engine built as a React Native mobile application. It runs FunctionGemma (a 270M-parameter instruction-tuned LLM purpose-built for function calling) locally via the Cactus Compute runtime, giving it the ability to observe real-time device signals, reason about user behavior, and autonomously execute structured tool calls, all without any mandatory cloud dependency. When connectivity is available, an intelligent HybridRouter dynamically escalates complex prompts to Gemini 2.0 Flash Lite via REST, creating a seamless edge-cloud continuum where on-device inference is the default and the cloud is an optional accelerator, never a requirement.
On-Device Execution via Cactus Compute and FunctionGemma
The architecture fundamentally relies on two pillars:
- Cactus Compute (cactus-react-native SDK): the native inference runtime that loads the FunctionGemma GGUF model directly into device memory. The CactusRuntime class manages model lifecycle (lazy on-demand loading to avoid blocking the JS thread at startup), downloads the Q4_K_M quantized model (~170MB) from HuggingFace on first launch via ModelManager, and exposes three distinct inference paths: PATH 1: Gemini REST (online, preferred when API key is present): direct calls to generativelanguage.googleapis.com using Gemini 2.0 Flash Lite with native function-calling support. PATH 2: Cactus Hybrid (online, model loaded + Cactus token): uses the Cactus SDK’s hybrid mode which can dynamically route between on-device and cloud relay. PATH 3: Local FunctionGemma (offline or fallback): pure on-device inference via the Cactus SDK’s local mode, achieving >100 tokens/second on modern mobile hardware.
- FunctionGemma 270M-it – a tiny, purpose-built function-calling model from the Gemma family. The FunctionGemmaClient wraps CactusRuntime, injecting 16 tool schemas into every inference call. It validates returned tool calls against schema definitions (checking required parameters), measures latency and confidence, and feeds performance metrics back into the routing system.
Intelligent Routing: HybridRouter -
The HybridRouter is the decision engine that determines whether each agent cycle runs locally or in the cloud. Its routing logic considers:
- Connectivity state: offline forces local (FunctionGemma via Cactus).
- Battery level: below 15% forces local to save radio power.
- Prompt complexity: scored 0.0-1.0 based on whether health data, calendar events, meetings, doom-scrolling, recovery habits, or milestones are present. Simple prompts (complexity < 0.4) stay local; complex prompts (> 0.7) escalate to Gemini.
- Adaptive latency tracking: rolling window of the last 10 latencies per path. If local inference is 30%+ faster than cloud, it wins.
- Failure counters: consecutive failures on either path shift traffic to the other.
This means the system is never hard-coded to one path. It continuously adapts based on real-world conditions, always preferring local-first reasoning for speed and privacy.
Agentic Workflow: The Agent Loop -
The AgentLoop is the core agentic decision cycle, running autonomously via event-driven triggers (not fixed cron schedules):
- Trigger conditions: interval ticks (10-15 min with 20% jitter), post-calendar-event, prolonged doom-scrolling (>15 min), prolonged inactivity, habit completion, health milestones (every 5000 steps, new exercise sessions), or manual/demo triggers.
- Context aggregation: the ContextAggregator builds a unified ContextSnapshot from 8 local signal providers – TimeSignal, CalendarSignal, MotionSignal, ScreenTimeSignal, ScrollSignal (doom-scroll detection), HealthSignal (steps, sleep, heart rate, exercise sessions from Health Connect), ConnectivitySignal, and BatterySignal.
- Habit state recalculation: the HabitStateEngine recalculates friction (contextual), resistance (nudge outcome history), and momentum (deterministic weighted formula) for all habits. The MomentumCalculator uses a 5-factor formula: momentum = 0.25streak + 0.30recency + 0.25success_rate - 0.10friction - 0.10*resistance. The LaundryPredictor adds a specialized depletion algorithm that projects gym clothing inventory against upcoming workout days.
- Prompt construction: the PromptBuilder constructs a system prompt (persona, voice rules, 16 tool descriptions, Google account integration status) and a user prompt (current time, screen usage, scroll duration, calendar events, health metrics, motion state, battery, and all habit states with momentum tiers, streak milestones, recovery flags, and cooldown status).
- LLM inference: routed through HybridRouter to either FunctionGemma locally or Gemini in the cloud.
- Deterministic tool execution: the ToolExecutor dispatches the LLM’s chosen tool call to one of 16 typed handlers. The LLM decides what to do; the executor handles how deterministically.
- Persistence and UI emission: full cycle results (trigger, context, prompt, response, tool call, routing decision, latency) are persisted to AsyncStorage and emitted to UI subscribers via the Zustand store.
Tools and Technologies used - Cactus Compute, FunctionGemma-270M-it, React Native, Gemini 2.5-flash lite, AsyncStorage, Notifee, Google Sign-In, React Native Calendar Events, React Native Health Connect, React Native FS, Zustand, Babel.
How these tools enable Dynamic Escalation and Local-First goals —
- Cactus Compute + FunctionGemma form the local-first foundation: every agent cycle can run entirely on-device with zero network calls. The model is downloaded once and stored locally via react-native-fs. Inference runs through the Cactus native module, producing structured function calls that the ToolExecutor dispatches deterministically. This guarantees sub-second decisions and complete data privacy; no habit data, health metrics, or behavioral signals ever leave the device.
- Gemini 2.0 Flash Lite is the dynamic escalation target. When the HybridRouter determines a prompt is too complex for the 270M local model (e.g., multi-habit milestone celebrations with health data and calendar conflicts), it routes to Gemini via a direct REST call with native function-calling mode (toolConfig: { functionCallingConfig: { mode: ‘ANY’ } }). The same tool schemas are sent to both models, ensuring identical tool execution regardless of inference path.
- Health Connect + Calendar Events + Notifee provide the real-world signal pipeline that makes the agent contextually aware. The 8 signal providers feed the ContextAggregator, which builds the unified snapshot that drives every LLM decision. This is what transforms Momentum from a reminder app into a genuine agentic system; it observes, reasons, and acts autonomously.
- AsyncStorage + Zustand keep all state local and reactive. Habit histories, nudge outcomes, agent cycle logs, and user settings persist entirely on-device. The Zustand store provides real-time UI updates (momentum meters, friction indicators, nudge explanations, stability forecasts) without any server round-trips.
- Google Sign-In + Google APIs extend the agent’s capabilities when the user opts in, creating real Google Calendar events, fetching upcoming schedules, and managing habit-related shopping lists via Google Tasks; while maintaining the local-first principle: these are optional cloud integrations the agent invokes through its tool system, not architectural dependencies.
Prior Work
Ideation was done prior to the hackathon.