Redact - Google DeepMind x Cactus Compute Global Hackathon
AI Tinkerers - San Francisco
Hackathon Showcase

Redact

Team led by a physics PhD candidate and AI engineer with experience at Shippo and Xealth, specializing in multi-agent RAG, Python, and custom developer tooling.

1 member

Cactus Privacy-Aware Voice-to-Action

Overview

I built a privacy-aware agentic system that converts natural language commands into executable function calls through a multi-layer pipeline: PII detection, intelligent hybrid routing, and a graduated escalation engine that decides — per query — whether to resolve locally on FunctionGemma via Cactus Compute or escalate to Gemini cloud. The architecture also supports on-device Whisper transcription via cactus_transcribe for voice input. The system achieves F1=1.00 accuracy while keeping 67% of queries entirely on-device, and never exposes sensitive user data to the cloud unnecessarily.

How These Enable Dynamic Escalation

Cactus Engine returns a confidence score (1.0 − normalized entropy) with every FunctionGemma inference call. Our routing engine combines this signal with three additional checks — JSON parse success, argument validation against query hints, and required-parameter completeness — to make a per-query escalation decision:

  1. High confidence + valid output → Accept locally. Zero cloud cost, zero network latency, zero privacy exposure.
  2. Low confidence or invalid output → Attempt deterministic local synthesis from regex-extracted query hints. Still entirely on-device.
  3. All local paths exhausted → Escalate to Gemini 2.5 Flash via google-genai. When PII is present, the custom redaction engine scrubs sensitive data before it reaches the cloud, and restores original values locally after.
  4. private: mode active → Escalation disabled entirely. Local success or graceful failure — no exceptions.
Cactus Compute Gemini 2.5 Flash (via google-genai Python SDK) Google DeepMind Whisper-Small (OpenAI via Cactus)