LLM Security Loop for PRs

See how an LLM can find security holes in AI-generated code. This talk demonstrates an exploit loop that catches vulnerabilities before they merge, turning exploits into regression tests.

Next TypeScript Vitest OpenAI API GitHub Pull Requests

Overview

I built a tiny, deliberately vulnerable finance app for a fake AI lab, plus an LLM exploit harness that turns an agent-written PR into a live IDOR attack.
It starts with an innocent-looking “fast invoice preview” feature in a Next.js app. Then the exploit loop kicks in: it reads the ticket, the route code, and the seed data, drafts candidate attack requests, fires them at the running app, and shows that an OpenAI user can pull a fictional Anthropic invoice just by guessing inv-anthropic-4001.

I’ll walk through the app, the vulnerable API route, the exploit agent prompt, the request/response trace, the validator that confirms the cross-tenant leak, and the regression test and fix it generates.

Tech stack

Next

Next.js is the full-stack React framework: it delivers high-performance web applications via hybrid rendering and powerful, Rust-based tooling.

This is the React Framework for production: Next.js enables you to build full-stack web applications with zero configuration and maximum efficiency. It supports a hybrid rendering approach (Server-Side Rendering, Static Site Generation, and Incremental Static Regeneration) for optimal speed and SEO performance. Key features include React Server Components, Server Actions for running server code directly, and the App Router for advanced routing and nested layouts. Developed by Vercel, it leverages Rust-based tools like Turbopack and the Speedy Web Compiler for the fastest possible builds and a superior developer experience.

https://nextjs.org/

View projects
TypeScript

TypeScript is an open-source superset of JavaScript: it adds static typing and compiles to clean, standards-based JavaScript.

TypeScript is a high-level, open-source language developed by Microsoft: it acts as a superset of JavaScript, adding a powerful static type system. This system enables compile-time type checking, catching errors before runtime (a critical benefit for large-scale applications). The TypeScript Compiler (TSC) reliably transpiles all code into clean, standards-based JavaScript (ES3 or newer), ensuring compatibility across any browser or host environment (Node.js, React.js, etc.).

https://www.typescriptlang.org/

View projects
Vitest

Vitest is the next-generation testing framework: Vite-native, blazing fast, and Jest-compatible.

Vitest is a high-performance test runner built on Vite, designed for modern JavaScript and TypeScript projects. It leverages Vite’s configuration and transformation pipeline, ensuring a consistent setup across your application and tests. Expect a significant boost in Developer Experience (DX) with its smart, instant watch mode, which reruns only related changes, similar to Hot Module Replacement (HMR). Vitest maintains a Jest-compatible API (including `expect`, snapshot testing, and mocking via Tinyspy), making migration from existing test suites straightforward. It provides out-of-the-box support for ESM, TypeScript, and JSX, and is the recommended solution for component testing in frameworks like Vue, React, and Svelte.

https://vitest.dev

View projects
OpenAI API

OpenAI API: Your direct gateway to cutting-edge AI models (GPT-4o, DALL-E 3, Whisper), enabling scalable, multimodal intelligence integration into any application.

The OpenAI API provides authenticated, programmatic access to a powerful suite of generative AI models. Developers leverage REST endpoints and official libraries (Python, Node.js) to integrate capabilities like advanced text generation (GPT-4o), image creation (DALL-E 3), and speech-to-text transcription (Whisper). This platform is engineered for scale, supporting millions of daily requests for tasks from complex reasoning to real-time customer support agents, ensuring your application gets reliable, state-of-the-art intelligence.

https://platform.openai.com/

View projects
GitHub Pull Requests

GitHub Pull Requests (PRs) are the core collaboration tool: they propose code changes from a feature branch (e.g., `feat/login-fix`) to a base branch (e.g., `main`) for review and discussion before final merge.

GitHub Pull Requests (PRs) are the foundational mechanism for collaborative software development, managing proposed changes between branches. A developer pushes code from a feature branch (e.g., `dev/issue-401`) to a remote repository, then opens a PR targeting the base branch (commonly `main`). This action initiates the review cycle: stakeholders, like Code Owners, review the file diffs, add line-by-line comments, and suggest specific edits. Once all automated CI/CD checks pass and reviewers approve (e.g., two required approvals), the PR is merged. This structured process ensures quality control and transparent change management across a project’s primary codebase.

https://docs.github.com/en/pull-requests

View projects