Preswald: Build fast, real-time dashboards with DuckDB and Python

Overview

This talk explores the technical mechanics of leveraging DuckDB, an in-memory OLAP database, to build real-time dashboards using Preswald, our open-source Python framework. Preswald combines DuckDB’s high-performance query engine with a declarative API, enabling developers to create fast, lightweight, code-first dashboards with full control over their analytics stack.

In-memory query execution: DuckDB operates entirely in-memory, processing complex SQL queries directly on flat data files (CSV, Parquet, or raw streams) without the need for databases, data warehouses, or ETL pipelines.
Real-time data transformation: Learn how DuckDB enables on-the-fly data transformations with SQL-like operations that run natively on your data files, avoiding unnecessary data movement or duplication.
Design trade-offs: Preswald’s declarative Python API tightly integrates with tools like Pandas and Plotly, giving devs more control over data transformation, queries, schedules, and visualization workflows.

From an engineering standpoint, we decided to build Preswald around DuckDB because it let us rethink how analytics pipelines should work. Traditional setups rely on data warehouses, ETL pipelines, and distributed databases, which add a lot of complexity, latency, and maintenance overhead. These systems often require moving data around constantly, which slows things down and makes them harder to manage.

DuckDB stood out because of its in-memory, columnar database engine. It’s optimized for vectorized execution and can run complex SQL queries directly on flat files like CSVs and Parquet without needing a separate database server or moving data to another system. This drastically reduces I/O overhead and eliminates the need for extra infrastructure. We saw this as the foundation for building a faster, more lightweight approach to analytics.

To make it easier for developers to use DuckDB, we built Preswald with a declarative Python API. It lets developers define their data workflows without having to write raw SQL, while still taking full advantage of DuckDB’s performance. We designed the API to integrate with familiar tools like Pandas for data manipulation and Plotly for visualizations. Under the hood, Preswald uses DuckDB’s vectorized query engine to handle joins, aggregations, and transformations efficiently.

A big challenge was balancing performance and flexibility. DuckDB is incredibly fast in-memory, but it can also process data directly from disk for larger datasets. We built Preswald to handle both cases, allowing small, fast prototypes in memory and scaling up to larger workflows without adding extra dependencies or infrastructure.

Links

https://github.com/StructuredLabs/preswald
Pyodide/DuckDB WASM packages Python data apps for offline browser execution.
https://www.preswald.com/
Preswald converts Python/DuckDB scripts into reactive, zero-JavaScript, statically deployed web apps.

Tech stack