Retrieval After RAG: Hybrid Search, Agents, and Database Design — Simon Hørup Eskildsen of Turbopuffer
This post originally appeared in Latent Space.
“Models can learn to reason, but they can’t compress the world’s knowledge into a few terabytes of weights.”
Turbopuffer came out of a reading app.
In 2022, Simon was helping his friends at Readwise scale their infra for a highly requested feature: article recommendations and semantic search. Readwise was paying ~$5k/month for their relational database and vector search would cost ~$20k/month making the feature too expensive to ship. In 2023 after mulling over the problem from Readwise, Simon decided he wanted to “build a search engine” which became Turbopuffer.

We discuss:
• Simon’s path: Denmark → Shopify infra for nearly a decade → “angel engineering” across startups like Readwise, Replicate, and Causal → turbopuffer almost accidentally becoming a company
• The Readwise origin story: building an early recommendation engine right after the ChatGPT moment, seeing it work, then realizing it would cost ~$30k/month for a company spending ~$5k/month total on infra and getting obsessed with fixing that cost structure
• Why turbopuffer is “a search engine for unstructured data”: Simon’s belief that models can learn to reason, but can’t compress the world’s knowledge into a few terabytes of weights, so they need to connect to systems that hold truth in full fidelity
• The three ingredients for building a great database company: a new workload, a new storage architecture, and the ability to eventually support every query plan customers will want on their data
• The architecture bet behind turbopuffer: going all in on object storage and NVMe, avoiding a traditional consensus layer, and building around the cloud primitives that only became possible in the last few years
Keep reading with a 7-day free trial
Subscribe to SAIL Media to keep reading this post and get 7 days of free access to the full post archives.



