Lessons Learned Building a RAG-Based Application on a Serverless Platform
Building a RAG-Based SaaS App on a Serverless Platform
Let me tell you a secret: building a SaaS product is like trying to assembling imaginary furniture without the manual—exhilarating when a piece finally clicks, rage-inducing when you realize you’ve been holding the screwdriver backward the whole time. As someone who is obsessed with AI’s potential, I recently threw myself into building a retrieval-augmented generation (RAG) chatbot. What started as a weekend project turned into a three-month journey of weekend debugging, existential dread over serverless timeouts, and one glorious moment where someone thought it was paying $1.99 to try out my product. Here’s my story, the roadblocks I hit, and how I overcame them.
Why RAG?
RAG combines the power of large language models (LLMs) with domain-specific data retrieval, making it ideal for applications like document Q&A systems, research assistants, or customer support bots. What drew me to RAG was its complex workflow and components:
Chunking: Breaking documents into digestible sections.
Embeddings: Converting text into numerical representations.
Retrieval: Using cosine similarity to fetch relevant content.
Generation: Synthesizing answers with an LLM.
Each step offered opportunities to optimize for accuracy, speed, or cost—a playground for tinkerers. My journey began with a Supabase tutorial that promised a “production-ready” RAG app. While the tutorial was a goldmine for understanding core concepts, its markdown-only support felt limiting. Real users need PDFs—and that’s where the real learning began.
The PDF Problem: Parsing Isn’t as Simple as It Seems
The Supabase template handled only markdown but not PDFs.
Here’s what I discovered:
Complexity: PDFs can include images, tables, scanned text, and multi-column layouts. Most TypeScript/JavaScript parsing libraries (like pdf-parse) struggled with non-text elements. There was no one single solution which worked well for all PDFs
Edge Cases: A 20MB SEC report with embedded charts broke my initial setup. A resume with tables and headers? Even worse.
My Workaround:
Azure Document Intelligence: Leveraged its API to convert PDFs to markdown.
While this worked for small files, larger documents exposed a critical flaw: serverless CPU timeouts.
Serverless Growing Pains: CPU Timeouts & Storage Traps
1. CPU Timeouts: The Hidden Cost of “Free” Tiers
Serverless platforms like Supabase Edge Functions impose strict execution and CPU time limits (e.g., 10–30 seconds). For RAG workflows, this became a bottleneck:
Problem: Embedding generation for a 50-page PDF (split into 1,000+ chunks) often timed out.
Root Cause: Local embedding models which ran on edge runtime are CPU-intensive.
Solution:
Offload to OpenAI: Replaced local embeddings with text-embedding-3-small via API calls.
This trade-off sacrificed latency but kept the app running.
2. Storage Tier Limitation
The Supabase tutorial stored files in Supabase Storage (1GB free tier). While convenient, this wasn’t a serverless limitation—it was a platform choice.
Problem: 1GB fills fast with user uploads. A single 50MB PDF from 20 users would exhaust the tier.
Solution: Look at other storage providers:
Cloudflare R2: Stored raw PDFs (10GB free tier, $0.015/GB-month). No egress fee
AWS S3:
Key Architecture Decisions
Component Challenge Solution PDF Parsing Complex layouts, large files Azure API Embeddings CPU timeouts OpenAI API Storage Platform tier limits Cloudflare R2 for files, Supabase for DB User Experience Processing delays Progress indicators + webhook notifications
Lessons Learned (The Hard Way)
Start with Managed Services: APIs like OpenAI Assistants abstract RAG’s complexity. Build RAG from scratch only if benefits outweigh complexity.
Design for Async Early: Assume large files will break sync workflows. Use queues (e.g., Cloudflare, RabbitMQ) from day one.
Test with Real-World Data: A “toy” PDF works until a user uploads a 100-page scanned manual.
Serverless ≠ Free: Optimize for API costs (e.g., batch embedding requests) and monitor usage.
The Launch: From Code to Customer
After months of iteration, I launched with a $1.99/month tier—and the first payment notification felt great. Stripe made monetization seamless, but the real win was seeing strangers use (and pay for!) something I’d built. A tweet about this milestone even went viral, validating the effort.
Final Thoughts: Why Build now?
Today’s tools—Supabase, OpenAI, Vercel—democratize SaaS development. You don’t need a team or VC funding (at least for MVP); you need grit and a willingness to learn.
Whether you’re a seasoned engineer or a curious beginner, there’s never been a better time to ship something.