Lessons Learned Building a RAG-Based Application on a Serverless Platform
I built my first RAG chatbot side project. ~3 months from start to first paying customer at $1.99/mo. Started from Supabase’s RAG tutorial which only handled markdown. I added PDF parsing on top. That’s where most of the work was.
What’s in a RAG pipeline
Chunking: split docs into pieces
Embeddings: text → vectors
Retrieval: cosine similarity over chunks
Generation: LLM answers using retrieved context
Each step is a knob you can tune for accuracy, latency, or cost.
PDF parsing
Tutorial only supported markdown but most customers work with PDF. PDF parsing is whole another complexity of its own. They are messier: images, tables, scanned text, multi-column layouts. JS/TS libraries like pdf-parse only handle clean text. A 20MB SEC report with embedded charts broke my first setup. Resumes with table layouts broke it differently.
I decided ill outsource the problem of PDF parsing and ended up routing PDFs through Azure Document Intelligence to convert to markdown. Worked for small files. larger ones hit the next problem.
Serverless CPU timeouts
Supabase Edge Functions cap execution at 10–30 seconds for free tier. A 50-page PDF chunked into 1000+ pieces, embedded locally on edge runtime, blew through that.
I decided to swap local embeddings for OpenAI’s text-embedding-3-small over API. This solved CPU timeout issue but introduced acceptable levels of latency.
Storage
Next problem i ran into was Storage. Supabase Storage has a 1GB free tier. How can one support many users on such small free tier? Moved file storage to Cloudflare R2: 10GB free, $0.015/GB-month after, no egress fees. This was good enough for now.
So moved from Supabase to R2 for blob storage but still continued using postgres tables on supabase.
Architecture summary
Four decisions ended up shaping the pipeline.
PDF parsing went to Azure Document Intelligence because it handled PDF parsing really well and provided markdown as output.
Embeddings moved off the edge runtime to OpenAI's API because local generation on Edge Runtime CPU timed out on anything substantial.
File storage moved to Cloudflare R2 because Supabase Storage's 1GB was simply too small, though the database stayed on Supabase.
And on the UX side, processing delays got papered over with progress indicators and webhook notifications so users weren't staring at a frozen screen.
What I’d do differently
Start with managed services. OpenAI Assistants handles most of this. Roll your own only if you need control the API doesn’t give.
Assume async from day one. Anything touching a large file goes through a queue.
Test on real user files early. Toy PDFs lie.
Serverless isn’t free. Monitor API spend, batch where possible.
Launch
$1.99/mo on Stripe. First payment came in. A tweet about it went viral, that was super exciting.

