Show HN: WebPizza – AI/RAG pipeline running in the browser with WebGPU

November 6, 2025

Share This Post

I built a proof-of-concept for running RAG (Retrieval-Augmented Generation) entirely in the browser using WebGPU.

You can chat with PDF documents using models like Phi-3, Llama 3, or Mistral 7B – all running locally with zero backend. Documents never leave your device.

Tech stack:
– WebLLM + WeInfer (optimized fork with ~3.76x speedup)
– Transformers.js for embeddings (all-MiniLM-L6-v2)
– IndexedDB as vector store
– PDF.js for parsing

The main challenges were:
1. Getting esbuild to bundle without choking on onnxruntime-node
2. Managing COOP/COEP headers for SharedArrayBuffer
3. Keeping the bundle reasonable (Angular + models = ~11MB base)

Performance is surprisingly decent on modern hardware:
– Phi-3 Mini: 3-6 tokens/sec (WebLLM) → 12-20 tokens/sec (WeInfer)
– Llama 3.2 1B: 8-12 tokens/sec

Demo: https://webpizza-ai-poc.vercel.app/
Code: https://github.com/stramanu/webpizza-ai-poc

This is experimental – I’m sure there are better ways to do this. Would appreciate feedback, especially on:
– Bundle optimization strategies
– Better vector search algorithms for IndexedDB
– Memory management for large documents

Happy to answer questions!

Comments URL: https://news.ycombinator.com/item?id=45833853

Points: 1

# Comments: 0

Source: github.com

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Windows Securitym Hackers Feeds

How Does TidesDB Work?

Article URL: https://tidesdb.com/getting-started/how-does-tidesdb-work/ Comments URL: https://news.ycombinator.com/item?id=45837597 Points: 1 # Comments: 0 Source: tidesdb.com

November 6, 2025

Windows Securitym Hackers Feeds

China-US AI Crypto Trading Showdown: ChatGPT Gets Wiped Out

Article URL: https://thechinaacademy.org/china-us-ai-crypto-trading-showdown-chatgpt-gets-wiped-out/ Comments URL: https://news.ycombinator.com/item?id=45837590 Points: 1 # Comments: 0 Source: thechinaacademy.org

November 6, 2025

IT Support

Hosting & Email

Cloud Solutions

Cyber Security

Telephone & Internet