Show HN: LLM-Use – An LLM router that chooses the right model for each prompt

Share This Post

Hi HN,

I built *LLM-Use*, an open-source intelligent router that helps reduce LLM API costs by automatically selecting the most appropriate model for each prompt.

I created it after realizing I was using GPT-4 for everything — including simple prompts like “translate hello to Spanish” — which cost $0.03 per call. Models like Mixtral can do the same for $0.0003.

### How it works:
– Uses NLP (spaCy + transformers) to analyze prompt complexity
– Routes to the optimal model (GPT-4, Claude, LLaMA, Mixtral, etc.)
– Uses semantic similarity scoring to preserve output quality
– Falls back gracefully if a model fails or gives poor results

### Key features:
– Real-time streaming support for all providers
– A/B testing with statistical significance
– Response caching (LRU + TTL)
– Circuit breakers for production stability
– FastAPI backend with Prometheus metrics

### Early results:
– Personal tests show up to 80% cost reduction
– Output quality preserved (verified via internal A/B testing)

### Technical notes:
– 2000+ lines of Python
– Supports OpenAI, Anthropic, Google, Groq, Ollama
– Complexity scoring: lexical diversity, prompt length, semantic analysis
– Quality checks: relevance, coherence, grammar

Repo: [https://github.com/JustVugg/llm-use](https://github.com/JustVugg/llm-use)

Thanks! Happy to answer questions.


Comments URL: https://news.ycombinator.com/item?id=45504149

Points: 1

# Comments: 0

Source: github.com

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Windows Securitym Hackers Feeds

The Ruby Annotation Element

Article URL: https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/ruby Comments URL: https://news.ycombinator.com/item?id=45561924 Points: 1 # Comments: 0 Source: developer.mozilla.org

Do You Want To Boost Your Business?

drop us a line and keep in touch

We are here to help

One of our technicians will be with you shortly.