Show HN: Tokenflood – simulate arbitrary loads on instruction-tuned LLMs

Share This Post

Hi everyone, I just released an open source load testing tool for LLMs:

https://github.com/twerkmeister/tokenflood

=== What is it and what problems does it solve? ===

Tokenflood is a load testing tool for instruction-tuned LLMs hat can simulate arbitrary LLM loads in terms of prompt, prefix, and output lengths and requests per second. Instead of first collecting prompt data for different load types, you can configure the desired parameters for your load test and you are good to go. It also let’s you assess the latency effects of potential prompt parameter changes before spending the time and effort to implement them.

I believe it’s really useful for developing latency sensitive LLM applications and
* load testing self-hosted LLM model setups
* Assessing the latency benefit of changes to prompt parameters before implementing those changes
* Assessing latency and intraday variation of latency on hosted LLM services before sending your traffic there

=== Why did I built it? ===

Over the course of the past year, part of my work has been helping my clients to meet their latency, throughput and cost targets for LLMs (PTUs, anyone? ). That process involved making numerous choices about cloud providers, hardware, inference software, models, configurations and prompt changes. During that time I found myself doing similar tests over and over with a collection of adhoc scripts. I finally had some time on my hands and wanted to properly put it together in one tool.

=== What am I looking for? ===

I am sharing this for three reasons: Hoping this can make other’s work for latency-sensitive LLM applications simpler, learning and improving from feedback, and finding new projects to work on.

So please check it out on github (https://github.com/twerkmeister/tokenflood), comment, and reach out at thomas@werkmeister.me or on linkedin(https://www.linkedin.com/in/twerkmeister/) for professional inquiries.

=== Pics ===

image of cli interface: https://github.com/twerkmeister/tokenflood/blob/main/images/…

result image: https://github.com/twerkmeister/tokenflood/blob/main/images/…


Comments URL: https://news.ycombinator.com/item?id=45898674

Points: 1

# Comments: 0

Source: github.com

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Async and Finaliser Deadlocks

Article URL: https://tratt.net/laurie/blog/2025/async_and_finaliser_deadlocks.html Comments URL: https://news.ycombinator.com/item?id=45903586 Points: 2 # Comments: 0 Source: tratt.net

Windows Securitym Hackers Feeds

Val Town 2023-2025 Retrospective

Article URL: https://macwright.com/2025/11/11/val-town Comments URL: https://news.ycombinator.com/item?id=45903585 Points: 1 # Comments: 0 Source: macwright.com

Do You Want To Boost Your Business?

drop us a line and keep in touch

We are here to help

One of our technicians will be with you shortly.