Show HN: Run HF Transformers in pure Go (10 MB binary, no Python)

November 10, 2025

Share This Post

Hey HN! Author here.

Built this because deploying ML models is painful. Python + pip dependencies + Docker = 5GB+ images. For edge/embedded/air-gapped systems, this doesn’t work.

LOOM loads HuggingFace transformers directly in Go. No Python runtime. ~10MB binary.

Technical highlights:
– Native safetensors parser
– Pure Go BPE tokenizer (no transformers library)
– Full transformer stack (MHA, GQA, RMSNorm, SwiGLU)
– Cross-platform determinism (MAE < 1e-8)
– Published to PyPI, npm, NuGet

Tradeoff: CPU-only, 1-3 tok/s on small models. Correctness first, speed second.

Works with Qwen, Llama, Mistral, SmolLM. Cross-compiles everywhere Go runs.

Demo: https://youtu.be/86tUjFWow60

What layer types should I add next? Currently have: Dense, Conv2D, MHA, RNN, LSTM, LayerNorm, RMSNorm, SwiGLU, Softmax (10 variants), Residual.

Questions welcome!

Comments URL: https://news.ycombinator.com/item?id=45874310

Points: 1

# Comments: 0

Source: github.com

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Windows Securitym Hackers Feeds

SailPoint's Second Act

Article URL: https://strategyofsecurity.com/p/sailpoints-second-act Comments URL: https://news.ycombinator.com/item?id=45881582 Points: 1 # Comments: 0 Source: strategyofsecurity.com

November 10, 2025

High-performance 2D graphics rendering on the CPU using sparse strips [pdf]

Article URL: https://github.com/LaurenzV/master-thesis/blob/main/main.pdf Comments URL: https://news.ycombinator.com/item?id=45881568 Points: 1 # Comments: 0 Source: github.com

November 10, 2025

IT Support

Hosting & Email

Cloud Solutions

Cyber Security

Telephone & Internet