Supacrawler is open source web scraping api built in Go for performance and concurrency. Playwright-Go for JS rendering, Redis for caching and queuing. Also easy to self-host (clone + docker-compose) or use hosted version.
What Supacrawler supports:
– v1/scrape — fetch content (HTML, js rendered) from a URL
– v1/crawl — follow links to crawl entire sites or sections
– v1/screenshots — capture visual renderings of pages (full page, element, etc.)
– v1/watch — monitor pages for changes over time
– v1/parse — the new endpoint: you submit a URL + a schema or desired format (JSON, CSV, YAML, Markdown), and it returns structured data without needing custom scraper logic
Repo: https://github.com/supacrawler/supacrawler
Cloud: https://supacrawler.com
Let me know what would make this a tool you’d rely on in production! Thanks for checking this out 🙂
Comments URL: https://news.ycombinator.com/item?id=45402233
Points: 1
# Comments: 0
Source: supacrawler.com