Show HN: DedupX – Duplicate file finder with perceptual image matching for macOS

Share This Post

Hi HN,

I built a duplicate file finder for my brother who’s a photography enthusiast and constantly runs out of storage constantly due to resizing a lot of photos and having a lot of duplicates around.

Notes:
– Incremental hashing: Instead of loading entire files into memory, I hash files in chunks. Files with identical sizes get grouped and progressively hashed until they diverge or
match completely.
– Perceptual hashing: For images, I use perceptual hashing (pHash) that generates a fingerprint based on visual content rather than bytes. Similar images have similar hashes.
– BK-Tree indexing: To efficiently search for similar hashes, I implemented a BK-tree that organizes hashes by Hamming distance. This lets me query “find all images within distance
N” without comparing against every single hash.
– Configurable similarity: Users can adjust the Hamming distance threshold (1-15) to control how strict the matching should be.
– Added macOS Services integration so you can right-click any folder in Finder and select “Scan for Duplicates”

The app has a free trial (10 scans / 7 days, whichever is earlier) and then requires a license. I’m using Dodo Payments for licensing.

I’d love feedback from the community, especially on:
– Performance optimizations I might have missed
– Better UX patterns for the results view
– Edge cases in the similarity detection
– More feature suggestions

REQUIREMENTS: macOS 26.0.1 (Tahoe) and Apple Silicon Macs

Happy to answer questions about the implementation or architecture!


Comments URL: https://news.ycombinator.com/item?id=45763117

Points: 1

# Comments: 0

Source: maheepk.net

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Windows Securitym Hackers Feeds

Show HN: RepoPulse – AI-powered GitHub analytics dashboard

Hi HN! I built RepoPulse (repopulse.live), an AI-powered analytics dashboard for GitHub repositories. It offers real-time monitoring, AI-driven insights, performance metrics, code quality analysis, and

Do You Want To Boost Your Business?

drop us a line and keep in touch

We are here to help

One of our technicians will be with you shortly.