Xtool: Dedup
For (e.g., 50+ GB), use --global --external to sort on disk and reduce memory footprint. Integration Examples Remove duplicate IPs from a web log (global):
grep "GET /api" access.log | xtool dedup --global > unique_ips.txt xtool dedup
head -1 sales.csv > clean.csv tail -n +2 sales.csv | xtool dedup --global >> clean.csv For (e