Why No Single Algorithm Solves Deduplication - and What to Do Instead | HackerNoon
Effective blocking dramatically cuts comparisons while still grouping true duplicates together. Several blocking strategies can be applied in multi-pass to improve recall.