Fuzzy Matching to identify similar profiles within Treasure Data
This Treasure Box contains the workflows necessary to use Fuzzy Matching algorithms like Levenshtein to do probabilistic matching of the profiles. After profiles that have a higher probability of being similar are gathered, they are further compared with behavioral data using deterministic matching. A report is generated to provide an overview of similar profiles and relevant profiles that can be kept in the Treasure Data platform.
Process of Fuzzy Matching:
- Data Preprocessing
- Soundex Algorithm
- Master Key Generation
- Levenshtein Algorithm
- Dynamic Match Store generation
- Dynamic report generation
Galleries
Example Dashboard
Use-Cases
- Identify similar profiles
- Integrate different data sources
- Spell Checker