If the record has too many matches those records are considered as hotspot. Depending on match configuration sometimes too many matches will be generated and those can be irrelevant matches.
Here are few reasons that can cause hotspot:
- Noise data :Noise words are common words and does not get identified by there own. Ideally this data will be removed by match process but if there is any word which is commonly used in the records but not recognized in match process , it can cause hotspot. Example: If Source is sending customer name by adding word "INACTIVE" to indicate inactive customers , a large number of records in MDM system will have customer name with "INACTIVE" word , potentially leading to irrelevant matches.
- Search Level : Exhaustive and Extreme search level can cause overmatching.
Impact of Hotspot:
- It causes delay in the match process
- Sometimes match job will be failed with timeout
- Irrelevant records will be matched
How to handle hotspot:
This are few solutions we can try:
- DMAT : Dynamic match analysis threshold is used in match process to limit number of comparisons in match process.
- Match Keys Distribution : This tool in match configuration will help in identifying hotspot.
- During match process you can keep this records on hold with consolidation_ind=9. After match and merge you can manually review this records.
- EXCLUDE_FROM_MATCH : You can create column in BO with name "EXCLUDE_FROM_MATCH" and you set this value to 1 for the records which are causing overmatch.
- Cleansing and standardization: Proper cleansing and standardization helps to reduce noise data.
- Change in Match Configuration : Adding at least one exact column in match rules and selecting correct search level can help in reducing overmatch.

No comments:
Post a Comment