Master Data Management

Monday, June 3, 2024

What is Hotspot in MDM Match

If the record has too many matches those records are considered as hotspot. Depending on match configuration sometimes too many matches will be generated and those can be irrelevant matches.

Here are few reasons that can cause hotspot:

Noise data :Noise words are common words and does not get identified by there own. Ideally this data will be removed by match process but if there is any word which is commonly used in the records but not recognized in match process , it can cause hotspot. Example: If Source is sending customer name by adding word "INACTIVE" to indicate inactive customers , a large number of records in MDM system will have customer name with "INACTIVE" word , potentially leading to irrelevant matches.
Search Level : Exhaustive and Extreme search level can cause overmatching.

Impact of Hotspot:

It causes delay in the match process
Sometimes match job will be failed with timeout
Irrelevant records will be matched

How to handle hotspot:

This are few solutions we can try:

DMAT : Dynamic match analysis threshold is used in match process to limit number of comparisons in match process.
Match Keys Distribution : This tool in match configuration will help in identifying hotspot.
During match process you can keep this records on hold with consolidation_ind=9. After match and merge you can manually review this records.
EXCLUDE_FROM_MATCH : You can create column in BO with name "EXCLUDE_FROM_MATCH" and you set this value to 1 for the records which are causing overmatch.
Cleansing and standardization: Proper cleansing and standardization helps to reduce noise data.
Change in Match Configuration : Adding at least one exact column in match rules and selecting correct search level can help in reducing overmatch.

Sunday, May 12, 2024

Multiple ways to execute LOAD Job in MDM

Loading data into a base object is known as a load job. All Master Data Management operations to obtain Golden record will be performed on this base object record. Here are the different possible ways to load data into base object or to call load job:

1. Batch Viewer :

You can execute MDM load job directly from hub console . In hub console under batch viewer tab you will get list of base object. Select base object and perform load Job.

user should have hub console access
batch job metrics will be accessible easily
Job History will be visible to user
Job scheduling is not possible

2. SIF API :

Informatica provides SIF API ExecuteBatchLoad to call load job. This API can be called from SOAP UI or by using java EJB call. You can call SIF API from SOAP to load data into MDM.

User does not require to login hub console
It does not provide user interface to get job metrics and job history
API can be called from java class by creating EJB client

Sample Request Parameter:

username - user who is executing this load job

password - password of the user

orsId - databse ID

tableName - stage table name

forceUpdateInd - 1 or 0 ( if the value is 1 MDM loads data regardless of last update date and if 0 , MDM only loads data that has more recent last update date)

Sample ResponseParameter:

message : Contains a message regarding status of the job

retCode : Provide return code

3. Command Line Batch Execution:

MDM batch load can be performed by command line execution. This can be done using The Command Line Batch Execution Resource Kit.

-username -password action -tablename [-forceload]

Command Line Batch Execution internally calls SIF API
User does not require hub console access
User can call this job from any scheduler tool
Can be used in shell scripts

If you know of any other ways to execute a load job, we'd love to hear from you in the comments section

Wednesday, May 8, 2024

Informatica MDM Core Components

Here is the list of MDM core components and how they are used in Multidomain MDM architecture

1. MDM hub master database (CMX_SYSTEM)

This is master database which holds information about ORS database, MDM users, message queue settings , IDD application configuration.

The default name of MDM hub master database is CMX_SYSTEM but you can use custom name.

2. Operational Reference Store (CMX_ORS)

CMX_ORS stores business data, content metadata and rules to process and manage the master data. It will store all Base object, stage table ,landing table and REPOS tables which will have configuration details.

3. Hub Server

Hub Server is J2EE application that can be deployed to any of the application server supported by MDM. There are three application servers supported by MDM :

JBoss
WebSphere
WebLogic

It manages core and common MDM services.

4. Process Server

Process Server processes batch jobs such as load, cleanse , recalculate BVT ,match & merge. It is essential to register the Process Server from the hub console after successful deployment to the application server.

5. Provisioning Tool

Provisioning tool will be used for creating and designing e360 portal. You can create multiple applications in provisioning tool and publish to the MDM .

6. ActiveVOS Schema

ActiveVOS schema will be used to store ActiveVOS metadata and task details.

7. Informatica ActiveVOS

This is business process management tool which helps to automate business process and task. By default Informatica provides predefined MDM workflows however you can customize this workflows using ActiveVOS Designer.

8. Informatica Data Director

IDD is user interface to create , update and manage master data record. This will create real time records into MDM.

Tuesday, May 26, 2020

Informatica MDM Basic

In this Article we will see how Informatica MDM works.

Input System:

Input system are also called source system from where data will be inserted into MDM landing table. Source system can be of two types:

Internal Source System:

This will be source system which is internal to the organization.

External Source System:

This will be external system which can be used by MDM application to compare any data.

Landing:

From this source system data will be inserted into landing table. This can be done using any ETL tool or SQL query .

Loading:

From Landing data will be inserted into MDM base object table which is load job. While loading data from landing to base object one intermediate job will be executed which is stage job. Stage job will do data cleansing and standardization.

Match:

Based on some rules similar records are identified in Match job. Match rules can be fuzzy or exact match.

Merge:

Similar records are merged to create single consolidated record. This will be golden record.

Output System:

This consolidated records will be published to any output system. Data can be outsourced by :

1. Publishing data using Message Queue

2. Publish data using SQL procedure

3. Publish data using SQL Views

This are some ways by which data will be outsourced from MDM.

Friday, May 22, 2020

Master Data Management - Intro

This article will explain

- What is Master data Management

- Why we need it

- Example

What is Master Data Management

Every organization collects data from various systems. However, gathering data from different systems, correcting inappropriate data, standardizing it, and then creating a single record that is most reliable is known as master data management. This record is called master data. Have you ever encountered a situation where you had the same information stored in two different systems? It can be confusing and time-consuming to manage. Let's discuss a scenario in which we face the same issue with duplicate data.

Master Data Management System

A system that is used to manage this master data is a master data management system. In the market many MDM tools are available. This system collects data from multiple source systems performs cleansing and improves data quality, merges similar records into a single record and it will be given to the business user.

End user can be any system where collective data is used to take any business decision.

Why we need it

The quality of data will be improved. Data collected by MDM will be in a standard format by which quality will be maintained
Avoid Redundancy There might be chances of having duplicate records. MDM will make sure that redundancy will be eliminated and will have a single record.
It will give 360 view to the end user by providing information from all the sources into a single system.
Helps in decision-making as data is presented in a single view from various systems.
Up-to-date data across all the systems

Example:

Here we will take an example of any retail industry super market. Data will be collected from various systems into the database like online sites, shops, and market stores. It might be possible that the same customer purchases a product online as well as buy it from the shop. Which will create duplicate data. In this scenario, we will have duplicate information from two systems.

There might be a possibility that some of the information will be missing while buying products online. For example, a customer might not provide personnel details. So we will be missing this information in the online system which can be collected from the store database. This way we will get missing data collected from other systems.

There might be a possibility that the customer provides incomplete information about the address in the store. But we can get these details from his online purchase where the online address will be more trustworthy compared to the store.

Managing data from multiple source systems and viewing them on a single platform is difficult. However, MDM will provide this all information in a single view. In the above example you will be able to see the information of a single customer who is registered online and from the store will be collected and stored in MDM. Business users can view this information in a single view.