Master Data Management

Wednesday, June 11, 2025

Challenges I faced during PostgreSQL connection creation in IICS

Here are some common challenges faced during PostgreSQL connection creation in IICS

(Informatica Intelligent Cloud Services) and tips to resolve them, especially from an MDM or data integration perspective:

1. Data Source Not Found.

[IM002] [unixODBC][Driver Manager]Data source name not found, and no default driver specified (0)

This error can be due to either incorrect data source name or The DSN is not configured on the Secure Agent machine — especially for ODBC-based connections

Fix:

Verify that System DSN has this data source configured.

1. login to secure agent machine
2.select start->ODBC Data sources->System DSN
3. Check if this DSN is configured and test the connection.

Select the data source and click on configure.

Test if this is working or not. If test fails , you need to check connection details.
If System DNS is not configured , create DNS.

2. Architecture Mismatch Error

The connection test failed because of the following error: [IM014] [Microsoft][ODBC Driver Manager] The specified DSN contains an architecture mismatch between the Driver and Application (0)

This error will be when 32 bit ODBC driver used with 64bit platform.

Fix:

start->odbc data sources(64 bit)->create system DNS.
Use this new DNS in connection data sources.

3. User does not have login access

The connection test failed because of the following error: [08001] connection to server at "localhost" (::1), port 5432 failed: FATAL: role "testuser" is not permitted to log in (101)

This error can be due to user which is configured in IICS connection is not configured with login privileges.
Go to pgAdmin->login/group roles->select user->properties->Previleges

if "can login' is disabled , user will not be able to login.

Fix:

Enable can login? option.

Monday, May 5, 2025

Working with Custom Validation Rules in MDM

When working with Informatica MDM, validation rules are essential for maintaining data quality. While predefined rule types cover most use cases, custom validation rules give us the flexibility to address complex, domain-specific scenarios.

🧩 What Are Custom Validation Rules?

Custom validation rules allow you to write SQL-based logic for validating data that goes beyond standard rule capabilities. They are extremely powerful, but there’s a catch:

Informatica MDM does not validate your SQL at design time.

That means if your query has errors, you might only discover them at runtime—during a Load or Revalidate job.

⚠️ The Surprising Behavior I Encountered

Recently, I ran into three key issues while working with custom validation rules:

1. Rule Not Triggering in Load, But Working in Revalidate
Turns out, the custom rule depended on a child table, but during the Load job, the child data hadn’t been loaded yet. So, the rule did not work. Revalidate worked fine because all the data was already present.

2. Rule Worked in Load, Failed in Revalidate
Here, I used aliases in the rule. Specifically, I referenced s.*, which points to staging columns. This worked during Load but failed during Revalidate because staging columns aren’t available. You have to change your query such a way that it will work both at load and at revalidate job.

3. Performance Bottleneck During Revalidate
The Revalidate job was taking over 24 hours. After optimizing the SQL logic (replaced IN with EXISTS), the job completed in under an hour.

Pro tip: Custom rules can be expensive—always evaluate performance on large datasets.

✅ Best Practices for Custom Validation Rules

Here are some lessons I now live by:

1. 🔍 Test SQL Outside MDM First
Use a SQL client with real or sample data before pushing changes to Hub Console.

2. 🛡️ Write Defensive SQL
Handle edge cases. Use IS NULL, safe joins, and anticipate missing data during the Load phase.

3. 🧾 Be Descriptive
Name rules meaningfully and include detailed descriptions. You (and your teammates) will thank you later.

4. 🔄 Test Both Load & Revalidate
Always validate both paths. Don’t assume what works in one will work in the other.

5. 🔗 Know the Dependencies
Understand if your rule depends on data loaded by other mappings or child tables.

6. 📁 Check Job Logs Thoroughly
Often, rules don’t error—they just silently skip. Review cleanse/validation logs for subtle hints.

7. 📝 Version Control Your SQL
Keep SQL scripts versioned in Git or another system. This helps with collaboration, rollbacks, and audits.

🚀 Final Thoughts

Custom validation rules are indispensable in complex MDM implementations. By following these best practices, you can avoid common pitfalls and ensure data validation remains accurate and consistent across all jobs.

Have you experienced any challenging behavior with validation rules in MDM? Drop a comment—I’d love to hear your stories!

Tuesday, April 22, 2025

MDM isn't just about consolidating data—it's about governing it.

In today’s data-driven world, businesses thrive—or suffer—based on how well they manage their information. A single data breach can have lasting consequences, not just financially, but in terms of trust and customer loyalty.

So, what exactly is Data Governance in the context of MDM, and why does it matter?

Statistic: Average total cost of a data breach in industrial sector worldwide from 2019 to 2024 (in million U.S. dollars) | Statista

Find more statistics at Statista

What is Data Governance in MDM?

Data Governance refers to the framework of policies, roles, standards, and metrics that ensure master data is accurate, consistent, secure, and used responsibly across the organization.

In an MDM context, this means governing:

Who owns the data
Who can change it
How it's validated
How exceptions are handled
How lineage is tracked

🧱 Core Elements of Data Governance in MDM

1. Data Ownership & Stewardship

Assigning data owners and stewards to domains like Product, Customer, Supplier.
Owners are accountable; stewards are operational.
You can use role based security to enforce attribute level access.

2. Business Rules & Validation

Enforcing attribute-level rules: mandatory fields, value formats, referential integrity.
Using Validation Rules in MDM Hub to catch issues at staging.
Validation logic need to revisit and reverify to make sure it is working with updated data.

3. Workflow & Approval Processes

Automating issue resolution workflows via e360 portal/provisioning tool
Stewards can approve, reject, or escalate changes.

4. Audit & Lineage

Logging every change: who did what, when, and why. Most of this information will be saved in system column of MDM like (source, source LUD , created by ,updated by)
Essential for compliance (GDPR, HIPAA, SOX).

5. Data Quality Metrics

Dashboards to track match confidence, duplicate rates, null values, etc.
Helps stewards prioritize fixes and spot data decay.

⚠️ What Happens Without Governance?

Golden records become unreliable
Users lose trust in central MDM
Compliance violations go unnoticed
Regional teams bypass MDM altogether

Conclusion :

MDM without governance is just a glorified database.

Whether you're a developer, architect, or steward, think of governance as the invisible guardrails that protect your data—and your business—from chaos.

Thursday, July 25, 2024

Troubleshooting MDM Issues

At any step of MDM project development we face multiple issues. Troubleshooting those problems and fixing them is major task in MDM development. It is also time consuming process. However its not possible to have same solution for all problems , having systematic approach to resolve this problem can definitely help to solve in better way and faster. This are few things I follow while trying to fix issue in MDM:

Steps to troubleshoot :

1. View the logs to get the exact error :

Sometimes you may see that logs does not contain the error which you are facing . This could be either logs are rolled out or you might be checking different logs. You can either increase log size or write log file content into new file to track all the activity. You can change log mode to debug mode to get more details and stack trace for the specific error.

2. Search related KB :

You can search the exact error code in informatica KB portal. You will find multiple KB's for specific error. You can open those KB's and check which is most relevant KB for the error which can help to resolve the issue. Sometimes you can search caused by statement in KB to get more relevant KB.

3. Search Java and DB error code

In case if you don't find any KB with given error code , analyze logs to get more details : error will contain java error code or db error code. You may google this error code to get more details of the error and possible solution to fix the issue.

4. Isolate the problem :

Few issues happens only in specific condition or specific environment. Understanding what exact configuration change is causing this error, will help to resolve it.

5.Informatica support

In case if you are still not able to resolve issue you can take help of Informatica support team by raising case.

Common Informatica MDM Issues :

1. Installation and Upgrade :

Most error will be due to unsupported version of component , missing environment variables , permission issues

2. Performance Issue :

You can review logs to understand which step is taking more time. You may find some specific query or procedure which is taking time. Depending on which you can change parameters to improve performance.

3. Job Failure :

It could be because of connectivity issue or specific to data. Understanding data flow in job can help to troubleshoot this.

4. Configuration Error :

If you have configured any new thing and it does not work as expected or getting any error while execution , you can recheck if you have followed all steps as per given in the document. Any missing step can cause the issue . Going back and checking all steps will help.

My secret : Most of the time having small tea break has helped me to get better solution. Yes, it just breaks thought process which was in loop and gives me new direction or approach to look into it. I think you must try it 😃

Thursday, June 27, 2024

Security Basics: Key Elements You Need to Know

Master data management is a crucial system for managing golden records effectively. Therefore making sure that this information used by right person , by right way is equally important. Who can use data , what data should be accessible to user , and how to use it effectively are few important questions which can be addressed by different security modules of MDM.

Tool Access: Tool access is available in hub console and it helps to manage available process and workbench. User need to have Administrator access to work with "Tool Access " .

Dynamic Data Masking : This feature helps to mask sensitive information when user opens any record. Dynamic Data Masking is Informatica product. So you need to make sure that you are using supported version of Dynamic data masking.

Security Access Manager : This tool is accessible from MDM hub console. This will protect unauthorized access to MDM resources which are like Base Object , Cleanse function. You can assign below privileges :

1. Read
2. Create
3. Update
4. Delete
5. Merge
6. Execute

Authentication : Authentication is the process of identifying the user. User name and password will help to authenticate. Type of Authentication :

1. Internal : User will be created in MDM hub and will be having password stored into MDM.
2. External : User will be authenticated using external system like LDAP , Microsoft Active Directory.

Authorization : Authorization is the process to determine if user has sufficient privileges to perform operation which user is trying to do.

Secure Resources and Privileges : Resources can be secure or private. MDM does have below resources :

Base objects
Mappings
Packages
Cleanse functions
Match rule sets
Metadata
Profiles
Users table

We can assign privileges to this resources. You can create group of resources and assign to roles.

Roles : In MDM implementation, a variety of roles play crucial parts, including project manager, architect, developer, and administrator. To effectively manage MDM data, key roles such as business analysts, data stewards, and business users are essential. Furthermore, users have the flexibility to create new roles tailored to specific business requirements

Thursday, June 13, 2024

Data Matching

Data matching is process to identify to similar or identical entities to merge and create single view of those entities. Data will be collected from multiple sources and there can be entities which are identical. Find those entities from different sources by comparing attributes of this entity is data matching process.

To understand this let’s take example of bank database where we can have customer data. This data will be collected from various sources e.g. loan department, saving account department, insurance department. There will be chances that single customer is part of this three department. Based on attribute like address, phone, name and email we can identify identical customer. This process is called data matching.

Data matching can be done by two ways :

Fuzzy matching : Fuzzy match technique is used to find relevant data set. It identify two words identical based on phonetic or based on similarity between pronunciation. Example : Monika and Monica can be considered as match
Exact matching : As name suggest if two records are copy of each other they will be consider exact match. Example : Monika and Monica will not be considered as match. Only Monika with another same word Monika will be matched.

Benefits of Data Matching :

Unnecessary costs reduced : Lets consider if bank is sending speed post every month to the customer. Customer has changed address into one system but did not update to speed post system . Bank can identify the customer and can send speed post to new address. It also helps in data storage reduction.
Identifying duplicate records : Record will be collected from various system matching will help to identify duplicate records
Verifying the accuracy of data : If we find two duplicate records we can compare them to get accurate consolidated record
Consolidating data : we can create consolidated record by collecting record from multiple sources which is best version if record.

Challenges with data matching :

Data entry issues : While entering data into system someone can use different spelling for same name or they can use some short names. This could be Incorrect or incomplete data.
Mismatches in data formats : Best example of mismatch in data format is phone number. (91) 12345678 and 9112345678 this both are same however due to different formats this will not get detected. We can use data standardization technique to avoid this.
Overmatching and undermatching : If matching rules defined are loose then we can get multiple records which can be overmatching. If matching rules are too tight we will get very less number of duplicates this will be undermatching.

Use case :

Retail Sector : Retail sector can use MDM to get customer data and based on customer purchase behavior they can provide some offers to specific customer
Financial Services : Bank can use this technique to identify potential fraud .Before giving any loan bank can check if this is existing customer ,if he has already taken any loan in the bank.
Healthcare Sector : Healthcare Sector can use this technique to compare patient with his medical history so that before treating patient they will be aware of his existing health situation.
Marketing and Sales : Marketing and Sales team can use this information to plan any new marketing scheme where they can compare customer purchase history and design offers to the customer.

Monday, June 3, 2024

What is Hotspot in MDM Match

If the record has too many matches those records are considered as hotspot. Depending on match configuration sometimes too many matches will be generated and those can be irrelevant matches.

Here are few reasons that can cause hotspot:

Noise data :Noise words are common words and does not get identified by there own. Ideally this data will be removed by match process but if there is any word which is commonly used in the records but not recognized in match process , it can cause hotspot. Example: If Source is sending customer name by adding word "INACTIVE" to indicate inactive customers , a large number of records in MDM system will have customer name with "INACTIVE" word , potentially leading to irrelevant matches.
Search Level : Exhaustive and Extreme search level can cause overmatching.

Impact of Hotspot:

It causes delay in the match process
Sometimes match job will be failed with timeout
Irrelevant records will be matched

How to handle hotspot:

This are few solutions we can try:

DMAT : Dynamic match analysis threshold is used in match process to limit number of comparisons in match process.
Match Keys Distribution : This tool in match configuration will help in identifying hotspot.
During match process you can keep this records on hold with consolidation_ind=9. After match and merge you can manually review this records.
EXCLUDE_FROM_MATCH : You can create column in BO with name "EXCLUDE_FROM_MATCH" and you set this value to 1 for the records which are causing overmatch.
Cleansing and standardization: Proper cleansing and standardization helps to reduce noise data.
Change in Match Configuration : Adding at least one exact column in match rules and selecting correct search level can help in reducing overmatch.