Adding more value when cleansing SME client data ahead of a bespoke AI integration
- Neil Marchant
- Oct 10
- 3 min read
Many SMEs ERP data has grown organically over time, shaped more by day-to-day firefighting than by deliberate design. Product masters, supplier records and customer data often reflect years of incremental change, some locked in spreadsheets, some translated back into the ERP.
The data you’re faced with might well hide a history of quick fixes and inconsistent updates, confusing product hierarchies, unclear units of measure, duplicated items and missing fields.
When a consultant steps in to help, the immediate task might seem clinical but the real opportunity is a chance to improve the governance process and ownership, structure and future usability of the core business data, creating a foundation that delivers ongoing improvements and secures accurate reporting, precise decision-making and cleaner future developments (whether that’s AI, analytics, or a new ERP module).
The following principles can help when cleaning ERP static data and how to turn what could be a tactical task into a long-term value driver for your client’s business.
1. Understand the Data’s Functional Context and Future Use
Before changing or cleaning, make sure you understand how each dataset is used today and how it might evolve:
Broaden your thinking, don’t treat master data in isolation, e.g. an “item” record links to other systems, pricing, availability, and reporting for example.
Understand how category, subcategory, and license fields drive how management reporting, analytics and AI model segmentation will work. Inconsistent hierarchies, or something missed here can cripple insights later.
Ask: “What decisions or reports depend on this field?”
Think beyond the current sprint. Consider what upcoming developments (e.g. new analytics dashboards, AI enrichment, planning tools, or system integrations) might need this data later.
Value add: Protect downstream logic and ensure the data model supports both current and future needs.
2. Define the Target Data Model & Quality Standards
Even if the client doesn’t have one yet, create a future-state data model or at least a “gold standard” template for key objects:
What does a good item / customer / supplier master look like?
What are the mandatory fields, naming conventions, and validation rules?
Are there hierarchy structures (product groups, categories, subcategories, license types) that need rationalisation?
Value add: Set the foundation for scalability, clean reporting and integration with AI or future ERP functionality.
3. Set Up Data Ownership & Governance
Once data is clean, it won’t stay clean without ownership:
Assign data owners for each key domain (materials, customers, suppliers, etc.).
Define who approves new data creation and who reviews data quality periodically.
Document clear “rules of entry” and “rules of change.”
Establish a light-touch governance cadence, even a monthly review can prevent reversion to chaos.
Value add: Embed accountability and sustainability into the client’s data model, ensuring quality persists beyond the project.
4. Quantify and Communicate Improvements
Track and report measurable progress, data quality is invisible without metrics:
% completeness
% duplicates removed
% obsolete records deactivated
of standardized or corrected fields
Improvements in data usability for reporting (e.g. clearer category hierarchies or cleaner license data)
Value add: Provide tangible proof of progress and ROI, reinforcing the business impact of your work.
5. Preserve Data Lineage & Auditability
When you clean or enrich data:
Keep the original data as a backup (in a separate table or file).
Track what was changed, why, and by whom.
Create a “before and after” report showing key data improvements.
Value add: Builds trust and allows safe rollback if needed.
6. Identify Structural Issues vs. Cosmetic Ones
Not all “bad data” is just typos, sometimes it reveals process or configuration flaws, e.g.:
Duplicate items because of poor governance or unclear creation rules.
Missing attributes because ERP configuration doesn’t make them mandatory.
Incorrect planning or reporting parameters because users bypassed defaults.
Value add: Diagnose root causes, not just symptoms and recommend small governance or configuration tweaks that prevent recurrence.
7. Prepare Data for AI / Analytics Use
Since you’re sharing with an AI tool:
Standardize and normalize categorical values (no free-text chaos like “EA”, “Each”, “each”).
Remove duplicates, obsolete items, and inactive records.
Ensure consistent identifiers and units of measure.
Value add: Make the dataset AI-ready, supporting accurate predictions and insights.
8. Anticipate Downstream Integration Needs
If the client will migrate to a new ERP or integrate with BI, CRM, or planning systems:
Structure data to be migration-ready (consistent keys, flat hierarchies).
Ensure field naming conventions and code lengths align with the target system’s constraints.
Avoid “quick fixes” that only work in the current ERP environment.
Value add: You future-proof the data and reduce rework in future system rollouts.
9. Document and Deliver Reusable Assets
Leave behind:
A data dictionary
Cleansing rules and scripts
Business rules documentation
A simple “Data Quality Framework” playbook
Value add: Leave lasting intellectual property that strengthens the client’s internal capability and highlights your strategic contribution.





Comments