Friday, April 13, 2012

Your Data Migration Project is Not a Data Quality Exercise


If I look back across all of the data migration exercises I’ve been involved with one aspect has come up during every single one of them – at some point someone involved with the project will try and seize the opportunity to use the migration as the means by which all past data quality sins will be dealt with and solved.  There was a time when I’d have embraced this idea and tried to incorporate it in to the project plan, on more than one occasion I may have even been the person advancing the idea.

But not any more. That was then, this is now. Looking back over some of the projects which struck problems, one recurring theme seems to have been a push for wide ranging data quality improvements as one of the key deliverables, either as a mid stream addition to, or change of, focus or a key aim from day one.  I’ve reached a conclusion – looking at it now it seems stupidly obvious, self evident if you like, but I suspect it may seen a tad controversial to more than a few people out there:

Data migration exercises are about migrating data. This should be the first and only aim. If you don’t get your data from the old to the new system then you have failed – it’s a simple as that. All of those forecast business benefits that rely upon having data migrated will be reduced, deferred or missed entirely. Don’t even think about using the migration to do anything else, data migration is hard enough without making a conscious choice to try and increase the complexity, cost, timeframe and risk. Don’t do it to yourself!!

I’d suggest only entertaining the concept of improving data quality when it can be shown that the new system can’t and won’t operate successfully with the level of quality of data currently available in the incumbent system. Then, and only then, look to incorporate data quality improvements, but keep them as light as possible. Only improve the data as much as is required to allow the new system to operate. This may seem counterintuitive, even perhaps bordering on delinquency of our duty as data professionals, but step back and look at the bigger picture, the interconnectivity of the data in (and across) systems. The changes you may make to improve a particular aspect of the data in one area may have unforeseen and impacts elsewhere. Sure, data profiling can go some way to help understand the impacts here, but do you really want to be following convoluted lineage trails just to be sure that no nasty surprise side effects crop up? Worse still, much improved data quality in one area of the system will soon be forgotten if whole other areas can’t even get the base data they need to operate due to migration woes.

Now, before all of you data quality folks delete my blog address from your favourites folder, let me reassure you that I’m not saying that we shouldn’t improve data quality. In fact I think that the arrival of a new system and the data migration project that comes with it is a great opportunity to push for funding to improve data quality. We just need to play the long game and look a little further ahead - designing projects to make specific data quality improvements before the data migration even begins. Start selling the benefits of improved data quality well ahead of the migration (months, if not a year or more) and angle the pitch in such a way as to show how these projects will reduce the risk of your company becoming the next data migration horror story headline with huge cost overruns, customer churn and brand damage.