If I look back across all of the data
migration exercises I’ve been involved with one aspect has come up during every
single one of them – at some point someone involved with the project will try and
seize the opportunity to use the migration as the means by which all past data
quality sins will be dealt with and solved.
There was a time when I’d have embraced this idea and tried to
incorporate it in to the project plan, on more than one occasion I may have
even been the person advancing the idea.
But not any more. That was then, this is
now. Looking back over some of the projects which struck problems, one
recurring theme seems to have been a push for wide ranging data quality
improvements as one of the key deliverables, either as a mid stream addition
to, or change of, focus or a key aim from day one. I’ve reached a conclusion – looking at it now
it seems stupidly obvious, self evident if you like, but I suspect it may seen
a tad controversial to more than a few people out there:
Data migration exercises are about migrating data.
This should be the first and only aim. If you don’t get your data from the old
to the new system then you have failed – it’s a simple as that. All of those
forecast business benefits that rely upon having data migrated will be reduced,
deferred or missed entirely. Don’t even think about using the migration to do
anything else, data migration is hard enough without making a conscious choice
to try and increase the complexity, cost, timeframe and risk. Don’t do it to
yourself!!
I’d suggest only entertaining the concept
of improving data quality when it can be shown that the new system can’t and
won’t operate successfully with the level of quality of data currently
available in the incumbent system. Then, and only then, look to incorporate
data quality improvements, but keep them as light as possible. Only improve the
data as much as is required to allow the new system to operate. This may seem
counterintuitive, even perhaps bordering on delinquency of our duty as data
professionals, but step back and look at the bigger picture, the
interconnectivity of the data in (and across) systems. The changes you may make
to improve a particular aspect of the data in one area may have unforeseen and
impacts elsewhere. Sure, data profiling can go some way to help understand the
impacts here, but do you really want to be following convoluted lineage trails
just to be sure that no nasty surprise side effects crop up? Worse still, much
improved data quality in one area of the system will soon be forgotten if whole
other areas can’t even get the base data they need to operate due to migration
woes.
Now, before all of you data quality folks
delete my blog address from your favourites folder, let me reassure you that
I’m not saying that we shouldn’t improve data quality. In fact I think that the
arrival of a new system and the data migration project that comes with it is a
great opportunity to push for funding to improve data quality. We just need to play
the long game and look a little further ahead - designing projects to make
specific data quality improvements before the data migration even begins. Start
selling the benefits of improved data quality well ahead of the migration
(months, if not a year or more) and angle the pitch in such a way as to show
how these projects will reduce the risk of your company becoming the next data
migration horror story headline with huge cost overruns, customer churn and
brand damage.