Friday, May 17, 2013

Data Migration: Low Hanging Fruit Can Be Bad For Your Health!



 If you’ve been through a few data migration exercises then you’ll probably recognise the scene.  You’re in the midst of a busy data migration, the pressure is on, there’s too much to do and too little time to do it in. You’ve been back and forward with the business users trying to resolve data quality issues that are preventing data from successfully loading in the new system. One, two, three, or more trial data loads have come and gone and still there are pieces of data which stubbornly refuse to play ball and load in to the new system.

With deadlines looming you start looking for ways to solve the problem, to migrate the data, to just get it in there. Or perhaps, even worse, those above you begin to apply the pressure for you to do the same. The questions / demands start to come. “Look for the low hanging fruit, what can you do to just make it work? Just get the data loaded!”

One common idea is that a whole parcel of load errors associated with free text fields can be easily and quickly solved by simply truncating the data to match the maximum allowable length of the fields in the target system.  Load the data in; produce an exception report for the text fields that were too long and were truncated and, hey presto, problem solved! You’ve got a bundle of errors out of the way and proved that the data can load. There’s now confidence that, come the next cutover trial or the real cutover and go live, that the migration in this area will be fine. Now it’s been done successfully once it can be repeated successfully and can only get better. After all the business users will get your exception report and immediately and diligently get to work on cleansing those fields which needed attention.

They will, won’t they? It’s not like they got anything else competing for their time. Their line manager doesn’t need them to do anything else. They’re ready and waiting to cleanse data for you as their top priority.

OK, wake up and head back over to the real world! Once (or if) they finish doing the day to day tasks of their own job, not to mention the tasks of those of their colleagues who have been seconded to that big project that is underway they may have time to add the data cleansing tasks to their to do lists, but I bet those tasks don’t get added to the top of the list.  If that data is going to get cleansed then you need to present a reason to prioritise its cleansing.  A hard failure is that reason, and a much more compelling and immediate reason than an exception report for data which (after all) has already loaded.  If the data doesn’t load then it needs to be cleansed if it is to be available I the new system.

Give the business all the help you can. Point out exactly and ambiguously what the problem is, show them exactly which field(s) in which record(s) need to be cleansed and what good looks like (i.e. what rules and conditions must be met before the data can migrate) and use data profiling tools to give them some indications ahead of the next trial load that the cleansing they are doing is effective and having the desired effect. But, don’t own their problem nor try to solve it for them in isolation. Most times, the business folds will know the data much better than the data migration folks. As soon as the data folks start making assumptions about what is safe to do with the data in the name of getting it to load, be it truncating text fields or some other technique, then the organisation takes on a long tail risk and liability in that the meaning of the data may have been lost or changed which may manifest in negative impacts to process or operations at some point post go-live. Bad for the organisation and bad for you as often times it is the data migration folks that will be perceived as having been the cause of the problem.


But, in the end, pragmatism must rule the day. Your data migration must run. There will be those (hopefully few) areas where you’ll just have to make some calls for the sake of allowing the data to get into the system at final cutover. If you don’t you’ll find your name quickly in the same sentence as the words “…we didn’t go live because of…” So go ahead, pick that low hanging fruit, but wait until you absolutely must, it still may be poisonous, but less immediately deadly than the alternative! 

1 comment:

  1. Data quality exists only when business users have a consistently high level of confidence in the accuracy,Adaptive products help organizations maintain high Data Quality and establish a “standard” root- cause identification process.

    ReplyDelete