Friday, May 17, 2013

Data Migration: Low Hanging Fruit Can Be Bad For Your Health!



 If you’ve been through a few data migration exercises then you’ll probably recognise the scene.  You’re in the midst of a busy data migration, the pressure is on, there’s too much to do and too little time to do it in. You’ve been back and forward with the business users trying to resolve data quality issues that are preventing data from successfully loading in the new system. One, two, three, or more trial data loads have come and gone and still there are pieces of data which stubbornly refuse to play ball and load in to the new system.

With deadlines looming you start looking for ways to solve the problem, to migrate the data, to just get it in there. Or perhaps, even worse, those above you begin to apply the pressure for you to do the same. The questions / demands start to come. “Look for the low hanging fruit, what can you do to just make it work? Just get the data loaded!”

One common idea is that a whole parcel of load errors associated with free text fields can be easily and quickly solved by simply truncating the data to match the maximum allowable length of the fields in the target system.  Load the data in; produce an exception report for the text fields that were too long and were truncated and, hey presto, problem solved! You’ve got a bundle of errors out of the way and proved that the data can load. There’s now confidence that, come the next cutover trial or the real cutover and go live, that the migration in this area will be fine. Now it’s been done successfully once it can be repeated successfully and can only get better. After all the business users will get your exception report and immediately and diligently get to work on cleansing those fields which needed attention.

They will, won’t they? It’s not like they got anything else competing for their time. Their line manager doesn’t need them to do anything else. They’re ready and waiting to cleanse data for you as their top priority.

OK, wake up and head back over to the real world! Once (or if) they finish doing the day to day tasks of their own job, not to mention the tasks of those of their colleagues who have been seconded to that big project that is underway they may have time to add the data cleansing tasks to their to do lists, but I bet those tasks don’t get added to the top of the list.  If that data is going to get cleansed then you need to present a reason to prioritise its cleansing.  A hard failure is that reason, and a much more compelling and immediate reason than an exception report for data which (after all) has already loaded.  If the data doesn’t load then it needs to be cleansed if it is to be available I the new system.

Give the business all the help you can. Point out exactly and ambiguously what the problem is, show them exactly which field(s) in which record(s) need to be cleansed and what good looks like (i.e. what rules and conditions must be met before the data can migrate) and use data profiling tools to give them some indications ahead of the next trial load that the cleansing they are doing is effective and having the desired effect. But, don’t own their problem nor try to solve it for them in isolation. Most times, the business folds will know the data much better than the data migration folks. As soon as the data folks start making assumptions about what is safe to do with the data in the name of getting it to load, be it truncating text fields or some other technique, then the organisation takes on a long tail risk and liability in that the meaning of the data may have been lost or changed which may manifest in negative impacts to process or operations at some point post go-live. Bad for the organisation and bad for you as often times it is the data migration folks that will be perceived as having been the cause of the problem.


But, in the end, pragmatism must rule the day. Your data migration must run. There will be those (hopefully few) areas where you’ll just have to make some calls for the sake of allowing the data to get into the system at final cutover. If you don’t you’ll find your name quickly in the same sentence as the words “…we didn’t go live because of…” So go ahead, pick that low hanging fruit, but wait until you absolutely must, it still may be poisonous, but less immediately deadly than the alternative! 

Tuesday, March 19, 2013

Data as an Asset – Walk the Talk




Establishing data governance programs, data quality initiatives or other projects aimed at looking after or enhancing the well being of a company’s data can be difficult to get underway and sometimes even more challenging to keep alive.  Such initiatives (especially if your company’s funding approach forces you to tackle them as a series of individual projects) cannot always deliver the concrete and tangible “things” that you might get from a more traditional IT project. More often than not there is no new asset created, no new system to implement and enable across a go-live weekend, nor are benefits always obtained and measurable in the short term. Unless your business is facing real pain (and honestly attributes that pain to not just problems with the data itself, but also with how the data came to be in the state it is) then you may well struggle building support for, and keeping data initiatives running.

Often this seems surprising to data folks. It seems that this isn’t logical. The business says all of the right things about valuing data quality, about wanting a single source of truth from which to make better, faster and more cost effective decisions and maybe someone has even sprouted the “data as an asset” phrase, but this hasn’t resulted in them beating down the door with bags of money to fund data improvement initiatives. At first glance it seems key business stakeholders are showing moral support for the concepts but no commitment to the actions required to really solve the problems in the longer term.  They are the chicken to the data team’s pig.

But perhaps it’s not all that surprising. Are you, as a data leader, walking the walk or simply talking the talk? Do you treat your company’s data as an asset yourself? And does your data strategy reflect that? Do you even have a data strategy or are you one of the many, described by Joyce Norris-Montanari in her recent blog post  (http://www.dataroundtable.com/?p=12661), who fall into the group lacking a vision for their data and a strategy to help them get there?

If you’re not walking the talk then chances are that you’ll need some work up front to sell a congruent picture to the business, and particularly to those stakeholders holding the purse strings.   Make sure that you can show the basics around the company’s data that you would have around any other asset.


  • Knowing what you have – an inventory of your data;
  • Knowing where this data is – which systems house which data and (ideally) how it moves around between them;
  • Knowing which stakeholders care about which data (you don’t need to get to business data ownership just yet, but at least you should know which data helps achieve, or threaten, which stakeholder’s bonus package);
  • Knowing what preventive maintenance is required to keep your assets running at the required level and optimize the investment in them- some idea (and documentation) of where the problem areas lie. The big data quality issues (and importantly their real impact on the business);
  • An understanding of how you can protect your asset – which may include classifications for data along with associated handling rules and other security aspects; and
  • A vision for continuous improvement and a better future state of asset management – thought out and documented ideas for better use of your asset (which might include identifying and reducing redundancy, identifying inventory gaps or working better at the extreme ends of the asset life cycle).


With enough of the above in place you have your house in order and can say that you are indeed making efforts to treat data as you would any other asset. Now that you have a congruous story to present to the business you may find your next funding request is met more favourably. Even better, chances are you may well have identified a number of projects to deliver some concrete shorter-term benefits along the way as well.   

Tuesday, March 12, 2013

Could External Quality Assurance Hamper Your Chance of Data Migration Success?


I’m not sure of the exact rate for failures in data migration projects. Along the way I’ve seen Gartner report that somewhere around 83% of migrations either fail or overrun budgets and schedules and if memory serves I believe I’ve read that Forrester reported the success rate at around 16%.  The exact number probably depends upon who is doing the reporting, whom they survey and how candid the responses they receive are. Whatever the case, the number is big, scarily big.

To my way of thinking any area of a major project where the weight of historical evidence suggests that somewhere between 8 and 9 of every 10 attempts will be significantly challenged should be subject to two things:

  •  External quality assurance processes in an attempt to make sure that the chance of success isn’t derailed by things not being done as they should. Adding another voice of experience or another set of eyes if you will; and

  •  Some form of contribution to the wider data migration community of practice to help understand where things go wrong and over time (as a collective drawing from the positive and negative experience across many projects) look to evolve the methodologies used to undertake data migrations and lift the success rate.


Unfortunately, at least in my experience at least, the two items often work at cross-purposes. All too often I’ve seen the first endeavour block or even derail the second.  Quality assurance efforts will often be established as a risk mitigation exercise. That same aversion to risk often results in lack of comfort and confidence in anything which can’t be shown to have been done many times before. An established methodology is preferred over that which might be construed as cutting or bleeding edge.  That’s all well and good but, chances are, if you are following an established approach then that approach has been followed by a fair number of those 83% of projects that failed (to some degree) before yours.

This resistance to any attempt to stray from the well-worn path hinders the adoption and evolution of new concepts in so doing prevents them from gaining wider acceptance, development and enhancement over time by the wider crowd of data migration practitioners.

So we, as those practitioners, have two choices. We can accept that we can do little to change accepted practice, keep our heads down, collect our pay cheques and hope that luck or our best efforts place us in the lucky 17%, or we can look to find ways to not only increases the chances of success for our project but also contribute to the longer term average success rate of data migration projects in general. If we do want change then we must also recognise that radical shifts in methodology just won’t be possible; governance and quality assurance processes simply won’t allow that. Instead I think we must look for chance to use new techniques to build upon more accepted methodologies, filling the gaps or shoring up the problem areas that pose the biggest problems in our particular current projects.  This could take any number of forms from using lead indicators alongside lag indicators to build gradually build confidence across a project or the gradual introduction of new and improved approaches to the techniques and timing of reconciliation. 

Whatever, and however, we may go about this I hope that over time as a community of practitioners we can slowly build acceptance for new techniques, new methodologies and new measurement paradigms and over time slowly shift what is deemed to be acceptable and common practice. Who knows, maybe sometime before the end of my career we may actually see a failure rate that doesn’t send cold shivers down the collective spines of project managers everywhere.