Tuesday, December 4, 2012

Dead Men Pay No Bills

There should be more data quality horror stories out there. No, really there should. When we data quality practitioners struggle to get stakeholder buy-in for our initiatives it's good to have a few in the kit bag to pull and illustrate the problems poor quality data can have. So, to that end, let me share a recent experience.

Mid-afternoon late last week my phone buzzed with the arrival of a new text message. It came from a number I didn't recognise: "Steven, we refer to our demand for payment of last week sent to your address in Midway Point. If this amount is not paid by Wednesday at 5PM legal action will be started. Please contact us on xxxxxxxx and quote reference number xxxxxxxx to arrange payment." My first thought was wrong number or phishing attempt. Either way I wasn't going to action the request. Odd though - my Father is Steven and he had a Midway Point address. Odder still in that he died almost 5 years ago.

Curiosity got the better of me. A phone call to the provided number revealed that it was in fact a debt supposedly owed by my Father to a local electricity retailer. That bill had been dealt with years before when the lawyers were wrapping up Dad's affairs, why was a debt collection service chasing payment now and why / how on earth were they chasing me?

This is squarely in brand damage territory. Not the stuff of front page headlines, but certainly fodder for those trashy tabloid "current affairs" TV shows, and certainly capable of causing some degree of negative sentiment, brand avoidance and customer churn. Putting the commercial implications aside, it also has potential for emotional distress. Five years since Dad's passing gave me the space need to see the absurdity of the situation, but I suspect a more recent death would cause unnecessary upset for the person receiving the payment demand.

Looking from the outside in I can only make educated guesses at what is likely to have happened to get to this point. Here goes:

1. I know the company replaced their customer care and billing system between Dad's death and the payment demand. I assume there was a data migration exercise at the same time. Could it be that some part of the data associated with Dad's account wasn't migrated successfully, or even attempted to be migrated at all? Perhaps only the structured transactional data was migrated and the non-structured data attached to the account, such as notes advising of his death, wasn't migrated, or lost its association during migration.

2. There has been a concerted push from the major shareholder of the utility to recover debts recently. What if some form of data matching had been applied in order to find my mobile phone number (previously not associated with Dad's account) and use it as another communications channel to attempt to recover the debt. Dad and I share the same first initial and last name, we have at various times in the last forty odd years shared the same address and we were born in the same month of the year. Did we somehow become fuzzily matched as the same person? I'm all for finding ways to get value from your data in new ways, but it's got to be based on solid foundations - if the data quality is not understood, acknowledged and good enough then the insights gained may well be flawed.

Who knows what really happened - I'm guessing as to the chain of events that transpired. But I am sure of one thing - this is one bill that won't be paid.

Saturday, September 1, 2012

Why Data Quality is Like Shaving

Maintaining data quality is like shaving. That thought hit me early yesterday morning whilst I was doing it - shaving, that is, not maintaining data quality! Here's why.

A few things that most of us know about shaving and probably also know about data quality:


  1. It's not much use if you just do it once. You've got to keep at it every day.
  2. If you don't do it for a while you can still get back to a good state, it just might take a bit longer and be a bit uncomfortable for a while.
  3. Sometimes there's blood!
  4. Convincing someone who's never thought of data quality as important that they should do something about it is as hard as convincing someone who has had a beard for 20 years that they should shave it off.

And here are a few things that might not be quite so obvious about shaving (and about data quality undertakings):

  1. If you don't do it most people won't notice nor care. But a few people who are really important to you might treat you differently and not give you what you want. Just ask my wife!
  2. A little is good. Do a bit more and many people around you might look at you strangely but chances are the extra you did will pay you back many times over one day when you least expect it. Just ask a cyclist who has had to apply sticky bandages to his legs after a fall if it's easier with or without hair on his legs!
  3. Doing it everywhere you possibly can is probably a wasted effort (and just a little bit creepy!) And if you must do it everywhere, then some areas will definitely need more caution during the exercise (refer to item 3 in the first list!)
  4. The longer you've been around the more places you'll have to pay attention to. At 25 we all laughed at people with ear and nose hair, but it comes to us all in time. That data you've got hanging around from the time it was "faster" to work in spreadsheets will start showing eventually!
  5. Shaving gives you the chance to take a good look at yourself in the mirror everyday, often without any of your imperfections covered up. Maintaining data quality calls for the same regular scrutiny and openness.

And here's the biggest thing most people will immediately accept about shaving and should accept about data quality too:

Chances are you're not going to ask someone from your internal hairdressing department to keep you clean-shaven. It will work far better if you take care of that yourself. Data quality's no different. It's your beard and it's your data!

Monday, August 27, 2012

So That's What IT Recruiters Are Good For!

The BI Market seems to be fairly hot right now, at least in Australia, if the number of calls I'm getting from headhunters and IT Recruiters is any sort of indicator. However an experience I had this week makes me wonder.

Right now I have a need to ramp up the number of BI folks in our company to handle some project demands. It's a short term gig for the time being so we directly approached a few service firms as well as posting in one or two LinkedIn groups. Two days later I received an email from an IT Recruiter offering me the chance to interview for a great BI position with a first class company there were working very closely with. The position description felt very familiar and no wonder, I'd written most of it only a few days earlier. It had been augmented a bit to give the impression that what was in reality a short term engagement with a slim chance of extension was instead just the initial piece of a long engagement. A hunch had me checking Australia's main jobs website. Sure enough, two more recruiters were also advertising jobs which were also obviously the roles I was trying to fill. Both had extra or erroneous information, I suspect to make their version of the job more attractive than the one being advertised by their competitors.

Recruitment "consultants" taking initiative and creating opportunities to make income for themselves and their companies is one thing, but it causes problems for those of us working in, or managing those working in, the BI arena. With multiple parties now advertising my two roles now seemed like eight, so when those two are filled it will seem as if many more roles have been quickly filled. I suspect this contributes to the general feeling that demand is high and might even be artificially forcing rates up. Worse still will it cause a scenario where this perception of great demand draws people from other IT areas to re-train in the BI field, potentially causing a future glut of supply?

So let's be generous and say that recruitment consultants generate a buzz for us and drive potential candidates our way, but those of us with our own Human Resource Departments could do that anyway. What else do recruiters bring to the table? In my experience the majority of those I've dealt with lack any depth of understanding - matching skills and expertise to roles is haphazard at best, and even basic screening of applicants seems limited to "if they can spell BI then they're qualified to get a second interview".

So, for me I struggle to see the value..... But then I had an experience that made me realize that recruitment consultants do serve a valuable purpose - they're here to make us laugh. I was approached by a recruiter interested in my data warehousing expertise and keen to learn if I had a forklift license and what my typing speed was. I suspect data warehousing may have been a new term for him! Another, during a preliminary chat when I was trying to find a candidate to fill a new data warehousing role, asked me if I felt rising real estate costs would make warehouse space more expensive and cause my firm to lift our rates.

So, guys, keep the laughs coming, but you'll excuse me if I pass on your other offerings!

Friday, August 24, 2012

The Next Data Rock Star

I miss the 1980s, there was lots to like about that time: great music, a lifestyle less cluttered by technology, crazy hair, the fact that I could still grow enough hair to have crazy hair. For most of my adulthood if you had asked me which time in my life I'd like to go back to I'd have picked the 1980s. But now I'm not so sure, I'm starting to pine for the late 1990s through to the mid 2000s. Why? Because, that was a time when quite a few data folks working in the business intelligence arena were the rock stars of IT.

I'm lucky enough to have worked on my fair share of interesting and high profile data warehousing and BI gigs during that period. The problems I had the privilege of working on were often difficult, few (if any) commodity solutions or services had manifested, and the solutions we provided were often anything but cheap. However in solving those problems we added real value to the business, at times doing things that their traditional IT staff or software vendors had said couldn't be done or had tried (and failed) to solve with more application centric approaches. Perhaps most importantly during that period few BI projects were undertaken unless the business could really see the benefit. Maybe this was because of the high cost, maybe not. Whatever the reason, those of us lucky enough to be leading the charge and playing key roles in these projects often got the accolades of not just our IT and data colleagues but also  of key business stakeholders as well. I was fortunate to parlay this into a successful consulting career  with word of mouth securing a series of back to back BI engagements stretching for over a decade.

But I fear that the BI Rock Star is a thing of the past. It's becoming a crowded field. Any number of people now "do BI" and what was once bordering on black magic is commodity thanks to advances in (BI) software products and hardware processing power. Even worse, the phrase "reporting" is commonly be used almost interchangeably with "business intelligence". We've all too often moved away using BI undertakings to support complex decisions and solve wicked problems to one where often all too many reports are produced for banal reasons or reasons unknown. It's no wonder then that business stakeholders seem to be seeing less value from their business intelligence dollar.

So, if BI is indeed on the wane, I wonder where the next Data Rock Star will come from. If I was to encourage my kids into the field where would I point them? Will the Data Scientists be the next wave to be lauded by the business, or will those jumping on the Big Data bandwagon get that privilege? Maybe those playing in the NoSQL space are about to be thrust to glory, or perhaps the social media tsunami will mean that's it the intersection of all three which will be the sweet spot. I actually think it won't be any of these. Both Big Data and Data Scientists to me still feel like buzz words - hot topics - that the industry has thrown up and bandied about to the point where airline magazine syndrome has kicked in. I don't see the clear path to the wide stream killer need that means business stakeholders will clamour for solutions in these areas enmasse any time soon.

So, if you can see more clearly than I, I'd love to know what the next big data thing will be. I miss being a rock star!


Saturday, August 18, 2012

Data Quality - Running in the Badlands

I like to run a little. Actually, I like to run a lot. Back in the days when my work had me travelling most weeks one of the upsides was the opportunity to regularly find new running routes in new cities. Most times this resulted in an enjoyable run and the chance for a bit of sightseeing. However on occasions things didn't go quite so well. Several times I've felt very uncomfortable and one on occasion I had genuine fears for my safety.

Three such times stand out for their similarities although they occurred in different cities around the world - Las Vegas, Milan and Philadelphia. On each occasion I was speaking at a conference and so had a little time to kill, but not enough time so that I could run too far from my hotel. All of these cities promote themselves as either tourist or conference destinations and to the casual observer seem to fit this brief well - safe, busy, functional and (in two of the three cases) prosperous. At first glance it would seem that nothing is amiss and that the "service" they offer is well provided; they are "fit for purpose", if you like. However in each city less than six blocks from my hotel I found myself running through either seriously rundown areas, areas with (in one case) chain mesh fences and burning cars or areas where I was harassed and threatened.

Now, I admit that probably 99% or more of tourists or other visitors to these cities will probably never see the areas I did nor find themselves in a threatening situation. It strikes me that there are parallels with data quality in many business systems: the places where everyone goes, those areas that support the most common transactions and queries have had most obvious data quality issues sorted out. In most cases the people using these systems largely have no problems which stem from data quality issues so over time the perception develops that data quality is not an issue. Even when the odd data quality problem does crop up often its dismissed as an outlier or the pain of it forgotten before it can be used as a driver for a data quality improvement initiative. Much like the cities I visited all seems to be working smoothly as it should.

However on my problem runs I definitely didn't get the result I was after - I didn't enjoy sightseeing and the training effect I had planned for didn't eventuate as I either had to cut my run short or ran (away) at a faster pace than I had planned. Without doubt there was at least a loss of effectiveness and probably a failure to get to the desired outcome. The same occurs in our systems when data quality issues lurk just out of plain sight. When users venture into those areas, be it with ad-hoc requests or with infrequently run analytics, the outcome they seek may well either be missed entirely or hampered by inefficiencies. These issues have knock on effects too - perhaps someone reading this blog won't run next time they attend a conference in one of those cities or may even choose another conference entirely, in much the same way users may choose not to again wade through an analytic process in a certain area of data, preferring instead to rely on gut feel or their own sets of numbers from desktop spreadsheets.

So should we as data quality or data governance practitioners act to rectify these less mainstream data quality issues. The answer is probably "it depends". For me, it's a cost benefit question, if the impact of the problem is bigger than the cost to fix it then there could be a strong business case to be built for a fix. The bigger issue though is not knowing that the problems exist in the first case, so perhaps it is beholden on us to attempt to understand where these problems lie so we, and our stakeholders, can best choose where to focus data quality improvement efforts. Interestingly this very style effort actually is in play at the hotel I stayed at in Philadelphia. Seeing me walk back into the lobby in my running gear the hotel concierge asked how my run was and then offered me a pre-prepared map showing run routes and a very clearly marked border indicating the outer limits of where it was safe to run. I only wish they had advertised the existence of this data steward before I'd gone out running!

Monday, July 16, 2012

Congratulations, Your Data Migration Project Was a Success! Or Was It?

In recent years I've begun to notice something amongst data migration providers that I'd seen little of before. Many of them (or at least many of those that I've had dealings with) are beginning to be far more proactive in taking steps to ensure that their migration projects are successful. That's a good thing, isn't it? Well I guess that depends how you define your success metrics. Does a successful migration always translate into benefit (or at least lack of significant pain) for the enterprise?

Perhaps the most noticeable shift has been towards the "less is more" philosophy. That is, rather than trying to take all of the data that's available, or that the business users say they need, there is a push back to reduce the amount of data migrated between systems. In my opinion this is a good thing - not migrating data that isn't required should remove complexity, decrease cost and time lines and, perhaps most significantly, reduce the risk of unexpected problems leading to delays and cost blowouts. For me there are a number of scenarios when this approach makes a lot of sense:


  • Some of the data can be shown to add no value to business operations (that is it's only required for compliance or records management requirements) AND the business users agree that this is the case;
  • There have been several past restructures to business operations resulting in data which is very hard to map between different charts of account or the like;
  • The cost of migrating the data is significantly higher than the value that having it available in the new system would deliver AND not migrating it does not expose the company to any significant additional risk.
However, I've seen other arguments for this approach, among them:

  • Only open balances and other current data are required for the business to operate;
  • Migrating anything over and above current data introduces more cost and risk of project time overruns;
  • It will be more cost effective to use alternate means (other than migrating it) to get to old data, including:
    • Leaving the old system in place in a read only state;
    • Leaving the old data available via a an existing data warehouse or BI solution;
    • Leaving (just) the database from the legacy solution in place and having IT staff query it to service user requests.
There's a common thread amongst these items - they all focus on a successful project outcome, i.e. getting the project delivered on time, whilst making some major assumptions about life post migration. So, if you're faced with a migration project where your vendor is proposing this style of approach, or even if you yourself are considering this option, take some time to validate those assumptions. Ask the business users if they really can operate on that minimal data set. If they say yes then ask them some more questions, paint a picture of how the world well be and test their theory - how will they do trend reporting, how will they respond to an e-discovery request, how will they bring a new product to market quickly or design future sales and marketing campaigns in a timely fashion?

Don't just talk to the business users, spend some time with the IT folks as well. What's the impact to them of having to keep old systems around for some (potentially significant) amount of time? Chances are IT Management won't have budgeted for having to continue to pay licence fees for the old software on top of the new fees, let alone the operational support costs of keeping the computing hardware that it runs upon up and running. This can be made worse still if a third party supports this hardware and requires that it be kept in warranty as then there may well be new hardware purchases, product installations and potentially even a migration of the old system (to the old system) as well. And then there's the need to maintain support and help desk skills with the old products. At best this means having to keep staff cross skilled, potentially reducing their ability to perform higher order work, but it may also mean the need to bring on new staff, adding head count and increasing wage costs. A tricky situation becomes even worse if the project business case doesn't recognise the need for increased IT operational spend or projects IT savings from the increased efficiency the new system will bring. 

Only when you've satisfied yourself that the assumptions are understood and hold should you proceed with reducing the data migration volume. Chances are there will be a middle ground where the data required is more than the bare minimum set and both the business and IT can operate without significant compromise - aim here! Hit this and there's every chance that your migration project will be seen as successful not just at go-live but also in the weeks, months and years that follow. Aim for the minimum migration set, without first validating the approach, then enjoy the accolades at go-live, but hit the road in a hurry because it won't be long before both business and IT managers are queuing up at your door to ask some hard questions. Good luck!




Thursday, May 17, 2012

Data Governance and the Not So Thirsty Horse

What's that old joke? The one about the deeply religious man caught in his house as flood waters rise around him.  Early in the piece as the waters are only a few feet deep a dingy arrives to take him to safety and he refuses, saying "I'll be fine, the Lord will look after me". Later, as the water rises and forces him to the house's upper story, a bigger boat arrives and those on board implore him to leave his house and come with them. Once again he answers "I'll be fine the Lord will look after me". Some hours later as he is perched dangerously on his roof with water lapping at his feet a helicopter descends and those on board urge him to leave before it is too late. But the man is insistent - "The Lord will look after me, I don't need you to save me". Not long later the waters rise again and the man is swept away to his death. Arriving in Heaven he is confused and, on being granted an audience with God, asks Him "Lord, I put my faith in you. I trusted you. Why didn't you save me?". God looks at him somewhat incredulously and asks "what more did you want? I sent you a dingy, a boat and a helicopter!"

On more than one occasion in my career I've marvelled at how much that joke paralleled what was going on around me. You may have even experienced this yourself at some time - demands coming from all sides, requests for clarity around data meanings, master data rulings, demanding to know why you don't know the ins and outs and business rules of obscure pieces of data that is about to be touched by some project or rather, and complaining because there's not enough documentation and other collateral in place - forcing them to have to spend more effort and dollars inside their project.  If this was indeed all missing then I could understand the consternation, but when a data governance framework exists and the business users understand and acknowledge that they (and not the IT group) steward the data and are the ones with the true knowledge of that data, then I start to bristle a bit. Maybe not the first time, or even the second or third time you walk people through what's available and how to make best use of, and leverage it, but sometime thereafter it's sure to come. I've noticed similar behaviour in other industries across my career, but it does seem to be remarkably pronounced amongst those working in IT.

Often I've put this down to some form of Not Invented Here Syndrome and accepted that some individuals will just not use what is put before them, be it through lack of acceptance, understanding or whatever. I'm sure that the phrase "you can lead a horse to water, but you can't make him drink" has passed my lips on numerous occasions. But lately I've started to think that perhaps this is not the answer. The horse really needs a drink, he just doesn't know the water is good yet! I've written before  about the need for persistence and tenacity when implementing data governance programs. Perhaps this situation offers an opportunity to create greater acceptance of data governance, data stewardship and the like. Meeting those IT folks placing demands on you half way may well work - give them a little of what they are asking for, do a bit of "their" work for them.  Sure, it's re-inventing the wheel to some degree, but nudge them toward what's already in place, shepherd them into the data governance framework and show them (via a few small wins) how they can leverage the data stewards working within it. My initial findings suggest that where those people begin to see that it can make their jobs easier they start to be more accepting of it and begin to use it to augment their other activity and in some cases make it first port of call. Given enough time I'm convinced we can make data governance advocates and champions of at least  some of these people.

I'm keen to hear from you if you've experienced (and leveraged) similar issues as I'm sure I'll encounter many stables full of horses who don't know they're thirsty across the rest of my working life!

Monday, May 7, 2012

Data Governance and the Self Licking Ice Cream

A friend and colleague regularly uses the self licking ice cream phrase to illustrate to the project management team he mentors that projects need to provide value to the enterprise and not just operate in a vacuum. No matter how well those in the project build a system, or how well that project is managed, it's unlikely to be perceived as a useful and successful exercise unless the various stakeholder groups agree that the work was needed and worthwhile. From their point of view all that the project will have achieved is keeping those on the project busy and employed for its duration.

But that can't be said for data governance, can it? Every organization has at least some data quality problems, conflicts around data ownership, data silos preventing leveraging the data asset across the business, so therefore every organization could benefit from a data governance program.

That may be true, but I'd still like to bet that many, if not most, data governance programs are seen by many in the organization as self licking ice creams. Ask an exec or senior manager from one of those firms and chances are they'll not see the value of the data governance initiative. If you're heading a data governance program and your execs don't recognise the need for it then, even if you think what you're doing is worthwhile, I've got bad news for you - to others you're licking your own ice cream! It won't matter if you've been making great advances in improving data quality, finally getting the data stewards to work together to understand master data issues, or building understanding across your data landscape and producing mountains of documentation along the way. From the outside you'll just be seen as doing little more than using scarce funds and making a call on their peoples' time for no other reason than to keep yourself in a job.

Of course securing an executive sponsor for the program and working to win wider executive awareness and buy-in will help avoid this scenario. However, this will be difficult to achieve and retain unless you continue to find ways to make your data governance work relevant and demonstrate value that can be seen by key stakeholders in the organization. For me, to do this it's critical to have demand placed on a data governance program. If people in the enterprise don't recognise they have data problems and aren't coming to you to help solve them, then you need to find something tangible for your program to tackle. If you're lucky enough to be facing upcoming system changes with associated data migration then that may be a great driver. If not, then try and find the areas where non trivial amounts of money can be saved due to data problems or look for the issues that keep your execs awake at night because they lack confidence in the data. If you believe that those situations can be helped through data governance then start to work on them, if not keep looking - trying to force them to be data governance issues won't do you much good.

If there's nothing in your organization that you can apply data governance to (and be seen to create value) right now then don't be afraid to let things lie for a while. Pushing ahead on things others don't perceive as valuable may actually torpedo the longer term success of a data governance program. Better to lie low and bide your time rather than have others shut the program (and perhaps your career prospects) down. But, keep your ear to the ground because there will come a time when there will be a big problem facing the organization which data governance could help solve. Be ready to seize the opportunity. If you don't and one of your execs later reads about data governance then being a self licking ice cream will be the least of your worries - you may fall victim to airline magazine syndrome!

Friday, April 13, 2012

Your Data Migration Project is Not a Data Quality Exercise


If I look back across all of the data migration exercises I’ve been involved with one aspect has come up during every single one of them – at some point someone involved with the project will try and seize the opportunity to use the migration as the means by which all past data quality sins will be dealt with and solved.  There was a time when I’d have embraced this idea and tried to incorporate it in to the project plan, on more than one occasion I may have even been the person advancing the idea.

But not any more. That was then, this is now. Looking back over some of the projects which struck problems, one recurring theme seems to have been a push for wide ranging data quality improvements as one of the key deliverables, either as a mid stream addition to, or change of, focus or a key aim from day one.  I’ve reached a conclusion – looking at it now it seems stupidly obvious, self evident if you like, but I suspect it may seen a tad controversial to more than a few people out there:

Data migration exercises are about migrating data. This should be the first and only aim. If you don’t get your data from the old to the new system then you have failed – it’s a simple as that. All of those forecast business benefits that rely upon having data migrated will be reduced, deferred or missed entirely. Don’t even think about using the migration to do anything else, data migration is hard enough without making a conscious choice to try and increase the complexity, cost, timeframe and risk. Don’t do it to yourself!!

I’d suggest only entertaining the concept of improving data quality when it can be shown that the new system can’t and won’t operate successfully with the level of quality of data currently available in the incumbent system. Then, and only then, look to incorporate data quality improvements, but keep them as light as possible. Only improve the data as much as is required to allow the new system to operate. This may seem counterintuitive, even perhaps bordering on delinquency of our duty as data professionals, but step back and look at the bigger picture, the interconnectivity of the data in (and across) systems. The changes you may make to improve a particular aspect of the data in one area may have unforeseen and impacts elsewhere. Sure, data profiling can go some way to help understand the impacts here, but do you really want to be following convoluted lineage trails just to be sure that no nasty surprise side effects crop up? Worse still, much improved data quality in one area of the system will soon be forgotten if whole other areas can’t even get the base data they need to operate due to migration woes.

Now, before all of you data quality folks delete my blog address from your favourites folder, let me reassure you that I’m not saying that we shouldn’t improve data quality. In fact I think that the arrival of a new system and the data migration project that comes with it is a great opportunity to push for funding to improve data quality. We just need to play the long game and look a little further ahead - designing projects to make specific data quality improvements before the data migration even begins. Start selling the benefits of improved data quality well ahead of the migration (months, if not a year or more) and angle the pitch in such a way as to show how these projects will reduce the risk of your company becoming the next data migration horror story headline with huge cost overruns, customer churn and brand damage.




Tuesday, February 21, 2012

Big Data - Is it Really New?

Despite the fact that Big Data is fastest becoming one of the buzz words in the data industry I still wonder if we yet know exactly what it is. Is it anything more than just a concept, an umbrella topic if you like, to attempt to signal that something new is beginning to emerge in our industry? But, is big data really something new? Is the buzz justified, do we really need to reinvent our approaches, our practices and the skills we engender in those coming up through data careers?

I’d argue that some, if not many, data professionals already know a thing or two about big data. Well over a decade ago I was the architect leading a project to build a data warehouse to handle call traffic and billing data for one of Australia’s biggest telecommunications companies. Back then the volume of data we were working with was hardly small and I’d suggest that those that the people who work with that warehouse, and the BI solutions in enables, today are facing even larger data volumes thanks to the proliferation of mobile phones and other devices. We found ways of handling the challenges that the volume of data posed and more than a few of these approaches would still work today, even without the additional leg up we now get from increased computing power.
But what of the other characteristics of big data? It’s not just volume that’s the challenge with big data, but the velocity too. I’ve heard some commentators discussing that it is the rate at which data arrives which is in fact the biggest issue that comes with big data. But we’ve been dealing with high velocity data for a while now as well. Complex event processing has been available in major database products for at least a few years and people working with any form of operational technology will be used to data flowing in from sensors and other protection or control devices at millisecond intervals.
So I believe that we can leverage much of what we already know and practice as data professionals to start to address the volume and velocity aspects of [structured] big data. It’s the complexity and variety aspects of big data that I think will give us the real problems we need to deal with. The problem of making, or taking, the correct meaning from unstructured data, especially that coming from outside our organisations, such as sentiment analysis from social media, and then somehow find a way to effectively integrate it with our structured data sets is where I believe we’ll find our headaches. It’s here that we’ll need innovative new tools and techniques, but even then I think they’ll rely on key areas and practices which at least some of us have worked with for a while. Metadata will play a big role here to establish the right context and data mining may well make an appearance as well.
So, in my opinion at least, we already know a thing or two about big data. Let’s not re-invent the wheel, but rather try to build on what we’ve learned in the past. It’s a novel idea I know, and perhaps not one that IT professionals have much of a proven track record with, but hopefully this type of approach might shorten the time to value for those making early investments in and around big data.

Tuesday, January 17, 2012

What Do You Mean You Don't Need a Data Model?

On more than one occasion across my career I've found myself having to justify why we should bother undertaking data modeling exercises. Thankfully I'm not facing that problem currently, but I am in the midst of refreshing a set of strategy, standards and guidelines documents and recently found myself writing a "Why Model Data?" section as a preface to one document. It got me thinking that, despite all of the prior times I'd had the discussions, I'd never really succinctly put down in one place the why it is that I think we should undertake a data modeling process - how it adds value, if you like. So, this blog post is the result.

Modeling of information systems without modeling of the data they operate with can, and does, lead to bad outcomes. Modeling at only a systems or application level often fails to consider important characteristics of the underlying data, instead painting a picture which is only representative of a particular method of use of that data. Such modeling frequently impedes reuse of data across organisations, lowers the quality and speed of decision making and can lead to the unnecessary IT spend and the development of siloed IT applications with unwanted and sometimes unrecognised redundancy. Modeling of data assists with building understanding not only of data content but also of data relationships, with an effective data model being a key enabler of successful integration and well aligned and efficient application architectures and portfolios.

The explicit act of data modeling also encourages discussion about the meaning of data and the appropriate (or otherwise) use of various data elements in different functions of the organisation. It may well expose differing definitions and understanding of the same elements of data that, prior to the modeling exercise, those in the organisation had been unaware of.

Higher level data models also serve as a useful communication tool, helping to bridge the IT – Business divide and provide a mechanism to forge a common understanding of particular areas of the organisation’s data. Data which is easily and well understood is more likely to be reused as not only is the barrier to reuse lowered, but the urge to collect and store the same data elsewhere for another purpose less likely to come to the fore. This important aspect can be a mechanism for cost avoidance in that it lessens the risk of the design and implementation of systems which make ineffective or inappropriate use of the data assets.

Finally, when coupled with other [types of] models, data models allow an understanding of an organisation’s current technology landscape and the way it interacts with business processes. This understanding forms the foundation for efficient and low cost impact analysis around process and system change, in turn enabling business agility.

So there you have it. Part elevator pitch, part business case executive summary. I hope it helps someone the next time they're arguing for funding to develop or maintain the enterprise data model or trying to convince a bunch of rogue developers that simply using Visio to draw an ERD after the application is delivered isn't really the best approach!

Thursday, January 12, 2012

We Don't All Need World's Best Practice


In anything but the smallest of IT departments chances are that at sometime you will need to rely on the efforts of others to design and deliver some project or other, be they internal staff, contract resources brought in-house or a full blown systems integration firm. There’s also a reasonable chance that you may not have managerial authority over those undertaking the work, or perhaps at best a dotted line report to you.  When faced with this scenario most of us will want to exert some degree of control or influence over how the work gets done or, at the very least, we’ll have an approach we’d prefer those charged with performing the work to follow.

One common element I’ve noted amongst many of the people I’ve managed or mentored over my career is an hesitation, or even unwillingness, to commit to paper a set of rules or guidelines which will govern the way in which others conduct their design, development and implementation activities. Often these folks have claimed that this sort of guiding documentation isn’t required, citing reasons including direct supervisory authority over the project team, the ability to exert technical influence in a one on one scenario with key project team members, or that the project team is experienced and skilled enough that such governance is not required. To my way of thinking none of these reasons really holds water. Even if you are able to direct or influence the behavior of the project team, to tackle this in an ad-hoc fashion is ineffective, not fair on the project team (it’s hard to work within given constraints when those constraints are only trickle fed to you and often only arrive at a review stage) and often this direction falls by the wayside as project pressures heat up (if it ever actually occurs at all). Leaving things ungoverned is also a risky proposition – there are many ways to skin a cat – and there is no guarantee that the approach taken by the project team (no matter how good or effective it may be) will result activities or outputs which mesh well with your existing team, processes, procedures or landscape. Failure to govern, guide and set some ground rules can, and usually will, result in project outcomes which are not as good as they could otherwise have been.

I have another theory as to why people are reluctant to commit such ground rules to paper – fear! It’s a natural reaction when you’re dealing with the unknown. Other than a paper CV and perhaps a few preliminary meetings you have little real idea of the depth of experience and skills of the “outsiders” you will be working with. Concern creeps in: “will they know so much more than me” and “will I look stupid alongside them or in their eyes” are just two of the things those little voices in your head might start to whisper to you. I can recall these feelings in the past even the project lay squarely in an area in which I had deep expertise. Imagine how uncomfortable a person already a little uncertain of his or her skill and experience in an area might feel! Alongside the fear comes overwhelm: the feeling that there are just too many things to think about, that it would not be possible to get them all documented without missing at least one or two items. This thought process leads right back to feeding the fear with concerns that omissions from the document will only make the author look even worse in the eyes of his or her management or the new people arriving for the project.

Let me put an idea out there. You don’t need to know more than the people coming into your company, you don’t even need to know the best way to tackle a certain technology problem or the ins and outs of the latest development techniques or what domain thought leaders are debating amongst themselves. There is no need to commit to documenting world’s best practice, nor to hold your project resources to that standard. Good practice will be fine for the vast majority of situations. But how do you even know what good practice is? And how do you make sure that you cover all of the big-ticket items? Knowing everything that is important to think about can be daunting.

I favour tackling this with something I call the Bad Outcomes approach. Rather than trying to think of everything that needs to be done in a certain way or toward a certain approach, simply make a list of the bad outcomes that could result from the upcoming project. Start with the really big ones; the things that might cost your company at the high end of the scale, whether it be financially or in other less direct ways such as reputation or brand damage, or even worse could cost you your job. Once you have those down move on to the layer below, those bad outcomes which may not be as catastrophic but are still likely to cause a prolonger period of discomfort. Mull on these items for a few days; discuss them with colleagues, both from within the IT function and from the business, adding any new items that might come up. Revisit your list and drop any items which would only cause minimal impact or short term pain and with luck you’ll have a relatively short list of the outcomes that you need to govern against. As an example, the last time I went though this exercise I ended up with only eleven items for a project likely to be worth multiple tens of millions of dollars. Now you’ll have focus – you’ll know what to work on that’s really important. It won’t matter if you don’t use the World’s Best Practice approach to govern and guide each item, so long as you find a way which is likely to avoid the outcome then you’ll have what you need and you’ll likely have saved your company a pretty penny in avoided costs and perhaps even your job along the way.

The next time you’re faced with the need to craft a strategy or pen a set of standards or guidelines don’t worry about what you don’t know. Remember, no-one will know all there is to know about a subject, so accept that you won’t always know the best way to solve a problem or everything there is to think about, but I’m pretty sure you, like me, have your scars and war stories so you’ll know what you want to avoid, so start there. Good luck!

Wednesday, January 4, 2012

Technology Leadership - A Misnomer?

If you've had a career of any reasonable length in the IT industry and have a track record of success behind you then chances are you're now in, or at some future point will be offered, a role which is considered a Technology Leadership role. In the data and information space these types of roles include Architects, Business Intelligence Managers, Competency Centre Leaders, Data Quality Managers and the like.

I've worked in many of these roles at various times and the one thing that they all have in common is that often this so called technology leadership is actually dealing with issues that have little to directly do with the technology at all. Rather the Technology Leader spends his days strategy setting, looking for ways to deliver business value through future change or the current actions of his people, engaged in stakeholder management, selling concepts to senior management, and so on. Most of these issues are actually more closely aligned with managing functions, people, shaping programs of work, etc and require some ability to control and direct and have access to, and authority over, budget and resourcing decisions. Without this the leadership may become divorced from the ability to deliver which can have serious ramifications on credibility, influence and ability to show effect and value from the leadership role. Perhaps it could even be argued that people in such roles are shouldering, and in some cases having to own, some of the concerns traditionally associated with line management without having enough say in solving those problems - i.e. their plans could be quashed, derailed or rerouted by others who do have control over budget, resourcing or strategy.

In order to allow our technology leaders to add the most value we need to realise and acknowledge their leadership is in fact only in part technology based. Our technology leaders need a seat at the management table to be truly effective. Taking this step delivers another benefit. Technology Leaders are in the decision loop early, ensuring better alignment with other initiatives and allowing others in the management structure better visibility of, consideration of, and therefore use of the area from which the "technology" leader comes.

The alternative to this arrangement is the introduction of Thought Leadership roles. These may well be ideal positions for the person formerly known as Technology Leader if the organization can stretch to that. However most can't due to lack of size, true need, commoditisation of many facets of the IT function, or shrinking budgets. The unkind (or is that better phrased as jealous?) spin on these roles might well be lots of thinking, musing and sprouting of opinion without the need for implementation efforts hampered by real world constraints and politics. Hey Gartner - if you're ever looking, I'm over here :)