Despite the fact that Big Data is fastest becoming one of the buzz words in the data industry I still wonder if we yet know exactly what it is. Is it anything more than just a concept, an umbrella topic if you like, to attempt to signal that something new is beginning to emerge in our industry? But, is big data really something new? Is the buzz justified, do we really need to reinvent our approaches, our practices and the skills we engender in those coming up through data careers?
But what of the other characteristics of big data? It’s not just volume that’s the challenge with big data, but the velocity too. I’ve heard some commentators discussing that it is the rate at which data arrives which is in fact the biggest issue that comes with big data. But we’ve been dealing with high velocity data for a while now as well. Complex event processing has been available in major database products for at least a few years and people working with any form of operational technology will be used to data flowing in from sensors and other protection or control devices at millisecond intervals.
So I believe that we can leverage much of what we already know and practice as data professionals to start to address the volume and velocity aspects of [structured] big data. It’s the complexity and variety aspects of big data that I think will give us the real problems we need to deal with. The problem of making, or taking, the correct meaning from unstructured data, especially that coming from outside our organisations, such as sentiment analysis from social media, and then somehow find a way to effectively integrate it with our structured data sets is where I believe we’ll find our headaches. It’s here that we’ll need innovative new tools and techniques, but even then I think they’ll rely on key areas and practices which at least some of us have worked with for a while. Metadata will play a big role here to establish the right context and data mining may well make an appearance as well.
So, in my opinion at least, we already know a thing or two about big data. Let’s not re-invent the wheel, but rather try to build on what we’ve learned in the past. It’s a novel idea I know, and perhaps not one that IT professionals have much of a proven track record with, but hopefully this type of approach might shorten the time to value for those making early investments in and around big data.
No comments:
Post a Comment