Friday, October 14, 2016

What is big data?

What do you think big data is? Please fill this pool before reading the rest of the post!

Really, have you filled the pool?

I had recently a discussion about Big Data with my colleagues, and realized that most of them believe that big data is about a lot of data... what about you? It is true that until today there is no accepted definition for the concept of Big Data, the only agreement across definitions one can find is : Big Data is about so much more than the big amount of data.

I have recently finished reading this book, "Big Data : A revolution that will transform how we, work and think." A read for anyone who is interested in this emerging hot topic.

This book basically claims : 
It ushers in three big shifts: more, messy and correlations (the book’s chapters 2, 3 and 4). First, more. We can finally harness a vast quantity of information, and in some cases, we can analyze all the data about a phenomenon. This lets us drill down into the details we could never see before. Second, messy. When we harness more data, we can shed our preference for data that’s only of the best calibre, and let in some imperfections. The benefits of using more data outweighs cleaner but less data. Third, correlations. Instead of trying to uncover causality, the reasons behind things, it is often sufficient to simply uncover practical answers. So if some combinations of aspirin and orange juice puts a deadly disease into remission, it is less important to know what the biological mechanism is than to just drink the potion. For many things, with big data it is faster, cheaper and good enough to learn “what,” not “why.”

And why is it a revolution? 
A reason that we can do these things is that we have so much more data, and one reason for that is because we are taking more aspects of society and rendering it into a data form (discussed in chapter 5). With so much data around, and the ability to process it, big data is the bedrock of new companies.
The value of data is in its secondary uses, not simply in the primary purpose for which it was initially collected, which is the way we tended to value it in the past (noted in chapter 6). Hence, a big delivery company can reuse data on who sends packages to whom to make economic forecasts. A travel site crunches billions of old flight-price records from airlines, to predict whether a given airfare is a good one, or if the price is likely to increase or decrease. These extraordinary data services require three things: the data, the skills, and a big data mindset (examined in chapter 7). Today, the skills are lacking, few have the mindset even though the data seems abundant. But over time, the skills and creativity will become commonplace — and the most prized part will be the data itself.

What are the threats (or how societies will have to re-invent privacy)? 
Big data also has a dark side (chapter 8). Privacy is harder to protect because the traditional legal and technical mechanisms don’t work well with big data. And a new problem emerges: propensity — penalizing people based on what they are predicted to do, not what the have done. At the same time, there will be an increasing need to stay vigilant so that we don’t fall victim to the “dictatorship of data,” the idea that we shut off our reasoned judgment and endow in the data-driven decisions more than they deserve.
Solutions to these thorny problems (raised in chapter 9) include a fundamental rethink of privacy law and the technology to protect personal information. Also, a new class of professional called the “algorithmist” that will do for the big data age what accountants and auditors did for an era 100 years ago, when the cornucopia of information swamping society was in the form of financial data.
What role is left for humanity? For intuition, experience and acting in defiance of what the data suggests? Big data is set to change not only how we interact with the world, but ourselves.

So look at the pool again? Which answers would you choose now? If you are not ticking all of them, then get the book!

No comments:

Post a Comment