The Big Data Myth

Big Data has become the craze of the business world.  Companies are spending millions of dollars on the latest technologies and are hiring data scientists in droves, seemingly in a rush to stay technologically relevant.  But what is “Big Data” and is it right for your business?  For many organizations, the answer may come as a surprise.

The definition of Big Data is as varied as the companies that claim to have it.  Consider this definition from a quick Google search:

noun: big data

  1. extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.

    “much IT investment is going towards managing and maintaining big data”

There is no question that Big Data does, in fact, offer myriad opportunities to reveal invaluable patterns and trends that can lead organizations to both uncover unknown problems and discover completely new products, services, and ideas.

Take, for example, the popular book (turned movie) Moneyball.  Major League Baseball is a game of hundreds of players throwing thousands of pitches during scores of at bats.  Entire companies exist to collect the millions of records of statistics each year to detail who threw a pitch, who was at bat, details of weather and wind speed, how fast was the pitch, whether it was a fast ball or curve ball, if the batter swung or not, whether he made contact or not, where the ball went… the list goes on and on.  Clearly, the vast amount of data available across multiple seasons of baseball are not manageable with everyday tools by the normal computer user.  That is why teams like the Athletics and Red Sox have become notorious for hiring data scientists to navigate the data and create information to better their organization.

The need exists in other industries as well.  The airlines have countless streams of data around flight schedules and the flights and passengers themselves.  Twitter and Facebook have millions of posts on their sites a day.  In these cases, hiring a team of computer science and statistics PhD’s is an understood necessity to utilize the data.

So, why are we calling “Big Data” a myth?  Let’s go back to the definition and consider the term “extremely large data sets”.  What does that mean?  In the case of Major League Baseball, airlines, and social networks, that can mean terabytes of data representing billions of data records.

However, chances are that most people reading this do not work in one of these places.  The typical organization is going to have thousands (most manufacturers and B2B organizations) to millions (larger retailers) of transactions.

 This does not represent Big Data.  This is just Data.

Big Data requires huge investments in infrastructure, rapidly evolving tools such as Hadoop and NoSQL, and scores of data scientists, who tend to have mostly academic backgrounds.

Data requires more modest investments in infrastructure, common (and often free!) tools such as Microsoft SQL Server, PostgreSQL, Excel, and Tableau, and experienced analytics professionals with real world business experience.  Most organizations need analysts who are both technically skilled and have the ability to translate data into actionable business insights.