Skip to main content

Small-scale data analysis can be just as important as big data

Paul Newman Archive

Many firms emphasize data quality because of the perceived importance of big data - the general consensus is "more is better," and corporate IT departments often work tirelessly to amass as much information on consumers as possible. But lately, another school of thought has begun to emerge. Perhaps size doesn't matter - it's not the amount of information in one's database that counts, but rather, its accuracy and relevance that matters more.

According to Quartz, "small data" can be just as important as its "big" counterpart. The more information a company collects, the riskier it becomes - redundancies begin to crop up, false correlations might be drawn and the task of ensuring data quality becomes a drain on companies' time and money.

Michael Wu, principal scientist of data analytics at social media analysis firm Lithium, said that there are diminishing returns with data collection - the more you accumulate, the less it's worth to you. Past a certain point, you're only wasting your time.

"The information you can extract from any big data asymptotically diminishes as your data volume increases," Wu told the news source.

Here are three reasons why "small data" has a place in big business analysis.

Big data can become overcomplicated and expensive
Quartz explains that big data might be just as likely to confuse companies as to enlighten them. Data collection often involves bias, a lack of context or gaps in what's gathered and what's not. The more you collect, the more room for error there is, and fixing those errors often requires a massive amount of labor and computing power. Companies are often better off using simpler, more manageable clusters.

Even huge corporations aren't using Google-style tools
At Google, powerful servers work to evaluate every single page on the internet, analyzing its performance and traffic to find trends. That requires an enormous amount of hardware, not to mention headaches for the engineers asked to sift through everything. Most large corporations don't need to be like Google. Even Facebook and Yahoo, both known for their massive banks of data, don't need Google-style tools to sort through everything - at a certain point, it's overkill.

Small data helps tackle real problems
The Guardian notes that by using small, centralized clusters of data, organizations can solve real problems in communities. Issues like household energy use, local bus schedules and City Hall budgets can all be addressed. Smaller amounts of data can be used to analyze all of the above, helping to effect meaningful changes.