A recent conversation between software insider Alex Gorelik and BeyeNETWORK revealed that companies are increasingly turning to the Hadoop architecture to store big data. While there may be a temptation to treat this new and different technology as separate from the regulations and best practices governing the rest of a company's data, governance and data quality
are still important.
"You want to be able to apply the same data quality rules, the same data security protection, the same lifecycle management, archival and so on to Hadoop as you do outside of Hadoop," Gorelik told the source.
Gorelik stated that companies must come to grips with data quality tools for their new Hadoop systems and must prepare employees to work with the new types of data. Unstructured data can be very different from the figures companies have always worked with, demanding a customized approach to get uniform results.
The ability to take high-quality information from Hadoop may have to be dispersed widely through companies, according to a new Karmasphere survey. The researchers indicated that it is important for companies to have many and varied employees trained in Hadoop use to counter a shortage of data scientists.