Skip to main content

Data cleansing questions spread to government

Rachel Wheeler Archive
As organizations in various fields begin to store and process big data sets, the same management and analysis problems tend to appear in many different contexts. According to GovWin, the recent Government Big Data Conference served as a stage for some of these concerns.

At the event, speakers addressed data quality problems by explaining that not every piece of information collected for new, high-capacity projects needs to be subjected to traditional cleansing procedures. This is hardly an overarching rule, however, and organizations' needs will likely vary significantly from one case to the next.

As storage for a wide range of data in various formats is one of the main concerns facing agencies, the source noted that choosing how to keep information is important. Some records can be stored unstructured and with no cleansing, and others should be rigorously checked, depending on differing project needs and roles in the federal infrastructure.

CIO recently reported on the wide usage cases open for big data in government. The source indicated that the recent $200 million big data program involves projects at five separate agencies. Possible uses for the information include high-speed Department of Defense insights and optimized Department of Energy computing.