Munging is the standard definition for irrevocably changing or damaging data beyond its original state. The term is thought to have originated as a backronym for “Mash Until No Good”.
However, when referring specifically to Data Munging (or Data Wrangling), it means preparing your data for a dedicated purpose - taking the data from its raw state and transforming and mapping into another format, normally for use beyond its original intent.
Often ‘raw’ data can be hard, even impossible, to analyse and gain useful insights from. This is where somebody will transform the data entires, fields, rows and columns into a more useful format. Activities to achieve this might include:
The final data can then be sent to the relevant data analyst or stored, ready to be analysed at a later date.
A specific example of data munging might be used in Machine Learning, in order to restructure data in a way that could be used by a learning algorithm.
A common example of damaging data is with email addresses. Typically, to prevent spam, a user will destroy the valid format of an email address by writing it in a way that humans understand but computers do not, such as:
JohnDOTdoeATJohnDoeDOTcom or John(dot)doe(at)John(dot)doe(dot)com
Copyright ©, 2014-2017. All rights reserved.
Experian Ltd is authorised and regulated by the Financial Conduct Authority. Experian Ltd is registered in England and Wales under company registration number 653331.
Registered office address: The Sir John Peace Building, Experian Way, NG2 Business Park, Nottingham NG80 1ZZ.