chapter 2 : Data cleaning

Data cleaning?

Data cleaning is the process of identifying, correcting, or removing errors, inconsistencies, and irrelevant information in a dataset so that the data becomes accurate, consistent, complete and ready for analysis or decision-making.

In simpler terms: it means cleaning up the raw data — like fixing mistakes, filling in missing parts, removing duplicates or badly formatted entries — so that your analysis gives reliable results.