← Back to guides

CSV date and number cleaning

Many broken analytics pipelines start with one small formatting mismatch. This guide focuses on the highest-value corrections for dates and numbers in CSV and spreadsheet-driven exports.

Unify date format policy

Pick one canonical date standard and apply it everywhere. The safest default for teams and automation is ISO date formatting (YYYY-MM-DD).

Handle locale number formatting

Thousand separators, decimal separators, and currency symbols change by locale. Inconsistent number parsing is a common source of metric drift.

Excel date serials and large integer precision

Excel often stores date values as serial numbers. Large identifiers and IDs can also be rewritten in scientific notation by downstream tools.

Validation checks before publish

After transformation, validate for impossible dates, invalid numbers, and type mismatches. This catches hidden corruption before downstream systems fail.