Data Health Scans

Getting Started with Data Health Scans.

Data Health Scan Workflows

Example: Pre-Analysis Validation

Scenario: Before running cohort analysis on a new dataset, you want to check for issues that might affect results.

Your Data (with problems)

customer_idmonthrevenue
alice@co.com2024-01100
Alice@co.com2024-02100
bob2024-01$50
bob2024-0150
carol2024-02200
carol200
dave2024-03-75

Step-by-Step Setup

  1. Open Jetti and select your data sheet
  2. Report type: Data Flows
  3. Run all checks

Expected Results

Quality Score: 45/100 (Poor)

Issues Found:

IssueSeverityDetails
Duplicate rowHighRow 4 and 5 are identical (bob, 2024-01, 50)
Inconsistent IDsHigh"alice@co.com" and "Alice@co.com" — same person?
Text in number columnMedium"$50" should be numeric
Missing valueMediumRow 7 has empty month
Negative valueLowdave has -75 revenue — intentional?

How to Fix

  1. Duplicates: Remove row 5 (exact duplicate)
  2. Inconsistent IDs: Standardize to lowercase: =LOWER(A2)
  3. Text numbers: Remove $ symbol: =SUBSTITUTE(C2,"$","")
  4. Missing values: Fill in or remove row 7
  5. Negative values: Verify if intentional (refund?) or data error

Re-run After Fixes

After cleaning, run the health scan again. Target: 90+ score before proceeding to analysis.