Why Data Ethics Matters
Data science decisions affect real people — hiring algorithms, credit scoring, healthcare predictions, and criminal justice tools all carry ethical weight. A technically correct model can still cause harm if it encodes or amplifies societal biases.
Sources of Bias
- Selection bias — data doesn't represent the target population
- Measurement bias — data collection methods systematically distort values
- Historical bias — past discrimination encoded in training data
- Aggregation bias — treating diverse groups as homogeneous
Fairness Metrics
There is no single definition of fairness. Demographic parity, equalised odds, and calibration are common metrics — but they can conflict with each other. Choosing the right metric depends on the context and stakeholders.
Fairness is not just a technical problem. It requires collaboration between data scientists, domain experts, affected communities, and policymakers.
Key Takeaway
Ethical data science requires awareness of bias sources, deliberate fairness metric selection, and ongoing stakeholder engagement.