ai-ethics

AI Ethics

Comprehensive AI ethics skill covering bias detection, fairness assessment, responsible AI development, and regulatory compliance.

When to Use This Skill

Evaluating AI models for bias
Implementing fairness measures
Conducting ethical impact assessments
Ensuring regulatory compliance (EU AI Act, etc.)
Designing human-in-the-loop systems
Creating AI transparency documentation
Developing AI governance frameworks

Ethical Principles

Core AI Ethics Principles

Principle Description

Fairness AI should not discriminate against individuals or groups

Transparency AI decisions should be explainable

Privacy Personal data must be protected

Accountability Clear responsibility for AI outcomes

Safety AI should not cause harm

Human Agency Humans should maintain control

Stakeholder Considerations

Users: How does this affect people using the system?
Subjects: How does this affect people the AI makes decisions about?
Society: What are broader societal implications?
Environment: What is the environmental impact?

Bias Detection & Mitigation

Types of AI Bias

Bias Type Source Example

Historical Training data reflects past discrimination Hiring models favoring male candidates

Representation Underrepresented groups in training data Face recognition failing on darker skin

Measurement Proxy variables for protected attributes ZIP code correlating with race

Aggregation One model for diverse populations Medical model trained only on one ethnicity

Evaluation Biased evaluation metrics Accuracy hiding disparate impact

Fairness Metrics

Group Fairness:

Demographic Parity: Equal positive rates across groups
Equalized Odds: Equal TPR and FPR across groups
Predictive Parity: Equal precision across groups

Individual Fairness:

Similar individuals should receive similar predictions
Counterfactual fairness: Would outcome change if protected attribute differed?

Bias Mitigation Strategies

Pre-processing:

Resampling/reweighting training data
Removing biased features
Data augmentation for underrepresented groups

In-processing:

Fairness constraints in loss function
Adversarial debiasing
Fair representation learning

Post-processing:

Threshold adjustment per group
Calibration
Reject option classification

Explainability & Transparency

Explanation Types

Type Audience Purpose

Global Developers Understand overall model behavior

Local End users Explain specific decisions

Counterfactual Affected parties What would need to change for different outcome

Explainability Techniques

SHAP: Feature importance values
LIME: Local interpretable explanations
Attention maps: For neural networks
Decision trees: Inherently interpretable
Feature importance: Global model understanding

Model Cards

Document for each model:

Model purpose and intended use
Training data description
Performance metrics by subgroup
Limitations and ethical considerations
Version and update history

AI Governance

AI Risk Assessment

Risk Categories (EU AI Act):

Risk Level Examples Requirements

Unacceptable Social scoring, manipulation Prohibited

High Healthcare, employment, credit Strict requirements

Limited Chatbots Transparency obligations

Minimal Spam filters No requirements

Governance Framework

Policy: Define ethical principles and boundaries
Process: Review and approval workflows
People: Roles and responsibilities (ethics board)
Technology: Tools for monitoring and enforcement

Documentation Requirements

Data provenance and lineage
Model training documentation
Testing and validation results
Deployment and monitoring plans
Incident response procedures

Human Oversight

Human-in-the-Loop Patterns

Pattern Use Case Example

Human-in-the-Loop High-stakes decisions Medical diagnosis confirmation

Human-on-the-Loop Monitoring with intervention Content moderation escalation

Human-out-of-Loop Low-risk, high-volume Spam filtering

Designing for Human Control

Clear escalation paths
Override capabilities
Confidence thresholds for automation
Audit trails
Feedback mechanisms

Privacy Considerations

Data Minimization

Collect only necessary data
Anonymize when possible
Aggregate rather than individual data
Delete data when no longer needed

Privacy-Preserving Techniques

Differential privacy
Federated learning
Secure multi-party computation
Homomorphic encryption

Environmental Impact

Considerations

Training compute requirements
Inference energy consumption
Hardware lifecycle
Data center energy sources

Mitigation

Efficient architectures
Model distillation
Transfer learning
Green hosting providers

Reference Files

references/bias_assessment.md
Detailed bias evaluation methodology
references/regulatory_compliance.md
AI regulation requirements

Integration with Other Skills

machine-learning - For model development
testing - For bias testing
documentation - For model cards

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

dead-code-removal

ai-code-cleanup

python-scripting

developer-experience