Establishing Fairness and Transparency Through AI-Driven Data Lineage
DOI:
https://doi.org/10.15680/IJCTECE.2021.0406003Keywords:
AI Lineage, Data Governance, Explainability, Transparency, Responsible AI, Data Ethics, Fairness, Model Accountability, Provenance, AI LifecycleAbstract
Artificial intelligence (AI) systems are now integral to decision-making across industries, yet concerns around fairness, transparency, and accountability continue to undermine public trust. As models grow in complexity, understanding how data flows through these systems becomes essential. AI-driven lineage offers a solution by providing dynamic, real-time tracking of data and model transformations, enabling transparency throughout the AI lifecycle. This paper explores the critical role of lineage in establishing fair and accountable AI systems. We analyze existing tools, frameworks, and standards, propose a methodology for implementing AI-driven lineage, and present a layered framework that supports ethical governance and regulatory compliance.
References
1. Moreau, L., et al. The Open Provenance Model core specification. Future Generation Computer Systems, 27(6), 743–756.
2. Gebru, T., et al. Datasheets for datasets. arXiv:1803.09010.
3. Holland, S., et al. The dataset nutrition label. arXiv:1805.03677.
4. NIST. AI Risk Management Framework 1.0.
5. Apache Atlas. https://atlas.apache.org
6. OpenLineage. https://openlineage.io
7. DataHub. (https://datahubproject.io
8. Schelter, S., et alAutomatically tracking metadata and provenance of machine learning experiments. Data Engineering Bulletin.
9. Pachyderm. https://www.pachyderm.io
10. MLflow.https://mlflow.org
11. EU Commission. EU Artificial Intelligence Act.
12. Comet ML. https://www.comet.com
13. Koshy, R., et al. Data governance and lineage in regulated AI systems. Journal of Data and Information Quality,14(3).
14. Mittelstadt, B. D., et alThe ethics of algorithms: Mapping the debate. Big Data & Society, 3(2).
15. Microsoft. Responsible AI Standard.