Large Language Models for Intelligent Data Stewardship in Enterprises: Architectures, Provenance, and Evidence-Mapped Governance
DOI:
https://doi.org/10.15680/IJCTECE.2024.0701007Keywords:
AI Governance, Evidence Mapping, Enterprise AI, Responsible AI, NIST AI RMF, OECD AI Principles, Digital Transformation, Corporate AI StrategyAbstract
Enterprises increasingly operate in data ecosystems characterized by extreme heterogeneity, spanning legacy databases, cloud-native platforms, streaming pipelines, and unstructured knowledge repositories, all under mounting regulatory, ethical, and operational scrutiny. In this context, recent advances in large language models (LLMs) notably transformer-based architectures combined with retrieval-augmented generation (RAG), dense passage retrieval (DPR), programmatic and weak supervision, and knowledge-graph grounding offer a unifying technical substrate for intelligent data stewardship at enterprise scale. By embedding stewardship objectives such as FAIR principles, end-to-end provenance, semantic interoperability, and continuous data quality assurance directly into LLM-enabled workflows, organizations can move beyond static catalogs toward adaptive, context-aware systems capable of automated metadata enrichment, lineage-aware question answering, policy-sensitive data discovery, and assisted remediation of quality and compliance issues. This article synthesizes these foundations into an integrated reference architecture for LLM-assisted stewardship, and introduces an evidence-mapping methodology that operationalizes governance assessment by aligning publicly observable signals with established standards and controls. Through an applied case study of Inspire Brands’ AI-driven governance initiatives, we demonstrate how evidence mapping enables a non-invasive yet systematic evaluation of organizational readiness, surfacing both strengths and gaps without requiring privileged internal disclosures. Finally, we outline open research challenges including evaluation metrics for trustworthiness and explainability, robustness under regulatory change, and human-in-the-loop validation patterns and offer practical recommendations to guide enterprise adopters in responsibly deploying LLMs as first-class components of modern data governance and stewardship ecosystems.
References
1. Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., … Zimmermann, T. (2019). Software engineering for machine learning: A case study. Proceedings of the 41st International Conference on Software Engineering, 291–300.https://dl.acm.org/doi/10.1109/icse-seip.2019.00042
2. Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., … Vayena, E. (2018). AI4People An ethical framework for a good AI society. Minds and Machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5
3. James, K. L., Randall, N. P., & Haddaway, N. R. (2016). A methodology for systematic mapping in environmental sciences. Environmental Evidence, 5(7). https://doi.org/10.1186/s13750-016-0059-6
4. Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389–399.
https://doi.org/10.1038/s42256-019-0088-2
5. Kroll, J. A., Huey, J., Barocas, S., Felten, E. W., Reidenberg, J. R., Robinson, D. G., & Yu, H. (2017). Accountable algorithms. University of Pennsylvania Law Review, 165(3), 633–705. https://scholarship.law.upenn.edu/penn_law_review/vol165/iss3/3/
6. Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2).
https://doi.org/10.1177/2053951716679679
7. Shravan Kumar Reddy Padur "Empowering Developer & Operations Self-Service: Oracle APEX + ORDS as an Enterprise Platform for Productivity and Agility" International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 11, pp.364-372, November-December-2018. Available at doi : https://doi.org/10.32628/IJSRSET1844429
8. Miake-Lye, I. M., Hempel, S., Shanman, R., & Shekelle, P. G. (2016). What is an evidence map? A systematic review of published evidence maps. Systematic Reviews, 5(28). https://doi.org/10.1186/s13643-016-0204-x
9. Sudhir Vishnubhatla. (2018). From Risk Principles to Runtime Defenses: Security and Governance Frameworks for Big Data in Finance. In International Journal of Science, Engineering and Technology (Vol. 6, Number 1). Zenodo. https://doi.org/10.5281/zenodo.17452405
10. Mökander, J., Floridi, L., & Taddeo, M. (2021). Ethics-based auditing of automated decision-making systems. AI and Ethics, 2(4), 609–623. https://doi.org/10.1007/s11948-021-00319-4
11. Nithin Nanchari. (2020). Wearable IoT Devices for Health. Journal of Scientific and Engineering Research, 7(11), 235–236. https://doi.org/10.5281/zenodo.15966018
12. Raji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., … Barnes, P. (2020). Closing the AI accountability gap. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 33–44.
https://doi.org/10.1145/3351095.3372873
13. Shravan Kumar Reddy Padur "Empowering Developer & Operations Self-Service: Oracle APEX + ORDS as an Enterprise Platform for Productivity and Agility" International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 11, pp.364-372, November-December-2018. Available at doi : https://doi.org/10.32628/IJSRSET1844429
14. Stahl, B. C., Timmermans, J., & Mittelstadt, B. D. (2016). The ethics of computing. ACM Computing Surveys, 48(4). https://doi.org/10.1145/2871196
15. Shravan Kumar Reddy Padur. (2016). Network Modernization in Large Enterprises: Firewall Transformation, Subnet Re-Architecture, and Cross-Platform Virtualization. In International Journal of Scientific Research & Engineering Trends (Vol. 2, Number 5). Zenodo. https://doi.org/10.5281/zenodo.17291987

