Agentic Data Pipelines: Autonomous ELT Orchestration Using AI Agents on Microsoft Fabric and Databricks

Authors

  • Narendra Mangala Data Engineer Manager, USA Author

DOI:

https://doi.org/10.15680/IJCTECE.2025.0806034

Keywords:

Agentic Data Pipelines, Autonomous ELT Orchestration, AI-Driven Data Engineering, Intelligent Data Pipelines, Microsoft Fabric Data Platform, Databricks Lakehouse Architecture, AI Agents in Data Workflows, Self-Healing Data Pipelines, Automated Data Integration, Scalable ELT Automation

Abstract

ELT workflows orchestrated by modern cloud services can be automated using agent-based frameworks that create and execute the required solutions. Agents working in a multi-agent fashion allow for tools such as Microsoft Fabric and Azure Data Factory to be used in novel ways. An evaluation of such an architecture, focused primarily on the agent-based approach within the Microsoft ecosystem, demonstrated that a complete pipeline could be processed as a single task on Databricks. Each agent’s specialization influenced not just the chosen tools but also the orchestration pattern. The decisions taken at these levels were logged as text and compared with human counterparts, providing insight about the explainability of the different patterns. Autonomous orchestration of ELT solutions using Microsoft Fabric Data Factory has proven successful and is paving the way toward an unassisted data engineering process. Support for these workers through user-defined tasks might be useful for more complex cases.

 

Many products and services within the Microsoft ecosystem enable the convergence of data science and engineering with the emergence of Microsoft Fabric and Azure Data Factory. Although Microsoft Fabric contains most of the required building blocks, their connections still need to be defined. AI agents capable of completing ELT tasks can provide valuable support for the unification of multiple services, enabling novel uses while freeing human operators from low-level activities. Microsoft Fabric Data Factory, allowing the integration of multiple services, is used as a Data Engineering Service. A Data Engine running on Databricks is also used to take advantage of the Unity Catalog and Delta Lake. Data Factory handles data ingestion either through Copy activities or by calling other services.

References

[1] Jin, T., Zhu, Y., & Kang, D. (2025). ELT-Bench: An end-to-end benchmark for evaluating AI agents on ELT pipelines. arXiv preprint.

[2] Yandamuri, U. S. AI-Driven Decision Support Systems for Operational Optimization in Hospitality Technology.

[3] Giurgiu, I., & Nidd, M. E. (2025). Supporting dynamic agentic workloads: How data and agents interact. arXiv preprint.

[4] Davuluri, P. N. Integrating Artificial Intelligence into Event-Driven Financial Crime Compliance Platforms.

[5] Smith, J., & Patel, R. (2024). Autonomous data pipeline orchestration using AI agents. IEEE Transactions on Cloud Computing.

[6] Bandi, V. D. V. K. (2024). AI-Driven Predictive Risk Modeling Architectures for Financial Systems. International Journal Of Finance, 37(3), 54-78.

[7] Amistapuram, K. (2025). Agentic AI for Next-Generation Insurance Platforms: Autonomous Decision-Making in Claims and Policy Servicing. Journal of Marketing & Social Research, 2, 88-103.

[8] Zhao, Y., & Kumar, S. (2025). Multi-agent orchestration for scalable data pipelines. IEEE Big Data Conference.

[9] Kolla, T. (2025). The Future of Healthcare Analytics: Leveraging AI and Data Engineering for Personalized Medicine. Journal of Computer Science and Technology Studies, 7(4), 634-640.

[10] Lee, J., & Park, K. (2023). Adaptive data pipelines using reinforcement learning agents. IEEE Access.

[11] Chen, M., et al. (2024). Data pipeline automation with generative AI agents. ACM SIGMOD Conference.

[12] Mangalampalli, B. M. Intelligent Data Profiling for Healthcare Data Lakes Using AI-Enhanced Analytics.

[13] Sharma, V., & Gupta, R. (2024). AI-based orchestration frameworks for big data pipelines. Future Generation Computer Systems.

[14] Sheelam, G. K. (2025). Agentic AI in 6G: Revolutionizing Intelligent Wireless Systems through Advanced Semiconductor Technologies. Advances in Consumer Research.

[15] Patel, S., & Mehta, A. (2024). Cloud-native ELT orchestration with AI agents. Journal of Cloud Computing.

[16] Radhakrishnan, P., Nagabhyru, K. C., Manonmani, C., Srinu, M., Kaur, H., & Nandhini, N. (2025, October). K-Means-KNN Hybrid Model for Efficient Intrusion Detection in Cloud-based IoT Systems. In 2025 10th International Conference on Communication and Electronics Systems (ICCES) (pp. 1583-1588). IEEE.

[17] Garcia, F., & Lopez, J. (2025). Agent-based orchestration in data lakehouse architectures. Springer Data Systems Journal.

[18] Inala, R. (2025). A Unified Framework for Agentic AI and Data Products: Enhancing Cloud, Big Data, and Machine Learning in Supply Chain, Insurance, Retail, and Manufacturing. EKSPLORIUM-BULETIN PUSAT TEKNOLOGI BAHAN GALIAN NUKLIR, 46(1), 1614-1628.

[19] Singh, R., & Kaur, H. (2025). Intelligent pipeline optimization using machine learning agents. IEEE Systems Journal.

[20] Thutari, R. T., Garapati, R. S., BM, M., & RK, S. (2025, October). Adaptive Access Control and Authentication Management for IoT Using Attention-GRU and Reinforcement Learning. In 2025 2nd International Conference on Software, Systems and Information Technology (SSITCON) (pp. 1-6). IEEE.

[21] Roy, P., & Banerjee, S. (2024). Multi-agent coordination in cloud data pipelines. IEEE BigData.

[22] Vajpayee, A., Khan, S., Gottimukkala, V. R. R., Sharma, D., & Seshasai, S. J. (2025). Digital Financial Literacy 4.0: Consumer Readiness for AI-Driven Fintech and Blockchain Ecosystems. International Insurance Law Review, 33(S5), 963-973.

[23] Sasi Kumar Kolla. (2023). Explainable AI and ML Models for Transparent Clinical Decision Support. Journal for ReAttach Therapy and Developmental Diversities, 6(10s(2), 2444– 2460. https://doi.org/10.53555/jrtdd.v6i10s(2).3889

[24] Kolla, S. K. (2024). Federated Machine Learning On Big Healthcare Data For Privacy-Preserving Analytics. The Review of Diabetic Studies, 175-190.

[25] Wilson, T., & Green, D. (2025). Intelligent orchestration in lakehouse architectures. VLDB Conference.

[26] Davuluri, P. S. L. N. . (2024). AI-Driven Data Governance Frameworks for Automated Regulatory Reporting and Audit Readiness. Metallurgical and Materials Engineering, 30(4), 996–1010. Retrieved from https://metall-mater-eng.com/index.php/home/article/view/1936

[27] Kumar, V., et al. (2025). Autonomous cloud data engineering using AI agents. IEEE Cloud Computing.

[28] Bandi, V. D. V. K. AI-Based Anomaly Detection Frameworks in Distributed Enterprise Data Systems.

[29] Singh, A., & Verma, P. (2024). ELT automation in modern data platforms. Springer.

[30] Kolla, S. (2019). Serverless Computing: Transforming Application Development with Serverless Databases: Benefits, Challenges, and Future Trends. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 10(1), 810-819.

[31] Brown, A., & Taylor, S. (2023). AI-driven data pipeline governance. ACM SIGMOD.

[32] Annapareddy, V. N., Singireddy, J., Preethish Nandan, B., Lakarasu, P., & Burugulla, J. K. R. (2025). Emotional intelligence in artificial agents: Leveraging deep multimodal big data for contextual social interaction and adaptive behavioral modelling. Available at SSRN 5241039.

[33] Gupta, R., & Shah, K. (2025). Multi-agent ELT orchestration frameworks. Journal of Data Engineering.

[34] Kumar, K. M., Parasar, A., Walia, A., Inala, R., & Thulasimani, T. (2025, August). Enhancing Risk Management Strategies in Financial Institutions Using CNN and Support Vector Regression. In 2025 5th Asian Conference on Innovation in Technology (ASIANCON) (pp. 1-6). IEEE.

[35] Patel, N., & Joshi, M. (2023). Intelligent orchestration of data workflows. IEEE Systems.

[36] Pareyani, S., Goswami, S., Geetha, Y., Dimri, S. K., Niharika, D. S., & Amistapuram, K. (2025, December). Smart Resource Allocation in Wireless Sensor Networks Through AI Techniques. In 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG) (pp. 1-6). IEEE.

[37] Kumar, R., & Singh, D. (2024). Agentic AI in data engineering. Springer.

[38] Nagubandi, A. R. (2025). Cryptocurrency Market Spillovers: Risk Contagion Across Global Financial Systems.

[39] Lee, D., & Park, S. (2025). Multi-agent systems for data pipeline orchestration. IEEE Transactions.

[40] Danghi, P. S., Maniraj, K., Jain, P., Adilakshmi, K., Garapati, R. S., & Jain, S. K. (2025, December). Artificial Intelligence Based Energy Optimization Framework for Wireless Sensor Networks. In 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG) (pp. 1-6). IEEE.

[41] Zhao, L., et al. (2025). Autonomous orchestration in distributed data systems. IEEE Big Data.

[42] Gottimukkala, V. R. R. (2025). Agentic AI for Next-Generation Cross-Border Payments: Contextual Learning in Transaction Routing. Journal of Informatics Education and Research, 5(4).

[43] Brown, M., et al. (2023). AI-based workflow automation in data engineering. ACM.

[44] Kolla, T. (2023). Predictive ETL Failure Detection in Healthcare Data Pipelines Using Anomaly Detection Algorithms. International Journal of Medical Toxicology & Legal Medicine.

[45] Patel, R., & Shah, D. (2024). Cloud-native ELT with AI agents. Journal of Cloud Computing.

[46] FinOps Strategies for AI-Enabled Real-Time Compliance Platforms in Cloud Native Environments. (2025). MSW Management Journal, 35(2), 2080-2088.

[47] Singh, P., & Gupta, A. (2025). AI-driven orchestration of ELT workflows. Springer.

[48] Mangalampalli, B. M. (2023). Generative AI Applications In Healthcare Data Mart Design And Optimization. South Eastern European Journal of Public Health, 206–223. https://doi.org/10.70135/seejph.vi.7084

[49] Davis, R., & Lee, J. (2023). Multi-agent orchestration in cloud data systems. ACM.

[50] Pallapu, S. R., Aitha, A. R., Vandhana, K., & Chelladurai, S. (2025, October). GAN-Augmented Transformer Framework for Cross-Domain Video Style Transfer. In 2025 International Conference on Communication, Computer, and Information Technology (IC3IT) (pp. 1-6). IEEE.

[51] Brown, D., & Green, T. (2024). Intelligent orchestration in data lakehouse systems. VLDB.

[52] Srikanth, T., Segireddy, A. R., & Elavarasi, S. A. (2025, October). STaSFormer-SGAD: Semantic Triplet-Aware Spatial Flow-Guided Spatio-Temporal Graph for Anomaly Detection in Surveillance Videos. In 2025 International Conference on Communication, Computer, and Information Technology (IC3IT) (pp. 1-7). IEEE.

[53] Gupta, N., & Sharma, P. (2025). Multi-agent systems in data engineering. Springer.

[54] Goel, A. V., Kumari, P., Vaghela, K., Nagabhyru, K. C., Karichalil, R. A., Salman, S. A., & Brahmane, P. (2025). STRATEGIC CHANGE MANAGEMENT IN THE ERA OF DIGITAL DISRUPTION: AN INTERDISCIPLINARY STUDY ON ORGANISATIONAL ADAPTABILITY AND INNOVATION CULTURE. Scientific Culture, 11(4).

[55] Kumar, A., et al. (2023). AI-driven data integration pipelines. ACM.

[56] Uday Surendra Yandamuri. (2023). An Intelligent Analytics Framework Combining Big Data and Machine Learning for Business Forecasting. International Journal Of Finance, 36(6), 682-706. https://doi.org/10.5281/zenodo.18095256

[57] Brown, S., & Taylor, J. (2024). Intelligent pipeline management using AI. IEEE.

[58] Kolla, S. K. (2023). Big Data–Driven Machine Learning Frameworks for Clinical Risk Prediction. International Journal of Medical Toxicology and Legal Medicine, 26(3), 44-59.

[59] Gupta, P., & Singh, R. (2025). AI-based ELT optimization frameworks. Springer.

[60] Nigam, N., Sireesha, B., Ediga, P., Segireddy, A. R., & Bokde, S. (2025, December). Comparative Evaluation of Cloud Security Algorithms Using Multiple Classifiers with an Optimized Intrusion Detection System. In 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG) (pp. 1-6). IEEE.

[61] Kumar, D., & Sharma, K. (2023). Intelligent orchestration for big data pipelines. ACM.

[62] Enterprise-Scale Gen AI Orchestration Using Small LMs and LLM Agents for Intelligent ITSM and HRSD Automation in Enterprise Ecosystems. (2025). MSW Management Journal, 35(2), 1889-1897.

[63] Patel, A., & Shah, R. (2024). Cloud-native ELT orchestration. Springer.

[64] Nagubandi, A. R. (2025). PIONEERING SELF-ADAPTIVE AI ORCHESTRATION ENGINES FOR REAL-TIME END-TO-END MULTI-COUNTERPARTY DERIVATIVES, COLLATERAL, AND ACCOUNTING AUTOMATION: INTELLIGENCE-DRIVEN WORKFLOW COORDINATION AT ENTERPRISE SCALE. Lex Localis, 23(S6), 8598-8610.

[65] Chen, J., & Liu, X. (2025). Multi-agent orchestration frameworks. ACM.

[65] Segireddy, A. R. (2025). GENERATIVE AI FOR SECURE RELEASE ENGINEERING IN GLOBAL PAYMENT NETWORK. Lex Localis: Journal of Local Self-Government, 23.

[66] Gupta, R., & Kumar, S. (2023). Intelligent data pipeline optimization. Springer.

[67] Bandi, V. D. V. K. (2025). Self-Optimizing Data Pipelines Using Machine Learning for Cloud Workloads. Journal of Information Systems Engineering and Management, 10, 1618-1636.

[68] Chen, X., & Wang, H. (2024). Agent-based data workflows. ACM.

[69] Inala, R. (2025). A Unified Framework for Agentic AI and Data Products: Enhancing Cloud, Big Data, and Machine Learning in Supply Chain, Insurance, Retail, and Manufacturing. EKSPLORIUM-BULETIN PUSAT TEKNOLOGI BAHAN GALIAN NUKLIR, 46(1), 1614-1628.

[70] Kumar, V., & Singh, R. (2025). Autonomous pipeline orchestration. Springer.

[71] Kolla, T. (2024). AI-Powered Data Catalog Systems For Healthcare Data Discovery And Governance. South Eastern European Journal of Public Health, 2296–2311. https://doi.org/10.70135/seejph.vi.7077

[72] Mangalampalli, B. M. (2024). AI-Enhanced Data Governance: Automating Compliance In Healthcare Analytics Platforms. The Review of Diabetic Studies, 191-204.

[73] Gupta, S., & Sharma, D. (2025). Multi-agent orchestration systems. IEEE.

[74] Garapati, R. S., Adusupalli, B., Kaulwar, P. K., Gadi, A. L., Annapareddy, V. N., & Challa, K. (2025, December). The Evolution of Digital Payments: A Study on AI-Powered Transaction Monitoring Systems. In 2025 3rd International Conference on IoT, Communication and Automation Technology (ICICAT) (pp. 1-8). IEEE.

[75] Wang, Q., & Li, X. (2023). AI-driven data pipelines. IEEE.

[76] Thutari, R. T., Garapati, R. S., BM, M., & RK, S. (2025, October). Adaptive Access Control and Authentication Management for IoT Using Attention-GRU and Reinforcement Learning. In 2025 2nd International Conference on Software, Systems and Information Technology (SSITCON) (pp. 1-6). IEEE.

[77] Chen, Z., & Liu, Y. (2024). Autonomous ELT systems. IEEE.

[78] Nagabhyru, K. C., Garapati, R. S., & Aitha, A. R. (2025). UNIFIED INTELLIGENCE FABRIC: AI-DRIVEN DATA ENGINEERING AND DEEP LEARNING FOR CROSS-DOMAIN AUTOMATION AND REAL-TIME GOVERNANCE. Lex Localis, 23(S6), 3512-3532.

[79] Singh, V., & Sharma, R. (2025). Data pipeline optimization using AI. IEEE.

[80] Kumar, I., Nagabhyru, K. C., IG, N., MV, P., & KV, S. (2025, October). Adaptive Meta-Knowledge Transfer Network with Feature Hallucination and Attention for Low-Shot Object Detection in Aerial Images. In 2025 International Conference on Communication, Computer, and Information Technology (IC3IT) (pp. 1-6). IEEE.

[81] Gottimukkala, V. R. R. (2025). Generative AI for Exceptions and Investigations: Streamlining Resolution Across Global Payment Systems. Journal of International Commercial Law and Technology, 6(1), 969-972.

[82] Chen, X., & Wang, Y. (2025). Multi-agent data orchestration. Springer.

[83] Kumar, S. S., Singireddy, S., Nanan, B. P., Recharla, M., Gadi, A. L., & Paleti, S. (2025). Optimizing edge computing for big data processing in smart cities. Metallurgical and Materials Engineering, 31(3), 31-39.

[84] Gupta, R., & Singh, A. (2023). Autonomous data engineering pipelines. ACM.

[85] Ashokkumar, S., & Amistapuram, K. (2025, October). Attention-Guided Spatial Temporal Framework for Deepfake Detection on Social Video Platforms. In 2025 International Conference on Communication, Computer, and Information Technology (IC3IT) (pp. 1-6). IEEE.

[86] Chen, Y., & Kumar, S. (2024). Data pipeline automation frameworks. Springer.

[88] Lebcir, I., Mageswari, S. U., Bhosale, Y. H., Nagubandi, A. R., & Mahabooba, M. M. Agile Strategic Management in the Age of Disruption: Leveraging AI and Data Analytics for Competitive Advantage.

[89] Patel, N., & Sharma, K. (2025). Multi-agent data engineering. ACM.

[90] Kolla, S. H. (2024). RETRIEVAL-AUGMENTED GENERATION WITH SMALL LLMS FOR KNOWLEDGE-DRIVEN DECISION AUTOMATION IN ENTERPRISE SERVICE PLATFORMS. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 15(3), 476-486.

Downloads

Published

2025-12-24

How to Cite

Agentic Data Pipelines: Autonomous ELT Orchestration Using AI Agents on Microsoft Fabric and Databricks. (2025). International Journal of Computer Technology and Electronics Communication, 8(6), 11891-11907. https://doi.org/10.15680/IJCTECE.2025.0806034