Exploring Metadata through Natural Language Interfaces Powered by Generative AI
DOI:
https://doi.org/10.15680/IJCTECE.2025.0802003Keywords:
Natural language interfaces, Metadata exploration, Generative AI, Data catalogs, Conversational AI, Data governance, Language models, Knowledge graphs, Data discovery, User experienceAbstract
Metadata exploration plays a critical role in data management, enabling users to discover, understand, and utilize data assets effectively. Traditional metadata exploration tools often require technical expertise and navigation through complex interfaces, which can hinder accessibility and productivity. This paper proposes a novel approach leveraging generative artificial intelligence (AI) to create intuitive natural language interfaces (NLIs) for metadata exploration in diverse data ecosystems. By integrating generative AI models such as GPT-like transformers, the system allows users to interact with metadata repositories through conversational queries, making metadata discovery accessible to non-technical users. The generative AI interprets user intents expressed in natural language and dynamically constructs appropriate metadata queries. It also generates explanatory summaries, recommendations, and context-aware insights, enriching the metadata exploration experience. The proposed methodology involves fine-tuning large pre-trained language models on domain-specific metadata corpora and linking them with metadata catalogs and knowledge graphs. This enables the system to understand complex metadata schemas and relationships, providing accurate and contextually relevant responses. Evaluation on enterprise-scale metadata environments demonstrates significant improvements in user satisfaction, query success rates, and exploration efficiency compared to conventional keyword-based search interfaces. Users reported enhanced understanding of data assets and improved decision-making capabilities. The study illustrates the transformative potential of generative AI in democratizing metadata access, reducing dependency on specialized knowledge, and facilitating more effective data governance. Future research will focus on multimodal interfaces, real-time updates, and extending the system to handle ambiguous or incomplete metadata queries.
References
1. Abiteboul, S., Buneman, P., & Suciu, D. (1995). Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann.
2. Hearst, M. A. (2009). Search User Interfaces. Cambridge University Press.
3. Popescu, A. M., Etzioni, O., & Kautz, H. (2003). Towards a theory of natural language interfaces to databases. Proceedings of the 8th International Conference on Intelligent User Interfaces.
4. Li, F., & Jagadish, H. V. (2014). Constructing an interactive natural language interface for relational databases. PVLDB.
5. Sugumar, R. (2023). Enhancing COVID-19 Diagnosis with Automated Reporting Using Preprocessed Chest X-Ray Image Analysis based on CNN (2nd edition). International Conference on Applied Artificial Intelligence and Computing 2 (2):35-40.
6. Brown, T. B., et al. (2020). Language Models are Few-Shot Learners. NeurIPS.
7. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT.
8. Zhu, Y., et al. (2021). AI for Automated Metadata Extraction and Enrichment. IEEE Big Data.
9. Wang, Z., et al. (2022). Integrating Knowledge Graphs with Language Models for Semantic Search. WWW Conference.