The End of Generative AI Experiments Designing Production‑Grade Data Architectures for LLM Systems

Authors

  • Samanth Gurram Engineering Manager, Data & AI, USA Author

DOI:

https://doi.org/10.15680/IJCTECE.2024.0701009

Keywords:

Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Data Volatility

Abstract

The paper explains the causes of AI pilots’ failures, which are often due to flaws in the model or volatile data. Using a quantitative experimental design, the paper compares and contrasts the big parameter models and the specific models with and without Retrieval Augmentation Generation (RAG). The results showed that data volatility, delay in indexing, and low ingestion pipelines were the primary sources of significant performance loss. AI pilots that operate in Generative AI worked well in the controlled tests, but when they were to perform in the actual production setting, several of them collapsed. The smaller models with strong pipelines were beneficial because of their cost-adjusted value. Risks are governance and security, which began escalating as systems were introduced in the production environments.

References

[1] Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, M., & Wang, H. (2023). Retrieval-Augmented Generation for Large Language Models: A survey. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2312.10997

[2] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Contextual Personal Intelligence: a new paradigm for AI that evolves with you. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2005.11401

[3] Linegar, M., Kocielnik, R., & Alvarez, R. M. (2023). Large language models and political science. Frontiers in Political Science, 5. https://doi.org/10.3389/fpos.2023.1257092

[4] Jaimovitch-López, G., Ferri, C., Hernández-Orallo, J., Martínez-Plumed, F., & Ramírez-Quintana, M. J. (2022). Can language models automate data wrangling? Machine Learning, 112(6), 2053–2082. https://doi.org/10.1007/s10994-022-06259-9

[5] Komatsuzaki, A. (2020). Current Limitations of Language Models: What You Need is Retrieval. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2009.06857

[6] Ai, Q., Bai, T., Cao, Z., Chang, Y., Chen, J., Chen, Z., Cheng, Z., Dong, S., Dou, Z., Feng, F., Gao, S., Guo, J., He, X., Lan, Y., Li, C., Liu, Y., Lyu, Z., Ma, W., Ma, J., . . . Zhu, X. (2023). Information Retrieval meets Large Language Models: A strategic report from Chinese IR community. AI Open, 4, 80–90. https://doi.org/10.1016/j.aiopen.2023.08.001

[7] Mnih, A., & Teh, Y. W. (2012). A fast and simple algorithm for training neural probabilistic language models. arXiv (Cornell University), 419–426. https://doi.org/10.48550/arxiv.1206.6426

[8] Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. In Cambridge University Press eBooks. https://doi.org/10.1017/cbo9780511809071

[9] Srivastava, N., Salakhutdinov, R. R., & Hinton, G. E. (2013). Modeling Documents with Deep Boltzmann Machines. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1309.6865

[10] Aerts, D., Broekaert, J., Sozzo, S., & Veloz, T. (2014). Meaning–Focused and Quantum–Inspired information retrieval. In Lecture notes in computer science (pp. 71–83). https://doi.org/10.1007/978-3-642-54943-4_7

Downloads

Published

2024-02-09

How to Cite

The End of Generative AI Experiments Designing Production‑Grade Data Architectures for LLM Systems. (2024). International Journal of Computer Technology and Electronics Communication, 7(1), 8233-8242. https://doi.org/10.15680/IJCTECE.2024.0701009

Most read articles by the same author(s)