Cloud-Enabled Data Lake Architectures for Large-Scale Autonomous Vehicle Datasets with Neural Network Integration
DOI:
https://doi.org/10.15680/IJCTECE.2024.0706005Keywords:
Data lake architecture, Autonomous vehicles, Cloud computing, Large-scale datasets, Data ingestion, Metadata management, Real-time analytics, Big data storage, Data governance, Machine learning pipelinesAbstract
The exponential growth of autonomous vehicle (AV) technologies has led to the generation of massive, heterogeneous datasets requiring scalable and efficient storage, processing, and analysis. This paper proposes a cloud-enabled data lake architecture tailored for large-scale autonomous vehicle datasets, enabling seamless data ingestion, integration, and retrieval across multimodal sources such as LiDAR, radar, cameras, and vehicle-to-infrastructure (V2I) communication streams. By leveraging neural networks, the architecture facilitates advanced data analytics, including object detection, trajectory prediction, and anomaly detection, thereby enhancing decision-making for autonomous systems. The integration of cloud-native services ensures elasticity, high availability, and real-time data processing, while schema-on-read capabilities provide flexibility in handling structured and unstructured data. Experimental validation demonstrates that the proposed architecture reduces data latency, supports scalable training of deep learning models, and enhances the robustness of AV applications. This work contributes to building intelligent, AI-powered ecosystems that accelerate the safe deployment of large-scale autonomous transportation systems.
References
1. Chen, M., Ma, Y., Li, Y., Wu, D., Zhang, Y., & Youn, C. H. (2019). Wearable 2.0: Enabling Human-Cloud Integration in Next Generation Healthcare Systems. IEEE Communications Magazine, 57(1), 22-28.
2. Arulraj AM, Sugumar, R., Estimating social distance in public places for COVID-19 protocol using region CNN, Indonesian Journal of Electrical Engineering and Computer Science, 30(1), pp.414-424, April 2023.
3. Dave, B. L. (2024). An Integrated Cloud-Based Financial Wellness Platform for Workplace Benefits and Retirement Management. International Journal of Technology, Management and Humanities, 10(01), 42-52.
4. Gartner (2018). Data Lakes: The New Big Data Platform. Gartner Research Report.
5. Jones, R., Smith, A., & Lee, H. (2020). Data Governance in Cloud Data Lakes: Challenges and Approaches. Journal of Cloud Computing, 9(1), 22-35.
6. Alwar Rengarajan, Rajendran Sugumar (2016). Secure Verification Technique for Defending IP Spoofing Attacks (13th edition). International Arab Journal of Information Technology 13 (2):302-309.
7. Lee, S., Park, J., & Kim, J. (2017). Big Data Analytics for Autonomous Vehicles: A Survey. IEEE Transactions on Intelligent Transportation Systems, 18(9), 2392-2408.
8. Patel, K., & Shah, S. (2022). Cost Optimization Strategies for Cloud Data Lakes. ACM Computing Surveys, 54(4), 1-34.
9. Pareek, C. S. (2023). Unmasking Bias: A Framework for Testing and Mitigating AI Bias in Insurance Underwriting Models.. J Artif Intell. Mach Learn & Data Sci, 1(1), 1736-1741.
10. Komarina, G. B. (2024). Transforming Enterprise Decision-Making Through SAP S/4HANA Embedded Analytics Capabilities. Journal ID, 9471, 1297.
11. Amuda, K. K., Kumbum, P. K., Adari, V. K., Chunduru, V. K., & Gonepally, S. (2021). Performance evaluation of wireless sensor networks using the wireless power management method. Journal of Computer Science Applications and Information Technology, 6(1), 1–9. https://doi.org/10.15226/2474-9257/6/1/00151
12. Adari, V. K., Chunduru, V. K., Gonepally, S., Amuda, K. K., & Kumbum, P. K. (2020). Explainability and interpretability in machine learning models. Journal of Computer Science Applications and Information Technology, 5(1), 1–7. https://doi.org/10.15226/2474-9257/5/1/00148
13. Smith, L., & Kumar, V. (2020). Metadata Management in Cloud Data Lakes: A Comparative Study. Data Engineering Bulletin, 43(2), 7-15.
14. Sahaj Gandhi, Behrooz Mansouri, Ricardo Campos, and Adam Jatowt. 2020. Event-related query classification with deep neural networks. In Companion Proceedings of the 29th International Conference on the World Wide Web. 324–330.
15. Karvannan, R. (2024). ConsultPro Cloud Modernizing HR Services with Salesforce. International Journal of Technology, Management and Humanities, 10(01), 24-32.
16. Sugumar, R. (2016). An effective encryption algorithm for multi-keyword-based top-K retrieval on cloud data. Indian Journal of Science and Technology 9 (48):1-5.
17. Zhou, Q., Wu, Y., & Zheng, L. (2021). Stream Processing for Autonomous Vehicle Data Ingestion. IEEE Transactions on Big Data, 7(4), 796-808.

