Self-Adaptive Data Preprocessing with AI for Dynamic Data Environments
DOI:
https://doi.org/10.15680/IJCTECE.2025.0801005Keywords:
Self-adaptive preprocessing, Artificial intelligence, Machine learning, Concept drift, Real-time data streams Data preprocessing, Reinforcement learning, Normalization, Outlier detectionAbstract
In dynamic data environments characterized by evolving data distributions, concept drift, and real-time data streams, traditional static data preprocessing methods often fail to maintain model accuracy and reliability. This paper introduces a self-adaptive data preprocessing framework leveraging artificial intelligence (AI) to dynamically adjust preprocessing steps in response to changing data characteristics. The proposed framework integrates machine learning algorithms to monitor data streams, detect shifts in data distributions, and automatically adjust preprocessing techniques such as normalization, feature selection, and outlier detection. The framework employs reinforcement learning to continuously optimize preprocessing strategies, ensuring that the data fed into machine learning models remains relevant and of high quality. Experiments conducted on various real-world datasets demonstrate that the self-adaptive preprocessing approach significantly improves model performance compared to static preprocessing methods. Notably, the framework effectively handles concept drift and adapts to new data patterns without manual intervention. This research contributes to the field of data preprocessing by providing a scalable and automated solution that enhances the robustness and accuracy of machine learning models in dynamic environments. The self-adaptive framework offers a promising direction for future data preprocessing methodologies, particularly in applications involving real-time data analysis and decision-making.
References
1. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). "A survey on concept drift adaptation." ACM Computing Surveys.
2. Krawczyk, B. (2016). "Learning from imbalanced data: open challenges and future directions." Progress in Artificial Intelligence.
3. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., & Zhang, G. (2018). "Learning under concept drift: A review." IEEE Transactions on Knowledge and Data Engineering.
4. Bifet, A., & Gavalda, R. (2007). "Learning from time-changing data with adaptive windowing." Proceedings of the SIAM International Conference on Data Mining.
5. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
6. Shaker, A., Hüllermeier, E., & Henzgen, S. (2012). "Self-adaptive preprocessing for data streams." Journal of Intelligent Information Systems.
7. Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press.