Serverless ETL with Auto-Scaling Triggers: A Performance-Driven Design on AWS Lambda and Step Functions
DOI:
https://doi.org/10.15680/IJCTECE.2022.0503004Keywords:
Serverless Computing, Etl Architecture, Aws Lambda, Auto-Scaling, Performance OptimizationAbstract
The proliferation of cloud-native architectures has catalyzed a fundamental shift in data engineering paradigms, with serverless computing emerging as a transformative approach for Extract, Transform, and Load operations that exhibit variable workload patterns and irregular temporal characteristics. This article investigates the design, implementation, and comprehensive performance evaluation of a production-grade serverless ETL architecture leveraging AWS Lambda for compute execution and Step Functions for workflow orchestration, systematically addressing critical challenges including cold-start latency penalties, concurrency management under burst loads, and cost optimization across heterogeneous data volumes. Through rigorous empirical analysis spanning batch sizes from megabyte-scale events to hundred-gigabyte datasets under diverse concurrency scenarios, this article demonstrates that properly architected serverless ETL pipelines achieve linear scalability characteristics with near-perfect correlation between execution time and input data volume, while delivering substantial cost reductions for sporadic and low-frequency workloads compared to persistent cluster-based infrastructure. The experimental evaluation reveals critical performance thresholds, including cold-start latency profiles, break-even points between serverless and traditional architectures based on execution frequency, and auto-scaling responsiveness patterns that inform deployment decisions for production environments. The article establishes that serverless ETL represents a workload-dependent optimization rather than a universal best practice, with economic advantages manifesting primarily in scenarios characterized by unpredictable data arrival patterns, intermittent processing requirements, and elastic scaling demands that traditional infrastructure cannot efficiently accommodate without incurring significant idle resource costs and operational overhead.References
[1] Sarvesh Sonawne et al., "The Role of Serverless Architecture in Scalable and Efficient Web Development." ResearchGate, March 2025, Available: https://www.researchgate.net/publication/389615854_The_Role_of_Serverless_Architecture_in_Scalable_and_Efficient_Web_Development
[2] Nima Mahmoudi & Hamzeh Khazaei, "Performance Modeling of Metric-Based Serverless Computing Platforms," ResearchGate, February 2022, Available: https://www.researchgate.net/publication/358814467_Performance_Modeling_of_Metric-Based_Serverless_Computing_Platforms
[3] Josep Sampe et al., "Serverless Data Analytics in the IBM Cloud," ResearchGate, December 2018. Available: https://www.researchgate.net/publication/329107609_Serverless_Data_Analytics_in_the_IBM_Cloud
[4] Ali Raza et al., "SoK: Function-As-A-Service: From An Application Developer's Perspective" ResearchGate, September 2021. Available: https://www.researchgate.net/publication/358656180_SoK_Function-As-A-Service_From_An_Application_Developer's_Perspective
[5] Joel Scheuner & Philip Leitner, "Function-as-a-Service performance evaluation: A multivocal literature review," ResearchGate, June 2020. Available: https://www.researchgate.net/publication/342519865_Function-as-a-Service_performance_evaluation_A_multivocal_literature_review
[6] Daniel Kelly et al., "Serverless Computing: Behind the Scenes of Major Platforms." ResearchGate, December 2020. Available: https://www.researchgate.net/publication/346933587_Serverless_Computing_Behind_the_Scenes_of_Major_Platforms
[7] Simon Eismann, "Serverless Applications: Why, When, and How?," ResearchGate, ResearchGate, September 2020. Available: https://www.researchgate.net/publication/344294829_Serverless_Applications_Why_When_and_How
[8] Cosmina Ivan et al., "Serverless Computing: An Investigation of Deployment Environments for Web APIs," ResearchGate, June 2019. Available: https://www.researchgate.net/publication/334015883_Serverless_Computing_An_Investigation_of_Deployment_Environments_for_Web_APIs
[9] Theodore Gerard Lynn et al., "A Preliminary Review of Enterprise Serverless Cloud Computing (Function-as-a-Service) Platforms," ResearchGate, December 2017. Available: https://www.researchgate.net/publication/321753133_A_Preliminary_Review_of_Enterprise_Serverless_Cloud_Computing_Function-as-a-Service_Platforms
[10] Quingye Jiang et al., "Serverless Execution of Scientific Workflows," ResearchGate, October 2017. Available: https://www.researchgate.net/publication/320447590_Serverless_Execution_of_Scientific_Workflows

