Resilience Engineering Principles for Distributed Cloud-Native Applications under Chaos

Authors

  • Phanindra Gangina Awoit Systems Inc., USA Author

DOI:

https://doi.org/10.15680/IJCTECE.2022.0505005

Keywords:

Resilience engineering, chaos engineering, fault tolerance, distributed systems, circuit breakers, bulkhead pattern, retry mechanisms, graceful degradation, system reliability, failure injection

Abstract

The concepts of resilience engineering and chaos engineering are presented in this paper in the framework of the development of fault-tolerant distributed applications based on cloud-native technology. The paper outlines the design patterns of the system to enhance its reliability that can be achieved by proactive patterns of designs like circuit breakers, bulkhead patterns, and dynamic retry patterns. It introduces a systematic approach to failure injection testing where the system is supposed to withstand some disruption expecting the unexpected in the system without failure. The architecture proposed is based on decoupling services using circuit breakers that eliminate cascading failures and on the use of bulkhead patterns, which confine the number of failures to certain components, so that the whole system does not suffer. Adaptive retry schemes which dynamically adapt to network and service availability are also there. The platform also promotes graceful degradation to allow the application to be still used with a lower level of functionality in the event that certain components fail, and a user experience is still tolerable. Chaos engineering, in its turn, introduces failures to replicate the reality and stress test system resilience. The paper is an elaborated instructional guide to engineers and developers on how they can develop a high-resilience, fault-tolerant cloud-native application that can endure and recover failures without triggering any major service outage

Downloads

Published

2022-09-07

How to Cite

Resilience Engineering Principles for Distributed Cloud-Native Applications under Chaos. (2022). International Journal of Computer Technology and Electronics Communication, 5(5), 5760-5770. https://doi.org/10.15680/IJCTECE.2022.0505005

Most read articles by the same author(s)