本页面中的信息并不完全以您的首选语言展示,我们正在完善其他语言版本。想要以您的首选语言了解相关信息,可以点击这里下载PDF。
Chaos Engineering
In the last year we've seen Chaos Engineering move from a much talked-about idea to an accepted, mainstream approach to improving and assuring distributed system resilience. As organizations large and small begin to implement Chaos Engineering as an operational process, we're learning how to apply these techniques safely at scale. The approach is definitely not for everyone, and to be effective and safe, it requires organizational support at scale. Industry acceptance and available expertise will definitely increase with the appearance of commercial services such as Gremlin and deployment tools such as Spinnaker implementing some Chaos Engineering tools.
In previous editions of the Radar, we've talked about using Chaos Monkey from Netflix to test how a running system is able to cope with outages in production by randomly disabling instances and measuring the results. Chaos Engineering is the nascent term for the wider application of this technique. By running experiments on distributed systems in production, we're able to build confidence that those systems work as expected under turbulent conditions. A good place to start understanding this technique is the Principles of Chaos Engineering website.
In previous editions of the Radar, we've talked about using Chaos Monkey from Netflix to test how a running system is able to cope with outages in production by randomly disabling instances and measuring the results. Chaos Engineering is the nascent term for the wider application of this technique. By running experiments on distributed systems in production, we're able to build confidence that those systems work as expected under turbulent conditions. A good place to start understanding this technique is the Principles of Chaos Engineering website.