Readplace

Netflix Chaos Monkey Upgraded

netflixtechblog.com 3 min read
View original
Summary (TL;DR)
Netflix released Chaos Monkey 2.0, a tool that randomly terminates production instances to test service resiliency. Key upgrades include integration with Spinnaker, better scheduling with mean time between terminations, grouping by app/stack/cluster, and tracker support for metrics. The old feature of running other failure experiments (like CPU burn) is removed. New features include automatic opt-out for canaries, cross-account terminations, and auto-disable during outages.