A Day, a Week, a Month in the Life of a Cascading Failure




Microservice architectures provide immense benefits to the flexibility and speed of development, but they also introduce significant complexity in the way the distributed components interact with each other. Eventually, all distributed systems will encounter an error or failure. Sometimes these errors can express themselves outside of the isolation of single component and cause a series of cascading failures in interconnected system. This is the story of one such failure, the lessons learned, and how it’s influenced our architecture and design decisions moving forward.

Speaker

john-engelman

John Engelman

 
John Engelman is the Principal Engineer for Platform Services at Target in Minneapolis, MN. In this role, he oversees the developer experience on the Target Application Platform including the ...