With observability becoming more understood and common, I felt it would be useful to compile a list of learning material that people new to this area might appreciate. The better we observe and consequently understand our systems, the better we get at predicting and reducing risks.
A huge thanks to Abby Bangser for making her content available.
The doyenne’s of observability. These folks are the ones that have tirelessly driven and promoted observability before it became a ‘thing’. There’s a mountain of great content on the honeycomb website. Start with https://www.honeycomb.io/resources/white-papers/ but also they have a podcast and videos so it’s worth checking out their whole site. Plus you can signup for free!
if you are the type of person who learns by doing then this observability playground by Abby Bangser is perfect for you. A simple webapp but integrated with loads of monitoring and observability type tools to learn the basics.
You can download here
And if you want to use Azure instead of AWS go to Parveen Khan’s write up
I’m holding a Friday mentoring group for testers new to observability starting 15th January 2021. You can signup (https://www.testingtimes.com.au/academy/)
This is an excellent write up on Observability by Andy Dote
[And this video Unifying your Observability Pipeline by Aditya Mukerjee
How to Build Observable Distributed Systems by Pierre Vincent
The Venn diagram of observability by David Worth
The Missing O11y Primer by Daniel Dyla
I like this analogy by Katy Farmer: What is Observability?
Learn about Structured Logging by Aditya Praharaj
Example of the challenges with metrics (aggregates). “How NOT to Measure Latency” by Gil Tene
SLO (Service Level Objectives)
Learn about SLO’s and SLI’s in this pdf created by Julie McCoy with Nicole Forsgren
Prometheus Explained by S Santhosh Nagaraj
Links to further content on resilience by Lorin Hochstein