5, 10 or 20 seats+ for your team - learn more
Welcome to Padre Inc., a company that uses AWS as its cloud service provider. Padre has acquired Tiddler Inc., a startup that uses Google Cloud Platform. As an enterprise architect for Padre, it’s up to you to manage the multi-cloud operations of its SaaS project. Using Countly, you’ll define and implement key performance indicators (KPIs) in order to measure the success of Padre’s SaaS project. To ensure smooth operation of services, you’ll implement site reliability engineering (SRE) practices using Kubernetes, Prometheus, and Grafana. You’ll create a centralized logging flow that streams logs from AWS to Google Cloud, further improving the user experience. To balance the deployment of new features with reliability of the services in production, you’ll implement error budgeting, a key tool in SRE to achieve a predefined service-level objective (SLO). When you’re done, you’ll have useful skills for managing multi-cloud operations.
This series offers a good introduction to SRE for someone who is looking to gain first-hand experience in this field.
Padre Inc. uses AWS as its cloud provider and has acquired Tiddler Inc., an insurance analytics startup that uses Google Cloud Platform. As an enterprise architect at Padre, your task is to manage multi-cloud operations. Using Countly, you’ll define and implement key performance indicators (KPIs) in order to measure the success of Padre’s SaaS project that runs on AWS and Google. To ensure smooth operation of services and an optimal user experience, you’ll implement site reliability engineering (SRE) practices using Kubernetes, Prometheus, and Grafana. When you’re done, you’ll have experience using business metrics to track the health and value of your software, ensuring it provides value to customers and stakeholders.
You’re an enterprise architect for Padre Inc., which has acquired Tiddler Inc., an insurance analytics startup, for its SaaS platform. Since your company runs its cloud operations on AWS and the startup uses Google Cloud Platform, your task is to manage multi-cloud operations. As part of that effort, you’ll implement centralized logging to simplify log analysis across multiple applications. After exploring centralized logging using the ELK stack, you’ll create a centralized logging instance in Google Cloud Platform and stream the logs from AWS to Google Cloud. To signal the need for corrective actions, you’ll use Kibana to run queries against the log data and create an alert based on log events. When you’re done, you’ll have created a centralized logging flow that improves site reliability.
As an enterprise architect at Padre Inc., which runs its cloud operations on AWS, it’s up to you to manage multi-cloud operations for Tiddler Inc., a startup using Google Cloud that Padre has acquired for its SaaS project. Your task is to implement error budgeting to achieve a predefined service-level objective (SLO) and balance the deployment of new features with the reliability of the services in production. Using tools including Elasticsearch, Kibana, and Logstash, you’ll create an error budget policy and you’ll calculate burn rate, which indicates how quickly the budget is expended. You’ll use Kibana to implement a dashboard that uses burn rate data when creating the alerts product owners rely on for ensuring that service reliability meets service-level agreements (SLAs) when releasing new features. When you’re finished, you’ll have hands-on experience with valuable site reliability engineering (SRE) skills and concepts that you can apply to real-world projects.
This liveProject provides clear instruction and helps to learn critical skills for my day-to-day work.
The experience was particularly enjoyable.
These liveProjects are for solutions architects and developers with a basic knowledge of AWS or Google Cloud Platform, the Linux command line, Kubernetes, JSON, and YAML. To begin these liveProjects you’ll need to be familiar with the following:
TOOLS