Merged further readings and references. Added absolute path for links

This commit is contained in:
Sumit Sulakhe
2021-02-08 07:08:22 -08:00
committed by Sumit Sulakhe
parent 561f8cd547
commit b9a9d3cc12
4 changed files with 31 additions and 25 deletions

View File

@@ -16,12 +16,12 @@ In this course, we are focusing on building strong foundational skills. The cour
- [Linux Networking](https://sumit419.github.io/school-of-sre/linux_networking/intro/) - [Linux Networking](https://sumit419.github.io/school-of-sre/linux_networking/intro/)
- [Python and Web](https://sumit419.github.io/school-of-sre/python_web/intro/) - [Python and Web](https://sumit419.github.io/school-of-sre/python_web/intro/)
- Data - Data
- [Relational databases(MySQL)](https://sumit419.github.io/school-of-sre/databases_sql/intro/) - [Relational databases(MySQL)](https://linkedin.github.io/school-of-sre/databases_sql/intro/)
- [NoSQL concepts](https://sumit419.github.io/school-of-sre/databases_nosql/intro/) - [NoSQL concepts](https://linkedin.github.io/school-of-sre/databases_nosql/intro/)
- [Big Data](https://sumit419.github.io/school-of-sre/big_data/intro/) - [Big Data](https://linkedin.github.io/school-of-sre/big_data/intro/)
- [Systems Design](https://sumit419.github.io/school-of-sre/systems_design/intro/) - [Systems Design](https://linkedin.github.io/school-of-sre/systems_design/intro/)
- [Metrics and Monitoring](https://sumit419.github.io/school-of-sre/metrics_and_monitoring/introduction/) - [Metrics and Monitoring](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/introduction/)
- [Security](https://sumit419.github.io/school-of-sre/security/intro/) - [Security](https://linkedin.github.io/school-of-sre/security/intro/)
We believe continuous learning will help in acquiring deeper knowledge and competencies in order to expand your skill sets, every module has added references which could be a guide for further learning. Our hope is that by going through these modules we should be able to build the essential skills required for a Site Reliability Engineer. We believe continuous learning will help in acquiring deeper knowledge and competencies in order to expand your skill sets, every module has added references which could be a guide for further learning. Our hope is that by going through these modules we should be able to build the essential skills required for a Site Reliability Engineer.

View File

@@ -13,7 +13,8 @@ any service breakdown due to a shortage of resources. On the other hand,
when a service goes down due to an issue, early detection and when a service goes down due to an issue, early detection and
notification of such incidents can help you quickly fix the issue. notification of such incidents can help you quickly fix the issue.
![An alert notification received on Slack](images/image11.png) <p align="center"> Figure 8: An alert notification received on Slack </p> ![An alert notification received on Slack](images/image11.png)
<p align="center"> Figure 8: An alert notification received on Slack </p>
Today most of the monitoring services available provide a mechanism to Today most of the monitoring services available provide a mechanism to
set up alerts on one or a combination of metrics to actively monitor the set up alerts on one or a combination of metrics to actively monitor the

View File

@@ -27,7 +27,8 @@ on). Let's look at some of the tools that are predominantly used.
- `-x` -- When displaying processes matched by other options, - `-x` -- When displaying processes matched by other options,
includes processes that do not have a controlling terminal. includes processes that do not have a controlling terminal.
![Results of top command](images/image12.png) <p align="center"> Figure 2: Results of top command </p> ![Results of top command](images/image12.png)
<p align="center"> Figure 2: Results of top command </p>
- `ss` -- The socket statistics command (ss) displays information - `ss` -- The socket statistics command (ss) displays information
about network sockets on the system. This tool is the successor of about network sockets on the system. This tool is the successor of
@@ -52,8 +53,8 @@ on). Let's look at some of the tools that are predominantly used.
displays the statistics in a human-readable format. displays the statistics in a human-readable format.
![Memory ![Memory
statistics on a host in human-readable form](images/image6.png) <p align="center"> Figure 4: Memory statistics on a host in human-readable form](images/image6.png)
statistics on a host in human-readable form </p> <p align="center"> Figure 4: Memory statistics on a host in human-readable form </p>
- `df --` The df command displays disk space usage statistics. The - `df --` The df command displays disk space usage statistics. The
`-i` command-line option is also often used to display `-i` command-line option is also often used to display
@@ -61,7 +62,8 @@ on). Let's look at some of the tools that are predominantly used.
statistics. The `-h` command-line option is used for displaying statistics. The `-h` command-line option is used for displaying
statistics in a human-readable format. statistics in a human-readable format.
![Disk usage statistics on a system in human-readable form](images/image9.png) <p align="center"> Figure 5: ![Disk usage statistics on a system in human-readable form](images/image9.png)
<p align="center"> Figure 5:
Disk usage statistics on a system in human-readable form </p> Disk usage statistics on a system in human-readable form </p>
- `sar` -- The sar utility monitors various subsystems, such as CPU - `sar` -- The sar utility monitors various subsystems, such as CPU
@@ -75,7 +77,8 @@ on). Let's look at some of the tools that are predominantly used.
specifies which network interface to watch. specifies which network interface to watch.
![Network bandwidth usage by ![Network bandwidth usage by
active connection on the host](images/image2.png) <p align="center"> Figure 6: Network bandwidth usage by active connection on the host](images/image2.png)
<p align="center"> Figure 6: Network bandwidth usage by
active connection on the host </p> active connection on the host </p>
- `tcpdump` -- The tcpdump command is a network monitoring tool that - `tcpdump` -- The tcpdump command is a network monitoring tool that
@@ -94,5 +97,6 @@ active connection on the host </p>
- `port <port number>` -- Filters traffic to or from a particular - `port <port number>` -- Filters traffic to or from a particular
port port
![tcpdump of packets on an interface](images/image10.png) <p align="center"> Figure 7: *tcpdump* of packets on *docker0* ![tcpdump of packets on an interface](images/image10.png)
<p align="center"> Figure 7: *tcpdump* of packets on *docker0*
interface on a host </p> interface on a host </p>

View File

@@ -45,26 +45,26 @@ following topics:
## Course content ## Course content
- [Introduction](#introduction) - [Introduction](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/introduction/#introduction)
- [Four golden signals of monitoring](#four-golden-signals-of-monitoring) - [Four golden signals of monitoring](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/introduction/#four-golden-signals-of-monitoring)
- [Why is monitoring important?](#why-is-monitoring-important) - [Why is monitoring important?](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/introduction/#why-is-monitoring-important)
- [Command-line tools](command-line_tools.md) - [Command-line tools](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/command-line_tools/)
- [Third-party monitoring](third-party_monitoring.md) - [Third-party monitoring](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/third-party_monitoring/)
- [Proactive monitoring using alerts](alerts.md) - [Proactive monitoring using alerts](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/alerts/)
- [Best practices for monitoring](best_practices.md) - [Best practices for monitoring](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/best_practices/)
- [Observability](observability.md) - [Observability](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/observability/)
- [Logs](observability.md#logs) - [Logs](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/observability/#logs)
- [Tracing](observability.md#tracing) - [Tracing](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/bservability/#tracing)
[Conclusion](conclusion.md) [Conclusion](https://linkedin.github.io/school-of-sre/metrics_and_monitoring/conclusion/)
## ##
@@ -221,7 +221,8 @@ Before we discuss monitoring an application, let us look at the
monitoring infrastructure. Following is an illustration of a basic monitoring infrastructure. Following is an illustration of a basic
monitoring system. monitoring system.
![Illustration of a monitoring infrastructure](images/image1.jpg) <p align="center"> Figure 1: Illustration of a monitoring infrastructure </p> ![Illustration of a monitoring infrastructure](images/image1.jpg)
<p align="center"> Figure 1: Illustration of a monitoring infrastructure </p>
Figure 1 shows a monitoring infrastructure mechanism for aggregating Figure 1 shows a monitoring infrastructure mechanism for aggregating
metrics on the system, and collecting and storing the data for display. metrics on the system, and collecting and storing the data for display.