mirror of
https://github.com/linkedin/school-of-sre
synced 2026-01-21 07:58:03 +00:00
Deployed 4239ecf with MkDocs version: 1.2.3
This commit is contained in:
@@ -2110,22 +2110,22 @@
|
||||
practices in mind.</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p><strong>Use the right metric type</strong> -- Most of the libraries available
|
||||
<p><strong>Use the right metric type</strong>—Most of the libraries available
|
||||
today offer various metric types. Choose the appropriate metric
|
||||
type for monitoring your system. Following are the types of
|
||||
metrics and their purposes.</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p><strong>Gauge --</strong> <em>Gauge</em> is a constant type of metric. After the
|
||||
<p><strong>Gauge</strong>—<em>Gauge</em> is a constant type of metric. After the
|
||||
metric is initialized, the metric value does not change unless
|
||||
you intentionally update it.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><strong>Timer --</strong> <em>Timer</em> measures the time taken to complete a
|
||||
<p><strong>Timer</strong>—<em>Timer</em> measures the time taken to complete a
|
||||
task.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><strong>Counter --</strong> <em>Counter</em> counts the number of occurrences of a
|
||||
<p><strong>Counter</strong>—<em>Counter</em> counts the number of occurrences of a
|
||||
particular event.</p>
|
||||
</li>
|
||||
</ul>
|
||||
@@ -2135,19 +2135,19 @@ practices in mind.</p>
|
||||
Types</a>.</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p><strong>Avoid over-monitoring</strong> -- Monitoring can be a significant
|
||||
engineering endeavor<strong><em>.</em></strong> Therefore, be sure not to spend too
|
||||
<p><strong>Avoid over-monitoring</strong>—Monitoring can be a significant
|
||||
engineering endeavor. Therefore, be sure not to spend too
|
||||
much time and resources on monitoring services, yet make sure all
|
||||
important metrics are captured.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><strong>Prevent alert fatigue</strong> -- Set alerts for metrics that are
|
||||
<p><strong>Prevent alert fatigue</strong>—Set alerts for metrics that are
|
||||
important and actionable. If you receive too many non-critical
|
||||
alerts, you might start ignoring alert notifications over time. As
|
||||
a result, critical alerts might get overlooked.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><strong>Have a runbook for alerts</strong> -- For every alert, make sure you have
|
||||
<p><strong>Have a runbook for alerts</strong>—For every alert, make sure you have
|
||||
a document explaining what actions and checks need to be performed
|
||||
when the alert fires. This enables any engineer on the team to
|
||||
handle the alert and take necessary actions, without any help from
|
||||
|
||||
Reference in New Issue
Block a user