mongodb-atlas

Overview

Jsonnet source code is available at github.com/grafana/jsonnet-libs

Alerts

Complete list of pregenerated alerts is available here.

mongodb-atlas-alerts

MongoDBAtlasHighNumberOfCollectionExclusiveDeadlocks

alert: MongoDBAtlasHighNumberOfCollectionExclusiveDeadlocks
annotations:
  description: The number of collection exclusive-lock deadlocks occurring on node
    {{$labels.instance}} in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}}
    which is above the threshold of 10.
  summary: There is a high number of collection exclusive deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Collection_deadlockCount_W[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfCollectionIntentExclusiveDeadlocks

alert: MongoDBAtlasHighNumberOfCollectionIntentExclusiveDeadlocks
annotations:
  description: The number of collection intent-exclusive-lock deadlocks occurring
    on node {{$labels.instance}} in cluster {{$labels.cl_name}} is {{printf "%.0f"
    $value}} which is above the threshold of 10.
  summary: There is a high number of collection intent-exclusive deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Collection_deadlockCount_w[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfCollectionSharedDeadlocks

alert: MongoDBAtlasHighNumberOfCollectionSharedDeadlocks
annotations:
  description: The number of collection shared-lock deadlocks occurring on node {{$labels.instance}}
    in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}} which is above the
    threshold of 10.
  summary: There is a high number of collection shared deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Collection_deadlockCount_R[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfCollectionIntentSharedDeadlocks

alert: MongoDBAtlasHighNumberOfCollectionIntentSharedDeadlocks
annotations:
  description: The number of collection intent-shared-lock deadlocks occurring on
    node {{$labels.instance}} in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}}
    which is above the threshold of 10.
  summary: There is a high number of collection intent-shared deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Collection_deadlockCount_r[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfDatabaseExclusiveDeadlocks

alert: MongoDBAtlasHighNumberOfDatabaseExclusiveDeadlocks
annotations:
  description: The number of database exclusive-lock deadlocks occurring on node {{$labels.instance}}
    in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}} which is above the
    threshold of 10.
  summary: There is a high number of database exclusive deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Database_deadlockCount_W[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfDatabaseIntentExclusiveDeadlocks

alert: MongoDBAtlasHighNumberOfDatabaseIntentExclusiveDeadlocks
annotations:
  description: The number of database intent-exclusive-lock deadlocks occurring on
    node {{$labels.instance}} in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}}
    which is above the threshold of 10.
  summary: There is a high number of database intent-exclusive deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Database_deadlockCount_w[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfDatabaseSharedDeadlocks

alert: MongoDBAtlasHighNumberOfDatabaseSharedDeadlocks
annotations:
  description: The number of database shared-lock deadlocks occurring on node {{$labels.instance}}
    in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}} which is above the
    threshold of 10.
  summary: There is a high number of database shared deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Database_deadlockCount_R[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfDatabaseIntentSharedDeadlocks

alert: MongoDBAtlasHighNumberOfDatabaseIntentSharedDeadlocks
annotations:
  description: The number of database intent-shared-lock deadlocks occurring on node
    {{$labels.instance}} in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}}
    which is above the threshold of 10.
  summary: There is a high number of database intent-shared deadlocks occurring.
expr: |
  sum without(cl_role,process_port,rs_nm,rs_state) (increase(mongodb_locks_Database_deadlockCount_r[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfSlowNetworkRequests

alert: MongoDBAtlasHighNumberOfSlowNetworkRequests
annotations:
  description: The number of DNS and SSL operations taking more than 1 second to complete
    on node {{$labels.instance}} in cluster {{$labels.cl_name}} is {{printf "%.0f"
    $value}} which is above the threshold of 10.
  summary: There is a high number of slow network requests.
expr: |
  sum without (cl_role,rs_nm,rs_state,process_port) (increase(mongodb_network_numSlowSSLOperations[5m])) + sum without (cl_role,rs_nm,rs_state,process_port) (increase(mongodb_network_numSlowDNSOperations[5m])) > 10
for: 5m
labels:
  severity: warning

MongoDBAtlasDiskSpaceLow

alert: MongoDBAtlasDiskSpaceLow
annotations:
  description: The amount of hardware disk space being used on node {{$labels.instance}}
    in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}}% which is above the
    threshold of 90%.
  summary: Hardware is running out of disk space.
expr: |
  100 * ((sum without (disk_name) (hardware_disk_metrics_disk_space_used_bytes)) / clamp_min((sum without (disk_name) (hardware_disk_metrics_disk_space_used_bytes)) + (sum without (disk_name) (hardware_disk_metrics_disk_space_free_bytes)), 1)) > 90
for: 5m
labels:
  severity: warning

MongoDBAtlasSlowHardwareIO

alert: MongoDBAtlasSlowHardwareIO
annotations:
  description: The latency time for read and write I/Os on node {{$labels.instance}}
    in cluster {{$labels.cl_name}} is {{printf "%.0f" $value}} seconds which is above
    the threshold of 3 seconds.
  summary: Read and write I/Os are taking too long to complete.
expr: |
  (sum without (disk_name) (increase(hardware_disk_metrics_read_time_milliseconds[5m])) + sum without (disk_name) (increase(hardware_disk_metrics_write_time_milliseconds[5m]))) / 1000 > 3
for: 5m
labels:
  severity: warning

MongoDBAtlasHighNumberOfTimeoutElections

alert: MongoDBAtlasHighNumberOfTimeoutElections
annotations:
  description: The number of elections being called due to the primary node timing
    out in replica set {{$labels.rs_m}} in cluster {{$labels.cl_name}} is {{printf
    "%.0f" $value}} which is above the threshold of 10.
  summary: There is a high number of elections being called due to the primary node
    timing out.
expr: |
  sum without (cl_role,process_port,instance,rs_state) (increase(mongodb_electionMetrics_electionTimeout_called[5m])) > 10
for: 5m
labels:
  severity: warning

Dashboards

Following dashboards are generated from mixins and hosted on github: