APM Blog

Subscribe to APM Blog: eMailAlertsEmail Alerts
Get APM Blog: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Blog Feed Post

Powerful disk alerting on an enterprise scale

Dynatrace blog

Detecting anomalies at the disk level isn’t as simple as it may initially seem to be. This is because hosts can include various disk types, including a boot disk, a disk for log storage, and possibly a disk for business data storage. Definition of alerting thresholds for disks also depends greatly on your organization’s built-in strategy for data handling. While alerting on low disk space incidents doesn’t make sense for fixed-sized boot disk images, it makes perfect sense for disks that contain business-critical data.

With the ability to set fine-grained rules for disk-anomaly alerts, you can intelligently control your disk thresholds on an enterprise scale. Setting a single threshold for a specific disk on a single host may be simple enough, but such an approach isn’t practical when you’re dealing with thousands of hosts, each with multiple disks.

With the latest release, Dynatrace introduces several enhancements that enable you to fine-tune your disk alerting thresholds on an enterprise scale. The newly introduced global disk alerting rules are based on disk name patterns and tags for host groups. By specifying a global disk rule, you now have the ability to fine-tune disk alerting for all disks across your enterprise.

Define automatic disk-detection rules

To define automatic disk alerting settings, navigate to Settings – Anomaly detection – Infrastructure. You’ll find some new automatic disk alerting settings here (see highlight below). With the Detect low disk space automatically setting enabled, Dynatrace detects low disk space when the amount of free space on any monitored disk falls below 3% of overall capacity.

Define custom disk-detection rules

You can define custom disk anomaly detection rules to distinguish between different types of disks within your environment. Disk anomaly detection rules contain filters for disk names and tagged hosts. You can select relevant disks by defining disk-name patterns. You can then define a specific threshold-alert setting for each group of tagged hosts. A typical use-case is to define strict thresholds for production machines while accepting lower thresholds for development hosts.

The example custom disk-alerting rule shown below will override global disk-alerting thresholds for disks that have a name that includes the string /var and that are also owned by a host that is tagged with DynatraceDemo. Both conditions must be met for this alerting rule to be applied.

The disk-name pattern and host-tag filters enable you to quickly create powerful alerting rules for enterprise-scale environments that may include thousands of disks—without setting thresholds for individual disks.

Once saved, the example custom disk-alerting rule detailed above will appear in the list below.

For smaller environments, you may only need to set specific thresholds for individual disks on a host. To facilitate this, Dynatrace now offers disk-specific thresholds on the host level. From any Host overview page, click the [] browse button and select Settings. Here you can set specific thresholds for the selected host.

The introduction of global disk alerting rules based on disk-name filters and host-tag filters is a big step forward in Dynatrace anomaly detection for disk storage. Global detection-rule settings offer a quick and powerful approach to overriding default thresholds for individual disks—even within environments with thousands of disks spread across numerous hosts. The newly introduced disk rules allow you to fine-tune the automatic disk-anomaly detection approach. So, head over to your environment’s Host anomaly detection settings page and get started with your own detection rules.

The post Powerful disk alerting on an enterprise scale appeared first on Dynatrace blog.

Read the original blog entry...

More Stories By APM Blog

APM: It’s all about application performance, scalability, and architecture: best practices, lifecycle and DevOps, mobile and web, enterprise, user experience