Implementing naming conventions using labels in checkmk

Most IT organizations have strict naming conventions for their hosts (servers, network devices and others) representing valuable information about:

  • country
  • region
  • customer
  • application
  • server
  • client
  • network
  • environment (test/production)

This list might be longer or any kind of subset of these attributes. As a sample I use my devices available in my home environment. (A review of my naming convention seems to be very urgent…)

Often naming conventions offer the ability to understand the usage of a given device, the location, or whatever. This information is also required to make your monitoring journey a success.

checkmk offers two key features to attach information of this kind to discovered hosts, labels and tags. In my opinion, the most valuable advantage of labels, compared to tags, is the ability to assign labels dynamically, without any dependency to folders or other mechanism.

This qualifies labels to dynamically apply “attributes” to hosts, using the currently valid naming conventions.

My home naming conventions are:

The following things I’d like to identify for monitoring:

  • the owner of the system
  • the type of the system (virtual system or physical)
  • the application, which is hosted on the system
  • the classification of the physical systems (e.g.: always online or not)

To do so, follow these steps in checkmk:

First Step: Open the setup dialog

  1. Type “host labels” in the search area
  2. In the navigation bar click “Setup”
  3. Click on host labels to enter the rules area for labels

Second Step: Use the “Create rule in folder” button
As naming conventions should apply across all systems in my monitoring environment, I create these rules in the “Main directory”.


Fill in your settings

  1. A description of your Rule
  2. The label you want to attach to this set of devices (how labels work)
  3. Select explicit hosts, to write down a regular expression, selecting the focused host names. The “~” indicates, that a regular expression will follow.
  4. Don’t forget to save your changes.

After repeating the above steps for all of my weak naming convention entries, I have the following rule set.

Please note, that my label names follow some conventions, to make sure, that there is no chaos introduced in the setup of checkmk.

Summary:

Labels can be implied dynamically to the hosts in your checkmk monitoring environment. Tags can’t be applied in this way. Labels can be used later on, while defining monitoring rules, views, dashboard, filters and so on. Labels are a good mechanism to group hosts in different dimensions.

Embedding Monitoring into Existing Organizations

To ensure the success of an IT Monitoring project implementation in the long-run, the monitoring should be embedded into the existing organization with clear responsibilities and roles for each stakeholder.

In the slide below a sample organization is shown, with a typical matrix organization. This is often seen in enterprises.

There are four groups in focus regarding the monitoring deployment:

  • Users
    Most enterprises earn their money with these group of people. They expect a running application, performing well and supporting them getting their job done. These could be internal or external users or both.
  • Business
    This group are the stakeholders of the business processes and focus on getting things done, to maximize the contribution to the companies, they work for.
  • IT Application
    These people design, code, test and deploy the required applications.
  • IT Infrastructure
    This group of people are responsible to provide and run a reliable IT infrastructure and architecture for the future, including internal and external cloud platforms.

These definitions are samples, seen with different customers over the last years. Other organizations my have other setups working for them very well.

The point I want to highlight is, how to embed the monitoring into such an organization:

  • Support Level 1
    While the business support organization supports the questions from the user’s using the applications and processes supported. Often the product catalog is in focus, the order process itself, the configuration of product and so on. The questions are business driven rather than IT technical. IT technical question are rerouted to the IT support desk.
    The IT support desk is in charge for all questions, regarding IT related objects. This includes assistance for program usage help, user login issues, performance issues and so on.
  • Support Level 2
    This function is covered by an operations center. This organization keeps IT services up and running, monitors the state of key resources, applications, network and all other stuff making up a well performing IT environment. They are in charge to initiate requests to level 3 if problems arise, they can’t handle by themselves. Regarding monitoring, these people are the power user of the monitoring solution. All acquired data is used here, to provide a comprehensive overview about the IT status and to enable decisions, how to handle given situations.
  • Support Level 3
    Level 3 has several teams, contributing to this mission. Developers and system programmers have to collaborate to get solutions for serious problems arising while operating the application.
    • System programmers are in charge to fix problems with hard- and software packages and their configuration.
    • Developers have to deal with issues originating from self-developed application code.Today, these roles are often combined in so called DevOps teams. These teams are responsible to perform deep dive analysis in case of application errors, including log analysis, detailed performance measurement and threshold definition. They also have to continually develop monitoring thresholds to increase alarm accuracy. They install new monitoring tools and integrate these into the existing solution. DevOps teams always keep in mind, that any new technology requires also a new monitoring review and potentially a new monitoring component.

Monitoring should be embedded into IT management processes, best described in ITIL. Incident management, problem management and change management are the disciplines in focus. For more details on monitoring and ITIL see the Process Symphony Knowledge Base.

As I described in my blog entry Finding the best Monitoring Solution, monitoring is a process rather than a status. It is a never ending iteration of requirement review, software upgrade, solution design and execution. That’s why it is so important to embed monitoring into IT organization’s daily business.