Configuring Cluster Monitoring

To monitor the jobs in your cluster (your code), you use the web console and the APC client, and perhaps third-party tools you use to instrument your applications, such as AppDynamics or New Relic.

To monitor the system (our code), Apcera supports both internal and external monitoring using third-party tools. Internal monitoring monitors the components of the Apcera system. External monitoring monitors the hosts running key components of the system.

To configure internal monitoring (Zabbix), you populate the web_hostnames field of the chef.apzabbix section of the cluster.conf.erb (Terraform) or cluster.conf file. The following examples demonstrate both types of configurations. If you are using Terraform, you configure monitoring in the cluster.conf.erb file, otherwise configure monitoring using the cluster.conf file.

NOTE: To configure monitoring alerts using Zabbix, you must set the attribute chef.continuum.cluster_name in the cluster.conf to get a persistent value for the cluster-name.

Terraform config example

  "apzabbix": {
    "db": {
# TERRAFORM OUTPUT: monitoring-database-address
      "hostport": "<%= `terraform output monitoring-database-address`.chomp %>:5432",
      "master_user": "apcera_ops",
# TERRAFORM OUTPUT: monitoring-database-master-password
      "master_pass": "<%= `terraform output monitoring-database-master-password`.chomp %>",
      "zdb_user": "zabbix",
      "zdb_pass": "YOUR_PASSWORD_HERE"
    },
    "users": {
      "guest": { "user": "monitoring", "pass": "YOUR_PASSWORD_HERE" },
      "admin": { "user": "Admin", "pass": "YOUR_PASSWORD_HERE", "delete": false }
    },
    "web_hostnames": ["monitoring.clustername.example.com"]

    # To enable alerts via PagerDuty, create a service in PagerDuty of type 'Zabbix'
    # and insert the API key here
    "pagerduty": {
      "key": "API-ACCESS-TOKEN-GOES-HERE"
    }
    "email": {
      "sendto": "target@example.com",
      "smtp_server": "localhost",
      "smtp_helo": "localhost",
      "smtp_email": "from@example.com"
    }
  }

Cluster.conf example

Cluster.conf example. Note that if you are using Terraform, you do not edit this file. See the example above instead.

  "apzabbix": {
    "db": {
      "hostport": "abcdefghijklm.nopqrstuvwzyx.us-west-2.rds.amazonaws.com:5432",
      "master_user": "apcera_ops",
      "master_pass": "YOUR_PASSWORD_HERE",
      "zdb_user": "zabbix",
      "zdb_pass": "YOUR_PASSWORD_HERE"
    },
    "users": {
      "guest": { "user": "monitoring", "pass": "YOUR_PASSWORD_HERE" },
      "admin": { "user": "admin", "pass": "YOUR_PASSWORD_HERE", "delete": false }
    },
    "web_hostnames": ["clustermon.clustername.tld"],
    "PF_pagerduty": {
      "key": "API-ACCESS-TOKEN-GOES-HERE"
    },
    "PF_email": {
      "sendto": "monitoring-events@example.com",
      "smtp_server": "localhost",
      "smtp_helo": "localhost",
      "smtp_email": "zabbix@example.com"
    }
  }

Monitoring config parameters

The following section describes each of the monitoring configuration parameters that you can set.

db.hostport

Zabbix monitoring requires a Postgres DB backend, which can be installed on the same host as the zabbix-server (internal) or an RDS (external). Whether or not the zabbix-database is external or internal is set by the Postgres connection string provided in cluster.conf for the chef.apzabbix.db.hostport parameter. If this value is set to localhost:5432, the zabbix-database is internal. Else, as shown in the example, an RDS is used.

  • For Terraform, the hostport is generated for you.
  • For cluster.conf, populate this value with the host name and port of the monitoring server.
  • For AWS, get the value from the Outputs tab > MonitoringPostgresEndpoint resource.

db.master_user

The DB administrator name for the Postgres monitoring database. The default is "apcera_ops."

db.master_pass

  • For Terraform, the master_pass is auto-populated from the terraform.tfvars file, which you edit.
  • For cluster.conf, the value is a password you set here.
  • For AWS, get the value from the base.json file for the PostgreSQL Monitoring DB.

For the password, you must use a string that does not require URL escaping (that is, does not include the @, /, or \ characters.)

db.zdb_user

The monitoring user name for the Postgres monitoring database. The default is "zabbix."

db.zdb_pass

The db.zdb_pass is a new password made by you here. Use a string that does not require URL escaping (see note above).

users.guest

The users.guest takes a user name and URL-safe password that you enter here. The default user name is "monitoring."

users.admin

The users.admin parameters include a user name and URL-safe password that you enter here. The default user name is "admin."

web_hostnames

The web_hostnames field is used by nginx to define the virtual hosts server_name values. The URL you provide is the URL to the Zabbix monitoring console.

The Zabbix URL can be a subdomain of the cluster, or it can be any name you choose as long as you put it into the chef.apzabbix.web_hostnames field in the cluster.conf file and publish a DNS entry for that name and the IP of the server.

Example URL schemes:

http://<subdomain>.clustermon.<cluster-name>.<tld>

http://<cluster-name>.clustermon.<cluster-domain>.<tld>

http://<auniquename.example.com

pagerduty.key

The pagerduty.key is output by Pagerduty when you create a PagerDuty service. This block is an optional monitoring configuration. Without it there are no alerts sent via Pagerduty.

email

The email block is used to configure email alerts from Zabbix. Email alerts are optional and can be configured in lieu of or in addition to PagerDuty alerts. Macros are documented in the Zabbix manual. Customization is possible on request.

The email > "sendto" field supports a single email address. You can add more using the Terraform console.

The subject of the email alerts is {TRIGGER.STATUS}: <cluster-name> - {HOST.NAME1} - {TRIGGER.NAME}. The format of the email message is as follows:

name:{TRIGGER.NAME}
id:{TRIGGER.ID}
status:{TRIGGER.STATUS}
hostname:{HOSTNAME}
ip:{IPADDRESS}
value:{TRIGGER.VALUE}
event_id:{EVENT.ID}
severity:{TRIGGER.SEVERITY}

Postfix

You can configure the chef.postfix section of the cluster.conf to define a site specific mail relay. This may be necessary if you are using an environment where the Zabbix server is unable to send email directly and and you need a relay host for email notification.

Here is the configuration section you need to include:

chef: {
  # Define a site specific mail relay, if necessary
  "postfix": {
    "main": {
      "relayhost": "10.0.0.50"
    }
  }

For example, with the following configuration, the file /etc/postfix/main.cf contains relayhost = foo.example.com.

chef: {
  "postfix": {
    "main": {
      "relayhost": "foo.example.com"
    }
  }

Monitoring Google Auth

You can configure your cluster.conf file to instruct Zabbix to monitor Google Auth.