Configuring Cluster Monitoring

To monitor the jobs in your cluster (your code), you use the web console and the APC client, and perhaps third-party tools you use to instrument your applications, such as AppDynamics or New Relic.

To monitor the system (our code), Apcera supports both internal and external monitoring using third-party tools. Internal monitoring monitors the components of the Apcera system. External monitoring monitors the hosts running key components of the system.

To configure internal monitoring (Zabbix), you populate the web_hostnames field of the chef.apzabbix section of the cluster.conf.erb (Terraform) or cluster.conf file. The following examples demonstrate both types of configurations. If you are using Terraform, you configure monitoring in the cluster.conf.erb file, otherwise configure monitoring using the cluster.conf file.

NOTE: To configure monitoring alerts using Zabbix, you must set the attribute chef.continuum.cluster_name in the cluster.conf to get a persistent value for the cluster-name.

Monitoring configuration example

The following configuration example shows the base cluster.conf settings for cluster monitoring using Zabbix.

chef: {
  "apzabbix": {
    "disable_default_site": true,
    "cachesize": "500MB",
    "web_hostnames": ["zabbix.mycluster.apcera-platform.io"],
    "sslonly": true,

    "db": {
      "hostport": "localhost:5432",
      "master_user": "postgres",
      "master_pass": "DB_ADMIN_PASSWORD",
      "zdb_user": "zabbix"
      "zdb_pass": "DB_USER_PASSWORD"
    }
    "users": {
      "guest": { "user": "guest", "pass": "GUEST_USER_PASSWORD" },
      "admin": { "user": "admin", "pass": "ADMIN_USER_PASSWORD", "delete": false }
    },
  } # apzabbix
} 

If you are using Terraform some of the Monitoring parameters will be auto-populated from other sources. Refer to the instructions for your platform.

Monitoring configuration parameters

The following table describes each of the monitoring configuration parameters that you can set.

Parameter Description
disable_default_site Default and recommended is true.
cachesize The default zabbix cachesize is 32MB. You can change this by setting the value in GB, MB, KB, or Byte. In the above example it is set to 500MB.
web_hostnames Used by Nginx to define the virtual hosts server_name values. The URL you provide is the URL to the Zabbix monitoring console. The Zabbix URL can be a subdomain of the cluster, or it can be any name you choose as long as you put it into the chef.apzabbix.web_hostnames field and publish a DNS entry for that name and the IP of the server. Example valid URL schemes: http://<subdomain>.clustermon.<cluster-name>.<tld>; http://<cluster-name>.clustermon.<cluster-domain>.<tld>; http://<auniquename.example.com.
sslonly By default Zabbix monitoring uses HTTPS. Set to false to disable (not recommended).
db.hostport Zabbix monitoring requires a Postgres DB backend. You populate this value with the host name and port of the monitoring DB. If chef.apzabbix.db.hostport is set to localhost:5432 as in the above example, the zabbix-database is installed on the same host as the zabbix-server. To specify an external database, such as Amazon RDS: "hostport": "abcdefghijklm.nopqrstuvwzyx.us-west-2.rds.amazonaws.com:5432", where the value is taken from the Outputs tab > MonitoringPostgresEndpoint resource in the AWS console.
db.master_user The DB administrator name for the Postgres monitoring database. For EE the default is "apcera_ops." For CE the default is "postgres."
db.master_pass The DB admin password for the Postgres monitoring database. You use a string that does not require URL escaping (that is, does not include the @, /, or \ characters.)
db.zdb_user The monitoring user name for the Postgres monitoring database. The default is "zabbix."
db.zdb_pass The db.zdb_pass is a new password made by you here. Use a string that does not require URL escaping (see note above).
users.guest The users.guest takes a user name and URL-safe password that you enter here. On EE the default user name is "monitoring"; on CE it is "guest."
users.admin The users.admin parameters takes a user name and URL-safe password that you enter here. The default user name is "admin."

Configuring Monitoring email alerts

You can configure the cluster to send notifications of Zabbix-triggered monitoring alters to an email address, a pager address, or both. The email block is used to configure email alerts from Zabbix. The pagerduty entry is used to send alert notifications to a pager.

For example, using cluster.conf:

chef: {
  "apzabbix": {
    "disable_default_site": true,
    "cachesize": "500MB",
    "web_hostnames": ["clustermon.mycluster.example.com"],
    "sslonly": true,
    "db": {
      "hostport": "MonitoringPostgresEndpoint.us-west-2.rds.amazonaws.com:5432",
      "master_user": "acme_ops",
      "master_pass": "DB_ADMIN_PASSWORD",
      "zdb_user": "zabbix",
      "zdb_pass": "DB_USER_PASSWORD"
    },
    "users": {
      "guest": { "user": "monitoring", "pass": "PASSWORD" },
      "admin": { "user": "admin", "pass": "PASSWORD", "delete": false }
    },
    "pagerduty": {
      "key": "API-ACCESS-TOKEN-GOES-HERE"
    },
    "email": {
      "sendto": "target@example.com",
      "smtp_server": "localhost",
      "smtp_helo": "localhost",
      "smtp_email": "from@example.com"
    }
  } # apzabbix
}

Or using cluster.conf.erb:

  "apzabbix": {
    ...
    "pagerduty": {
      "key": "API-ACCESS-TOKEN-GOES-HERE"
    }
    "email": {
      "sendto": "target@example.com",
      "smtp_server": "localhost",
      "smtp_helo": "localhost",
      "smtp_email": "from@example.com"
    }
  } # apzabbix
}
Parameter Description
pagerduty.key The pagerduty.key is output by Pagerduty when you create a PagerDuty service. This block is an optional monitoring configuration. Without it there are no alerts sent via Pagerduty. To enable alerts via PagerDuty, create a service in PagerDuty of type 'Zabbix' and add the API key.
email.sendto The recipeient email address. The sendto field supports a single email address.
email.smtp_server The smtp_server hostname.
email.smtp_helo The email server common name (typically the same as smtp_server).
email.smtp_email The email address of the sender.

NOTE: The subject of the email alerts is {TRIGGER.STATUS}: <cluster-name> - {HOST.NAME1} - {TRIGGER.NAME}. The format of the email message is as follows. See also the macros documented in the Zabbix manual.

name:{TRIGGER.NAME}
id:{TRIGGER.ID}
status:{TRIGGER.STATUS}
hostname:{HOSTNAME}
ip:{IPADDRESS}
value:{TRIGGER.VALUE}
event_id:{EVENT.ID}
severity:{TRIGGER.SEVERITY}

Postfix

You can configure the chef.postfix section of the cluster.conf to define a site specific mail relay. This may be necessary if you are using an environment where the Zabbix server is unable to send email directly and and you need a relay host for email notification.

Here is the configuration section you need to include:

chef: {
  # Define a site specific mail relay, if necessary
  "postfix": {
    "main": {
      "relayhost": "10.0.0.50"
    }
  }

For example, with the following configuration, the file /etc/postfix/main.cf contains relayhost = foo.example.com.

chef: {
  "postfix": {
    "main": {
      "relayhost": "foo.example.com"
    }
  }

Monitoring Google Auth

You can configure your cluster.conf file to instruct Zabbix to monitor Google Auth.