Configuring Cluster Monitoring
To monitor the jobs in your cluster (your code), you use the web console and the APC client, and perhaps third-party tools you use to instrument your applications, such as AppDynamics or New Relic.
To monitor the system (our code), Apcera supports both internal and external monitoring using third-party tools. Internal monitoring monitors the components of the Apcera system. External monitoring monitors the hosts running key components of the system.
To configure internal monitoring (Zabbix), you populate the web_hostnames
field of the chef.apzabbix
section of the cluster.conf.erb
(Terraform) or cluster.conf
file. The following examples demonstrate both types of configurations. If you are using Terraform, you configure monitoring in the cluster.conf.erb
file, otherwise configure monitoring using the cluster.conf
file.
NOTE: To configure monitoring alerts using Zabbix, you must set the attribute
chef.continuum.cluster_name
in the cluster.conf to get a persistent value for the cluster-name.
Monitoring configuration example
The following configuration example shows the base cluster.conf settings for cluster monitoring using Zabbix.
chef: {
"apzabbix": {
"disable_default_site": true,
"cachesize": "500MB",
"web_hostnames": ["zabbix.mycluster.apcera-platform.io"],
"sslonly": true,
"db": {
"hostport": "localhost:5432",
"master_user": "postgres",
"master_pass": "DB_ADMIN_PASSWORD",
"zdb_user": "zabbix"
"zdb_pass": "DB_USER_PASSWORD"
}
"users": {
"guest": { "user": "guest", "pass": "GUEST_USER_PASSWORD" },
"admin": { "user": "admin", "pass": "ADMIN_USER_PASSWORD", "delete": false }
},
} # apzabbix
}
If you are using Terraform some of the Monitoring parameters will be auto-populated from other sources. Refer to the instructions for your platform.
Monitoring configuration parameters
The following table describes each of the monitoring configuration parameters that you can set.
Parameter | Description |
---|---|
disable_default_site |
Default and recommended is true . |
cachesize |
The default zabbix cachesize is 32MB. You can change this by setting the value in GB, MB, KB, or Byte. In the above example it is set to 500MB. |
web_hostnames |
Used by Nginx to define the virtual hosts server_name values. The URL you provide is the URL to the Zabbix monitoring console. The Zabbix URL can be a subdomain of the cluster, or it can be any name you choose as long as you put it into the chef.apzabbix.web_hostnames field and publish a DNS entry for that name and the IP of the server. Example valid URL schemes: http://<subdomain>.clustermon.<cluster-name>.<tld> ; http://<cluster-name>.clustermon.<cluster-domain>.<tld> ; http://<auniquename.example.com . |
sslonly |
By default Zabbix monitoring uses HTTPS. Set to false to disable (not recommended). |
db.hostport |
Zabbix monitoring requires a Postgres DB backend. You populate this value with the host name and port of the monitoring DB. If chef.apzabbix.db.hostport is set to localhost:5432 as in the above example, the zabbix-database is installed on the same host as the zabbix-server . To specify an external database, such as Amazon RDS: "hostport": "abcdefghijklm.nopqrstuvwzyx.us-west-2.rds.amazonaws.com:5432", where the value is taken from the Outputs tab > MonitoringPostgresEndpoint resource in the AWS console. |
db.master_user |
The DB administrator name for the Postgres monitoring database. Default is "apcera_ops." |
db.master_pass |
The DB admin password for the Postgres monitoring database. You use a string that does not require URL escaping (that is, does not include the @ , / , or \ characters.) |
db.zdb_user |
The monitoring user name for the Postgres monitoring database. The default is "zabbix." |
db.zdb_pass |
The db.zdb_pass is a new password made by you here. Use a string that does not require URL escaping (see note above). |
users.guest |
The users.guest takes a user name and URL-safe password that you enter here. The default user name is "monitoring". |
users.admin |
The users.admin parameters takes a user name and URL-safe password that you enter here. The default user name is "admin." |
Configuring Monitoring email alerts
You can configure the cluster to send notifications of Zabbix-triggered monitoring alters to an email address, a pager address, or both. The email
block is used to configure email alerts from Zabbix. The pagerduty
entry is used to send alert notifications to a pager.
For example, using cluster.conf:
chef: {
"apzabbix": {
"disable_default_site": true,
"cachesize": "500MB",
"web_hostnames": ["clustermon.mycluster.example.com"],
"sslonly": true,
"db": {
"hostport": "MonitoringPostgresEndpoint.us-west-2.rds.amazonaws.com:5432",
"master_user": "acme_ops",
"master_pass": "DB_ADMIN_PASSWORD",
"zdb_user": "zabbix",
"zdb_pass": "DB_USER_PASSWORD"
},
"users": {
"guest": { "user": "monitoring", "pass": "PASSWORD" },
"admin": { "user": "admin", "pass": "PASSWORD", "delete": false }
},
"pagerduty": {
"key": "API-ACCESS-TOKEN-GOES-HERE"
},
"email": {
"sendto": "target@example.com",
"smtp_server": "localhost",
"smtp_helo": "localhost",
"smtp_email": "from@example.com"
}
} # apzabbix
}
Or using cluster.conf.erb:
"apzabbix": {
...
"pagerduty": {
"key": "API-ACCESS-TOKEN-GOES-HERE"
}
"email": {
"sendto": "target@example.com",
"smtp_server": "localhost",
"smtp_helo": "localhost",
"smtp_email": "from@example.com"
}
} # apzabbix
}
Parameter | Description |
---|---|
pagerduty.key |
The pagerduty.key is output by Pagerduty when you create a PagerDuty service. This block is an optional monitoring configuration. Without it there are no alerts sent via Pagerduty. To enable alerts via PagerDuty, create a service in PagerDuty of type 'Zabbix' and add the API key. |
email.sendto |
The recipeient email address. The sendto field supports a single email address. |
email.smtp_server |
The smtp_server hostname. |
email.smtp_helo |
The email server common name (typically the same as smtp_server ). |
email.smtp_email |
The email address of the sender. |
NOTE: The subject of the email alerts is {TRIGGER.STATUS}: <cluster-name> - {HOST.NAME1} - {TRIGGER.NAME}
. The format of the email message is as follows. See also the macros documented in the Zabbix manual.
name:{TRIGGER.NAME}
id:{TRIGGER.ID}
status:{TRIGGER.STATUS}
hostname:{HOSTNAME}
ip:{IPADDRESS}
value:{TRIGGER.VALUE}
event_id:{EVENT.ID}
severity:{TRIGGER.SEVERITY}
Postfix
You can configure the chef.postfix
section of the cluster.conf to define a site specific mail relay. This may be necessary if you are using an environment where the Zabbix server is unable to send email directly and and you need a relay host for email notification.
Here is the configuration section you need to include:
chef: {
# Define a site specific mail relay, if necessary
"postfix": {
"main": {
"relayhost": "10.0.0.50"
}
}
For example, with the following configuration, the file /etc/postfix/main.cf contains relayhost = foo.example.com
.
chef: {
"postfix": {
"main": {
"relayhost": "foo.example.com"
}
}
Monitoring Google Auth
You can configure your cluster.conf
file to instruct Zabbix to monitor Google Auth.