Securing Cluster Secrets

This section describes the Apcera Platform secret store and how to configure it.

About cluster secrets

The Apcera Platform is a distributed system that requires a number of cluster secrets, including passwords and private keys. To secure these secrets, Apcera provides a secret store. Secrets secured in the secret store include the following:

Secret Description
Component class keys Automatically generated private keys for inter-cluster communications among components via NATS
Component DB password For PostgreSQL DB access
Key value store token For Consul access
Identity provider shared secrets Secrets required to integrate with an external identity provider, for example the administrator password for an LDAP provider. LDAP, AD, Google Auth, and Keycloak are supported.
Certificates and Keys for HTTPS Routes SSL certs and keys provided by you when setting up job routes.
Encrypted file system keys Keys created for encrypting data at rest for containers.

Secret store architecture

The Apcera secret store leverages two third-party components to encrypt and store cluster secrets: Hashicorp Vault and Consul. We use Vault to encrypt the secrets, and we store them (in encrypted form) in Consul for high-availability. Both components should be run on the central host and are tightly coupled: they must run on the same host (central) within the same datacenter. Note that for new clusters Consul is also the component store, replacing the Postgres-based component database used for such purposes.

Vault operations are straightforward. During deployment the system inserts secrets into Vault where they are encrypted and stored in Consul. At runtime each Apcera component or job connects to Vault and fetches the secrets it needs to operate. Vault gets the secrets from Consul, decrypts them and passes them to the calling component.

For example, instead of storing secrets in the cluster.conf file, Orchestrator lets you collect secrets via user input. These secrets are encrypted by Vault and stored in Consul, then retrieved from Vault by the requesting component when needed. Another example are the keys used for protecting NATS message integrity. These keys are stored in Vault and retrieved from Vault by each Apcera component when the cluster is deployed. Secrets for jobs, such as SSL certs for HTTPS routes and data-at-rest encryption keys are likewise stored in Vault.

Cluster passphrase

When you deploy or upgrade a 3.x cluster, you are prompted for the cluster passphrase. The cluster passphrase is the Vault login credential that is used by the system to authenticate with Vault. The cluster passphrase must be at least 8 characters in length. A one-way, non-reversible hash of the cluster passphrase is stored in Vault, but not the passphrase itself.

The first time you deploy a cluster you establish the cluster passphrase. On subsequent deployments you must provide the cluster passphrase to proceed with any deployment operation; that is, any time you run the orchestrator-cli deploy command you must provide the cluster passphrase. You must also provide the cluster passphrase when using the orchestrator-cli security command (see Secret store command-line interface).

The cluster passphrase is not recoverable by Apcera. If you forget or lose your passphrase you will not be able to deploy or update your cluster. Your only recourse would be to destroy your cluster via terraform destroy. For this reason, you must securely store the cluster passphrase separately from Apcera. Apcera recommends that you use a third-party password management tool such as LastPass to secure the cluster passphrase.

Default cluster passphrase

If you do not care about the security of your cluster, you can use the default cluster passphrase by specifying the --nopass option when you deploy a cluster, or run the orchestrator-cli security command. For example, the following deploys a release using the --nopass option.

orchestrator-cli deploy -c cluster.conf --release-bundle "release-3.0.0.tar.gz" --nopass

If you use the --nopass parameter on initial cluster deployment, you will need to provide it on all future subsequent deployments (or when running the orchestrator-cli security command).

The use of default cluster passphrase (--nopass) is insecure and should not be used for production clusters.

Similarly, the --non-interactive flag instructs the system that the user is not required to acknowledge certain password prompts.

orchestrator-cli deploy -c cluster.conf --release-bundle "release-3.0.0.tar.gz" --non-interactive

The --non-interactive option assumes that you want to use the default password. If you want to enable --non-interactive mode with a custom password, set --nopass=false.

Updating cluster passphrase

You may need to update the cluster passphrase under the following circumstances:

  • The cluster was setup with the default passphrase (insecure) and now you want to use a custom passphrase.
  • The cluster was setup with a custom passphrase that you now want to update, for example, if the passphrase is compromised or you have passphrase rotation requirements.

To update the cluster passphrase, use the command orchestrator-cli security changepwd -c cluster.conf. You will be prompted to enter the existing password.

If you are using the default passphrase and you want to change to use a custom passphrase, run the above command using the --nopass option. For example:

$ orchestrator-cli security changepwd -c cluster.conf --nopass
Using default cluster passphrase
Provide a new cluster passphrase at least 8 characters in length.
new Cluster Passphrase:

Using stdin to input the cluster passphrase

If you are security-conscious and you do not want the cluster passphrase to show in the shell history, you can use the --input-json flag. With this flag the cluster passphrase will be piped to standard input (stdin).

Secret store command-line interface

You use the orchestrator-cli security command to manage the cluster secret store and other facets of security. By default this command performs the status subcommand. The orchestrator-cli security --help option provides command usage information. Note that you will need to provide your cluster passphrase to access the orchestrator-cli security command options (or include the --nopass flag if you used that option when deploying your cluster).

Usage: orchestrator-cli security [args]

The 'security' command is used manage the cluster secret store and other
facets of security. By default it performs the `status` subcommand. With the
`refreshtokens` subcommand, it will unseal vault and refresh the app tokens
in the cluster which the components use in order to connect to Vault.

Subcommands:

  changepwd     - Provide new cluster passphrase.
  unseal        - Unseal any sealed vault servers following a reboot.
  refreshtokens - Unseal any sealed vault servers and refresh the app tokens.
  reset         - Clear saved CA and vault certificates allowing deploy to recreate.
  status        - Check input passphrase and report status of vault server.


Command options:

 -c, --config FILE     - The cluster configuration file to describe the settings
                         that the cluster should reflect.

 --non-interactive     - Indicates that the user is not required to acknowledge certain
                         warning messages and permits the command to proceed (warning messages
                         are still displayed). This option implies --nopass so if the default
                         cluster passphrase should not be set then --nopass=false must
                         also be specified.

 --nopass              - Indicates that the system is using the default value as the cluster
                         passphrase. This option is a convinience for testing and demo systems,
                         THIS OPTION IS INSECURE AND NOT RECOMMENDED FOR PRODUCTION SYSTEMS.

 --input-json          - Indicates that stdin will provide a json structure with answers to prompts.
                         Supported attributes are "clusterPassphrase", "configPassphrase" and
                         "newClusterPassphrase".
                         Example: echo '{"clusterPassphrase":"pass@word1"}' | orchestrator-cli security --input-json ...

 --chef-server-log FILE     - Cause the internal chef-server to write to the given logfile.
 --chef-client-log FILE     - Cause the chef-client to write to the given logfile.
 --martini-debug-log FILE   - Cause the martini logs to be written to a logfile.

Notes

The subcommand unseal is implied with the status and refreshtokens commands.

The subcommand reset is for resetting the certificate used for the TLS connection between Orchestrator and platform components to Vault. The HTTPS connection protects the transmission of secrets from Vault to the various requesters. A CA cert is created on the first deploy and a server cert & key is issued from it for Vault. On each redeploy a new Vault server cert is reissued from the original CA cert. The reset command does a deeper level of renewal by clearing the CA cert so a new one will be created on the next deploy (which should be done immediately after reset to complete the process).

Deploying Vault

New Apcera clusters deployed with release 3.0 will automatically have Vault and Consul enabled. You do not need to do anything other than provide the cluster passphrase during deployment.

If you want to upgrade your existing Apcera cluster to release 3.0, before you upgrade the cluster software you will need to add the Vault and Consul components to your cluster.conf file, then deploy the cluster using the 3.0 release.

The Consul component is called kv-store in cluster.conf.

To deploy Vault and Consul for the purposes of upgrading your cluster to Apcera Platform release 3.0, complete the following steps

  1. Open for editing the cluster.conf.erb file

  2. In the machines section, add the vault and kv-store components to the central host.

    For example:

     machines: {
       ...
       central: {
         hosts: ['10.0.0.252', '10.0.1.80', '10.0.2.49']
         suitable_tags: [
           "component-database"
           "api-server"
           "job-manager"
           "router"
           "package-manager"
           "stagehand"
           "cluster-monitor"
           "health-manager"
           "metrics-manager"
           "nats-server"
           "events-server"
           "auth-server"
           "ldap-auth-server"
           "google-auth-server"
           "vault"
           "kv-store"
         ]
       }
       ...
     }
    
  3. In the components section, with the count for each component. The kv-store component must be an odd number: 1, 3, 5, etc. Typcially the component count for both vault and kv-store will match the number of central hosts you are deploying (unless you have an even number of centrals, which is not common).

    For an minimum production deployment, the Vault and Consul component count is 3. For example:

     components: {
       ...
       # Central Components
       component-database: 3
               api-server: 3
              job-manager: 3
                   router: 3
                 kv-store: 3
                    vault: 3          
          package-manager: 3
           health-manager: 3
          metrics-manager: 3
              nats-server: 3
            events-server: 3
          cluster-monitor: 1
              auth-server: 3
         ldap-auth-server: 3
       google-auth-server: 3
       ...
     }
    
  4. Generate the updated cluster.conf and redeploy the cluster.

    For example:

     erb cluster.conf.erb > cluster.conf
    
     orchestrator-cli deploy -c cluster.conf --release-bundle "release-3.0.0-1a2b3c4.tar.gz"
    
  5. Enter and confirm the cluster passphrase.

  6. Montior Vault and Consul.

    Log in to Zabbix and verify that Vault and Consul are running.

Prompting for Secrets

Optionally you can also add prompts to your cluster.conf for storing supported secrets.

See Configuring Secret Prompts.

Troubleshooting Vault

This section provides troubleshooting tips for deploying and using Vault.

The following error message means you did not enter the proper cluster passphrase.

Provide passphrase for cluster: ******
Unable to login to secret store; this may result in deploy failure later. Error: invalid credentials

The cluster passphrase is the vault login credential. It is not stored anywhere by Orchestrator. It needs to reach vault and do the auth in order to validate the passphrase.

The following error message means either that a previous deploy failed or that Vault is down.

Failed to connect to Secret Store; Continuing might lead to a failed deploy and you might be prompted for secrets again. Do you wish to continue? [y/n]:

If it is the former you can proceed. If it is the latter you should log in to Zabbix and confirm that Vault is down.