Using Orchestrator to Deploy Apcera Clusters

This documentation describes how to use the Orchestrator tool. Refer to the installation instructions for your platform for directions on using Orchestrator to deploy Apcera to your chosen environment.

Orchestrator Overview

You use Orchestrator to install, deploy, and update your Apcera cluster. You configure Orchestrator using the cluster.conf configuration file populated for your environment.

You use the orchestrator-cli to perform the following cluster management functions:

  • Install and update cluster components based on the cluster configuration file (cluster.conf)
  • Scale and update cluster components based on an updated cluster.conf or new software release or both
  • Manage cluster deployments, including machine status, SSH, backup/restore, reboot, and cluster node data

Orchestrator is provided as a virtual machine appliance (VM) that you import into your deployment environment. The Orchestrator host includes the Orchestrator server and a PostgreSQL database. You can obtain the Orchestrator image at the Apcera Support Portal.

When you run the orchestrator-cli deploy command, Orchestrator reads the release manifest, looks over the configuration file (cluster.conf), verifies machine states (for existing clusters), generates a list of the actions it will take to deploy the cluster, and then deploys each cluster component identified in the configuration file using a clone of the base VM image as a host for 1 or more cluster components.

See also upgrading and scaling a cluster.

Orchestrator versions

This section briefly describes Orchestrator versions. See also the release notes.

Latest promoted release

Orchestrator 0.5.3 is the latest promoted release that includes the following features:

You must upgrade to this version to take advantage of these features. See updating Orchestrator software.

Older Orchestrator versions

Orchestrator 0.5.2 included bug fixes.

Orchestrator 0.4.3 included the following changes:

  • Support for semantic versioning and the build-id for the installation process.
  • Changes the —update-latest-release command to only update within patch releases.

For example, if you are running Apcera Platform version 2.0.0 and version 3.0.0 is promoted at the same time as 2.0.1, then version 2.0.1 is installed. In this case, version 3.0.0 requires a manual and explicit upgrade, as would version 2.2.0.

If you are using a legacy Orchestrator version (older than version 0.4.3), Apcera strongly recommends that your update the Orchestrator tool to the latest version. See Updating Orchestrator software.

Using the Orchestrator CLI

To use the Orchestrator tool, you use the orchestrator-cli executable, which is based on APC.

Usage: orchestrator-cli COMMAND [command-specific-options]

Global flags:

  -h, --help  - View a specific command's help.

Subcommands:

  backup      - Backup the orchestrator
  clusterinfo - View Orchestrator and deployed release versions
  collect     - Collect logs from cluster nodes
  deploy      - Provision or deploy updates to an Apcera cluster
  init        - Initialize the database used by the Orchestrator
  prune-chef  - Clean up nodes registered with chef
  reboot      - Reboot machines in an Apcera cluster
  releaseinfo - View information of a list of available releases
  repair      - Repair machines in an Apcera cluster
  restore     - Restore the orchestrator
  ssh         - Open an ssh session to a cluster node
  status      - Display cluster node status information
  teardown    - Used to completely teardown an Apcera cluster
  version     - View the Orchestrator's current version

backup

Usage: orchestrator-cli backup --config cluster.conf

See Using Orchestrator Backup and Restore.

clusterinfo

Usage: orchestrator-cli clusterinfo [args]

Shows the Apcera platform build number from deployed release, and optionally node information.

Command options:

 --show-nodes               - Indicates that node information should be displayed.

 --json                     - Show results in JSON format.

Examples:

Base command:

orchestrator@orchestrator:~$ orchestrator-cli clusterinfo
Apcera Platform Build: 2:2.4.0apc48 (build: 88dde20)

Show cluster nodes:

orchestrator-cli clusterinfo --show-nodes

Show cluster nodes in JSON output:

`orchestrator-cli clusterinfo --show-nodes --json`

Using jq to extract node information from output:

orchestrator-cli clusterinfo --show-nodes -json | jq '.nodes'`

collect

Usage: orchestrator-cli collect logs [args]

The 'collect logs' command is used to open an SSH session on one of the nodes provisioned by the Orchestrator and collect the core component logs.

It takes a selector as its argument, which can be one of the following:

  • The name of the box provisioned, usually a short-form UUID.
  • One of the Chef tags applied to the box.
  • The 'all' keyword.

Command options:

 --from PATH                - Directories or files to be collected. Can be comma, semicolon or space separated. For example: "/var/log/continuum-*/current,/tmp/chef-*"

The Orchestrator will query its list of nodes for potential matches. If one is found, it will drop you right into an ssh session as the "ops" user on that box. It will also generate an ephemeral known hosts file that is passed to SSH so the host should already be recognized, and should be carefully apprached if not.

If multiple systems match the query, then it will return a list of the matches and prompt for which box to connect to.

The 'all' argument will log into and collect logs from all clusters.

The 'collect' command relies on the 'ops-key' being already added to the user's ssh-agent, and the ssh-agent being accessible.

If the –from flag is not specified, logs collected are from: /var/log/continuum-/current /tmp/orchestrator-, and /tmp/chef-*.

See also pulling the logs.

deploy

The deploy command is used to provision and update the machines in the cluster based on the cluster configuration.

Usage: orchestrator-cli deploy --cluster.conf [args]

Command options:

Option Description
-c, --config FILE The cluster configuration file to describe the settings that the cluster should reflect. Required.
--release RELEASE Used to specify a particular release to deploy, instead of the most recently used release.
--release-base-url URL The base URL to use for retrieving release metadata.
--release-bundle PATH Provide a path to a release bundle. Overrides --release and --release-base-url.
--concurrency NUM Sets the number of concurrent actions that the orchestrator will perform against the cluster at the same time. (default: 5)
--dry Trigger the orchestrator into "dry run" mode, where it will evaluate the changes that are necessary to be done and generate an image representing the dependency tree of the operations. The image will be written to graph.png.
--update-latest-release Indicates that the orchestrator should retrieve the latest compatible release. Two releases are compatible only if release versions differ in 'patch' part of version numbers.
--update-name Indicates that the cluster name should be updated based on the given name in the cluster configuration file.
--skip-deploy Indicates that the normal "Deploy" step should be skipped where the individual components are all deployed and updated. This is useful when only the machine provisioning actions wish to be taken.
--machines Indicates the set of machines on which deploy operation will be performed when doing a cluster update. Machine names should be provided as a comma separated list. These names could also be config names, such as Central, instance_manager. This is useful when deploy needs to be limited to the tags on specific machines. The recipes with * tag will not be skipped and executed on every machine. DO NOT USE IT WITH "repair" flag.
--tags Indicates the set of tags which will be performed on deploy when doing a cluster update. Tag names should be provided as a comma-separated list. This is useful when deploy only needs to run specific tags. If it is not set, then deploy will run all the tags.
--repair Indicates that machine state check and auto repair step should be performed during cluster update. Machines with state errors are unconfigured and removed during repair prior to the "Deploy" step.
--non-interactive Indicates that the user is not required to acknowledge certain warning messages and permits "Deploy" step to proceed. Warning messages are still displayed.
--chef-server-log FILE Cause the internal chef-server to write a logfile.
--chef-client-log FILE Cause the chef-client to write a logfile.
--martini-debug-log FILE Cause the martini logs to be written to a logfile.

init

The init command is used to initialize the database used by the Orchestrator. It ensures the initial records exist to be able to accept configuration settings and generates some of the internal keys that are needed for communicating with the machines it will be creating. You must run this the first time you create a cluster.

Usage: orchestrator-cli init

prune-chef

Usage: orchestrator-cli prune-chef [args]

The 'prune-chef' command is used to remove nodes that are registered with Chef, but not recognized by the Orchestrator.

It is used in the migration path to bridge existing machines with new machines created by the Orchestrator, and remove them once the deploy is complete.

Command options:

 -c, --config FILE     - The cluster configuration file to describe the settings that the cluster should reflect. (default: cluster.conf)

reboot

Usage: orchestrator-cli reboot --config cluster.conf [args]

The 'reboot' command is used to perform cluster rolling reboot. Machines in the Apcera cluster are rebooted in prescribed order that is specific to a
deployed release.

See Rebooting cluster hosts.

Command options:

 -c, --config FILE          - The cluster configuration file to describe the
                              settings that the cluster should reflect.

 --concurrency NUM          - Sets the number of concurrent actions that the
                              orchestrator will perform against the cluster at
                              the same time. (default: 1)

 --dry-run                  - Trigger the orchestrator into "dry run" mode,
                              where it will evaluate whether machines are 
                              required to be rebooted due to various conditions
                              on the machines.  However, machines will not be
                              rebooted.

 --force                    - Indicates that machines are to be rebooted.

 --machines                 - Indicates the set of machines on which reboot will
                              be performed.  Machine names should be provided as
                              a comma separated list. These names could also be
                              config names, such as Central, instance_manager.
                              This is useful when reboot is limited to certain
                              machines in the Apcera cluster.

 --chef-server-log FILE     - Cause the internal chef-server to write a logfile

 --chef-client-log FILE     - Cause the chef-client to write a logfile

 --martini-debug-log FILE   - Cause the martini logs to be written to a logfile

releaseinfo

Usage: orchestrator-cli releaseinfo [args]

Shows available release information, which includes version and timestamp of each release in reverse chronological order.

Command options:

 --all-releases             - Show all available releases.  By default only releases newer than deployed version are shown.
 --json                     - Show results in JSON format.
 --show-constraints         - Extended output showing release constraints

For example:

orchestrator-cli releaseinfo
Deployed Release: 2.4.0
╭─────────────────────────────────────────╮
│           Available Releases            │
├─────────┬───────────────────────────────┤
│ Version │ Release Date                  │
├─────────┼───────────────────────────────┤
│ 2.2.3   │ 2016-10-24 21:06:39 +0000 UTC │
│ 2.2.2   │ 2016-10-14 16:56:54 +0000 UTC │
│ 2.2.1   │ 2016-09-20 18:38:41 +0000 UTC │
│ 2.2.0   │ 2016-07-20 17:18:24 +0000 UTC │
│ 2.0.2   │ 2016-06-23 16:07:30 +0000 UTC │
│ 2.0.1   │ 2016-04-14 04:37:41 +0000 UTC │
│ 2.0.0   │ 2016-03-29 16:55:16 +0000 UTC │
│ 508.1.5 │ 2016-02-25 21:49:43 +0000 UTC │
│ 506.1.0 │ 2016-02-11 22:10:33 +0000 UTC │
│ 504.1.8 │ 2016-02-08 19:58:03 +0000 UTC │
│ 449.3.1 │ 2015-12-04 20:34:38 +0000 UTC │
│ 447.2.4 │ 2015-12-04 00:57:16 +0000 UTC │
│ 447.2.0 │ 2015-11-23 19:25:32 +0000 UTC │
│ 444.2.7 │ 2015-11-03 19:32:20 +0000 UTC │
│ 444.2.3 │ 2015-11-02 19:27:29 +0000 UTC │
│ 442.8.3 │ 2015-10-21 18:36:28 +0000 UTC │
╰─────────┴───────────────────────────────╯

repair

Usage: orchestrator-cli repair --config cluster.conf [args]

The 'repair' command is used to repair the cluster. Note that it is a destructive command. It deletes the machines in error state.

Command options:

 -c, --config FILE     - The cluster configuration file to describe the settings
                         that the cluster should reflect.
                         (default: cluster.conf)

 --machines            - Indicates the set of machines on which repair operation
                         will be performed. Machine names should be comma
                         separated. These names could also be config names,
                         such as central or instance_manager. This is useful
                         when repair needs to be run on specific machines.
                         If it is not set, then repair will run on all the
                         machines.

 --make-generic        - Converts the machine definitions in the existing cluster
                         to the generic machine definition.

 --non-interactive     - Indicates that the user is not required to acknowledge
                         certain warning messages and permits "Repair" step
                         to proceed. Warning messages are still displayed.
 
 --reconfigure-chef    - Reconfigures the chef agents on the cluster machines.
                         This flag can be used to reconfigure chef agents
                         after migrating the orchestrator to a new machine.

restore

Usage: orchestrator-cli restore [args]

The 'restore' command is used to restore the orchestrator to a backed up state. It requires a backup file.

See Using Orchestrator Backup and Restore.

Command options:

 -f, --file FILE     - The backup file from which the orchestrator will be
                       restored.

ssh

Usage: orchestrator-cli ssh [selecter]

The 'ssh' command is used to open an SSH session on one of the nodes provisioned by the Orchestrator. It takes a selector as its argument, which can be one of the following:

  • The name of the box provisioned, usually a short-form UUID.
  • The IP address of one of the provisioned boxes.
  • One of the Chef tags applied to the box.
  • A generic Chef query (such as "hostname:sample").
  • A regexp, starting with '/', which matches against tag names.

The Orchestrator will query its list of nodes for potential matches. If one is found, it will drop you right into an ssh session as the "ops" user on that box. It will also generate an ephemeral known hosts file that is passed to SSH so the host should already be recognized, and should be carefully apprached if not.

If multiple systems match the query, then it will return a list of the matches and prompt for which box to connect to as shown below. The 'ssh' command relies on the 'ops-key' being already added to the user's ssh-agent, and the ssh-agent being accessible.

orchestrator-cli ssh
Starting up database... done

Multiple nodes matched your query, please select which:

1) IP: 10.0.0.4, Name: f2afba9c, Tags: monitored,tcp-router,tcp-router-nginx
2) IP: 10.0.0.5, Name: bc8c57e4, Tags: monitored,router
3) IP: 10.0.0.6, Name: 8c43b2b7, Tags: monitored,router
4) IP: 10.0.1.5, Name: 94e86d45, Tags: monitored,monitoring,zabbix-server
5) IP: 10.0.2.10, Name: 31e89301, Tags: monitored,auditlog-database,auditlog-database-master
6) IP: 10.0.2.11, Name: 04281f3a, Tags: monitored,stagehand,job-manager,cluster-monitor,api-server,component-database,package-manager,health-manager,metrics-manager,nats-server,events-server,component-database-master
7) IP: 10.0.2.13, Name: c6e3049b, Tags: monitored,riak-node,riak-server
8) IP: 10.0.2.14, Name: 38004b90, Tags: monitored,package-manager,metrics-manager,cluster-monitor,api-server,component-database,health-manager,nats-server,job-manager,events-server
9) IP: 10.0.2.15, Name: 4d8f9add, Tags: monitored,instance-manager
10) IP: 10.0.2.16, Name: 2997ff2b, Tags: monitored,riak-node,riak-server
11) IP: 10.0.2.4, Name: f974f1af, Tags: monitored,instance-manager
12) IP: 10.0.2.5, Name: de4a8ca8, Tags: monitored,riak-node,riak-stanchion,riak-server
13) IP: 10.0.2.6, Name: 4ee2b53f, Tags: monitored,api-server,component-database,nats-server,cluster-monitor,metrics-manager,job-manager,events-server,package-manager,health-manager
14) IP: 10.0.2.7, Name: 9676106d, Tags: monitored,auditlog-database
15) IP: 10.0.2.8, Name: 3f966668, Tags: monitored,auth-server,google-auth-server,basic-auth-server,app-auth-server
16) IP: 10.0.2.9, Name: 5a3e3b71, Tags: monitored,statsd-server,redis-server,graphite-server

Pick a host [1]: 9
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ops@zoaz-4d8f9add:~$ 

status

Usage: orchestrator-cli status --config cluster.conf [args]

The 'status' command is used to show current known node status, including IP address, machine tag, machine role, hostname, connectivity and reboot states.

By default the output is printed in terminal table format, for example:

orchestrator@orchestrator:~$ orchestrator-cli status -c cluster.conf
╭─────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                       Cluster Node Status                                       │
├───────┬───────────┬──────────┬──────────────────┬───────────────┬──────┬────────┬──────┬────────┤
│ Index │ IP        │ Name     │ Machine Config   │ Hostname      │ Ping │ Status │ Note │ Reboot │
├───────┼───────────┼──────────┼──────────────────┼───────────────┼──────┼────────┼──────┼────────┤
│ 1     │ 10.0.0.4  │ f2afba9c │ tcp_router       │ zoaz-f2afba9c │ OK   │ OK     │      │        │
│ 2     │ 10.0.0.5  │ bc8c57e4 │ router           │ zoaz-bc8c57e4 │ OK   │ OK     │      │        │
│ 3     │ 10.0.0.6  │ 8c43b2b7 │ router           │ zoaz-8c43b2b7 │ OK   │ OK     │      │        │
│ 4     │ 10.0.1.5  │ 94e86d45 │ monitoring       │ zoaz-94e86d45 │ OK   │ OK     │      │        │
│ 5     │ 10.0.2.10 │ 31e89301 │ auditlog         │ zoaz-31e89301 │ OK   │ OK     │      │        │
│ 6     │ 10.0.2.11 │ 04281f3a │ central          │ zoaz-04281f3a │ OK   │ OK     │      │        │
│ 7     │ 10.0.2.13 │ c6e3049b │ riak             │ zoaz-c6e3049b │ OK   │ OK     │      │        │
│ 8     │ 10.0.2.14 │ 38004b90 │ central          │ zoaz-38004b90 │ OK   │ OK     │      │        │
│ 9     │ 10.0.2.15 │ 4d8f9add │ instance_manager │ zoaz-4d8f9add │ OK   │ OK     │      │        │
│ 10    │ 10.0.2.16 │ 2997ff2b │ riak             │ zoaz-2997ff2b │ OK   │ OK     │      │        │
│ 11    │ 10.0.2.4  │ f974f1af │ instance_manager │ zoaz-f974f1af │ OK   │ OK     │      │        │
│ 12    │ 10.0.2.5  │ de4a8ca8 │ riak             │ zoaz-de4a8ca8 │ OK   │ OK     │      │        │
│ 13    │ 10.0.2.6  │ 4ee2b53f │ central          │ zoaz-4ee2b53f │ OK   │ OK     │      │        │
│ 14    │ 10.0.2.7  │ 9676106d │ auditlog         │ zoaz-9676106d │ OK   │ OK     │      │        │
│ 15    │ 10.0.2.8  │ 3f966668 │ singleton        │ zoaz-3f966668 │ OK   │ OK     │      │        │
│ 16    │ 10.0.2.9  │ 5a3e3b71 │ metricslogs      │ zoaz-5a3e3b71 │ OK   │ OK     │      │        │
╰───────┴───────────┴──────────┴──────────────────┴───────────────┴──────┴────────┴──────┴────────╯

Command options:

 -c, --config FILE          - The cluster configuration file to describe the settings
                              that the cluster should reflect.

 --html                     - Show status information in HTML table format
 --json                     - Show status information in JSON format
 --list         - Show status information in comma separated text format
 --markdown                 - Show status information in markdown table format
 --all, --all-fields        - Include all fields in status information.  JSON output
                              always includes all fields.

Examples:

$orchestrator status -c cluster.conf –list

$orchestrator status -c cluster.conf –markdown –all-fields

teardown

Usage: orchestrator-cli teardown [args]

The teardown command is used to teardown an entire cluster. Note that this is a destructive command. The command removes Apcera components from their hosts. The command does not remove the underlying infrastructure.

Options:

-c, --config FILE - The cluster configuration file to describe the settings that the cluster should reflect. (default: cluster.conf)

--concurrency NUM - Sets the number of concurrent actions that the orchestrator will perform against the cluster at the same time. (default: 5)

version

Usage: orchestrator-cli version

Shows the Orchestrator's current version.

Example:

orchestrator@orchestrator:~$ orchestrator-cli version
0.5.3 (c51a3dd)

Configuring Orchestrator

You use a file named cluster.conf to configure Orchestrator and deploy your cluster. The cluster.conf file is the configuration template for your cluster which you populate and upload to the Orchestrator machine.

Apcera strongly recommends that you secure and version control your cluster.conf file.

The cluster.conf file includes the following subsections, each containing parameter values that you populate:

  • Provisioner: Specifies information related to the creation of the machines that will run within the cluster.
  • Machines: Defines the machines types that will be used within the cluster.
  • Components: Specifies the desired number of each of the component type.
  • Chef Configuration: Settings used by Chef and made available to machines in the cluster.

Refer to the Cluster Configuration documentation for details on populating the cluster.conf file.

If you are using Terraform to install Enterprise Edition, you will use the Apcera-provided cluster.conf.erb file to generate the cluster.conf file. Refer to the installation instructions for your platform.

Running Orchestrator

To run Orchestrator:

  • Configure the network device to map to the proper virtual machine network.
  • Import the Orchestrator VM.
  • Power on the virtual machine.

Orchestrator will default to configuring the network via DHCP. If no DHCP server is available on the network, it will timeout after two minutes.

If DHCP fails, or if a static IP address allocation is desired, you can configure the machine to use a static IP. The base operating system for the Orchestrator is Ubuntu 14.04. Network configuration can be set up by editing the file at /etc/network/interfaces.

Refer to the following example to configure the static IP address for the network device on eth0. This should replace the default configuration for eth0 in the cluster.conf file. After configuring the network device, run sudo service networking restart to restart the networking services.

auto eth0
iface eth0 inet static
  address 10.0.0.30
  netmask 255.255.255.0
  gateway 10.0.0.1
  dns-nameservers 8.8.8.8

Logging in to Orchestrator

Once the Orchestrator is up and running, you can log into it using the default orchestrator user with password orchestrator.

vm login: orchestrator
password: orchestrator

For example: ssh orchestrator@172.27.16.41

It is recommended that you change the password immediately via the passwd command.

NOTE: This may be different depending on your platform. On OpenStack, we use the ec2-user to log in initially based on the keypair that you selected when you launched the Orchestrator instance.

Connecting to Orchestrator using SSH

It is generally more convenient to use SSH to remotely access the Orchestrator host. To access the Orchestrator instance using SSH, you will need a key pair that the Orchestrator recognizes. To generate the necessary SSH key pair, see the SSH documentation.

Once you log in to the Orchestrator host using SSH, you can use the commands orchestrator-cli ssh and orchestrator-cli collect logs to access other machine hosts and to pull the component logs. Both of these commands expect the following SSH configuration:

  • Have ssh-agent running.
  • Enable SSH Agent forwarding (ForwardAgent yes is configured).
  • Have preloaded into your ssh-agent the SSH key that you configured in the cluster.conf file for access to servers inside the cluster.

Steps to add your SSH key to your SSH agent:

1) On Debian systems (and derivatives), the ssh-agent is started by default when you log in.

But, if for some reason you need to start the SSH Agent, run the command eval ssh-agent.

2) Add your key to the agent by entering the command ssh-add and your password when prompted.

Note that the above command assumes that the SSH key is in your current directory. If it is not, use command ssh-add /path/to/ssh/key.

For example, to add your SSH key to your local SSH agent, use command ssh-add my-ssh-key.pem where my-ssh-key.pem is the name of your SSH key.

Run command ssh-add -l to verify that your key was added successfully.

3) Once you have added the key, connect to the Orchestrator host.

For example:

$ ssh -A orchestrator@40.77.109.110
orchestrator@40.77.109.110's password: 
Last login: Sun Nov  6 19:48:15 2016 from 4.16.82.26
orchestrator@orchestrator:~$ 

Updating Orchestrator software

To check what version of Orchestrator you are on, run either of the following commands:

orchestrator-cli version or orchestrator-cli -v

Before deploying or updating a cluster, you should update the Orchestrator software with the latest promoted version.

Before updating Orchestrator, you should perform a backup.

To update Orchestrator software:

  1. SSH into the Orchestrator VM: ssh orchestrator@X.X.X.X
  2. Run sudo apt-get update
  3. Run sudo apt-get intall
  4. Run sudo apt-get install orchestrator-cli

If you have connected to the Orchestrator host as the root user, you do not need to run the sudo command.

See also performing an air-gapped Orchestrator update.

Copying cluster.conf to Orchestrator

Once the configuration for the cluster is ready, copy it to the user's home directory on the Orchestrator machine using scp. By default, the Orchestrator will look for the configuration file to be in the home directory.

Typically the cluster configuration file is named cluster.conf, but you can specify a different file name (with the *.conf extension) using the -c flag.

NOTE: You should always copy over the cluster.conf file anytime you update Orchestrator software. First, you likely will need to update the configuration file to support new features provided by the software update. Second, the software update removes the cluster configuration file. If you are updating your cluster, you will want to make sure you have a local copy of your cluster.conf file version controlled.

To copy the cluster configuration file to the Orchestrator host using SCP:

scp cluster.conf orchestrator@40.77.109.110:~
orchestrator@40.77.109.110's password: 
cluster.conf                                                                                100% 7476     7.3KB/s   00:00

If prompted, enter yes at the warning. You should see that the cluster.conf file is copied 100% to the Orchestrator host.

If you want to use your own key, use the following syntax:

scp -i <key-file-name> cluster.conf <username>@<orchestrator-host-public-ip>:~

For example:

scp -i continuum.pem cluster.conf root@52.26.8.34:~

NOTE: If you logged in as the orchestrator user earlier and did some setup, the files may be in different places.

Initializing Orchestrator DB

You only need to do this for new cluster deployments. The command initializes the database by generating the keys and credentials needed for both the cluster and for Chef.

To initialize the Orchestrator database, run the following command:

orchestrator-cli init

This command is only needed once when creating a new cluster.

Performing a dry run

Before doing a new or update deployment, it is recommended that you perform a dry run using the --dry flag with the deploy command.

When you perform a dry run, Orchestrator reads the release manifest, looks over the configuration file, verifies machine states, and generates a list of the actions it will take to deploy the cluster. A dry run does not alter any state, but instead calculates the actions necessary to do the deploy. The result of the dry run is a graph.png image file that you can copy to the local machine using scp and view it.

The dry run indicates if the configuration file is well formed; it does not provide insight into the entire deployment outcome.

To perform a dry run, run the following command:

orchestrator-cli deploy --config cluster.conf RELEASE --dry

See Deploying a cluster for RELEASE options.

Once the dry run completes, use the following command to copy the graph.png file to your local machine so you can inspect it:

scp [-i key-name.pem] user@ip-address:file-name.ext /some/local/directory

For example:

scp orchestrator@40.77.109.110:graph.png ~/my-apcera-cluster

The graph will typically have a branch of actions for each machine that needs to be created. On initial cluster spin up, it will likely have a column of actions for each server. The process for a new machine is to clone the base template, wait for it to be bootstrapped by OpenStack, update the base Orchestrator agent on the machine, and then run our base Chef configuration on the host to get it up to date. It will then apply all the necessary tags for the different processes that machine will have installed. The main "Deploy" step is where the Orchestrator follows the Apcera bundled release process to install and update the various parts of the infrastructure.

screenshot

Deploying a cluster

When you perform a full deploy, Orchestrator creates or updates the cluster according to the configuration file instructions. The deployment will take 30 minutes or more. During the deployment, the Orchestrator CLI outputs progress messages and log as the Chef component receives requests back from the remote hosts for metadata.

If you are performing an upgrade of a deployed cluster, refer to the upgrading and scaling documentation.

Before deploying you should perform a dry run and validate cluster.conf.

There are four ways to deploy a cluster.

Latest promoted release

orchestrator-cli deploy --config cluster.conf --update-latest-release [--dry]

Specific promoted release

View available releases:

orchestrator-cli releaseinfo

Deploy specific release:

orchestrator-cli deploy -c cluster.conf --release 2.2.3 [--dry]

Local release bundle

That you copied to the Orchestrator host:

orchestrator-cli deploy -c cluster.conf --release-bundle release-2.4.0-88dde99.tar.gz [--dry]

Remote release bundle

You must use Orchestrator 0.5.3 or later to deploy a remote release bundle:

orchestrator-cli deploy -c cluster.conf --release-bundle https://s3-us-west-2.amazonaws.com/test/release-2.4.0-88dde99.tar.gz [--dry]

Verifying cluster deployment

When Orchestrator completes a successful cluster deployment or update, you see the message Done with cluster updates.

Run the command orchestrator-cli status -c cluster.conf to verify the status of each cluster host.

If one or more machines requires a reboot, run the command orchestrator-cli reboot -c cluster.conf

At this point the Apcera cluster should be running and accessible.

To verify successful deloyment:

  • Log in to the web console at http(s)://console.<cluster-name>.<domain-name>.<tld>.
  • Install APC and verify that you can log using APC: apc target http(s)://<cluster-name>.<domain-name>.<tld>.

To test cluster functionality:

Troubleshooting deployments

See also the cluster troubleshooting documentation.

If the orchestrator-cli deploy -c cluster.conf RELEASE command fails:

  1. Examine your cluster.conf and make sure it validates by performing a dry run deployment.

    To see what is changed: diff -u cluster.conf <old_cluster.conf> (assuming you backed it up).

  2. SSH into the orchestrator.

  3. Check which file has changed the most recently: ls -ltr.

  4. Run orchestrator-cli ssh to list cluster hosts:

     orchestrator@kiso-orchestrator:~$ orchestrator-cli ssh
     Starting up database... done
    
     Multiple nodes matched your query, please select which:
    
     1) IP: 10.0.0.103, Name: b6cbbdb2, Tags: splunk-forwarder,monitored,component-database,router,job-manager,package-manager,api-server
     2) IP: 10.0.0.213, Name: 238f4c65, Tags: splunk-forwarder,monitored,ip-manager
     3) IP: 10.0.0.231, Name: 32075cea, Tags: splunk-forwarder,monitored,auth-server,health-manager,nats-server,metrics-manager
     4) IP: 10.0.0.249, Name: 9de2f0df, Tags: splunk-forwarder,monitored,graphite-server,redis-server,statsd-server
     5) IP: 10.0.0.32, Name: b16c73b1, Tags: splunk-forwarder,monitored,instance-manager
     6) IP: 10.0.0.37, Name: 1e394e16, Tags: splunk-forwarder,monitored,nfs-server
     7) IP: 10.0.0.49, Name: 6301bf44, Tags: splunk-forwarder,monitored,auditlog-database
     8) IP: 10.0.0.58, Name: ff34d0fc, Tags: splunk-forwarder,monitored,splunk-indexer,splunk-server
     9) IP: 10.0.0.71, Name: 1bb9fa65, Tags: splunk-forwarder,monitored,instance-manager
     10) IP: 10.0.0.9, Name: 26fcfd30, Tags: splunk-forwarder,monitored,splunk-search,splunk-server
     11) IP: 10.0.0.90, Name: 037ffac0, Tags: splunk-forwarder,monitored,instance-manager
     12) IP: 10.0.0.97, Name: 62dd9dc5, Tags: splunk-forwarder,monitored,tcp-router
     13) IP: 10.0.1.118, Name: 7c5e756c, Tags: splunk-forwarder,monitored,instance-manager
     14) IP: 10.0.1.152, Name: 42631f0f, Tags: splunk-forwarder,monitored,auditlog-database,auditlog-database-master
     15) IP: 10.0.1.223, Name: 53fbd4ac, Tags: splunk-forwarder,monitored,instance-manager
     16) IP: 10.0.1.42, Name: 4c8845ef, Tags: splunk-forwarder,monitored,instance-manager
     17) IP: 10.0.1.95, Name: 316dbe02, Tags: splunk-forwarder,monitored,component-database,router,job-manager,package-manager,cluster-monitor,api-server,component-database-master
     18) IP: 10.0.2.178, Name: 2f5536c3, Tags: splunk-forwarder,monitored,instance-manager
     19) IP: 10.0.2.22, Name: 4ed8d611, Tags: splunk-forwarder,monitored,monitoring,zabbix-server
     20) IP: 10.0.2.235, Name: 9dab4b06, Tags: splunk-forwarder,monitored,api-server,component-database,router,stagehand,job-manager,package-manager
     21) IP: 10.0.2.87, Name: 03c4b4b3, Tags: splunk-forwarder,monitored,instance-manager
     22) IP: 10.0.2.88, Name: 31a0bc97, Tags: splunk-forwarder,monitored,instance-manager
    
  5. Pick a host.

    Since the orchestrator-agent runs on all nodes in the cluster, you can pick any of the machines from the list. If you know where the deploy failed, SSH into the specific node that is running the component (e.g. ip-manager).

  6. Change directory into /var/log/orchestrator-agent

  7. Examine the current log

    If the current log indicates that the orchestrator-agent was shutdown you must restart the orchestrator-agent:

     orchestrator-agent has exited.
     orchestrator-agent-supervisor pid: 22813
     Watching for orchestrator-agent (22814) to exit
     Orchestrator Agent ready.
     Error in launcher run: listen tcp 0.0.0.0:7778: bind: address already in use
     orchestrator-agent has exited.
     orchestrator-agent-supervisor pid: 22823
     Watching for orchestrator-agent (22824) to exit
     Orchestrator Agent ready.
     Error in launcher run: listen tcp 0.0.0.0:7778: bind: address already in use
     Requesting shutdown
     ./run: line 25: kill: (22824) - No such process
    
  8. SSH into the orchestrator as root.

  9. Execute the following command to verify the orchestrator-agent status:

     sv status orchestrator-agent
    

    If it is down, you will see:

     root@kiso-32075cea:/var/log# sv status orchestrator-agent
     down: orchestrator-agent: 4794201s, normally up; run: log: (pid 2148) 4825216s
    
  10. Execute sv start orchestrator-agent to restart the orchestrator-agent

     root@kiso-32075cea:~# sv start orchestrator-agent
     ok: run: orchestrator-agent: (pid 18947) 0s
     root@kiso-32075cea:~# lsof -Pni tcp:7778
     COMMAND     PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
     orchestra 18948 root    3u  IPv6 35767312      0t0  TCP *:7778 (LISTEN)
    

Troubleshooting Instance Manager

  1. SSH into the orchestrator as root: <ops_repo>/bin/orchestrator-ssh kiso
  2. Execute orchestrator-cli ssh /instance-manager will display only the IMs
  3. Once you are connected to an IM, you can tail the log:

     $ cd /var/log/continuum-instance-manager
     $ tail -50f current
    

Performing an air-gapped installation

A typical installation of the Apcera Platform Enterprise Edition requires the Orchestrator host to have internet connectivity for two primary purposes:

  • To update the orchestrator-cli software, and
  • To download the Apcera release software that is installed on cluster hosts.

You can perform an air-gapped installation that removes the internet-access requirement for Orchestrator for one or both of these purposes.

Air-gapped Orchestrator update

  1. Download the orchestrator-cli Debian Package from the Apcera Support Site.

    The file name indicates the latest Orchestrator version.

    For example: orchestrator-cli_0.5.3_amd64.deb.

  2. Copy the package Debian Package to the Orchestrator host using SCP or some other means.

     scp orchestrator-cli_0.5.3_amd64.deb ops@40.77.109.110:~
    
  3. SSH in to the Orchestrator host as the ops user.

     ssh -A ops@40.77.109.110
    
  4. Update the Orchestrator host to the latest orchestrator-cli version.

     sudo dpkg -i orchestrator-cli-{version}.deb
    
  5. Reboot the Orchestrator host.

     sudo reboot
    
  6. SSH in to the Orchestrator host as the orchestrator user.

     ssh -A orchestrator@40.77.109.110
    
  7. Verify the Orchestrator version.

     $ orchestrator-cli version
     0.5.3 (6b7e06f)`
     orchestrator@orchestrator:~$
    

Air-gapped cluster deployment

To provide cluster components with required dependencies, you copy the Apcera release bundle (tarball) to the Orchestrator host. The release bundle contains the latest Apcera software, the cluster manifest, and all the dependencies needed by the components. The approximate size of tarball is 375MB.

To upload the Apcera release bundle to the Orchestrator host and deploy a cluster:

  1. Obtain the release bundle from the Apcera Support Site.

  2. Copy the release bundle to the Orchestrator host using SCP.

    For example: scp release-2.4.0-88dde99.tar.gz orchestrator@172.27.16.41:~

    If you updated cluster.conf, also copy it to the Orchesrtator.

  3. Deploy the cluster, specifying the local --release-bundle PATH.

    SSH to the Orchestrator host, for example: ssh orchestrator@172.27.16.41

    Perform a dry run: orchestrator-cli deploy -c cluster.conf --release-bundle release-2.4.0-88dde99.tar.gz [--dry]

    Deploy the cluster: orchestrator-cli deploy -c cluster.conf --release-bundle release-2.4.0-88dde99.tar.gz