Troubleshooting Apcera CE Deployments

This section provides troubleshooting tips for deploying the Apcera Platform Community Edition.

If you need help or support

Apcera provides full documentation for the Apcera Platform Community Edition. In addition, the Apcera Platform Community Edition is community supported. To get help or support, check out the Google Groups forum at https://community.apcera.com/.

Setup tool errors

This section provides apcera-setup troubleshooting tips.

If you cannot run the apcera-setup tool

We recommend you run the apcera-setup tool in the directory where you unzipped the download file. Alternatively, the directory where the apcera-setup tool is located must be in your $PATH.

On Windows, the current working directory where files are downloaded to by default is automatically in the $PATH.

On Unix-based systems, the current working directory is not in the $PATH by default.

On Mac if the directory where the apcera-setup tool is installed is not in your $PATH, you must run $ ./apcera-setup <command> to execute the tool.

Note that the instructions assume that the current working directory is in your path.

If the apcera-setup tool is not executable

If you receive the error command not found when you run ./apcera-setup, make sure that the apcera-setup file is executable.

To do this on a Unix host run command chmod u+x apcera-setup.

If the apcera-setup executable fails to run due to file system permissions (Linux or Mac)

If the executable file fails to run due to file system permissions, you can use an operating system command (Linux or Mac) as follows to check the file system settings:

$ ls -l apcera-setup

-rwxr-xr-x@ 1 someuser  staff  48599364 Dec  7 11:13 apcera-setup

And then run the following chmod command to update the permissions, for example:

$ chmod 755 apcera-setup
$ ls -l apcera-setup
-rwxr-xr-x@ 1 someuser  staff  48599364 Dec  7 11:13 apcera-setup

If the apcera-setup tool does not have internet access

The apcera-setup tool requires internet access to register the domain name and update DNS entries, and to update installation status information to better help us improve our software (if you opted in).

If you run the tool without internet access and try to connect to your platform, you receive an error similar to the following:

[ERROR] Failed to reach http://console.sub-domain.apcera-platform.io - dial tcp: lookup console.sub-domain.apcera-platform.io: no such host

Try running the apcera-setup tool again once you have internet access.

DNS issues

This section provides DNS troubleshooting tips.

If the domain name you chose is already registered

If the sub-domain you provide is already registered, you receive the error "Domain 'domain-name' is already registered," and you are prompted to enter the DNS Update Token to prove you were the person who registered the domain. See Configuring DNS for Apcera CE for guidance.

If this is the first time you are deploying the platform, you will not have a DNS Update Token. In this case, you need to enter a sub-domain name that is unique in our DNS server in order to proceed with the deployment.

If you need to update the IP address associated with a domain name

See Updating the IP address associated with a domain name.

If you receive a DNS lookup error

If, when running ./apcera-setup config, you receive the ERROR "Failed to check if DNS entry exists:… i/o timeout," this means that apcera-setup was not able to contact the DNS server.

To troubleshoot, check if you can reach the server that checks if a DNS entry exists using https://apcera-dns.prod.apcera.net/exists/<subdomain-name>.

For example: https://apcera-dns.prod.apcera.net/exists/my-subdomain-name.

If you receive a DNS update warning

If you receive the WARNING "Failed to see DNS update within timeout," this means the update to the DNS server timed out (5 minutes).

Generally you can ignore this error as the setup tool will try again.

If you can't connect to DNS

Features on a home network router, such as Parental Controls, can impact the ability to connect with the DNS Nameserver. In such cases, apcera-setup may not be able to configure your cluster for access to the default DNS Nameservers (at primary IP 8.8.8.8 and secondary IP 8.8.4.4), and the cluster will not be able to resolve hostnames on the internet.

AWS errors

This section provides AWS troubleshooting tips.

If you cannot connect to your AWS cluster

If you receive the following error:

"[WARNING] Could not retrieve machine state: Failed to get VM status: NoCredentialProviders: no valid providers in chain. Deprecated."

This means you have not set your AWS access keys in your terminal session. You need to set your access keys to be able to connect to your AWS cluster.

If you receive an IAM permissions error

If you receive an error similar to the following:

Creating a VPC, subnet, and security group for Apcera Setup...
[ERROR] Failed to run command "create": AccessDenied: User: arn:aws:iam::395162948718:user/user-name is not authorized to perform: cloudformation:CreateStack on resource: arn:aws:cloudformation:us-west-2:395162948718:stack/cluster-apcera-setup-3793499103/*

This means you do not have policy permissions in AWS to create CloudFormation stacks. Make sure you added all the required CloudFormation policy for your IAM user.

If you cannot create the CloudFormation stack

If you receive an error similar to the following:

Creating a VPC, subnet, and security group for Apcera Setup...
[ERROR] Failed to run command "create": Failed to get VPC for Apcera Setup stack

This may indicate a resource quota issue, such as the number of VPCs are exceeded. To troubleshoot the issue, in the AWS Console select Management Tools > CloudFormation. Make sure you select the region in the upper right. The stack is shown (such as cluster-apcera-setup-3798906349). In the Status column you will see an error. Select (check) the stack and then view the error(s) in the Event tab below.

Local VM errors

This section provides troubleshooting tips for local VM issues.

If you are prompted to upgrade a VM

If you are using VMware and are prompted to upgrade the virtual hardware of the VM, please do not choose to upgrade. If you upgrade the virtual hardware of the VM, there is no guarantee that VMware Tools will keep working with the new version of the virtual hardware.

If you close a VM during provisioning or operations

If you mistakenly close a VM window during provisioning or operations, or if virtual machine provisioning times out, you will need to destroy the cluster, and recreate and redeploy the platform.

Note that if the platform did not get to the deployed stage, you may need to manually remove the virtual machines using your VM software management console.

If you change networks

If you change networks while the local cluster is running, such as from a work wireless network to a home network, you may have to redeploy. See Manage Platform Resources.

If you need to do a brute force cleanup of a local deploy

  1. Delete the images directory from the local file system.

  2. Delete the config.json file from the apcera-setup working directory

Then you can start over with config, create, etc.

This may be necessary if your apcera-setup is in a state that you cannot "destroy", because the 'cluster VMs are not running, and running "resume" first' does not work. In this case, "destroy" cannot complete because one of the VMs won't resume.

Cluster access issues

This section provides troubleshooting tips for cluster login issues.

If you cannot log in to your cluster using the web console

When you are logging in to the web console, make sure you do so using the correct option to log in.

Basic auth is the default authentication mechanism for the Community Edition. Unless you have [enabled Google auth(/setup/apera-setup-google-atuh), you will need to provide a user name and password and choose to login using "Basic auth.

Also, make sure you are using the correct username and password. The config.json file has your credentials if you forgot them.

Lastly, check the logs/apcera-setup.log file in the working directory for authentication errors.

If the web console is not responding

If the web console is not reponding, you may need to restart the web console job using the APC client.

To do this, run the command apc job restart lucid, where lucid is the job name of the web console.

If you cannot target your cluster using APC

This edition of the Apcera Platform supports basic authentication. When you log in to your platform using APC, make sure you use the apc login --basic command.

This edition of the Apcera Platform does not use HTTPS. Thus, when you target your platform using APC (apc target), you must include the http:// protocol prefix because APC defaults to HTTPS.

For example:

$ apc target sub-domain-name.apcera-platform.io

Results in the error:

Get https://api.auniquename.apcera-platform.io/v1/info: dial tcp 192.168.1.9:443: connection refused

You must use the following instead:

apc target http://sub-domain-name.apcera-platform.io

Resulting in:

Note: connections over HTTP are insecure and may leak secure information.
Targeted [http://sub-domain-name.apcera-platform.io]

Note that $ /.apcera-setup info provides the target URL in case you need to remember the exact syntax.

Deployment errors

This section describes how to check the logs to troubleshoot deployment errors.

If you encounter deployment errors and need to check the logs

Apcera Platform Community Edition logs output to the local log file in the working director, such as /apcera-setup/logs/apcera-setup.log. If you encounter a platform configuration, creation, or deployment error, the first place to check is the apcera-setup.log file.

For deployment errors, you can SSH into the Orchestrator host and pull all log files, or SSH into a component host and check the logs there. See the managment and operations documentation for instructions.

If deployment fails without reason

If you run ./apcera-setup deploy and the deployment fails with the following error message:

[ Deploying Cluster ]
Orchestrator IP: [10.0.0.22]
Uploading cluster.conf...
Cleaning up old releases before the deploy...
Deploying... This might take a while, check "apcera-setup.log" for details...
[ERROR] Failed to execute SSH: Process exited with: 1. Reason was:  ()
[FATAL] Failed to run command "deploy" : Process exited with: 1. Reason was:  ()

This means you are attempting to deploy an Apcera version (such as the latest promoted release) using an out-of-date apcera-setup binary.

In this case the solution is to either update the apcera-setup tool so that it is compatible with the latest promote release, or specify a particular version of the tool to deploy (other than the latest release).

If you want to deploy a previous version of Apcera, such as 508, you would run the following deployment command:

./apcera-setup deploy -r https://apcera-releases.s3.amazonaws.com/continuum/bundles/release-508.1.5.tar.gz

Or the following if you have the release bundle local:

./apcera-setup deploy -r /Users/user/apcera-releases/release-508.1.5.tar.gz

Bootstrapping errors

If you receive a bootstrapping error similar to the following:

[ERROR] Failed to bootstrap VMs: error targeting cluster...connection refused

You can add sample packages to the platform using the following method:

  1. Download the package script files from the apcera/package-scripts repository to a local working directory.

  2. Target your cluster and log in using APC.

  3. CD to the working directory.

  4. Build each package using apc package build.

Cluster destroy failure

If the destroy operation fails, you will see the following message:

Cluster state can be restored by retrying with the '--force' option.

In this case you can use the apcera-setup destroy --force command to tear down your cluster.