Troubleshooting apcera-install errors

This section provides troubleshooting tips for deploying the Apcera Platform.

apcera-install tool errors

This section provides apcera-install troubleshooting tips.

If you cannot run the apcera-install tool

We recommend you run the apcera-install tool in the directory where you unzipped the download file. Alternatively, the directory where the apcera-install tool is located must be in your $PATH.

On Windows, the current working directory where files are downloaded to by default is automatically in the $PATH.

On Unix-based systems, the current working directory is not in the $PATH by default.

On Mac if the directory where the apcera-install tool is installed is not in your $PATH, you must run $ ./apcera-install <command> to execute the tool.

Note that the instructions assume that the current working directory is in your path.

If the apcera-install tool is not executable

If you receive the error command not found when you run ./apcera-install, make sure that the apcera-install file is executable.

To do this on a Unix host run command chmod u+x apcera-install.

If the apcera-install executable fails to run due to file system permissions (Linux or Mac)

If the executable file fails to run due to file system permissions, you can use an operating system command (Linux or Mac) as follows to check the file system settings:

$ ls -l apcera-install

-rwxr-xr-x@ 1 someuser  staff  48599364 Dec  7 11:13 apcera-install

And then run the following chmod command to update the permissions, for example:

$ chmod 755 apcera-install
$ ls -l apcera-install
-rwxr-xr-x@ 1 someuser  staff  48599364 Dec  7 11:13 apcera-install

If the apcera-install tool does not have internet access

The apcera-install tool requires internet access to register the domain name and update DNS entries, and to update installation status information to better help us improve our software (if you opted in).

If you run the tool without internet access and try to connect to your platform, you receive an error similar to the following:

[ERROR] Failed to reach http://console.sub-domain.apcera-platform.io - dial tcp: lookup console.sub-domain.apcera-platform.io: no such host

Try running the apcera-install tool again once you have internet access.

DNS issues

This section provides DNS troubleshooting tips.

If the domain name you chose is already registered

If the sub-domain you provide is already registered, you receive the error "Domain 'domain-name' is already registered," and you are prompted to enter the DNS Update Token to prove you were the person who registered the domain. See Configuring DNS for Apcera Platform for guidance.

If this is the first time you are deploying the platform, you will not have a DNS Update Token. In this case, you need to enter a sub-domain name that is unique in our DNS server in order to proceed with the deployment.

If you need to update the IP address associated with a domain name

See Updating the IP address associated with a domain name.

If you receive a DNS lookup error

If, when running ./apcera-install config, you receive the ERROR "Failed to check if DNS entry exists:… i/o timeout," this means that apcera-install was not able to contact the DNS server.

To troubleshoot, check if you can reach the server that checks if a DNS entry exists using https://apcera-dns.prod.apcera.net/exists/<subdomain-name>.

For example: https://apcera-dns.prod.apcera.net/exists/my-subdomain-name.

If you receive a DNS update warning

If you receive the WARNING "Failed to see DNS update within timeout," this means the update to the DNS server timed out (5 minutes).

Generally you can ignore this error as the install tool will try again.

If you can't connect to DNS

Features on a home network router, such as Parental Controls, can impact the ability to connect with the DNS Nameserver. In such cases, apcera-install may not be able to configure your cluster for access to the default DNS Nameservers (at primary IP 8.8.8.8 and secondary IP 8.8.4.4), and the cluster will not be able to resolve hostnames on the internet.

Cluster access issues

This section provides troubleshooting tips for cluster login issues.

If you cannot log in to your cluster using the web console

It is possible that the cluster cannot connect to the identity provider of your choice (Google OAuth2, Keycloak, Active Directory, or LDAP). Check the logs/apcera-install.log file in the working directory for authentication errors.

You can try log into the cluster using the basic auth. (You can find the user name and password for the basic auth in the apcera-install.json file if you forgot them.)

If you can log in with basic auth credential but not with a specific identity provider, run the apcera-install config auth command to reconfigure. (Refer to Config command.)

If you cannot log in to your cluster using APC

If your cluster has multiple identity providers configured, make sure to specify which identity provider to use.

The command flags are:

  --basic         - Use the Apcera Platform's built-in auth provider
  --ldap-basic    - Use Active Directory or basic LDAP authentication
  --google        - Use Google Device OAuth2
  --keycloak      - Use Keycloak OAuth2 authentication

For example, to login using your Google credential, run the following command:

$ apc login --google

You may also want to try login using your basic auth credential:

$ apc login --basic

If the web console is not responding

If the web console is not responding, you may need to restart the web console job using the APC client.

To do this, run the command apc job restart /apcera::lucid, where lucid is the job name of the web console.

If you cannot target your cluster using APC

The APC command (apc target) assumes that the connection to your cluster is secured. If you opted out HTTPS configuration, you must include the http:// protocol prefix because APC defaults to HTTPS.

For example:

$ apc target sub-domain-name.apcera-platform.io

Results in the error:

Get https://api.auniquename.apcera-platform.io/v1/info: dial tcp 192.168.1.9:443: connection refused

You must use the following instead:

apc target http://sub-domain-name.apcera-platform.io

Resulting in:

Note: connections over HTTP are insecure and may leak secure information.
Targeted [http://sub-domain-name.apcera-platform.io]

Note that $ ./apcera-install status provides the target URL in case you need to remember the exact syntax.

Deployment errors

This section describes how to check the logs to troubleshoot deployment errors.

If you encounter deployment errors and need to check the logs

Apcera Platform logs output to the local log file in the working director, such as /apcera-install/logs/apcera-install.log. If you encounter a platform configuration, creation, or deployment error, the first place to check is the apcera-install.log file.

For deployment errors, you can SSH into the Orchestrator host and pull all log files, or SSH into a component host and check the logs there. See the management and operations documentation for instructions.

If deployment fails without reason

If you run ./apcera-install deploy and the deployment fails with the following error message:

[ Deploying Cluster ]
Orchestrator IP: [10.0.0.22]
Uploading cluster.conf...
Cleaning up old releases before the deploy...
Deploying... This might take a while, check "apcera-install.log" for details...
[ERROR] Failed to execute SSH: Process exited with: 1. Reason was:  ()
[FATAL] Failed to run command "deploy" : Process exited with: 1. Reason was:  ()

This means you are attempting to deploy an Apcera version (such as the latest promoted release) using an out-of-date apcera-install binary.

In this case the solution is to either update the apcera-install tool so that it is compatible with the latest promote release, or specify a particular version of the tool to deploy (other than the latest release).

Package import failure

If you receive an error during the apcera-install loadpkgs operation, you can build sample packages using the following method:

  1. Download the package script files from the apcera/package-scripts repository to a local working directory.

  2. Target your cluster and log in using APC.
    For example:
    $ apc target example.apcera-platform.io
    $ apc login --basic
    
  3. Change your working directory to where you downloaded the package scripts.

  4. Build each package using apc package build.

Cluster destroy issue

If destroy fails with errors

If the apcera-install apply operation was interrupted by Ctrl + C before its completion, the apcera-install destroy operation may fail to destroy the partially provisioned cluster resources.

For example, a timeout error or couldn't find resource error may be thrown:

* aws_subnet.secondary: Error deleting subnet: timeout while waiting for state to become 'destroyed' (last state: 'pending', timeout: 10m0s)

* aws_internet_gateway.apcera-install-gw: Error waiting for internet gateway (igw-b6d22fd2) to detach: couldn't find resource (31 retries)

If the destroy operation continues to fail, please use the cloud provider's management tool to manually delete the cloud resources. Refer to the /workspace/<platform>/terraform.tfstate file for the IDs of the resources that have been provisioned during the apply operation.

AWS errors

This section provides AWS troubleshooting tips.

If you cannot connect to your AWS cluster

If you receive the following errors:

Please set your AWS credentials in your environment or home directory.
See http://docs.apcera.com/install/prereqs/aws-prereqs/#set-keys for more details.
Failed to configure AWS provider: missing AWS credentials

This means you have not set your AWS access keys in your terminal session. You need to set your access keys to be able to connect to your AWS cluster.

If you receive an IAM permissions error

If you receive an UnauthorizedOperation error similar to the following:

* module.apcera-install-aws.aws_ebs_volume.metricslogs-redis: 1 error(s) occurred:

* aws_ebs_volume.metricslogs-redis: Error creating EC2 volume: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: ...
	status code: 403, request id: 9e760ff7-e682-4814-8a6c-61e1c5719f9d

This means you do not have policy permissions in AWS to create CloudFormation stacks. Make sure you added all the required policy for your IAM user.