Release Notes

Jul 01, 2021

Cluster (3.2.10)

  • Upgrade Notes
    • See separate issue ticket for how to migrate to AWS RDS PostgreSQL 12.5 Issue 57

  • Cluster changes
    • Changes done to how cgroups_initd is loaded to accomodate for changes introduced as part of Glibc version >= 2.32. Issue 104

    • Optimization of control-plane traffic implemented to only send summary for heartbeat instance health and dynamic dns info for virtual networks. And, using compression to further reduce size. Issue 31

    • Update of PM-DockerHandler implemented to enable connection reuse with v2 registry.

    • Fixed typos in ntp to prevent stratum oscillations with AWS Time Service.

    • Fix implemented in log formatting during Job Manager shutdown.

  • Chef changes
    • Disabling of excessive logging in graphite-web. Issue 49

    • Update Zabbix PostgreSQL to version 12.5. Issue 57

Orchestrator (2.2.2)

  • Orchestrator-cli to support fetching release metadata and contents from s3 buckets requiring authentication provided via IAM role. Issue 16

  • Added use of net.DialTimeout in Ping utility to avoid too long timeouts.

  • Updated orchestrator-cli ssh-config help msg.

APC (3.2.10)

  • No changes.

Web Console (9.1.0)

  • No changes.

Guide (3.2.10)

  • No changes.

Oct 07, 2020

Cluster (3.2.8)

  • Upgrade Notes
    • This release contains system-level package and security updates, including a updated Linux kernel. The cluster must be rebooted to fully apply this release APCERA-312

    • The kernel update process improved, forced removal of the old kernel is no longer required. Thus we simply install the desired kernel and update the bootloader configuration. APCERA-312

  • Cluster changes
    • Removed legacy implementation of job log. Including APIs GET /logs/:channel and GET /v1/logs/:channel. Superseded by GET /jobs/{uuid}/logs APCERA-257

    • Improve compatibility with OpenJDK and CentOS docker images APCERA-186 APCERA-266 APCERA-255 APCERA-267 APCERA-275

    • Minor fix to trace level log message formatting in websocket handling. continuum#7031

    • Resolved a issue where all metrics report 0 for some period of time. APCERA-131

    • Instance Manager no longer creates legacy implementation of virtual networks bridge. Improves stability of Instance Manager and Virtual Networks. APCERA-142 APCERA-285

    • Remove duplicate multi-line job diff in Job Manager log APCERA-192

    • Improve stability of Stagehand package uploads during deploy APCERA-214

    • Improved performance listing Virtual Networks with Local IPAM enabled when only a summary is required. GET /networks supports a new query param ipam=false which skips gathering detailed IPAM information. Setting this param exactly to false invokes the optimized response. This reduces load on Health Managers and improves performance to requesters. Linked API document details specifics. APCERA-182

  • Chef changes
    • Monitoring Server has been tuned to improve performance and stability. APCERA-180 APCERA-301 continuum-chef#14

    • Web server for Monitoring now supports longer hostnames. ENGT-13818

    • Deployment waits for Monitoring Server to successfully start. APCERA-180

    • Default to basic auth. continuum.auth_server.identity.default_provider default is now basic and continuum.auth_server.identity.google.enabled defaults to false APCERA-58 APCERA-148

    • Option continuum.sysadmin.kernel_type.ubuntu.trusty defaults to hwe_aws. Valid values are hwe_aws, hwe_3_2_2, hwe. This sepcifies the kernel "release train" for the cluster machines. It is not reccoemended to change this setting. APCERA-312

    • All nodes set sysctl kernel.pid_max to 4194304. This improves stability of the cluster. APCERA-247 APCERA-142 continuum#7032

    • Disable über-readahead /etc/init/ureadahead-other.conf upstart job. Improves cluster stability under contention. APCERA-247

    • Improved default handling on AWS when ntp.servers is empty in cluster.conf. Previously, this would default to 4 ntp pool public servers. Now, when servers is empty and ohai attribute ec2.instance_id is present, the default ntp.servers will be 169.254.169.123, the Amazon Time Sync Service. This requires (empty) ohai hint file at /etc/chef/ohai/hints/ec2.json which should be present in all current AMIs. APCERA-161 continuum-chef#13 continuum-chef#20

    • Improve timekeeping in virtual machines to prevent stratum oscillations. When 169.254.169.123 is present in ntp.servers, lab-verified best-practice configuration is applied. APCERA-161 continuum-chef#13 continuum-chef#20

    • HTTP Header X-Cc-Request-ID is now logged in router access logs APCERA-264

APC (3.2.8)

  • Improved performance of network list subcommand. Apc passes query param ipam=false to GET /networks. This command did not display the detailed IPAM data in apc output. APCERA-182

  • (osx only) support is now "best effort". apc is no longer packaged as a signed pkg file. A unsigned file is packaged with this release. Users may continue to use an older version of apc by setting environment variable APC_NO_UPDATES=1. This disables apc auto-update.

Web Console

  • Improved performance of listing Virtual Networks by passing query param ipam=false to GET /networks. APCERA-182

Guide

  • Job Manager cache configuration explanation is improved APCERA-51

  • Corrected a number of errors and omissions in API documentation and OpenAPI v2 (Swagger) spec issuetrackers/trs-and-stories#20

  • Added ipam=false query param to GET /networks for APCERA-182

Sep 20, 2019

Cluster (3.2.5)

  • Cluster changes
    • Fixed issue with NATS payload size limits being exceeded, causing e.g. apc network list fail with response code 500 (APCERA-69).

    • Fixed issues with excessive logging and scalability in large clusters introduced with 3.2.3 by refactoring of the DNS refresh functionality in the Instance Manager (APCERA-158, APCERA-182).

    • Fixed issue with TCP router deploy failures (APCERA-140).

    • Fixed issue in the Docker image downloader which over time could make the cluster slow and start reporting errors due to nats: timeout (APCERA-57).

    • Fixed issue with Job Manager failing to restart during cluster deploy when the Metricslogs host was replaced (APCERA-121).

    • Fixed issue with incorrect reporting of CPU metrics for jobs and cluster during deploy causing large spikes in CPU graphs (APCERA-145).

    • Fixed issue with Instance Manager logs being flooded during DNS refresh (APCERA-158).

    • Fixed issue with Jobs being prevented to restart when lacking update permission in policy (APCERA-143).

  • Chef changes
    • Added Zabbix monitoring of Open vSwitch related processes (APCERA-181).

    • Added rotating of Graphite logs to prevent running out of disk space (APCERA-85).

Orchestrator (2.2.2)

  • No changes.

APC (3.2.5)

  • No changes.

Web Console (9.1.0)

  • No changes.

Guide (3.2.5)

  • No changes.

Jul 11, 2019

Cluster (3.2.4)

  • Cluster changes
    • Fixed an issue with DNS lookup failures within virtual networks (APCERA-156).

  • Chef changes
    • No changes.

Orchestrator (2.2.2)

  • No changes.

APC (3.2.4)

  • No changes.

Web Console (9.1.0)

  • No changes.

Guide (3.2.4)

  • No changes.

May 13, 2019

Cluster (3.2.3)

  • Cluster changes
    • Fixed an issue with the HTTP request for network data which made e.g. the command apc network list very slow (APCERA-56).

    • Fixed an issue with jobs being incorrectly marked as errored on a second job failure. The issue appeared if a job were having two failures more than three days apart but with a period of running ok in between the failures (APCERA-59).

  • Chef changes
    • Fixed rotation for /var/log/zabbix and /var/log/glusterfs (APCERA-50).

Orchestrator (2.2.2)

  • No changes.

APC (3.2.3)

  • No changes.

Web Console (9.1.0)

  • No changes.

Guide (3.2.3)

  • No changes.

Oct 26, 2018

Cluster (3.2.2)

  • Cluster changes
    • Fixed a regression where the errored_state_window was not being honored for flapping jobs and such a job would not reach an errored state.

    • Increased the cache size in the Job Manager cache (for the Consul store), reducing contention. Also, the cache size for jobs and mapped FQNs (Providers, Services, Bindings) are now configurable through cluster.conf. This was impacting the download of Docker images.

    • Updated the Apcera Platform Linux kernel to release 4.4.0-135 which includes fixes to address CVE-2018-12233, CVE-2018-13094, and CVE-2018-13405 as per https://www.ubuntuupdates.org/package/canonical_kernel_team/xenial/main/base/linux-source-4.4.0

  • Chef changes
    • See note above in the cluster section on updating the Linux kernel to release 4.4.0-135.

Orchestrator (2.2.2)

  • No changes.

APC (3.2.2)

  • No changes.

Web Console (9.1.0)

  • No changes.

Guide (3.2.2)

  • Added note indicating the requirement to use Postgres 9.6 for the monitoring database when using AWS RDS.

  • Added note indicating the need to separately administrate utilization of resources by services that are outside the management scope of the Apcera Platform.

Aug 24, 2018

Cluster (3.2.1)

  • Cluster changes
    • NOTE: The 3.2.1 release has not been tested on OpenStack.

    • Corrected the defect so that clusters can be deployed AWS regions that require AWS S3 Signature version 4 for authentication

  • Chef changes
    • No changes.

Orchestrator (2.2.2)

  • No changes.

APC (3.2.1)

  • No changes.

Web Console (9.1.0)

  • No changes.

Guide (3.2.1)

  • Corrected multiple dead links in the documentation.

Mar 29, 2018

Cluster (3.2.0)

  • Upgrade Notes
    • Apcera Platform release 3.2.0 is an LTS release with significant platform changes. Before upgrading to this release, be sure to read the upgrade instructions.

  • Cluster changes
    • DEPRECATION NOTICE: The use of Gluster to provide cluster configured NAS for user applications running in the cluster is deprecated in the Apcera Platform 3.2 release. For user applications requiring HA NAS running in a cluster, Apcera's recommended configuration is to manage external NFS systems and configure an NFS provider for jobs running in the cluster, for example AWS EFS. (The use of Gluster in a cluster for package storage is unaffected by this announcement and is still supported.)

    • Job Rolling Restarts are now supported as an alternative job updating mechanism that support updating jobs with no downtime or loss in level of service.

    • Job routes now support UDP.

    • Enhanced job health monitoring is now available via TCP based "liveness" check for jobs with non-optional ports. The interval between checks is configurable via cluster.conf settings.

    • Fixed issue with Apcera installation error on Azure platform (Raw Terraform install).

    • Fixed issue with terraform bundle for different providers.

    • Updated API/Data changes for new UPD port type and route, including support for separate UDP router component.

    • Fixed bug regarding application performance degradation when writing to STDOUT.

    • Fixed issue with events server performance that can overload central host CPU and DOS the cluster.

    • Fixed issue where active events server clients can cause restart to timeout, and a deploy to fail.

    • Fixed issue with v3.0 / Virtual Network / Stale Discovery Address.

    • Fixed issue where PM gluster client log file names disrupt logrotate.

    • Apache packages configured to log original client IP.

    • Improved addition values in Job syslog output. When log forwarding is initialized, a 'marker' log message is now sent which includes the FQN of the job whose logs are being forwarded.

    • Improved netfilter logs.

  • Chef changes
    • Added sanity check in Chef to ensure that remote_subnets are CIDRs.

    • Implemented changes in Chef to support UDP router as separate component.

    • Added Chef support for CEP-MNO default route.

    • Add Chef support for TCP Liveness Probe variable.

Orchestrator (2.2.2)

  • Updated source compilation environment to use Go 1.9.4.

  • Minor bug fixes for Orchestrator's support for Vault.

  • Minor bug fixes for Orchestrator's backup command. (orchestrator-cli)

  • Minor bug fixes for handling network errors during reboot and deploy.

  • Orchestrator-cli is now included in the release package so that it can update itself prior to deploy.

  • Orchestrator-agent is now included in the release package and is installed during deploy.

APC (3.2.0)

  • Updated LDAP underlying library to support UTF-8.

  • Improved logging from apc to allow for better diagnostics when troubleshooting.

  • Fixed issue in apc so that encrypt value for job can be changed with curl.

  • Improved user accessibility of PM migration status.

  • Fixed an issue with the event subscription API where it was leaking user authorization tokens into logs.

  • Improved apc cluster usage report.

  • Added discovery address to information displayed in apc network show.

  • Fixed issue with apc download where in some cases it was not downloading the full package.

  • Improved exporting of quotas.

  • Fixed support of special characters in LDAP Password.

  • Fixed issue with unability to use policy matching Job FQN or Network FQN.

  • Fixed issue with internal server error after retiring a package using apc app create.

  • Fixed issue with setting https-only value using apc route update.

  • Updated default policy to display serviceParam encrypt.

  • Improved how apc job show command displays the routes.

  • Corrected behavior for apc login –batch option, which should fail if the user is already logged-in.

  • Fixed issue where apc app from package command may stage a package even if it isn't supposed to.

  • Improved how apc job show command displays whichh ports on a job are marked optional.

  • Improved how apc network show command displays the subnet pool name.

  • Fixed issue with apc job show nfs-admin.

  • Improved how apc app stats should handle unbound resources.

  • Fixed issue with apc job update job_name hard/soft tags remove should not require restart.

Web Console (9.1.0)

  • UDP ports can be added to jobs. UDP routes can be mapped to jobs on UDP ports.

  • Job Rolling Restarts added to jobs. Docker job and capsule may be created with rolling mode enabled. Instances table on job scheduling page shows two additional fields: number of instances up-to-date and running and instances that will be restarted. Online progress modal is shown in job update and multi-resource manifest upload when rolling mode on job is enabled.

  • Certificates/Secrets are now supported.

  • Autoscaling added to jobs.

  • Converted IEC unit to SI unit for reporting RAM and Disk usage.

  • Added network graph to cluster page.

  • Instance manager table can be filtered by tags.

  • Added confirmation restart prompt when enabling and disabling ssh.

  • A job is now restarted when an environment update fails.

  • Added more repositories from where you can create a job using a docker image.

  • Password field for a job is now more secure.

  • Enhanced job graphs.

  • Added OS version in help section.

  • Cross-site requests using credentials like cookies can be enabled using environment variable on lucid job.

  • Success/Error notification messages are always stick on top of the window.

  • Empty environment variables are now ignored.

  • Tutorial links now link to the correct site.

Guide (3.2.0)

  • Updated API documentation for rolling restart functionality, including new flags in existing APC commands.

  • Added documentation for configuring and managing rolling restarts of jobs.

  • Updated API documentation for working with UDP ports/routes.

  • Added documentation for configuring and managing UDP ports/routes in APC, MRM and web console.

  • Added documentation for configurating TCP Liveness Probe checks.

  • Updated documentation about new orchestrator-cli capabilities.

  • Updated documentation about Azure Terraform installation sections.

  • Removed Apcera Install CLI (beta) documentation.

  • General documentation corrections throughout for typos, etc.

  • Updated documentation to make exporting serviceParams clear.

Mar 01, 2018

Cluster (3.0.3)

  • Cluster changes
    • NOTE: The 3.0.3 release has not been tested on OpenStack.

    • Updated the kernel with OS vendor latest release that address the Meltdown vulnerability.

    • Improved timeout and retry logic when checking AWS S3 for package resources.

  • Chef changes
    • No changes.

Orchestrator (2.0.21)

  • No changes.

APC (3.0.3)

  • No changes.

Web Console (9.0.0)

  • No changes.

Guide (3.0.3)

  • No changes.

Oct 23, 2017

Cluster (3.0.1)

  • Cluster changes
    • NOTE: The 3.0.1 release has not been tested on OpenStack.

    • Fixed issue in virtual networks where DNS entries for a job's discovery address would become stale. This would occur if a member job in a virtual network was deleted, recreated, and re-joined to the same virtual network.

    • Fixed issue where PPIDs were incorrectly parsed in /proc/<pid>/stat.

    • Fixed policy error when joining a job to a virtual network for some policy configurations.

  • Chef changes
    • No changes.

Orchestrator (2.0.22)

  • Fixed issue when restoring an Orchestrator database backup using the orchestrator-cli restore command.

  • Improvements to orchestrator-cli backup command.

APC (3.0.1)

  • Fixed issue with data reported by the apc cluster usage command. Previously, usage reports for a given day would change depending on when the query was executed. This was due to side-effects of the data retention policy configured for the cluster's metrics storage system (Graphite). Note that the new command will only display cluster usage that occurred since the cluster was updated to this release (3.0.1). See the APC documentation for more information.

Web Console (9.0.0)

  • No changes.

Guide (3.0.1)

  • Added Metrics API documentation.

  • Updated NFS service gateway documentation.

Sep 22, 2017

Cluster (3.0.0)

  • Upgrade Notes
    • Apcera Platform release 3.0.0 is an LTS release with significant platform changes. Before upgrading to this release, be sure to read the upgrade instructions.

    • The Apcera-provided Terraform modules have been updated, including the retirement of previous generation instance types in favor of new ones (for example,replacing the M3 type with T2 for AWS). Note that these will be destructive changes if you download and use these updated modules to perform the upgrade. The recommendation is to upgrade using your existing Terraform modules and then migrate to the new instance types over time.

    • If you previously enabled local IPAM (beta) for your 2.6.x installation, you will need to disable local IPAM (by commenting it out) and revert to global IPAM before upgrading to release 3.0.

    • Apcera Platform release 3.0.0 features a new component store component that improves availability. If you want to migrate from Store 2 to Store 3, you should upgrade the 3.0.0 first, then migrate at a later time.

  • Cluster changes
    • Added container log truncation which prevents logs from growing more than 10MB.

    • Job Autoscaling added as part of the platform.

    • Added subnet pools for virtual networks.

    • Fixed an issue where the JM would incorrectly state a job update contained no changes when certain environment variable changes were made.

    • Fixed an issue where soft negative scheduling tags were not being applied correctly.

    • Added new OvS driver for virtual networks.

    • Integrated with Hashicorp Vault backed by Consul for secure storage of cluster secrets. This first phase of integration stores component keys, database passwords and (optionally) external auth server connection credentials.

    • Return empty HTML pages on HTTP Router errors. Previously used Apcera-branded pages.

    • Fixed issue where instance errors could permanently penalize IM and introduce scheduling artifacts.

    • Added an event message for decreasing a job's instance count. (there was only one for increasing before).

    • Added the "domain" endpoints for installing, uninstalling and listing (POST, DELETE and GET, respectively) certificates and private-keys for domains on the router.

    • Added the subnet pool resource for configurable virtual networks. Supports POST, DELETE and GET actions on the resource.

    • Significantly sped up the /v1/version endpoint.

    • When updating a job (i.e. PUT /v1/jobs/:uuid:), if there are no changes to the job, you receive an HTTP 200 response with the unmodified job instead of an error.

    • Added the 'secret' set of endpoints for certificate/secret functionality. Supports POST, GET and DELETE actions for the importation, listing and deletion of secrets/certificates.

  • Chef changes
    • If you have deployed an APCFS high-availability file system, this release will upgrade GlusterFS from version 3.7.8 to version 3.8.12 and Ganesha NFS from version 2.3.0 to version 2.4.5.

    • Added some missing certificate authorities to the system CA list, requiring for validating connections to some external services signed by those CAs.

    • Correct typo in splunk-forwarder tag, when untagging.

    • Deploy, configure and populate Hashicorp Vault. Migrates component keys and database password out of orchestrator/chef database and cluster file system and into Vault.

    • Orchestrator version updated to 2.0. This version of orchestrator includes Vault support.

    • Introduce new dynamic taint adjustment options.

    • Allow for the forced rotation of router http access logs.

    • Updated Splunk (where used) to version 6.5.3.

Orchestrator (2.0.7)

  • Improved 'Downloading' progress message.

  • Retained CA key and database password between multiple deploys.

  • Deploy command outputs a warning message if audit logging on Vault cannot be enabled.

  • Fixed an issue vault status check during deploy.

  • Enable vault audit logging to syslog.

  • Fixed an issue in orchestrator agent that causes the agent process to panic.

  • Fixed an issue in teardown command when machine number cannot be reclaimed.

  • Reusable valut token is employed to replace one-time use vault token during deploy.

  • Enabled Consul backend functionality, set comp ACL.

  • Add secret/encfs to jm and im vault permissions.

  • Updated the Zabbix token on deploy and refresh commands.

  • Updated Component secret reliability

  • Fixed a bug in deploy. Orchestrator does not exit after a Chef error.

  • Fixed an issue in reclaiming machine number when rescaling cluster down.

  • Enable Consul secret backend functionality.

  • Add secret encfs to JM and IM Vault permissions.

  • Increased IM ID limit to 4096.

  • Added the ability to store component secrets in vault.

  • Fixed an issue in log collection.

  • Fixed an issue in backup command.

  • Support components revoking already-used tokens.

  • Removes the refresh-vault-token command and implements the security command.

  • Added configuration of Vault, including enforcement of cluster passphrase and encryption/decryption of persisted answers.

  • Fixed chef output log collection issue.

APC (3.0.0)

  • Added multiple commands (apc subnet pool create/delete/list/show) associated with the newly defined subnet pool resource.

  • Updated the apc network create command to take in a user specified subnet pool (--pool).

  • Fixed an issue where apc job list output would be indeterministic of a job was an more than one of app gateway stager pipeline.
  • Fixed a bug where apc app delete would consider the app name in a manifest file, but not the namespace.

  • Fixed a bug which caused temporary files to be left on the user's machine after updating APC.

  • Updated multiple APC commands to consistently use flags -i, –instance-id when specifying an instance id.

  • Fixed an issue where app deploy –keep-previous=false would not remove the old package if the app was stopped.

  • Updated APC help for default route naming scheme.

  • Add ANSI terminal emulation support on Windows for a better experience, especially when connected to a Linux container.

Web Console (9.0.0)

  • Added new UI for creating routes and mapping routes to jobs.

  • Added new ability to manage secrets/certificates.

  • Added UI for configuring job auto-scaling.

  • Docker launcher UI does not include a curated list of Docker images, in prior releases.

  • Added the cluster OS version to the Help popup menu.

Guide (3.0.0)

  • New API documentation generated from OpenAPI specification.

  • Updated API documentation to include "v2" endpoints and new API features, including for managing secrets and routes.

  • API documentation is now auto-generated from the OpenAPI specification.

  • Added documentation for configuring and managing secret storage and encryption.

  • Added documentation for managing SSL certificates and keys.

  • Added documentation for data encryption at rest.

  • Added new architecture diagram.

  • Added documentation for configurable networks.

  • Added documentation for configuring job auto-scaling.