Configuring Gluster

This section describes how to configure your Apcera Platform Enterprise Edition to use Gluster and nfs-ganesha for high-availability storage of user data and platform packages.

Overview

Gluster provides a reliable, scalable, and flexible distributed file system. Apcera utilizes Gluster to provide:

  • APCFS-HA - highly available NFS services for Apcera Platform jobs and user data
  • Gluster Package Storage Backend - highly available file system for storing Apcera Platform Packages.

APCFS-HA Overview

The architecture diagram below shows an example of HA NFS server configured for a cluster.

diagram

The diagram shows highly available NFS-Ganesha using Gluster as its backend system. NFS-Ganesha is a user-mode file server for NFS (v3, 4.0, 4.1 pNFS) and 9P from the Plan9 operating system, and it solves the performance challenge that many of the user-space filesystems had.

Gluster Package Storage Backendi Overview (Technical Preview)

Starting with Apcera release 2.6.0, Gluster is the preferred package storage backend for new clusters. Gluster is
scheduled to supersede Riak-CS in a future release.

Provisioning Gluster Services

To add Gluster based services to an Apcera cluster, you simply need to provision three (or a multiple of three) machines with the proper tag, each with a dedicated disk of equal size for file storage.

  • Machines tagged gluster-server provide APCFS-HA. The recommended volume size is 200 GB per VM.
  • Machines tagged cluster-object-storage provide Gluster Package Storage Backend. The recommended volume size depends on the size of your cluster.

Tagging a machine with gluster-server and cluster-object-storage is not supported.

Note: Apcera provides Terraform module (gluster.tf) to automate the provisioning of Gluster server resources for AWS. Refer to the AWS installation instructions for more details about using Terraform to create AWS resources.

Configuring APCFS-HA via Orchestrator (cluster.conf)

This section describes highly available NFS server configuration details for your Apcera deployment. You do this by updating the cluster.conf file as described below.

Machines

Under the machines schema, the host IPs of the Gluster servers needs to be listed as shown in the example below:

machines: {
  ...
  # Gluster, for HA NFS
  gluster: {
    hosts: ['10.0.0.160', '10.0.1.50', '10.0.2.241']
    suitable_tags: [
      "gluster-server"
  ]
}

Components

Add the component count for the Gluster servers: gluster-server: <# of gluster servers>. For example, if you are adding three Gluster servers to your cluster, the component schema should include gluster-server keyword as shown below:

components: {
  ...
  # Gluster must run on a multiple of three hosts for replication
      gluster-server: 3
  ...

Chef

Those dedicated disks for the Gluster servers must have purpose gluster-brick0 under the chef schema as shown in the following example.

chef: {
  "continuum": {
    ...

    # Specify mount settings. The Instance Manager uses LVM for container
    # volumes. This sets the default to be /dev/xvdb.
    "mounts": {
      ...

      "gluster-brick0": {
        # TERRAFORM OUTPUT: gluster-device
        "device": "/dev/xvds"
	      "logical_volumes": {
          "gluster-brick0": {
	           "thin_provision": "150G"
	        }
	      }
      }
      ...

This entry controls what portion of the raw disk volume is available as data is modified, and the rest is reserved for snapshot backups. For example, if you provisioned 200GB of volume for each Gluster server, and set thin_provision to 150GB, it leaves 50GB of the volume reserved for snapshot backups.

NOTE: Changing this value after initial deployment requires re-provisioning the Gluster hosts.

vSphere example configuration

The following shows an example vSphere configuration for Gluster:

Machines

machines: {
  # The gluster machines are used for HA storage for NFS service within the cluster.
  gluster: {
    cpu:    2
    memory: 4096
    disks: [
      { size: 200, purpose: gluster-brick0 }
    ]
    suitable_tags: [
      "gluster-server"
    ]
  }
}

Components

components: {
  # Gluster must run on a multiple of three hosts for replication
  gluster-server: 3
}

Chef

chef: {
  "continuum": {
    "mounts": {
      "gluster-brick0": {
        "logical_volumes": {
          "gluster-brick0": {
            "thin_provision": "150G"  # Recommended value: 75% of the raw disk size. This is the size available for file storage, vs overhead for snapshots, etc.
          }
        }
      }
    } # mounts
    "gluster": {
      # See [Snapshot Backups](http://localhost:4567/installation/config/gluster/#snapshot-backups) below before configuring.
      # "enable_snapshots" = true
    } # gluster
  } # continuum
} # chef

Configuring Gluster Package Storage Backend via Orchestrator (cluster.conf)

This section describes Gluster Package Storage Backend configuration details for your Apcera deployment. You do this by updating the cluster.conf file as described below.

Machines

Under the machines schema, the host IPs of the Gluster Package Storage Backend servers needs to be listed as shown in the example below:

machines: {
  ...
  cluster_storage: {
    hosts: ['10.0.0.161', '10.0.1.51', '10.0.2.242']
    suitable_tags: [
      "cluster-object-storage"
  ]
}

Components

Add the component count for the Gluster Package Storage Backend servers: cluster-object-storage <# of Gluster Package Storage Backend servers>. For example, if you are adding three Gluster Package Storage Backend servers to your cluster, the component schema should include cluster-object-storage keyword as shown below:

components: {
  ...
  # Gluster Package Storage Backend be on a multiple of three hosts for replication
      cluster-object-storage: 3
  ...

Chef: Mounts

Dedicated disks for Gluster Package Storage Backend servers must have purpose cluster-object-storage-brick0 under the chef.continuum.mounts schema as shown in the following example. These servers do not use LVM.

chef: {
  "continuum": {
    ...
    "mounts": {
      ...

      "cluster-object-storage-brick0": {
        "device": "/dev/xvds"
      }
      ...

vSphere example configuration

The following shows an example vSphere configuration for Gluster Package Storage Backend:

Machines

machines: {
  cluster_storage: {
    cpu:    2
    memory: 4096
    disks: [
      { size: 200, purpose: cluster-object-storage-brick0 }
    ]
    suitable_tags: [
      "cluster-object-storage"
    ]
  }
}

Components

components: {
  # Gluster Package Storage Backend must run on a multiple of three hosts for replication
  cluster-object-storage: 3
}

Diagnostic Commands

Here are some diagnostic commands you can run when checking the status of Gluster server and its volumes.

Command Description
gluster pool list Shows servers known to Gluster and their current connection state
gluster volume info Shows configuration information on a volume, including all configured bricks
gluster volume status Shows active status of a volume, including all running bricks
gluster volume rebalance Migrate data between servers to rebalance the disk usage. Gluster allocates entire files to a set of servers, so over time one set of servers can end up with more data than others and a rebalance may be necessary to redistribute the data.
gluster volume heal <name> info Check the health status of Gluster, reporting any files that currently do not have the expected number of replicas in the Gluster

Snapshot Backups

This section describes Gluster features for snapshotting data. Snapshot settings only apply to APCFS-HA.

Snapshot Value

Gluster snapshot backups provide point-in-time snapshots of the contents of a Gluster volume.

Caveats

Gluster snapshots are implemented in the underlying storage layer of each brick with LVM thin provisioning and LVM snapshots. Unfortunately Gluster does not handle well situations where individual bricks have mismatched sets of LVM snapshots. Under normal operation this is not a problem, but any time a host is replaced this causes various Gluster failures. The workaround for this is BEFORE running a deploy to replace any hosts, you must delete all existing snapshots. To do this, log in to one of the Gluster server machines as root, and run gluster snapshot delete all.
Because of these issues and risks, Gluster snapshots are not enabled by default.

Enabling Snapshots

To enable snapshot backups, set the following entry in the cluster.conf:

chef: {
  "continuum": {
    ...
    "gluster": {
        "enable_snapshots" = true
   }
  }
}

Once enabled, a Gluster snapshot will be created daily, and a snapshot will be created during every cluster deploy.

Using a snapshot

Because snapshots impact an entire volume, and multiple Apcera NFS services live within a single volume, we cannot use Gluster's snapshot restore mechanism without impacting multiple NFS services. Instead, to restore data from a previous snapshot, you must mount the snapshot and copy the data into the live volume. The process to do so will look like the following:

1) Identify the unique sub-directory of the Gluster volume that contains your data. The simplest way to do this is to create a capsule, bind that capsule to your NFS service, and look at the mounted NFS partition:

$ apc capsule create -i linux identify-nfs-service

$ apc service bind my-nfs-service -j identify-nfs-service

$ apc capsule connect identify-nfs-service

Inside the capsule:

# df | grep nfs
apcfs-ha.localdomain:/apcfs-default/5ab0dca3-c48c-4d24-8505-9f2064379bcc8c425fbb-29fb-4547-8100-4dbab27462c9 154687488 505856 146300928   1% /nfs/apcfs-default
# exit

Delete the capsule.

$ apc capsule delete identify-nfs-service

Note: The sub-directory in the example above is 5ab0dca3-c48c-4d24-8505-9f2064379bcc8c425fbb-29fb-4547-8100-4dbab27462c9

2) Login to the gluster-master server and determine which snapshot you wish to mount and recover data from.

On the orchestrator:

# orchestrator-cli ssh /gluster-master

On the gluster-master:

# sudo gluster snapshot list
apcfs-default_GMT-2016.03.15-18.40.15

3) Activate and mount the snapshot volume on the gluster-master host.

# sudo gluster snapshot activate apcfs-default_GMT-2016.03.15-18.40.15

# sudo mkdir /mnt/snapshot

# sudo mount -t glusterfs localhost:/snaps/apcfs-default_GMT-2016.03.15-18.40.15/acfs-default /mnt/snapshot

4) Verify the snapshot is the data you want to restore by inspecting the contents of the subdirectory you identified earlier, e.g. /mnt/snapshot/5ab0dca3-c48c-4d24-8505-9f2064379bcc8c425fbb-29fb-4547-8100-4dbab27462c9

5) Create a snapshot of the current data, mount the live Gluster volume and restore the old data.

# sudo gluster snapshot create apcfs-default apcfs-default
snapshot create: success: Snap apcfs-default_GMT-2016.03.15-18.40.15 created successfully

# sudo mkdir /mnt/apcfs-default

# sudo mount -t glusterfs localhost:/apcfs-default /mnt/apcfs-default

# cd /mnt/snapshot

# sudo rsync -av 5ab0dca3-c48c-4d24-8505-9f2064379bcc8c425fbb-29fb-4547-8100-4dbab27462c9/ /mnt/apcfs-default/5ab0dca3-c48c-4d24-8505-9f2064379bcc8c425fbb-29fb-4547-8100-4dbab27462c9/

6) Unmount the gluster volume and snapshot.

# sudo umount /mnt/apcfs-default
# sudo umount /mnt/snapshot

Scaling

This section describes how to scale up and scale down Gluster servers.

Scale Up

You must remove all snapshots before attempting a scale-up operation by running gluster snapshot delete all. See Snapshot Backups Caveats.

Adding more capacity to a cluster can be done easily by adding three or more (a multiple of three) servers. The disk sizes should be the same as the existing servers. If you want to increase the disk size, you can do so on an individual server basis by replacing one server with a new server at a time. Between server replacements, you must wait until data replication completes and the new server is fully populated. The gluster volume heal <name> info command listed above can be used to check that status.

Scale Down

Scaling a Gluster cluster down is a delicate operation; therefore, it is NOT AUTOMATED, Apcera strongly recommends engaging Apcera Support to assist in this operation.