Working with Semantic Pipelines

A semantic pipeline is a job that mediates connections between jobs and services with awareness of wire protocol semantics. Access to any service or service resource can be configured to require a semantic pipeline. Clusters can have multiple semantic pipelines, each keyed to a specific protocol. Apcera comes with semantic pipelines for MySQL, PostgreSQL, Memcache, Redis, and HTTP. A semantic pipeline understands the associated protocol. It accepts client connections, authenticates the connections locally, then establishes connections onward to the backing servers using separate credentials that are ephemeral.

Overview

The semantic pipeline passes data between the client and server, based on decisions made by handlers for specific events. An event is any significant occurrence for the protocol associated with that semantic pipeline. For example, for databases there are events for SQL commands like DELETE, DROP, INSERT, and UPDATE. An event handler is a job or external service that the semantic pipeline interacts with when the associated event occurs.

For example, you can provide an event handler that gets called when a PostgreSQL command of DELETE or DROP comes from a job and is headed for a service of type postgres. In connecting that event handler to the semantic pipeline, you may tell the semantic pipeline to wait for a go-ahead before proceeding. An event handler that interacts that way is called a hook. On the other hand, if the command is a SELECT, you might want the semantic pipeline to pass the command to the database right away without waiting for the event handler to respond. That kind of event handler interaction is called a notification. You might use it for logging information about database requests.

A semantic pipeline is never used for multiple jobs. For performance reasons, the system creates an instance of a semantic pipeline and pairs it with the bound job.

Semantic pipeline types

Apcera Platform provides several types of semantic pipelines that you can use out-of-the-box. Using a semantic pipeline, you can write event handlers for custom actions in the language of your choice. Examples include inserting latency, custom logging, and dynamic administrative access controls.

Apcera Platform provides the following semantic pipelines:

Type Description Protocol Support Ephemeral Credentials
http Direct internet connection using a semantic pipeline. Supports HTTP/1.1 or HTTP/1.0 with a Host header. No
memcache Memcache caching service. Supports both ASCII and binary variants of the Memcache protocol. The Command values reported in the Event payload are set to one of the following values: UNKNOWN, GET, SET, ADD, REPLACE, APPEND, PREPEND, INCRDECR, QUIT, FLUSH, NOOP, VERSION, STATS, VERBOSITY, TOUCH, DELETE, SASL_LISTMECH, SASL_AUTH, or SASL_STEP. Note that a number of command variants are mapped to a single value; for example, get/gets/rget all map to GET; similarly, set/rset/cas all map to SET. TLS and SASL are currently not supported. No
mysql MySQL database server. Supports MySQL protocol version 4.1 and later. Yes
postgres PostgreSQL database server. Supports PostgreSQL protocol v3.0, used by PostgreSQL 7.4 and later. The semantic pipeline expects the PostgreSQL database to have a TLS certificate installed, such as the default snakeoil certificate. If the database is configured without any TLS support, the semantic pipeline won't be able to connect to it; the SP always up-levels the database connection to use TLS. Yes
redis Redis key-value store. Supports the Redis protocol. TLS is not currently supported. Yes

Disabling semantic pipelines

By default when a job is bound to a service of a type that supports semantic pipelines, the system automatically creates a semantic pipeline job to mediate the service binding connection. When a semantic pipeline is generated, the only way you can connect to the service is by using the ephemeral credentials provided by the service binding.

Starting with Apcera release 2.4.0, you can disable automatic semantic pipeline generation using the policy claim type sp.disable on the job::/ realm. If such policy is in place a semantic pipeline is not generated for access to that service. If you disable the semantic pipeline via policy, you will be able to connect to the service using either the binding credentials or the credentials used to setup the provider.

See disabling semantic pipelines for details.

Using event handlers

Semantic pipelines communicate with event handlers using HTTP and JSON. Apcera also provides simple event handlers for hooks and notifications directly within semantic pipelines for fast-path evaluation of simple rules.

The way to connect an event handler with a semantic pipeline is to create a rule. Rules are associated with a given service or service resource. You develop the event handler code to listen at a certain URL, and you specify the URL as an argument to the command that creates the rule. The semantic pipeline uses that URL to deliver events to the event handler. The form of an event is a JSON structure that tells the handler what triggered the call and supplies the associated protocol-specific data. The event handler responds with a JSON structure that either permits or prohibits the requested operation and says why.

A semantic pipeline can generate a notification to an event handler that requires one at several points in the lifecycle of a client-service resource request:

  • Pre-hook — before sending the command to any event handler. The notification includes all commands issued by the client.
  • Post-hook — after sending the command to enough event handlers to know if the command is allowed or not, but before the command's request is passed to the service or denied. The notification includes the results of the hook evaluation; the payload is examined to determine the permission status.
  • Round-trip (not available with all protocols) — when a response is received from the service. The protocol determines whether the roundtrip notification is at the beginning or end of receiving a response. If the command is denied by hooks, then the semantic pipeline does not send a roundtrip notification for that command. This feature is not available with, for example, mysql or redis.

Writing semantic pipeline rules

A semantic pipeline rule associates the event handler with a semantic pipeline.

Operations allowed on rules include the following:

Rule Description
create Creates a rule governing behavior of a semantic pipeline for a service resource
delete Deletes a rule created on a semantic pipeline
list Lists all available rules for the semantic pipeline
show Shows detailed information about a rule and the service-service resource pair upon which it acts

The apc rule create command adds an event handler to a semantic pipeline.

apc rule create <service-name> [command-specific-options]

Run apc rule create -h to see the possible arguments.

The following table shows commands you can include in rules directed at supported protocols:

Protocol List of Commands
PostgreSQL any SQL command, e.g. SELECT, DROP, CREATE, INSERT, etc.
MySQL any SQL command
Memcache UNKNOWN GET SET ADD REPLACE APPEND PREPEND INCRDECR QUIT FLUSH NOOP VERSION STATS VERBOSITY TOUCH DELETE SASL_LISTMECH SASL_AUTH SASL_STEP
Redis Any Redis command listed here
HTTP http

For Memcache, a number of similar commands are flattened down to one common command name for specifying hooks. Thus, for instance, any of get/gets/rget are presented as GET, while any of set/rset/cas are presented as SET.

Rule examples

Here are some examples of semantic pipeline rules.

In addition to the following examples, refer to creating semantic pipeline rules.

apc rule create denydelete-PG --service pgdb  -t hook --commands delete
apc rule create my_pg_rule -t notification --provider postgres --service mydb -u http://site/url --stage post

JSON example

Here is a sample of what a semantic pipeline might send to a hook event handler:

{
  "Text": "create table foo(a varchar(32), b integer);\u0000",
  "Datastore": "apcera1",
  "ClientInfo": {
    "ClientAddress": "127.0.0.1:50408",
    "ClientUser": "barney",
    "DatastoreUser": "apceratest1"
  },
  "Command": "CREATE",
  "Parameters": null
}

The event handler must respond with a JSON structure containing the following:

Response Field Type Description
Permitted boolean Decides whether or not the request is allowed
Reason string Included in the logs

For example:

{
 "Permitted": true,
 "Reason": "creation allowed"
}

Example of a semantic pipeline payload for a roundtrip notification:

{
  "Text": "create table foo(a varchar(32), b integer);\u0000",
  "Datastore": "apcera1",
  "ClientInfo": {
    "ClientAddress": "127.0.0.1:50408",
    "ClientUser": "barney",
    "DatastoreUser": "apceratest1"
  },
  "Command": "CREATE",
  "Parameters": null,
  "AuthKnown": true,
  "AuthResult": true,
  "ReceivedTime": "2013-09-10T19:06:17.303518486Z",
  "TotalLatency": 19545783,
  "ThisFrameLatency": 19514909,
  "PreHookLatency": 30874,
  "ServerResponseLatency": 17051624,
  "HookLatencies": [
    {"Hook": "deny_1", "Latency": 789},
    {"Hook": "drop_2", "Latency": 6179},
    {"Hook": "http://localhost:12345/auth", "Latency": 2446435}
  ],
  "HookTotalLatency": 2463285
}

Response codes

The event handler also uses the following response codes:

Response Code Response Meaning Description
200 Received Request received
Any other Denied Malformed payload
  • A URL can use any of these schemas: http:, https:, syslog:
  • Only the http: and https: schemas can be used for hooks.
  • URLs can point to anything reachable, inside or outside Apcera.
  • Response fields are, per JSON, case-sensitive.