Custom LUA HAProxy HealthCheck

I’ve recently been tasked with using HAProxy to front a LDAP server in AWS to allow SSL access to the LDAP backend. This was essentially following this guide from AWS. It is a great shame that AWS don’t offer SSL termination in the Directory Service itself, but the HAProxy/ELB solution isn’t too complex and does work well.

While testing out some failure scenarios I did however spot an issue which I didn’t like.

There exists a situation whereby if (for whatever unlikely reason) an HAProxy instance cannot reach any of its downstream LDAP backends the ELB health check of the HAProxy instance will still pass and thus the instance will still be considered healthy. The ELB will still send traffic to the instance, but then things go south because the instance cannot see any LDAP backends and the request fails.

I spent a bit of time googling for a solution and didn’t find anything which solved this, what we really want is for HAProxy to stop listening on the TCP port of the front end when all the backends are unavailable, however that isn’t an option.

The solution I came up with was to use a custom HTTP health check endpoint, which runs a simple LUA script inside HAProxy. This uses the new LUA support, added around to HAProxy around version 1.6.1. Not all the binaries I found had it compiled in as an option so we ended up using the official Docker container.

The way this works is you define a custom HTTP front end in your haproxy.cfg like this:

global
    # Lots removed for brevity!
    lua-load /usr/local/etc/haproxy/haproxy-smart-tcp-healthcheck.lua

# Custom HTTP Health check endpoint
frontend status-lua

    bind  *:8000
    mode http
    http-request use-service lua.status_service

# LDAP frontend
frontend ldap_front

    bind  *:1389
    description LDAP Service
    option socket-stats
    option tcpka
    timeout client 5s
    default_backend aws-ldap

# Downstream AWS LDAP backends
backend aws-ldap

    balance roundrobin
    server directory1 10.0.0.1:8081 check port 8081
    server directory2 10.0.0.2:8081 check port 8081
    option tcp-check</code>

This now runs the LUA service tcp_healthcheck defined in the file haproxy-smart-tcp-healthcheck.lua whenever a request is made for / on port 8000 .

The script itself turned out to be pretty simple, once I got my head around some LUA quirks!

All it does, is query the HAProxy state for all the servers associated with a given backend (named aws-ldap on line 15). Then we loop that list of backends, and if any of them are UP we return a 200 status with the message Found some servers up , if they are all down we return a 500 status with the message Found NO servers up .

-- This service checks all the servers in the named backend (see the 
-- backend_name var). If _any_ of them are up, it returns 200 OK. If 
-- they are all down it returns a 500 FAILED.
-- 
-- This is intended to be used as a HTTP health check from an upstream
-- load balancer, without this check the most intelligent health check 
-- that could be performed is a simple TCP check on the HAProxy frontend.
-- This would not fail in the event that HAProxy cannot see *any* of its
-- downstream servers

core.register_service("tcp_healthcheck", "http", function (applet)

    -- Harcoded backend here, if anybody knows how to pass vars into Lua
    -- from the haproxy.cfg please shout!
    local backend_name = "aws-ldap"
    local r = ""

    backend = core.proxies[backend_name]
    servers = backend["servers"]

    local any_up = false

    for k, v in pairs(servers) do

        status = v.get_stats(v)["status"]

        -- If _any_ of the servers are up, we will return OK
        if (status == "UP") then
            any_up = true
        end

    end

    if ( any_up ) then
        core.log(core.debug, "Found some servers up")
        applet:set_status(200)
        r = "OK"
    else
        core.log(core.debug, "Found NO servers up")
        applet:set_status(500)
        r = "FAILED"

    end

    applet:start_response()
    applet:send(r)

end)

With this all installed in the HAProxy instance it was a simple matter of configuring the ELB in AWS to use a HTTP health check for the path / on port 8000.

The AWS Console config for the ELB Health check looks like this:

alt text

And you can see the Listeners configuration is still in TCP passthrough mode, with TLS termination performed at the ELB.

alt text

The one optimisation I’d like to make to this but have not yet found a solution for, is to pass the name of the backend to query through to the script as an argument. Currently it is hardcoded into the script which would mean duplicating the script if you need more than one of these custom health checks, that would be a pain. However in our use-case the HAProxy is only fronting a single LDAP directory so we can live with this single hardcoded piece fo configuration!