I’ve recently been tasked with using HAProxy to front a LDAP server in AWS to allow SSL access to the LDAP backend. This was essentially following this guide from AWS. It is a great shame that AWS don’t offer SSL termination in the Directory Service itself, but the HAProxy/ELB solution isn’t too complex and does work well.
While testing out some failure scenarios I did however spot an issue which I didn’t like.
There exists a situation whereby if (for whatever unlikely reason) an HAProxy instance cannot reach any of its downstream LDAP backends the ELB health check of the HAProxy instance will still pass and thus the instance will still be considered healthy. The ELB will still send traffic to the instance, but then things go south because the instance cannot see any LDAP backends and the request fails.
I spent a bit of time googling for a solution and didn’t find anything which solved this, what we really want is for HAProxy to stop listening on the TCP port of the front end when all the backends are unavailable, however that isn’t an option.
The solution I came up with was to use a custom HTTP health check endpoint, which runs a simple LUA script inside HAProxy. This uses the new LUA support, added around to HAProxy around version 1.6.1. Not all the binaries I found had it compiled in as an option so we ended up using the official Docker container.
The way this works is you define a custom HTTP front end in your haproxy.cfg like this:
global # Lots removed for brevity! lua-load /usr/local/etc/haproxy/haproxy-smart-tcp-healthcheck.lua # Custom HTTP Health check endpoint frontend status-lua bind *:8000 mode http http-request use-service lua.status_service # LDAP frontend frontend ldap_front bind *:1389 description LDAP Service option socket-stats option tcpka timeout client 5s default_backend aws-ldap # Downstream AWS LDAP backends backend aws-ldap balance roundrobin server directory1 10.0.0.1:8081 check port 8081 server directory2 10.0.0.2:8081 check port 8081 option tcp-check
This now runs the LUA service
tcp_healthcheck defined in the file
haproxy-smart-tcp-healthcheck.lua whenever a request is made for
/ on port
The script itself turned out to be pretty simple, once I got my head around some LUA quirks!
All it does, is query the HAProxy state for all the servers associated with a given backend (named
aws-ldap on line 15). Then we loop that list of backends, and if any of them are
UP we return a 200 status with the message
Found some servers up, if they are all down we return a 500 status with the message
Found NO servers up.
-- This service checks all the servers in the named backend (see the -- backend_name var). If _any_ of them are up, it returns 200 OK. If -- they are all down it returns a 500 FAILED. -- -- This is intended to be used as a HTTP health check from an upstream -- load balancer, without this check the most intelligent health check -- that could be performed is a simple TCP check on the HAProxy frontend. -- This would not fail in the event that HAProxy cannot see *any* of its -- downstream servers core.register_service("tcp_healthcheck", "http", function (applet) -- Harcoded backend here, if anybody knows how to pass vars into Lua -- from the haproxy.cfg please shout! local backend_name = "aws-ldap" local r = "" backend = core.proxies[backend_name] servers = backend["servers"] local any_up = false for k, v in pairs(servers) do status = v.get_stats(v)["status"] -- If _any_ of the servers are up, we will return OK if (status == "UP") then any_up = true end end if ( any_up ) then core.log(core.debug, "Found some servers up") applet:set_status(200) r = "OK" else core.log(core.debug, "Found NO servers up") applet:set_status(500) r = "FAILED" end applet:start_response() applet:send(r) end)
Link to gist here.
With this all installed in the HAProxy instance it was a simple matter of configuring the ELB in AWS to use a HTTP health check for the path / on port 8000.
The AWS Console config for the ELB Health check looks like this:
And you can see the Listeners configuration is still in TCP passthrough mode, with TLS termination performed at the ELB.
The one optimisation I’d like to make to this but have not yet found a solution for, is to pass the name of the backend to query through to the script as an argument. Currently it is hardcoded into the script which would mean duplicating the script if you need more than one of these custom health checks, that would be a pain. However in our use-case the HAProxy is only fronting a single LDAP directory so we can live with this single hardcoded piece fo configuration!