Mangling Apache2 usernames with mod_perl

1 Comment »

I have a Kerberos-authenticated network connected to Active Directory as the KDC. AD is case-insensitive but case-preserving, so clients might log in to my web server using principals in any number of cases; they might be danny or DANNY or Danny. So, to make applications on my sane Linux platform – where differnet characters are actually different – I wrote a quick mod_perl authorization module to force the case down.

The module – at /srv/www/html/test_auth/perl/Danny/ – looks like:

package Danny::loweruser;

use strict;
use warnings;

#use Apache2::Access;
use Apache2::RequestRec;

use Apache2::Const -compile => qw(OK HTTP_UNAUTHORIZED);

sub handler {
my $r = shift;
$r->user( lc $r->user() );
return Apache2::Const::OK;


And the Apache config looks vaguely like this:

PerlSwitches -I/srv/www/html/test_auth/perl

#PerlResponseHandler ModPerl::Registry
PerlAuthzHandler Danny::loweruser

AuthType basic
AuthName "Auth test"
AuthBasicProvider file
AuthUserFile "/srv/www/html/test_auth/userdb"
Require valid-user

Yeah, everything is in one directory. That’s not a secure way to do things; it’s a way to quickly test and easily clean up later. :)

I was testing with a PHP page that printes $_SERVER[“PHP_AUTH_USER”], so I’ve consequently pulled out most of my hair, because that stupid variable pulls from the HTTP headers instead of what the web server actually provides; it stays upper-case. The REMOTE_USER key is what I really wanted from the array. What I get from that is that PHP documentation is evil and anyone who uses PHP_AUTH_USER should be banned from writing code.

Actually, I’m not entirely clear on how PHP_AUTH_USER is getting set, and it’s late enough that I’m not going to dig in to it. This will work properly with anything that actually uses REMOTE_USER, per “everything since the 90’s except for PHP”, so it’ll be fine for what I actually needed.

gluster replication peer failure

No Comments »

Just a quick post because I couldn’t find anything on this in a quick Google search.

I was in the process of migrating data in a replicated volume from one machine to another when the destination machine was interrupted (it was actually rebooted by an automated process kicked off by another admin; that’s what poor communication gets you). Then the destination machine wouldn’t boot. This machine mounted several gluster volumes from localhost, but glusterd wouldn’t start, which caused the boot process to hang on mounting.
Read the rest of this entry

pre-9.5 PostgreSQL upsert

No Comments »

With the recent PostgreSQL 9.5 release, one of the exciting new features is “upsert” support, which is where you want to either update an existing row or insert a new one if it doesn’t already exist. If you’re still on an old version of PostgreSQL, however, you here’s how I’ve been getting that accomplished. This example uses the Perl DBI mechanism, but I think it’s readable enough that it’d be easy to convert to anything else with parametrized query preparation.
Read the rest of this entry

statistical packet distribution with iptables

No Comments »

So, Linux iptables has a couple of modules which allow you to distribute traffic across multiple hosts. But there isn’t any good documentation I can find which correctly explains how to use them. I figured it out, so I’m going to share. :)
Read the rest of this entry

Find next available number in a non-contiguous sequence

No Comments »

Recently, I needed to write a function which would find the next available value in a sequence of user IDs.  My data is in a PostgreSQL table called agtab within a scheme called accounts.  There is a column named uid and a column named gid for each account.  The table was populated using the aggregate of several “legacy” systems, and the existing IDs are not contiguous.  I need to have a mechanism which can quickly find the smallest unused ID.  The brute force method would be to start at the lowest possible value and just start incrementing a value until I found one that’s unused.  But that’s inelegant and won’t scale real well.  I also need to skip over a chunk in the middle of the range which is allocated for another use.  I ended up with the idea of calculating the difference of the list of all IDs and the same list with all IDs incremented by one; the lowest value in “ids plus one” which isn’t also in “ids” is the lowest free ID.  Enter the “EXCEPT” clause.

Read the rest of this entry

Recursion and namerefs in ksh93

No Comments »

Today I had a reason to use namerefs in a recursive function in a ksh93 script.  Specifically, I wanted to build a list of email addresses for the management chain above an arbitrary number of employees.  I used an associative array so duplicates would automatically be removed.  But I wanted the function adding elements to the array to potentially work with different arrays.

Originally, I wrote the function roughly like this:

function append_managers {
  typeset -l user=${1:?}
  typeset -n arr=${2:?}
  typeset manager

  lookup "$user"
  lookup "$manager"
  is_vp "$manager" && return
  append_manager "$manager" "${!arr}"

I think this is pretty self-explanatory, but the way this should work is that the function takes a user name and an array name as mandatory arguments.  It uses a function (lookup()) to populate a few global arrays, including “manager”, “email”, and one with the title (so is_vp() can later determine if the person is a VP or below).  The manager’s email address is inserted into the associative array which was passed in, and then if the manager is not a VP or above, the function calls itself to continue climbing up the management chain.  The things in here which might be unfamilar are

  • typeset -l, which forces the variable’s value to lowercase (which works better with my directory)
  • typeset -n, which makes the variable a “nameref”, so using the variable affects a variable with the name of the nameref’s value.  It’s kind of like a pointer.  In this case, referencing $arr will actually reference the variable with the same name as what was passed in as the second argument.  If you do “a=’hello’;typeset -n b=’a’;print $b”; you should get “hello” in ksh93.
  • ${!var}, which gets the name of the variable pointed to by a nameref.  In the case above of ${!arr} and a function call like “append_manager ‘danny’ ‘cc'”, that would expand to ‘cc’.

This last bullet point is where things get hung up.  The variable scoping in ksh93 with functions defined like “function f{}” (as opposed to “f(){}”) is lexical.  In general, lexical scoping means that the scope is like the code would read; if you fully expand the code by copying the whole functions into the places where they’re called, copy includes in place, etc so that you have one long procedure, then variables’ scope would be where they appear.  In this situation, that essentially translates to “variables defined in a function using typset are local to that function and to any functions called by that function.”  So, recursive calls to “append_manager” with the same array name should all be able to see the same original array.

Unfortunately, in the version of ksh93 on the machine where I’m developing this – a RHEL6 system with “AT&T ksh 93u+” – this does not work as expected.  On the second iteration through the recursion, I get an “arithmatic syntax error” when I try to use the nameref.  I tried a number of variations on the approach; I tried hard-coding the array name, I tried putting the array name into a variable and then using that variable to hold the nameref name (instead of $2), and I tried making argument $2 optional so recursive calls just provided a new name while using the nameref scoped within the first call to the function.  In all cases, I got either the same error or an empty array at the end.  I suspect that the recursion messes up the internal variable tracking.  Regardless of the cause, wWhat I finally ended up with was replacing the recursive call with this:

append_manager "$manager" "arr"

So, each call gets a new locally-scoped nameref variable pointing to the nameref array in its parent.  It makes me feel slightly ill to have a call where I’m basically doing “typeset -n a=a”, but since the new “arr” is in a different, lower scope than the first, they’re technically two separate variables.  It actually works, though, and even with various combinations of things I threw at it, it works properly.  So, that’s what’s in my final script – along with some comments hopefully saving a future programmer the time.

One final note: I mentioned that I used an associative array to avoid duplicates.  But that means I need to pull the indices (AKA “keys”) from the array in order to generate my final list of email addresses.  Considering an array named “cc”, the code to generate an Outlook-friendly (separated by semicolons – ugh) list of email address looks like this:

    OLDIFS=$IFS; IFS=";"
    print "cc: " "${!cc[*]}"

Or like this:

    list=$( printf '; %s' "${!cc[@]}" )
    print "cc: ${list#??}"

The first option uses IFS to separate the arguments, which is limited to a single character.  The second option (the one I actually used), prints each element preceded by a semicolon and a space, then prunes the first two characters off of that list.  One could also use “${list:2}”, but I use the “prune” modifier more often than “substring” because I still have some older POSIX shell habits where the substring option wasn’t an option. :)

Keeping a directory in sync with SVN

No Comments »

I keep my CFEngine policy (and some other similar things) in a Subversion repository.  The progression from unit test to integration test to production is handled by using tags.  Basically, the integration test policy is the trunk, unit tests are done by branching the trunk, and promotion to production is done by tagging a revision of the trunk with a release name (monthly_YYYY_MM.POINT). But this discussion doesn’t need to be just about that approach; my solution should work for pretty much anyone who needs a directory to match a portion of a subversion structure.

Read the rest of this entry

Running trac-admin from python

No Comments »

Recently, I’ve been trying to speed up my Subversion post-commit hooks.  I have several things which are run from the hook, and the number of separate commands leads to a bunch of fork() calls (or clone(), whatever).  Several of the scripts are already Python, so I figured I’d just write the hooks themselves in Python, making it so that the python interpreter would only need to start once and allowing the separate methods to be pre-compiled python.  This should decrease overall execution time, making the end-user experience slightly better overall by decreasing the time they have to wait on the server.  We’re talking about fractions of a second, but I have some operations which bulk-create directories in SVN or otherwise cause tens or hundreds of new revisions to be created at one time (which is necessary for the way some of my integration processes work), so it actually adds up.

This is also an excuse for me to learn Python, so bear with me if the code below is horrible.  Actually, don’t bear with me – leave a comment letting me know how it should have been done. :)

Read the rest of this entry

Using a database to configure CFEngine

No Comments »

I’m responsible for a pretty large CFEngine installation.  CFEngine is designed to be pretty self-sufficient even when the network is unavailable, so it basically works by keeping its configuration local on each machine, and running from that local copy.  This is mostly implemented using a file-based configuration structure.  There’s a main configuration file ( which includes several additional configuration files.  In pretty much every situation, one of the promises (the name for an individual policy item) or bundles of promises will ensure that the local config files are in sync with the configuration files on the central master.

While it’s possible to use LDAP or define some variables on the central master, the main way configuration is done is by putting the policy into some files on the master and then allowing individual systems to copy those files down; the central master is basically just a fairly efficient file server.

Read the rest of this entry

Comparing Puppet and CFEngine in recursive file handling

No Comments »

So, we all know that ruby’s memory management is sketchy at best, and the Puppet is generally slow. But how can we quantify that? One of the metrics which is important to my usage is that of verifying the permissions on a large number of files. To that end, I wrote a simple script to compare the performance of ensuring that the contents of a large directory of files are owned by a specific group. Before each test, I remove a temp directory, create a set of sequentially-named files with the wrong group ownership, and then correct the ownership. I then run the same command again to see how quickly it can verify the permissions – which should be the common case.

For the baseline, I use “find | xargs chgrp”, which is slightly slower than “chgrp -R”, but not much slower (and, in my mind, slightly more fair). I then use a simple CFEngine policy and a simple Puppet policy to do the same thing.  The summary?  Puppet is dog slow at file recursion, while CFEngine is nearly as fast as pure find.  CFEngine actually uses less memory than the shell when you get to many files (probably due to the pipe to xargs), and Puppet wastes memory like it’s been surfing the web for weeks using an old version of Firefox.

Read the rest of this entry