Today I had a reason to use namerefs in a recursive function in a ksh93 script. Specifically, I wanted to build a list of email addresses for the management chain above an arbitrary number of employees. I used an associative array so duplicates would automatically be removed. But I wanted the function adding elements to the array to potentially work with different arrays.
Originally, I wrote the function roughly like this:
function append_managers { typeset -l user=${1:?} typeset -n arr=${2:?} typeset manager lookup "$user" manager="${manager[$user]}" lookup "$manager" arr[${email[$manager]}]=1 is_vp "$manager" && return append_manager "$manager" "${!arr}" }
I think this is pretty self-explanatory, but the way this should work is that the function takes a user name and an array name as mandatory arguments. It uses a function (lookup()) to populate a few global arrays, including “manager”, “email”, and one with the title (so is_vp() can later determine if the person is a VP or below). The manager’s email address is inserted into the associative array which was passed in, and then if the manager is not a VP or above, the function calls itself to continue climbing up the management chain. The things in here which might be unfamilar are
- typeset -l, which forces the variable’s value to lowercase (which works better with my directory)
- typeset -n, which makes the variable a “nameref”, so using the variable affects a variable with the name of the nameref’s value. It’s kind of like a pointer. In this case, referencing $arr will actually reference the variable with the same name as what was passed in as the second argument. If you do “a=’hello’;typeset -n b=’a’;print $b”; you should get “hello” in ksh93.
- ${!var}, which gets the name of the variable pointed to by a nameref. In the case above of ${!arr} and a function call like “append_manager ‘danny’ ‘cc'”, that would expand to ‘cc’.
This last bullet point is where things get hung up. The variable scoping in ksh93 with functions defined like “function f{}” (as opposed to “f(){}”) is lexical. In general, lexical scoping means that the scope is like the code would read; if you fully expand the code by copying the whole functions into the places where they’re called, copy includes in place, etc so that you have one long procedure, then variables’ scope would be where they appear. In this situation, that essentially translates to “variables defined in a function using typset are local to that function and to any functions called by that function.” So, recursive calls to “append_manager” with the same array name should all be able to see the same original array.
Unfortunately, in the version of ksh93 on the machine where I’m developing this – a RHEL6 system with “AT&T ksh 93u+” – this does not work as expected. On the second iteration through the recursion, I get an “arithmatic syntax error” when I try to use the nameref. I tried a number of variations on the approach; I tried hard-coding the array name, I tried putting the array name into a variable and then using that variable to hold the nameref name (instead of $2), and I tried making argument $2 optional so recursive calls just provided a new name while using the nameref scoped within the first call to the function. In all cases, I got either the same error or an empty array at the end. I suspect that the recursion messes up the internal variable tracking. Regardless of the cause, wWhat I finally ended up with was replacing the recursive call with this:
append_manager "$manager" "arr"
So, each call gets a new locally-scoped nameref variable pointing to the nameref array in its parent. It makes me feel slightly ill to have a call where I’m basically doing “typeset -n a=a”, but since the new “arr” is in a different, lower scope than the first, they’re technically two separate variables. It actually works, though, and even with various combinations of things I threw at it, it works properly. So, that’s what’s in my final script – along with some comments hopefully saving a future programmer the time.
One final note: I mentioned that I used an associative array to avoid duplicates. But that means I need to pull the indices (AKA “keys”) from the array in order to generate my final list of email addresses. Considering an array named “cc”, the code to generate an Outlook-friendly (separated by semicolons – ugh) list of email address looks like this:
OLDIFS=$IFS; IFS=";" print "cc: " "${!cc[*]}" IFS=$OLDIFS
Or like this:
list=$( printf '; %s' "${!cc[@]}" ) print "cc: ${list#??}"
The first option uses IFS to separate the arguments, which is limited to a single character. The second option (the one I actually used), prints each element preceded by a semicolon and a space, then prunes the first two characters off of that list. One could also use “${list:2}”, but I use the “prune” modifier more often than “substring” because I still have some older POSIX shell habits where the substring option wasn’t an option. :)