Find next available number in a non-contiguous sequence

Recently, I needed to write a function which would find the next available value in a sequence of user IDs.  My data is in a PostgreSQL table called agtab within a scheme called accounts.  There is a column named uid and a column named gid for each account.  The table was populated using the aggregate of several “legacy” systems, and the existing IDs are not contiguous.  I need to have a mechanism which can quickly find the smallest unused ID.  The brute force method would be to start at the lowest possible value and just start incrementing a value until I found one that’s unused.  But that’s inelegant and won’t scale real well.  I also need to skip over a chunk in the middle of the range which is allocated for another use.  I ended up with the idea of calculating the difference of the list of all IDs and the same list with all IDs incremented by one; the lowest value in “ids plus one” which isn’t also in “ids” is the lowest free ID.  Enter the “EXCEPT” clause.

To start with, here’s a query to just find the lowest unused value which is greater than 1000 (reserved for system IDs):

SELECT uid+1 AS uid FROM account.agtab WHERE uid > 1000
EXCEPT
SELECT uid FROM account.agtab WHERE uid > 1000
ORDER BY uid ASC
LIMIT 1

Here, the EXCEPT clause returns all of the entries from the first query which are not in the second query.  Pretty straightforward.  But, this only works when there are values in the database, and it only takes into account the uid field.  Going forward, I want every new account to have the same uid and gid value, which means I need to find the lowest value which is unused in both columns.  So, I’ll calculate a union of the uid and gid columns, and do the same thing with subqueries:

(
  SELECT uid+1 AS id FROM account.agtab WHERE uid > 1000
  UNION
  SELECT gid+1 AS id FROM account.agtab WHERE gid > 1000
)
EXCEPT
(
  SELECT uid AS id FROM account.agtab WHERE uid > 1000
  UNION
  SELECT gid AS id FROM account.agtab WHERE gid > 1000
)
ORDER BY id ASC
LIMIT 1

Almost there.  This solution requires that the tables have at least one value, and I don’t like writing solutions which fail in edge cases.  Initially I thought I’d use COALESCE, but then it occurred to me that it’d make more sense to just add another UNION statement with the bottom of the allowable range:

(
 SELECT uid+1 AS id FROM account.agtab WHERE uid > 1000
 UNION
 SELECT gid+1 AS id FROM account.agtab WHERE gid > 1000
 UNION
 SELECT 1001 AS id
)
EXCEPT
(
 SELECT uid AS id FROM account.agtab
 UNION
 SELECT gid AS id FROM account.agtab
)
ORDER BY id ASC
LIMIT 1

Hooray; the function returns the first value – 1001 – if the table is empty.  One could argue that the sane alternative would be to just use a sequence if the table was empty, but this would still be useful in a situation where people could manually insert arbitrary IDs into the table.  For example, perhaps you’re supporting a large variety of users, some of whom occasionally purchase software from incredibly bad vendors who write code which relies upon an account having a specific UID.  Not that such a thing has ever happened to me several times or anything… :/

But wait, there’s one more wrinkle.  The range 10,000 to 9,999,999 is reserved for a different purpose.  When my IDs get to 9,999, I need to jump to 10 million as the next available ID.  Thanks to using the union solution rather than a coalesce, I just add that second alternative lower bound value as well, and modify the where clause:

(
 SELECT uid+1 AS id FROM account.agtab WHERE (uid > 1000 AND uid < 9999) OR (uid >= 10000000)
 UNION
 SELECT gid+1 AS id FROM account.agtab WHERE (uid > 1000 AND uid < 9999) OR (uid >= 10000000)
 UNION
 SELECT 1001 AS id
 UNION
 SELECT 10000000 AS id
)
EXCEPT
(
 SELECT uid AS id FROM account.agtab
 UNION
 SELECT gid AS id FROM account.agtab
)
ORDER BY id ASC
LIMIT 1

It should be pretty visible from this how one could extend the approach to exclude multiple ranges.  It should also be pretty visible how this is gonna be a whole bunch of repeated typing.  So, let’s move it to a WITH statement:

WITH ids AS (
 SELECT uid AS id FROM account.agtab
 UNION
 SELECT gid AS id FROM account.agtab 
),
ids_plus_one AS (
 SELECT id+1 AS id
 FROM ids
 WHERE (id > 1000 AND id < 9999)
    OR (id >= 1000000)
)
(
  SELECT id FROM ids_plus_one
  UNION SELECT 1001 AS id
  UNION SELECT 1000000 AS id
)
EXCEPT ( SELECT id FROM ids )
ORDER BY id ASC
LIMIT 1

I think that’s pretty readable, which translates to being more maintainable.  I had a hard time finding an easy solution online (people were using integer tables and all sorts of weird stuff, and no one had anything which excluded ranges), so hopefully this helps someone. :)