Find in set for PostgreSQL

Having been a Postgres user for quite a while, I’ve become accustomed to things that when I was using MySQL seemed non-essential, you know all that crazy stuff like referential integrity and foreign keys? Stuff that makes a database, well, a database.

Consider the following situation, from an XML feed, or something else, from your own app you are given a number of ids, this could be from some sort of mechanism that handles your search or something else, whatever. So, given the id’s of table foo in the order 3,5,1,9,7 how would you retrieve them from the database in that order?

Maybe you would use a case statement?

select * from foo where id in (3,5,1,9,7) order by 
  case id 
    when 3 then 1 
    when 5 then 2 
    when 1 then 3
    when 9 then 4
    when 7 then 5

how horrible is that? What you have here is a brittle map which most likely involves way too much typing for it to be worth it. Plus writing something like that in ActiveRecord or another ORM is serious cause for concern.

Maybe you would sort them in your application post retrieval? But should you have to? Should you expose your application to sorting that your database should be able to trivially solve?

Well find_in_set is here to help! Or atleast in MySQL

mysql> select find_in_set(7, '3,5,1,9,7');
| find_in_set(7, '3,5,1,9,7') |
|                           5 |

So you could use this in your order by as:

select * from foo where id in (3,5,1,9,7) 
  order by find_in_set(id, '3,5,1,9,7');

Great, yeah? Well … other than this is a MySQL specific function with no comparable function in Postgres. Fear not! All this technically is, is a reverse index look up on a non-sorted list (average complexity O(n/2)). To get this functionality in Postgres all you need is the following:

create or replace function find_in_array(
  needle anyelement, haystack anyarray) returns integer as $$ 
  i integer; 
  for i in 1..array_upper(haystack, 1) loop 
    if haystack[i] = needle then 
      return i; 
    end if; 
  end loop; 
  raise exception 'find_in_array: % not found in %', needle, haystack; 
$$ language 'plpgsql';

Note, this uses a Postgres array rather than a set. So, to execute, just run the following:

select * from foo where id in (3,5,1,9,7)
  order by find_in_array(id, '{3,5,1,9,7}');

For added Ruby madness, you could even extend the Array class by adding Array#to_postgres_array:

class String
  def sql_escape
    self.gsub(/[%_'\\"]/, "\\\\\\0")

class Array
  def to_postgres_array
    "'{" + self.inject([]) do |mem, val|
      mem << (val.kind_of?(String) ? "\"#{val.sql_escape}\"" : val)
    end.join(", ") + "}'"

irb(main):001:0> [1,2,3].to_postgres_array
=> "'{1, 2, 3}'"

Enjoy your sorted records!

“Maybe you would sort them in your application post retrieval? But should you have to?”