Thursday, September 26, 2013

Evaluating Backreferences in Oracle REGEXP_REPLACE

Here's the problem. I want to use Oracle's REGEXP_REPLACE function to replace a string of digits with a zero-padded string. In Perl I would do it like this:
#--- Change xyz123 to xyz000123
my $string = "xyz123";
$string =~ s/(\d+)$/sprintf("%06d", $1)/e;
So here is what I started with in Oracle PL/SQL:
my_string := 'xyz123'
regexp_replace(my_string, '(\d+)$', lpad('\1', 6, 0));
And what I got was xyz0000123. What I discovered was that the LPAD function was being evaluated before the backreference (\1) was. So no matter what I did, the LPAD function always saw the string to be padded as 2 characters long.

So to get around this, I wrote another function called evalpad. With this, I do the replacement in 2 steps. The first is to replace the string with an embedded pad psuedo-function. Then I call a second function to do the actual replacement. Here's the code:
create or replace function evalpad (
  strIn in varchar
)
return
  varchar
as
  retval varchar(255);
  arg1 varchar(50);
  arg2 varchar(50);
  padded_number varchar(100);
begin
  retval := strIn;
  while (regexp_instr(retval, 'pad\((\d+)') > 0) loop
    arg1 := regexp_substr(retval, 'pad\((\d+)');
    arg1 := regexp_replace(arg1, 'pad\(', '');
    arg2 := regexp_substr(retval, 'pad\((\d+),\s*(\d+)');
    arg2 := regexp_replace(arg2, 'pad\((\d+),\s*', '');
    padded_number := lpad(arg1, arg2, 0);
    retval := regexp_replace(retval, 'pad\((\d+),\s*(\d+)\)', padded_number, 1, 1);
  end loop;
  return retval;
end;
Then in my original function I do this:
my_string := 'xyz123'
regexp_replace(my_string, '(\d+)$', 'pad(\1, 6)');
my_string := evalpad(my_string);
 
Blogger Templates