Monday, March 8, 2010

Regular expressions and backreferences

While trying to find a way to match quoted strings in a text taking into account that you can use both double and single quotes I first started looking at using two separate sub-expressions OR-ed together, but then I started wondering if it was possible to say something like "match a sequence that's exactly the same as that other sequence that you matched earlier". After some looking around I found the answer in backreferences. The following will match a single or double quoted string for example:

(["']).*?\1

1 comment:

  1. To also match non-quoted attribute values we had to revert to using two regex'es a while back.

    http://frank.vanpuffelen.net/2007/04/how-to-optimize-regular-expression.html

    Let me know if you find a way to match all variants with one regex, because it would speed up some of our products. :-)

    ReplyDelete