BLOG

January 18, 2010  /  Jamie Appleseed  /  Tech 

Wildcards in Ruby’s regular expression (reg-ex) are by default greedy, meaning they will capture as much of the string as possible.

Let’s say you have to parse some strings which will be either “Sweet entertainment!” or “Great entertainment!”. Now, for the sake of this example, let’s say you want to grab the first word of these strings (”Sweet” or “Great”) using a regular expression with wildcards.

If you wrote the regular expression like this /^.*t/ it would capture “Sweet entertainment” because it’s greedy by default, so everything up to the very last “t” will be captured, hence, “entertainment” is included too.

Luckily, there’s a simple way to make your wildcards non-greedy: Just append a question-mark to it like this: /^.*?t/. Now, the wildcard is non-greedy and you get the wanted outcome “Sweet” or “Great” without the “entertainment” part.

The highlighted characters represent the matching part of the string:

/^.*t/ = “Sweet entertainment!”
/^.*?t/ = “Sweet entertainment!”
/^.*t/ = “Great entertainment!”
/^.*?t/ = “Great entertainment!”

So there you go. Just append a question mark to the wildcard to make it non-greedy. This rule also applies to the plus character, which by default is greedy too, but can be turned non-greedy by appending a question mark the same way: /^.+?t/.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

LEAVE A COMMENT


COPENHAGEN
Spoiled Milk ApS
Nørrebrogade 32, 2.
DK-2200 Copenhagen
Denmark


+45 32 10 05 33
ZURICH
Spoiled Milk Zweign.
Hammerstrasse 11
CH-8008 Zurich
Switzerland


+41 44 586 99 05
SUBSCRIBE TO NEWSLETTER