Regex – Optional Item
The question mark makes the preceding token in the regular expression optional. E.g.: colou?r matches both colour and color.
1 2 3 4 5 6 | <?php $pattern = "@colou?r@"; $string = "color or colour"; preg_match_all($pattern,$string,$a); print_r($a); ?> |
output:
You can make several tokens optional by grouping them together using round brackets, and placing the question mark after the closing bracket. E.g.: Nov(ember)? will match Nov and November.
You can write a regular expression that matches many alternatives by including more than one question mark. Feb(ruary)? 23(rd)? matches February 23rd, February 23, Feb 23rd and Feb 23.
Looking Inside The Regex Engine
Let’s apply the regular expression colou?r to the string The colonel likes the color green.
The first token in the regex is the literal c. The first position where it matches successfully is the c in colonel. The engine continues, and finds that o matches o, l matches l and another o matches o. Then the engine checks whether u matches n. This fails. However, the question mark tells the regex engine that failing to match u is acceptable. Therefore, the engine will skip ahead to the next regex token: r. But this fails to match n as well. Now, the engine can only conclude that the entire regular expression cannot be matched starting at the c in colonel. Therefore, the engine starts again trying to match c to the first o in colonel.
After a series of failures, c will match with the c in color, and o, l and o match the following characters. Now the engine checks whether u matches r. This fails. Again: no problem. The question mark allows the engine to continue with r. This matches r and the engine reports that the regex successfully matched color in our string.

Discussion 5 Responses
Trackbacks
Leave a Reply