Perl正则匹配的问题
my $data = "msn bb";if ( $data =~ /(?=[a-z]+)\s+(\w+)/ ) { if ( defined $1 ) { print "matches: $1\n"; }}
my $data = "msn bb";if ( $data =~ /(?<=msn\s)(\w+)/ ) { if ( defined $1 ) { print "matches: $1\n"; }}
[解决办法]
Looking ahead and looking behind
This section concerns the lookahead and lookbehind assertions. First, a little background.
In Perl regular expressions, most regexp elements 'eat up' a certain amount of string when they match. For instance, the regexp element [abc}] eats up one character of the string when it matches, in the sense that Perl moves to the next character position in the string after the match. There are some elements, however, that don't eat up characters (advance the character position) if they match. The examples we have seen so far are the anchors. The anchor ^ matches the beginning of the line, but doesn't eat any characters. Similarly, the word boundary anchor \b matches wherever a character matching \w is next to a character that doesn't, but it doesn't eat up any characters itself. Anchors are examples of zero-width assertions. Zero-width, because they consume no characters, and assertions, because they test some property of the string. In the context of our walk in the woods analogy to regexp matching, most regexp elements move us along a trail, but anchors have us stop a moment and check our surroundings. If the local environment checks out, we can proceed forward. But if the local environment doesn't satisfy us, we must backtrack.
Checking the environment entails either looking ahead on the trail, looking behind, or both. ^ looks behind, to see that there are no characters before. $ looks ahead, to see that there are no characters after. \b looks both ahead and behind, to see if the characters on either side differ in their "word-ness".
The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The lookahead assertion is denoted by (?=regexp) and the lookbehind assertion is denoted by (?<=fixed-regexp). Some examples are
$x = "I catch the housecat 'Tom-cat' with catnip"; $x =~ /cat(?=\s)/; # matches 'cat' in 'housecat' @catwords = ($x =~ /(?<=\s)cat\w+/g); # matches, # $catwords[0] = 'catch' # $catwords[1] = 'catnip' $x =~ /\bcat\b/; # matches 'cat' in 'Tom-cat' $x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in # middle of $x