首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 开发语言 > perl python >

Perl正则匹配的有关问题

2012-04-05 
Perl正则匹配的问题Perl codemy $data msn bbif ( $data ~ /(?[a-z]+)\s+(\w+)/ ) {if ( defined $

Perl正则匹配的问题

Perl code
my $data = "msn bb";if ( $data =~ /(?=[a-z]+)\s+(\w+)/ ) {    if ( defined $1 ) {        print "matches: $1\n";    }}


我想匹配bb,并输出它。
但是这个正则表达式没有匹配成功。 
请问什么?

[解决办法]
很奇怪,楼主只是要匹配bb并输出,正则中的(?=[a-z]+)是打算要做什么呢?
[解决办法]
?= 是lookforward
?<= 是lookbehind

但是lookbehind 不容许有“变长” 即,+,. 之类的meta character

所以代码只能是:
Perl code
my $data = "msn bb";if ( $data =~ /(?<=msn\s)(\w+)/ ) {  if ( defined $1 ) {    print "matches: $1\n";  }}
[解决办法]
Looking ahead and looking behind

This section concerns the lookahead and lookbehind assertions. First, a little background.

In Perl regular expressions, most regexp elements 'eat up' a certain amount of string when they match. For instance, the regexp element [abc}] eats up one character of the string when it matches, in the sense that Perl moves to the next character position in the string after the match. There are some elements, however, that don't eat up characters (advance the character position) if they match. The examples we have seen so far are the anchors. The anchor ^ matches the beginning of the line, but doesn't eat any characters. Similarly, the word boundary anchor \b matches wherever a character matching \w is next to a character that doesn't, but it doesn't eat up any characters itself. Anchors are examples of zero-width assertions. Zero-width, because they consume no characters, and assertions, because they test some property of the string. In the context of our walk in the woods analogy to regexp matching, most regexp elements move us along a trail, but anchors have us stop a moment and check our surroundings. If the local environment checks out, we can proceed forward. But if the local environment doesn't satisfy us, we must backtrack.

Checking the environment entails either looking ahead on the trail, looking behind, or both. ^ looks behind, to see that there are no characters before. $ looks ahead, to see that there are no characters after. \b looks both ahead and behind, to see if the characters on either side differ in their "word-ness".

The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The lookahead assertion is denoted by (?=regexp) and the lookbehind assertion is denoted by (?<=fixed-regexp). Some examples are

Perl code
    $x = "I catch the housecat 'Tom-cat' with catnip";    $x =~ /cat(?=\s)/;   # matches 'cat' in 'housecat'    @catwords = ($x =~ /(?<=\s)cat\w+/g);  # matches,                                           # $catwords[0] = 'catch'                                           # $catwords[1] = 'catnip'    $x =~ /\bcat\b/;  # matches 'cat' in 'Tom-cat'    $x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in                              # middle of $x 

热点排行