I have the following questions: - why does test 1 match the expected "hall" while test 2 matches "hallo" - why does test 1 match the whole string while test 4 matches only a part of it.
Because that's the way that Perl regexes work, if you have the expression (.*?)o? then for preference the .*? part will match *no characters at all*, so basically your expression either matches no characters, or one character if the next character is an "o". So since you're doing a search and replace, the effect is: * If the next character is not an "o", match a zero length string and output a null string (the contents of $1). * Since the last match was against a zero length string, then skip to the next character. * Otherwise if the next character is an "o", match it and output $1 - again this is an empty string. * Move to the end of the string matched. * Find the next match and output all unmatched text (everything from the end of the last match to the start of this one). * Repeat. So in effect we end up deleting all the letter "o"'s. Or at least I think that's what's going on here after a very brief look ;-) HTH, John.