Click to See Complete Forum and Search --> : regex


EscapeCharacter
06-19-2003, 02:31 AM
im no regular expression guru but im sure the output of the following statement is wrong. i have this:


echo "blah blah Opera" |sed -e 's/[^Opera]//g'

output:

aaOpera


i cant for the life of me figure out why the two a's from blah blah are being printed. ive tried the same statement with php and perl regex and get the same results for all of them. as far as i can tell only Opera should be printed.

mrBen
06-19-2003, 03:15 AM
It's because you are searching for the letters 'O', 'p', 'e', 'r', and 'a'. The letters 'b','l' and 'h', along with the space character are ignored, but it finds the letter 'a' in the middle of 'blah'.

EscapeCharacter
06-19-2003, 03:42 AM
oh ok and thats caused by the brackets correct? so how would i go about filtering everything but "Opera"?

mrBen
06-19-2003, 06:24 AM
Maybe :

sed -e 's/Opera//g '? Certainly it was the square brackets causing you problems.

EscapeCharacter
06-19-2003, 02:34 PM
Originally posted by mrBen
Maybe :

sed -e 's/Opera//g '? Certainly it was the square brackets causing you problems.

that does the oposite of what i want, i want everything but opera to be replaced with nothing.

sploo22
06-19-2003, 02:56 PM
I'm using Windows (blecch) at the moment so I can't test this, but try 's/.*Opera.*/Opera/g'.

bwkaz
06-19-2003, 11:33 PM
Originally posted by EscapeCharacter

echo "blah blah Opera" |sed -e 's/[^Opera]//g'

output:

aaOpera
Square brackets match any character inside them. Unless the string inside them begins with a caret -- then, the sequence matches anything BUT the characters inside the brackets.

So your sed search expression matches any single character except capital O, lowercase p, e, r, and a. This keeps your "Opera" word intact all right, but it also keeps the a's from "blah" intact as well (because of the a between the brackets).

The simplest way around it that I can think of is what sploo22 posted. However, I'm not at all sure what that will do if there's more than one Opera in a line... maybe this isn't an issue, though.