Click to See Complete Forum and Search --> : Using sed with grouping


bykedog
06-18-2003, 12:45 PM
Hello,

I am trying to replace a large number of links in several webpages. The idea is to change this:(Examples are all one line, no CR)<a href="http://www.firstgov.gov/" class="smalltext"> to this:<a href="javascript:openScript('/redirect.php?
redirect=http://www.firstgov.gov/',650,450)" class="smalltext"> I came up with this:sed -e "s/href\=\"\(http.*\)\"/href=\"javascript\:openScript\(\'\/redirect\.php\?
redirect=\1\'\,650\,450\)\"/g" page4.php.old > page4.php.new This only works when the line in the page is:<a class="smalltext" href="http://www.firstgov.gov/"> otherwise I get:<a href="javascript:openScript('/redirect.php?redirect
=http://www.firstgov.gov" class="smallesttext',650,450)"> So how to I tell sed to make the grouping between the first set of matching quotes, not the first and last quote on the line?

thanks in advance

Strogian
06-18-2003, 01:39 PM
Tip #1: Use single quotes to enclose your sed script, if at all possible. That really cuts down on the number of backslashes you'll need.

Now that that's out of the way.. :)

Perl, I think, has a way to switch the "mode" of pattern matching (i.e. greedy/lazy or something like that). I don't think you can do that in sed, though... it's always greedy. You can still do it though, you just have to modify your regular expression. Change it so that only what you want matches -- not what you don't want. Maybe this would work:

Change this:

sed -e "s/href\=\"\(http.*\)\"/href=\"javascript\:openScript\('\/redirect\.php\?
redirect=\1'\,650\,450\)\"/g" page4.php.old > page4.php.new

To this:

sed -e "s/href\=\"\(http[^ ]*\)\"/href=\"javascript\:openScript\('\/redirect\.php\?
redirect=\1'\,650\,450\)\"/g" page4.php.old > page4.php.new


That would work if you expect whatever is in quotes to not have spaces in it. At least, I think that would work. I gotta admit I didn't actually look at the whole thing.. it's too complicated for me. :)

EDIT: Yeah, it changed some stuff to smilies.. Ignore that part. :)

bykedog
06-18-2003, 03:41 PM
Strogian,

Thanks! Your suggestion worked like a charm, since they're all URL's, with no spaces:

sed -e "s/href\=\"\(http[^ ]*\)\"/href=\"javascript\:openScript\('\/redirect\.php\?
redirect=\1'\,650\,450\)\"/g" page4.php.old > page4.php.new

Thanks for the tip too - I ended up using quotes because I had to escape the / (because it's part of the replace text.) I assumed there's another way to use it, but I managed to get this working for now.

Thank's for the tip about perl too. So much to learn so little time.... ;)