Click to See Complete Forum and Search --> : Algorithm using AWK, SED or your own recommendations...


himatech
05-19-2004, 08:20 PM
Dear All,

First of all I'm sorry if my presentation of the problem is a little complicated.

The following is a simple part of a postscript file:

%!PS-Adobe-3.0
%%Pages: (atend)
%%Title: scs2ps
%%BoundingBox: 0 0 612 792
%%LanguageLevel: 2
%%EndComments

%%BeginProlog
%%BeginResource: procset general 1.0.0
%%Title: (General Procedures)
%%Version: 1.0
/s { % x y (string)
3 1 roll
moveto
show
} def
%%EndResource
%%EndProlog

%%Page: 1 1
%%BeginPageSetup
/pgsave save def
/Courier [4.09 0 0 10.91 0 0] selectfont
%%EndPageSetup
40.09 690.55 (M) s
44.18 690.55 (R) s
48.27 690.55 (.) s
56.45 690.55 (A) s
60.55 690.55 (L) s
64.64 690.55 (W) s
68.73 690.55 (A) s
72.82 690.55 (L) s
76.91 690.55 (I) s
81.00 690.55 (D) s
89.18 690.55 (S) s
93.27 690.55 (A) s
97.36 690.55 (I) s
101.45 690.55 (F) s
109.64 690.55 (E) s
113.73 690.55 (L) s
117.82 690.55 (D) s
121.91 690.55 (I) s
126.00 690.55 (N) s
134.18 690.55 (A) s
138.27 690.55 (B) s
pgsave restore
showpage
%%PageTrailer
%%Trailer
%%Pages: 1
%%EOF

The output is "MR. ALWALID SAIF ELDIN AB", the space between each character and the next to it is 4.09 and the space between each word and a new word is 8.18.

I want to shift this line a little to the left to start at 50.09 instaed of 40.09.

Example:
The difference between the "M" and the "R" is "44.18-40.09=4.09"
The difference between a word and another like the "MR." and "ALWALI" is "56.45-48.27=8.18"

So I want to make a simple operation to know whether the next character will be the start of a new word or not.

I know how to do this using `sed` but I don't know how to shift the rest of the characters based on the space between each character and the other. Another solution using `AWK` is:

awk '{print $1+10 " " $2 " " $3 " " $4}' /tmp/fic2

But the problem of AWK is that the file is not of the same structure from the beginning.

Can you provide a better algorithm or another tool?

Thank you for your kind help in advance.

bwkaz
05-19-2004, 10:38 PM
I'm sure it's possible in Perl or Python... but I don't have a clue how to do it.

What I'd do is just manually go through each line that needs to change, and add 10 to its first coordinate. It'll get repetitive, sure, but I can't program any simple way to do it (due to my lack of Perl/Python/text manipulation language knowledge)...

flukshun
05-20-2004, 11:01 AM
You got the right idea. I don't know how "smart" you want the editting script to be, but with this particular file you could isolate the manipulations to the correct segment using '%%EndPageSetup' and 'pgsave restore' as the start/end triggers.

This seems to work:


#!/usr/bin/perl

while (<>) {
if ((/^%%EndPageSetup$/../^pgsave restore$/)
and not (/^%%EndPageSetup$/ or /^pgsave restore$/)) {
@words = split(/ /, $_);
$words[0] += 10;
print join(' ',@words);
}
else { print; }
}

fluxion@purity ~/scripts/test: cat postscript | shifter.pl
%!PS-Adobe-3.0
%%Pages: (atend)
%%Title: scs2ps
%%BoundingBox: 0 0 612 792
%%LanguageLevel: 2
%%EndComments

%%BeginProlog
%%BeginResource: procset general 1.0.0
%%Title: (General Procedures)
%%Version: 1.0
/s { % x y (string)
3 1 roll
moveto
show
} def
%%EndResource
%%EndProlog

%%Page: 1 1
%%BeginPageSetup
/pgsave save def
/Courier [4.09 0 0 10.91 0 0] selectfont
%%EndPageSetup
50.09 690.55 (M) s
54.18 690.55 (R) s
58.27 690.55 (.) s
66.45 690.55 (A) s
70.55 690.55 (L) s
74.64 690.55 (W) s
78.73 690.55 (A) s
82.82 690.55 (L) s
86.91 690.55 (I) s
91 690.55 (D) s
99.18 690.55 (S) s
103.27 690.55 (A) s
107.36 690.55 (I) s
111.45 690.55 (F) s
119.64 690.55 (E) s
123.73 690.55 (L) s
127.82 690.55 (D) s
131.91 690.55 (I) s
136 690.55 (N) s
144.18 690.55 (A) s
148.27 690.55 (B) s
pgsave restore
showpage
%%PageTrailer
%%Trailer
%%Pages: 1
%%EOF