Click to See Complete Forum and Search --> : simple perl parser


threadhead
05-23-2004, 07:11 AM
hi im writing a perl parser for the output
of iwlist <dev> scanning.

the output of the above program looks like that:

.....

ath0 Scan completed :
Cell 01 - Address: 00:0D:88:84:F7:4A
Mode:Master
Encryption key:off
Quality:7/0 Signal level:-88 dBm Noise level:-95 dBm
Mode:Master
ESSID:"default"
Frequency:2.437GHz
Bit Rate:1Mb/s
Bit Rate:2Mb/s
Bit Rate:5Mb/s
Bit Rate:11Mb/s
Bit Rate:22Mb/s


ath0 Scan completed :
Cell 01 - Address: 00:0D:88:84:F7:4A
Mode:Master
Encryption key:off
Quality:7/0 Signal level:-88 dBm Noise level:-95 dBm
Mode:Master
ESSID:"default"
Frequency:2.437GHz
Bit Rate:1Mb/s
Bit Rate:2Mb/s
Bit Rate:5Mb/s
Bit Rate:11Mb/s
Bit Rate:22Mb/s

ath0 Scan completed :
..............


the problem with this is that my perl script only takes the first occurence of the regex i am applying.

a snippet of my perl code

#opening input and output file
...

@contents = <READ>; #the file we want to read from

foreach(@contents){
s/\n//ig;
}

if("@contents" =~ m/regexes go here/ig){
#blah
}
#close files
....



now, how would i be able to apply the regex to the whole array of contents, or the whole file?
i think the problem is because of that if("@contents").

i also tried a
while(<READ>){
}
around that if() but with success.

any ideas? ;)
thanks for your time
threadhead

maccorin
05-23-2004, 07:20 AM
Originally posted by threadhead
now, how would i be able to apply the regex to the whole array of contents, or the whole file?
i think the problem is because of that if("@contents").

i also tried a
while(<READ>){
}
around that if() but with success.

any ideas? ;)
thanks for your time
threadhead

you could just read the whole file into a scalar instead of an array

here's two ways


$file = "";
while(<READ>) {
$file += $_;
}



$file = `cat filename.txt`


you'll still either need to s all the \n's with \s's or use the /m switch

maccorin
05-23-2004, 07:23 AM
Originally posted by threadhead

#opening input and output file
...

@contents = <READ>; #the file we want to read from

foreach(@contents){
s/\n//ig;
}

if("@contents" =~ m/regexes go here/ig){
#blah
}
#close files
....


Sorry for the double post, i just thought of this, i don't think i answered your question w/ my first post

have you tried


while("@contents" =~ m/regexes go here/igm) {
do_something $_;
}


that's untested, but i _think_ it should work (i usually just use perl for quick line-by-line regex's w/ the -pe switch)

threadhead
05-23-2004, 07:39 AM
well with those regexes i want to print a certain format to a file.

like:

if("@contents" =~ m/(1)(2)(3)/ig){
format PRINT =
@<<<< @<<<
$1, $2
.
write PRINT;
}


how could i apply your solution to that?

thanks alot!

maccorin
05-23-2004, 02:17 PM
i'm not sure what your doing, it's prolly due to my own incompetence though :( I'm sorry I can't be of further help.

specifically i'm not sure what the @<<<< does. I guess i should look up the format command for perl, but i'm to lazy right this second.

flukshun
05-23-2004, 03:42 PM
I also think you should be more specific on what exactly you are trying to extract from the file, it seems unecessarilly complicated to slurp a everything into a scalar, rather than just matching on a line by line basis (since the file is already formatted as such), unless you put things into context.

But keeping with your initial requirements, a generic solution would be to slurp everything into a scalar, store each match (encompassing the entire relevant string) into an array, then extract particular fields from those matching strings by looping through the resulting array. Example:

$/ = "";
$file = <FILE>;
@matches = $file =~ m/regex/gs;

foreach $string @matches {
$string =~ m/(1)(2)(3)/;
print "$1 $2 $3";
}


But as I stated before, it's hard to know what you want when you've made no specifications on the type of data you wanna pull. The above method only makes sense if your matches spawn multiple lines. If they don't, you shouldn't slurp the data in the first place as it just makes it harder to parse.

cux
05-23-2004, 04:15 PM
If you have the entire file in a scalar $contents


@contents = <READ>;
close(READ);
$contents = "";

foreach $line (@contents){
$contents .= $line;
}

## get rid of the newlines, the space is optional unless you
## depend on them in your regex
$contents =~ s/\n/ /ig;

## now just do your matching in a while loop
# with a weak regex
while ( m/ath0(.*)ath0/ig ){
## now the match is in $1
#blah
}



As you can see I'm no perl expert but I think this is the basic idea you are looking for.

threadhead
05-23-2004, 05:19 PM
thanks for all your helping hands, but i just found a quick hack for my problem.

to be more specific about my problem.


ath0 Scan completed :
Cell 01 - Address: 00:0D:88:84:F7:4A
Mode:Master
Encryption key:off
Quality:7/0 Signal level:-88 dBm Noise level:-95 dBm
Mode:Master
ESSID:"default"
Frequency:2.437GHz
Bit Rate:1Mb/s
Bit Rate:2Mb/s
Bit Rate:5Mb/s
Bit Rate:11Mb/s
Bit Rate:22Mb/s


assume i want to extract the values that are behind
mode, encryption and essid.
that way i couldnt do line by line matching since i am not looking for the same pattern everytime. once it is "Mode:" i am looking for and another time it is "ESSID:". that makes, from my understanding, line by line matching impossible.

the quick hack i was applying to my code to get it working was the following.
i made one big regex for all the things i have been looking for like essid and the some more. those values had to be on the same line though.
so i searched for the pattern "Cell " within in the whole file.
if it was found i took the following 8 lines and removed their newline feeds and put all of the 8 lines into a $variable.
this $var was then matched by my big regular expression.

this was done until it reached the end of the file.

complicated but it does its job.

luckily the codes are only for myself and not my employer.

maccorin
05-23-2004, 05:25 PM
you could do it line by line still


/(ESSID|Mode|Encryption\skey):(.*)/


that puts which one your working with in $1, and the value in $2

threadhead
05-23-2004, 05:34 PM
more experience in perl would've helped.
didnt know i can do it that way, damn thats pretty easy. ;)
all the stuff i was matching got printed to a file finally.
how would i erase even lines from the file, or even catch them before
they get printed.

are there any good functions in perl that do something like that?
like shrinking an array or automatically detecting even lines in a file?

thanks

flukshun
05-23-2004, 05:52 PM
Originally posted by threadhead
how would i erase even lines from the file, or even catch them before they get printed.

are there any good functions in perl that do something like that?
like shrinking an array or automatically detecting even lines in a file?

when looping through a file (using the default input seperator of "\n"), $. contains the current line count. Using it, you can determine whether or not the line is even by checking if the number is evenly divisible by two. Example:


while (<FILE>) {
print $_ unless ($. & 2);
}


Catching these lines before printing to the file can be done using the same check on a running counter.

threadhead
05-23-2004, 05:57 PM
oh no you misunderstood me.
i meant 'even' in the sense of identical or equal.

excuse me for that confusion.

flukshun
05-23-2004, 06:07 PM
Originally posted by threadhead
oh no you misunderstood me.
i meant 'even' in the sense of identical or equal.

excuse me for that confusion.

hehe, that's quite the discrepancy. This is a little bit harder, but one method is to use a hash to store each line, then checking each line against the hash table to determine whether or not the line has already been encountered. Example:

while (<FILE>) {
print $_ unless ( exists $lines{$_} );
$lines{$_}++;
}


I'm not aware of any built-in functions that serve the same purpose.

threadhead
05-23-2004, 06:15 PM
i didnt try your solution, but i will do and read up on hashes.
thanks ;)

btw: dont hashes look like
%hash = ("localhost", "remote");
?

how would i then store values in the hash and/or compare certain values
to the stored ones?

threadhead
05-24-2004, 11:26 AM
as feedback for your answers
i post the finished and working code here.

the log file output must look like that
(produced by 'iwlist <device> scanning')


ath0 Scan completed :
Cell 01 - Address: 00:0D:88:84:F7:4A
Mode:Master
Encryption key:off
Quality:7/0 Signal level:-88 dBm Noise level:-95 dBm
Mode:Master
ESSID:"default"
Frequency:2.437GHz
Bit Rate:1Mb/s
Bit Rate:2Mb/s
Bit Rate:5Mb/s
Bit Rate:11Mb/s
Bit Rate:22Mb/s


ath0 Scan completed :
Cell 01 - Address: 00:0D:88:84:F7:4A
Mode:Master
Encryption key:off
Quality:7/0 Signal level:-88 dBm Noise level:-95 dBm
Mode:Master
ESSID:"default"
Frequency:2.437GHz
Bit Rate:1Mb/s
Bit Rate:2Mb/s
Bit Rate:5Mb/s
Bit Rate:11Mb/s
Bit Rate:22Mb/s



the program i wrote:

#!/usr/local/bin/perl -w
#simple parser for output of
#iwlist <device> scanning

$readfile = shift || die "usage:\n" . "\tperl $0 <file from scan.sh> <output file>\n";
$outfile = shift || die "usage:\n" . "\tperl $0 <file from scan.sh> <output file>\n";
#opening source file and output file
open(READ,"$readfile") || die "cannot open $readfile: $!\n";
open(PRINT,">$outfile")|| die "cannot open $outfile: $!\n";

#preformat file for list
$a = "Address Mode, Mode2 Encryption key Sig. lvl. ESSID Frequency\n";
$b = "--------------------------------------------------------------------------------------------------------------------------------------------------\n";
print PRINT $a;
print PRINT $b;
$count = 0;

print "\n";

while(<READ>){
if(m/(Address: |Mode:|Encryption key:|Signal level:|ESSID:|Frequency:)(.\S{1,})/){
#assigning pattern match'ed value to slot $count
$values[$count] = $2;
#do we have a complete entry?
if($count eq 6){
#first value is always the address, check if already existing
if($values[0] =~ m/\S{2}:\S{2}:\S{2}:\S{2}:\S{2}:\S{2}/){
if($hash{$values[0]}){
print "already added '$values[0]', omitting entry!\n";
}
else{
#that address was not written until now, ok to write
$hash{$values[0]} = 1;
format PRINT =
@<<<<<<<<<<<<<<<< || @<<<<<<<<<<, @<<<<<<<<<< || @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< || @<<<<<<<<< || @<<<<<<<<<<<<<<<<<<<<< || @<<<<<<<
$values[0],$values[1],$values[4],$values[2],
$values[3] . " dBm",$values[5],$values[6]
.
write PRINT;

}
} #if($values[0] =~ m/\S{2}:\S{2}:\S{2}:\S{2}:\S{2}:\S{2}/)
#we wrote to file, lets begin a new cycle
$count = 0;
} #if($count eq 6)
#if not, we can continue writing values to the array
else{
$count++;
}
} #if(m/(Address: |Mode:|Encryption key:|Signal level:|ESSID:|Frequency:)(.\S{1,})/)
} #while(<READ>)

print "everything done!\n";
print "formatted list now in '$outfile'!\n\n";

close(READ);
close(PRINT);