Click to See Complete Forum and Search --> : Perl question


TaeShadow
11-13-2000, 01:05 AM
I just started learning Perl 4 days ago and I have a question. I am trying to write a script that parses through a mail file and prints a list giving the sender and subject of each message. Here is the code I have so far:


#!/usr/bin/perl -w

print "Enter the name of the mail file: ";
$file = <STDIN>;
chomp $file;
open(FILE, $file);

while (<FILE> )
{
if (/^(?:From\b)/) { @from = split(/ /); }
if (/^(?:Subject\b)/) { @subject = split(/:/); }

$messages{$from[1]} = $subject[1];
}

foreach (keys(%messages))
{
print "From: $_\n";
print "Subject: $messages{$_}\n\n";
}

It works almost nicely, but there are a few problems. First, suppose the subject line contains a colon (eg. "Subject: Re: whatever"). It will only display the stuff between the first and second colons. Also, the From line sometimes returns strange answers. How could I tell it to look at the line and just pick out the email address? Btw, the mail file I am using is the inbox file generated by KMail. Thank you.


Tae

klamath
11-13-2000, 01:51 AM
Something like this (untested - it should work, but I'm tired and I may have made a typo or two):


#!/usr/bin/perl -w
use strict;

print "Enter the name of the mail file: ";
my $filename = <STDIN>;
$filename =~ tr/a-zA-z0-9_,. //cd;
open(FILE, "<$filename") or die "Couldn't open $filename\n";

my ($from, $subj, %messages);
while (<FILE> ) {
chomp;

if (s/^From:\s+//) {
$from = $_;
} elsif (s/^Subject:\s+//) {
$subj = $_;
}

if (defined($from) and defined($subj)) {
$messages{$from} = $subj;
$from = undef;
$subj = undef;
}
}

foreach (keys %messages) {
print "From: $_\n";
print "Subject: $messages{$_}\n\n";
}


BTW, there was a fairly major security hole in your original code. Can you spot it (hint: don't trust any input).

Also, it uses the 'strict' pragma. It's a very good idea to get used to programming with it.

------------------
- Klamath
Get my GnuPG Key Here (http://klamath.dyndns.org/mykey.asc)
Looking for an open source project to contribute to? Check out the BBB (http://bbb.sourceforge.net)

TaeShadow
11-13-2000, 02:41 AM
Ok, I understand everything you did except for one thing. Can you explain the line:

$filename =~ tr/a-zA-z0-9_,. //cd;


I don't understand how to use tr very well.

I cannot find the security hole in my code. It appears that you modified it so that it takes out any characters that are not letters, numbers, or _,. , but I'm not seeing how it is a security hole not to do that.


Tae

YaRness
11-13-2000, 09:45 AM
if /^From:\s+(.*)$/ { $from = $1; }

# the () marks a pattern to be accessed with $1 ($2 for the second one, etc) after matching.


the ".*" in there might be better substituted for a more explicit pattern ( "(?:[^\s]+\s*)+" maybe?), but you can try and play with that on yer own i guess, if you haven't already tried doing backreferencing.

klamath's code is prolly better, i just thought i'd give a go at how i'd do it with what i know (which is very little)


<edit> from the perlop man page: on tr//, the c and d at the end of it in the above code do this:

c Complement the SEARCHLIST.
d Delete found but unreplaced characters.

------------------
"Assembly of Japanese bicycle require great peace of mind."
Registered Linux User #188285 http://counter.li.org/
------------------

[This message has been edited by YaRness (edited 13 November 2000).]

klamath
11-13-2000, 07:04 PM
TaeShadow - the part you don't understand fixes the security hole. What it does it remove all the characters that are NOT in that set from the filename - it keeps alphanumerics + period, underscore, comma, and space (add a few more if you need).

The reason this is necessary is because of the way open() works. For example, the following is legal:

open(FILE, "|rm -r foo");

Which would delete the file named 'foo'. Perl tried to open a pipe to the rm process, executes the rm command, and expects to get some data back (in this case, the filename to open). So if the user entered the filename '|rm -r /' (and the script happened to be running as root), the system is toast. The transliteration fixes this by deleting all the shell meta characters (like '|', '<', '>', back-quotes, etc).

------------------
- Klamath
Get my GnuPG Key Here (http://klamath.dyndns.org/mykey.asc)
Looking for an open source project to contribute to? Check out the BBB (http://bbb.sourceforge.net)

TaeShadow
11-13-2000, 07:15 PM
That is scary.

Why did you use tr? What is wrong with using a substitution?


Tae

klamath
11-13-2000, 07:23 PM
tr// is faster.

------------------
- Klamath
Get my GnuPG Key Here (http://klamath.dyndns.org/mykey.asc)
Looking for an open source project to contribute to? Check out the BBB (http://bbb.sourceforge.net)

YaRness
11-13-2000, 08:45 PM
that's weird, i remember writing a paragraph about the security fix, musta deleted it by accident on the edit.

http://www.linuxnewbie.org/ubb/mad.gif

oh well.

------------------
"Assembly of Japanese bicycle require great peace of mind."
Registered Linux User #188285 http://counter.li.org/
------------------

TaeShadow
11-14-2000, 05:37 PM
So as long as the user doesn't enter a pipe character, I'm safe, right?

I appreciate it any way, YaRness http://www.linuxnewbie.org/ubb/smile.gif

jemfinch
11-14-2000, 05:48 PM
Originally posted by TaeShadow:
So as long as the user doesn't enter a pipe character, I'm safe, right?


No, they can also enter angles and clobber other files.

Jeremy