Click to See Complete Forum and Search --> : Perl question
TaeShadow
11-13-2000, 01:05 AM
I just started learning Perl 4 days ago and I have a question. I am trying to write a script that parses through a mail file and prints a list giving the sender and subject of each message. Here is the code I have so far:
#!/usr/bin/perl -w
print "Enter the name of the mail file: ";
$file = <STDIN>;
chomp $file;
open(FILE, $file);
while (<FILE> )
{
if (/^(?:From\b)/) { @from = split(/ /); }
if (/^(?:Subject\b)/) { @subject = split(/:/); }
$messages{$from[1]} = $subject[1];
}
foreach (keys(%messages))
{
print "From: $_\n";
print "Subject: $messages{$_}\n\n";
}
It works almost nicely, but there are a few problems. First, suppose the subject line contains a colon (eg. "Subject: Re: whatever"). It will only display the stuff between the first and second colons. Also, the From line sometimes returns strange answers. How could I tell it to look at the line and just pick out the email address? Btw, the mail file I am using is the inbox file generated by KMail. Thank you.
Tae
klamath
11-13-2000, 01:51 AM
Something like this (untested - it should work, but I'm tired and I may have made a typo or two):
#!/usr/bin/perl -w
use strict;
print "Enter the name of the mail file: ";
my $filename = <STDIN>;
$filename =~ tr/a-zA-z0-9_,. //cd;
open(FILE, "<$filename") or die "Couldn't open $filename\n";
my ($from, $subj, %messages);
while (<FILE> ) {
chomp;
if (s/^From:\s+//) {
$from = $_;
} elsif (s/^Subject:\s+//) {
$subj = $_;
}
if (defined($from) and defined($subj)) {
$messages{$from} = $subj;
$from = undef;
$subj = undef;
}
}
foreach (keys %messages) {
print "From: $_\n";
print "Subject: $messages{$_}\n\n";
}
BTW, there was a fairly major security hole in your original code. Can you spot it (hint: don't trust any input).
Also, it uses the 'strict' pragma. It's a very good idea to get used to programming with it.
------------------
- Klamath
Get my GnuPG Key Here (http://klamath.dyndns.org/mykey.asc)
Looking for an open source project to contribute to? Check out the BBB (http://bbb.sourceforge.net)
TaeShadow
11-13-2000, 02:41 AM
Ok, I understand everything you did except for one thing. Can you explain the line:
$filename =~ tr/a-zA-z0-9_,. //cd;
I don't understand how to use tr very well.
I cannot find the security hole in my code. It appears that you modified it so that it takes out any characters that are not letters, numbers, or _,. , but I'm not seeing how it is a security hole not to do that.
Tae
YaRness
11-13-2000, 09:45 AM
if /^From:\s+(.*)$/ { $from = $1; }
# the () marks a pattern to be accessed with $1 ($2 for the second one, etc) after matching.
the ".*" in there might be better substituted for a more explicit pattern ( "(?:[^\s]+\s*)+" maybe?), but you can try and play with that on yer own i guess, if you haven't already tried doing backreferencing.
klamath's code is prolly better, i just thought i'd give a go at how i'd do it with what i know (which is very little)
<edit> from the perlop man page: on tr//, the c and d at the end of it in the above code do this:
c Complement the SEARCHLIST.
d Delete found but unreplaced characters.
------------------
"Assembly of Japanese bicycle require great peace of mind."
Registered Linux User #188285 http://counter.li.org/
------------------
[This message has been edited by YaRness (edited 13 November 2000).]
klamath
11-13-2000, 07:04 PM
TaeShadow - the part you don't understand fixes the security hole. What it does it remove all the characters that are NOT in that set from the filename - it keeps alphanumerics + period, underscore, comma, and space (add a few more if you need).
The reason this is necessary is because of the way open() works. For example, the following is legal:
open(FILE, "|rm -r foo");
Which would delete the file named 'foo'. Perl tried to open a pipe to the rm process, executes the rm command, and expects to get some data back (in this case, the filename to open). So if the user entered the filename '|rm -r /' (and the script happened to be running as root), the system is toast. The transliteration fixes this by deleting all the shell meta characters (like '|', '<', '>', back-quotes, etc).
------------------
- Klamath
Get my GnuPG Key Here (http://klamath.dyndns.org/mykey.asc)
Looking for an open source project to contribute to? Check out the BBB (http://bbb.sourceforge.net)
TaeShadow
11-13-2000, 07:15 PM
That is scary.
Why did you use tr? What is wrong with using a substitution?
Tae
klamath
11-13-2000, 07:23 PM
tr// is faster.
------------------
- Klamath
Get my GnuPG Key Here (http://klamath.dyndns.org/mykey.asc)
Looking for an open source project to contribute to? Check out the BBB (http://bbb.sourceforge.net)
YaRness
11-13-2000, 08:45 PM
that's weird, i remember writing a paragraph about the security fix, musta deleted it by accident on the edit.
http://www.linuxnewbie.org/ubb/mad.gif
oh well.
------------------
"Assembly of Japanese bicycle require great peace of mind."
Registered Linux User #188285 http://counter.li.org/
------------------
TaeShadow
11-14-2000, 05:37 PM
So as long as the user doesn't enter a pipe character, I'm safe, right?
I appreciate it any way, YaRness http://www.linuxnewbie.org/ubb/smile.gif
jemfinch
11-14-2000, 05:48 PM
Originally posted by TaeShadow:
So as long as the user doesn't enter a pipe character, I'm safe, right?
No, they can also enter angles and clobber other files.
Jeremy