Click to See Complete Forum and Search --> : subroutines in perl


singlespeed
02-13-2006, 04:54 PM
I'm not sure if the title matches what I'm looking to do so please bear with me here. I've got a bunch of perl scripts that parse up different text files and inject the parsed up data into our database. I've got that part all done no problem but because of differences in the text files I've got 5 different perl scripts, one for each.

The first thing all of them do is look in a specific directory for an unprocessed file. It also checks to see in any unprocessed file contains the same original name as an already processed file. This is to prevent duplicate file import.

This section is the same for each of the 5 scripts except for a single parameter, telling it what directory to look in. I'd like to separate this part of the script out into it's own script, call it from the main and then get an array back containing the filenames of the files that need processing. Is there a way to do this?

Thanks,

chrism01
02-13-2006, 07:54 PM
Well, I'd write it as 1 prog with subs, and put the dir names in a cfg file and read that at startup. Makes it trivial to move the prog around.
Also, to avoid wasting time/code with filename checks, I always use 2 dirs: incoming, archive.
Every file that's been successfully imported is then moved (immediately) into the archive dir.
That way you can always re-load if you have to, but you don't have to check filenames.
The num of incoming/archive dirs depends on whether the groups of files have sufficiently unique names or each group needs it's own dirs.
Typical filename formats inc <src_system>_<data_desc>_<datetimestamp>.<data_type>
HTH

singlespeed
02-14-2006, 09:39 AM
Well, I'd write it as 1 prog with subs, and put the dir names in a cfg file and read that at startup. Makes it trivial to move the prog around.
Also, to avoid wasting time/code with filename checks, I always use 2 dirs: incoming, archive.
Every file that's been successfully imported is then moved (immediately) into the archive dir.
That way you can always re-load if you have to, but you don't have to check filenames.
The num of incoming/archive dirs depends on whether the groups of files have sufficiently unique names or each group needs it's own dirs.
Typical filename formats inc <src_system>_<data_desc>_<datetimestamp>.<data_type>
HTH

The files are much too different so a config files would be almost as labor intesive as having individual programs. I thought of the incoming/archive dir but I don't want to risk the users importing the same file twice. Checking filenames isn't fool proof but does help. The files always have datestamps in the name. Re-importation is not something I want the users doing, for me it's an easy matter.

Ideally the one piece of code that I'd like to re-use is the filechecking. Is there no way in perl to kick off an external script and then get an array from it as an output into the parent script?

singlespeed
02-22-2006, 10:52 AM
Ideally the one piece of code that I'd like to re-use is the filechecking. Is there no way in perl to kick off an external script and then get an array from it as an output into the parent script?


anyone have any input for this???

truls
02-24-2006, 03:59 AM
my @array = `perl script.pl`;

Assuming script.pl prints the output as a number of lines. The command runs an external program and puts the resulting output into an array.

BTW: If you are going to do some Perl programming I would recommend getting a good book on the subject. Learning Perl(easy), Programming Perl(medium) would be a good start, and then Perl Cookbook so you don't actually have to do anything yourself. (I nicked this from Chapter 16 of Programming Perl).

singlespeed
02-24-2006, 09:30 AM
I actually do have "learning Perl" the "Lama" book by O'reilly. I've just recently found the answer I need on perlmonks.org. Using Modules is the proper way.

Thanks for the reply.