Hello, I am very much new to perl and to this forum also... pls help for this prob...
I have 200 two colom data files like, xyz_channel_1.dat .....xyz_channel_200.dat. Coloms of each file is similer like below; First colom 'X' starts from 1 and ends at 200. and second colom 'Y' is arbitrary. colom X and Y are separated by single white space.

X Y
1 2
2 3
3 6
4 5
5 7
. .
. .
. .
. .
200 9

For each file i want to integrate over two different range of X, say, integrate1(X=3 to 70) and integrate2 (X=90 to 150); and then subtract, diff=integrate2-integrate1. Finally i want to save all the calculated 'diff' from all "xyz_channel_1.dat .....xyz_channel_200.dat" files in a new out.dat file.

Edited 6 Years Ago by shuklax: n/a

When you say "integrate" what do you mean exactly? In terms of columns and mathematics what do "integrate" mean? Is there an arbitrary "range" you'd like to integrate?

As for the basics you will want to IMO, read each file in, loop through the lines (skipping the first line of each file, split each line into an array and push the array reference into another array. Then you can "integrate" (whatever that means) from the resultant array.

file 1.dat

X Y
1 5
2 9
3 9
4 7
5 1

file 2.dat

X Y
1 3
2 4
3 7
4 9
5 10

Code to read through them and get the values of each. You can adjust the integration code to deal with a particular range, but remember perl arrays start with 0 as an index.

my $x;
open (OUT, ">", "output.dat");
for ($x=1;$x<3;$x++){
	my @holder;
	open (FILE, "<", "$x\.dat");
	my $y;
	while (<FILE>){
		chomp;
		$y++;
		next if($y==1); #skips the first line
		my(@pairs)=split(/ /);
		push (@holder,\@pairs);
	}
	close FILE;
	for (@holder){
		#do the "integration" whatever that is
		print "col 1: $_->[0] col 2: $_->[1]\n";
	}
	#send to ouput file
}
close OUT;

Hi, Thanks. I tried but it doesn't help me.

Integration simply means summation. To explain my problem, for example, take file 1.dat (which u have made). if we integrate for x=2 to 4, ans is 25. and again if we integrate for another range, x= 2 to 5, ans is 26.
So the 'diff' of my problem is -1, i.e. difference of two integrations, i.e. 25-26.
I want to do the same for my all 200 data files, xyz_channel_1.dat .....xyz_channel_200.dat , and want to save the 'diff' for each input file in the same out.dat file... so in the end i will have 200 values in a single colom in out.dat.
Pls consider me a biggener in perl.

Here is some code that does these integrations across files. If there's only 1 integration per file, this code should work. I have put in some debug code to help you see what is happening.

my $x;

my @range_holder;
my %range_hash;
#you'll want a better way to read the ranges
%range_hash1=("start",2,"end",4);
push @range_holder,\%range_hash1;
%range_hash2=("start",2,"end",5);
push @range_holder,\%range_hash2;


my %ahash;
for ($x=1;$x<3;$x++){
	open (FILE, "<", "$x\.dat");
	my $y;
	while (<FILE>){
		chomp;
		$y++;
		next if($y==1); #skips the first line
		my($index,$value)=split(/ /);
		$ahash{$x}->{$index}=$value; #that's {file number} {row number} = value
	}
	close FILE;
}
my $c;
my @sum_holder;
for (@range_holder){
	$c++; #the file number we're working on
	my $start=$_->{start};
	print "start $start\n";
	my $end=$_->{end};
	print "end $end\n";
	my $sum=0;
	my $k;
	print "range $c\n";
	for ($k=$start;$k<=$end;$k++){
		$sum=$sum+$ahash{$c}->{$k};
		print "value $ahash{$c}->{$k} \t sum $sum\n";
	}
	push @sum_holder,$sum;
}

#now you have all the integrations in an array
print $sum_holder[0]-$sum_holder[1]."\n";

output:

start 2
end 4
range 1
value 9          sum 9
value 9          sum 18
value 7          sum 25
start 2
end 5
range 2
value 4          sum 4
value 7          sum 11
value 9          sum 20
value 10         sum 30
-5

Sorry to post again, but here is the elegant solution to your issue (and I am all for elegant and terse):

use strict;
my $x;
my %ahash;
my $files=2; #set to however many files you have or read the DIR
for ($x=1;$x<=$files;$x++){
	open (FILE, "<", "$x\.dat");
	my $y;
	while (<FILE>){
		chomp;
		$y++;
		next if($y==1); #skips the first line
		my($index,$value)=split(/ /);
		$ahash{$x}->{$index}=$value; #that's {file number} {column number} = value
	}
	close FILE;
}
#### being sloppy stuff
#### now that you have the hashes of hashes stored you can calculate 
#### any range by passing the ref to the inner hash with the begin
#### line and the end line - see below
my $fileno1=1; #the file number you'd like to use for integration
my $fileno2=2;
my $var1=calc_integration(\$ahash{$fileno1},2,4); #that's ref to inside hash and begin/end range
my $var2=calc_integration(\$ahash{$fileno2},2,5); 
my $intg=$var1-$var2;
print "diff is $intg\n";
#### end the sloppy stuff

sub calc_integration{
	my ($h,$start,$end)=@_;
	my $x;
	my $sum;
	for ($x=$start;$x<=$end;$x++){
		$sum=$sum+$$h->{$x};
	}
	return $sum;
}

My wife says I have to turn off the computer now so this is just a stub. The following shows you can use glob to do something to every file that is named according to a specific pattern:

#!/usr/bin/perl
use strict;
use warnings;

my @files_to_integrate = glob("/home/david/Programming/Perl/xyz_channel_*.dat");

foreach my $f (@files_to_integrate){
    integrate_file($f);
}

sub integrate_file {
    my $file = $_[0];
    print "Integrating $file now\n";
}

I think you said you want to do two integrations and calculate the difference on each file, correct? I hope to have time to look at this a little more tomorrow.

OK, try this. Not saying it's better than mitchems' solution, just different.

#!/usr/bin/perl
use strict;
use warnings;

my ($start_range1, $end_range1, $start_range2, $end_range2) = (2, 4, 2, 5);
my $cur_dir = "/home/david/Programming/Perl"; #Path to data files on my computer
my @files_to_integrate = glob("$cur_dir/xyz_channel_*.dat");

foreach my $f (@files_to_integrate){
    my ($sum1, $sum2) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
}

sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}

Well, one important difference between my solution and yours is that I am making full use of references and you're not. I'm not saying that's better or worse, but for a newbie, perhaps he doesn't understand references that well and if he's doing it for a class, the teacher will probably not believe that he wrote either of our solutions :-), especially mine because of all the de-referencing stuff.

BTW, I like the "glob" way of reading the files. I typically use something like this:

opendir(DATA, "$DATAROOT/$dir") || die "Can't open data directory ($DATAROOT)\n";
while( $file = readdir(DATA) )
{
 #do something with each file
}

But both work! Nice job! I love how perl allows you to you stuff so many different ways.

Well, one important difference between my solution and yours is that I am making full use of references and you're not. I'm not saying that's better or worse, but for a newbie, perhaps he doesn't understand references that well and if he's doing it for a class, the teacher will probably not believe that he wrote either of our solutions :-), especially mine because of all the de-referencing stuff.

BTW, I like the "glob" way of reading the files. I typically use something like this:

opendir(DATA, "$DATAROOT/$dir") || die "Can't open data directory ($DATAROOT)\n";
while( $file = readdir(DATA) )
{
 #do something with each file
}

But both work! Nice job! I love how perl allows you to you stuff so many different ways.

Yes, references can be very useful but the notation looks scary when you're new to Perl. When reading files into an array I used to avoid using references too as I found the various alternative notations confusing. Instead I would save each line as an array element and split the element into another array as needed to access the component fields (which could require splitting the same array element more than once). But sometimes references are pretty much essential, such as for passing or returning more than one list to or from subroutines -- so it's good to become familiar with them sooner or later. (Not to mention hash references. Passing attributes to objects and saving complex data to databases require hashes and hash references.)

As for the opendir and readdir way, on the one hand it is not quite as simple as the glob but it is more powerful, as it lets you access subdirectories as well as files.

One potential downside with using my @files_to_integrate = glob("$cur_dir/xyz_channel_*.dat"); to get a list of files is that the order in which the filenames occur in the list will be alphabetical -- so that the xyz_channel_10.dat file will be processed before the xyz_channel_2.dat file. That may not matter in this case, and if it does it should not be too difficult to sort the results as desired.

Edited 6 Years Ago by d5e5: Forgot to mention hash references.

Hi,
@mitchems & @d5e5 ........Thank you very much.
@d5e5......... in my case; it matters that 'diff' should be done in numerical order i.e. start from file xyz_channel_1.dat,.... and should finish with xyz_channel_200.dat.

so, as far as i understand this 'glob' will then not work...

with help from both of you; i have made the code below; but it is not printing anything.... i want to save the output in a file called out.dat.

in fact in out.dat file i want to print two coloms; 1st colom will be some corresponding values;

my $velocity;
for ($velocity=3; $velocity=203; $velocity=$velocity+1)

and second colom will be 'diff' from files, which starts from xyz_channel_1.dat...
so out.dat will look like following

3             'diff'from xyz_channel_1.dat 
4             'diff'from xyz_channel_2.dat 
5             'diff'from xyz_channel_3.dat
6             'diff'from xyz_channel_4.dat
.
.
.
.
.
203            'diff'from xyz_channel_200.dat
#!/usr/bin/perl
use strict;
use warnings;

my $velocity;
for ($velocity=3; $velocity=203; $velocity=$velocity+1); 
 
my ($start_range1, $end_range1, $start_range2, $end_range2) = (20, 30, 100, 110);
opendir(DATA, "/shuklax/data/xyz_channel") || die "Can't open data directory \n";

# all the files xyz_chennel_1.dat ...... xyz_chennel_200.dat are in "/shuklax/data/xyz_channel".....

while( $file = readdir(DATA) )
{
foreach my $f ($file){
    my ($sum1, $sum2) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
}
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}
.....
....
....
...

# pls print $velocity $diff in out.dat

PLEASE CORRECT AND COMPLETE THE ABOVE CODE.
THANKS IN ADVANCE..

Edited 6 Years Ago by shuklax: n/a

I like the reference stuff, but it can be confusing. One thing I love about perl is that you can have a reference or alias to just about anything, including functions. Have you used Moose at all? I am currently working with Catalyst and Moose. It's a lot like ruby on rails, but with the power of CPAN behind it. Awesome IMO.

The first time I started using references was back in 1994 when I wrote a wrapper to DBI that returned a reference to an array of hashes - the return of the SQL command with each line in the array and each column used as a key for the inside hash. I have to admit, I still use that "tool" today for quick DB queries.

The code looks a bit like this (first written in 1994), the down side is that it makes a connection each time:

sub goSQL{
use DBI;
my ($statement,$database)=@_; #pass the SQL statement,database
my $dbh,
my $user;
my $password="password";
my $db;
$database="DB_default" if($database eq "");
$user="DB_user";
if(!$dbh){
	$dbh = DBI->connect("dbi:mysql:$database", $user, $password);
}
	if ($dbh){
		my $id;
		my $count;
		my $color;
		my $sth;
		my @rows;
		my $err;
		$sth=$dbh->prepare("$statement");
		$sth->execute();
		my $nf=$sth->{NUM_OF_FIELDS};
		if ($nf!=0){
			while($id=$sth->fetchrow_hashref()){
				push @rows,$id;
			}
		}
		return \@rows;
	} else {
	    return 0;
	}  
}

If you combine that with an object that holds the particulars (user, db, password), you can get a ref. Then you use it like this:

$h=goSQL("select * from table", "database");
for (@$f){ #reference
    print "$_->{column}\n";
}

All you have to do is put the following code in your file:

open (OUT, ">", "out.dat");
#.... other code
print OUT "$velocity $diff\n";

And BTW you should probably say

for ($velocity=3; $velocity=203; $velocity++);

Hi, Thanks...
But I dont know, how to read files from from data directory in a sorted order...So I m not able to use my files in the programe.....

pls check the code below and help for correct syntex..... and check for errors......pls consider me new...

#!/usr/bin/perl
use strict;
use warnings;
my $velocity;
for ($velocity=3; $velocity=203; $velocity++); 
my ($start_range1, $end_range1, $start_range2, $end_range2) = (20, 30, 100, 110);
opendir(DATA, "/shuklax/data/xyz_channel") || die "Can't open data directory \n";
open (OUT, ">", "out.dat");
while( $file = readdir(DATA) )
{ 


# How to read here the files /shuklax/data/xyz_channel/xyz_channel_*.dat in sorted order..., 
    # something like so that i can read the content of $file as @files_to_integrate one by one in sorted order....
# my @files_to_integrate = $file = /shuklax/data/xyz_channel/xyz_channel_*.dat"); 
  #
#



foreach my $f (@files_to_integate){
    my ($sum1, $sum2) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
}
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}
print OUT "$velocity $diff\n";

Edited 6 Years Ago by shuklax: n/a

Since you know there's 200 files numbered 1-200, why not do this:

my $x;
for ($x=1;$x<201;$x++){
open (FILE,"<","/shuklax/data/xyz_channel/xyz_channel_$x\.dat");
while (<FILE>{
#do your thing with each file
}

}

That will pick up each file in order from 1-200.

Hi,
@mitchems & @d5e5 ........Thank you very much.
@d5e5......... in my case; it matters that 'diff' should be done in numerical order i.e. start from file xyz_channel_1.dat,.... and should finish with xyz_channel_200.dat.

so, as far as i understand this 'glob' will then not work...

with help from both of you; i have made the code below; but it is not printing anything.... i want to save the output in a file called out.dat.

in fact in out.dat file i want to print two coloms; 1st colom will be some corresponding values;

my $velocity;
for ($velocity=3; $velocity=203; $velocity=$velocity+1)

and second colom will be 'diff' from files, which starts from xyz_channel_1.dat...
so out.dat will look like following

3             'diff'from xyz_channel_1.dat 
4             'diff'from xyz_channel_2.dat 
5             'diff'from xyz_channel_3.dat
6             'diff'from xyz_channel_4.dat
.
.
.
.
.
203            'diff'from xyz_channel_200.dat
#!/usr/bin/perl
use strict;
use warnings;

my $velocity;
for ($velocity=3; $velocity=203; $velocity=$velocity+1); 
 
my ($start_range1, $end_range1, $start_range2, $end_range2) = (20, 30, 100, 110);
opendir(DATA, "/shuklax/data/xyz_channel") || die "Can't open data directory \n";

# all the files xyz_chennel_1.dat ...... xyz_chennel_200.dat are in "/shuklax/data/xyz_channel".....

while( $file = readdir(DATA) )
{
foreach my $f ($file){
    my ($sum1, $sum2) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
}
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}
.....
....
....
...

# pls print $velocity $diff in out.dat

PLEASE CORRECT AND COMPLETE THE ABOVE CODE.
THANKS IN ADVANCE..

I didn't mean to say that glob will not work. What I mean is that if you want the filenames in a particular order you need to do some sorting. readdir doesn't help in this case -- glob gives you an array which you can sort and save in another array. Then process each filename in the sorted array and the results will be in order by the integer portion of the filename. Try the following (after changing the $cur_dir from my value to yours). I tested it for a few small files, not 200. But I think it should work for your 200 files.

#!/usr/bin/perl
use strict;
use warnings;

my ($start_range1, $end_range1, $start_range2, $end_range2) = (2, 4, 2, 5);
my $cur_dir = "/home/david/Programming/Perl"; #Path to data files on my computer
my @files_to_integrate = glob("$cur_dir/xyz_channel_*.dat");
my @files_to_integrate_sorted = sort sortfiles @files_to_integrate;

open my $fout, '>', "$cur_dir/out.dat"; #Output file for $velocity and $diff
foreach my $f (@files_to_integrate_sorted){
    my ($sum1, $sum2, $velocity) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
    printf $fout "%-14d%5d from %s\n", ($velocity, $diff, $f);
}
close ($fout);


sub sortfiles {#Sort filenames numerically by the digit(s) in the filename
    my ($nbr_a) = ($a =~ /\D+(\d+)\D/);
    my ($nbr_b) = ($b =~ /\D+(\d+)\D/);
    $nbr_a <=> $nbr_b;
}

sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    my ($v) = ($file =~ /\D+(\d+)\D/);#Velocity should be the integer from filename plus 2
    $v = $v + 2; #For example, velocity of xyz_channel_1.dat is 3
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2, $v)
}

Well, hey.... All I can say d5e5, you've gone beyond the pale here. WOW. Nice. Good job. The idea of sorting either glob or opendir/readdir seems to work BTW. I haven't really had the situation in which the order of the files matter much. I do have some cool code that determines whether a files has been "touched" but that's for another time. So, did your wife give you hell about that? Hehe.

Hi
Velocity is not related to file no., it is something which starts at some random value and increases with constant step and ends at some value. Of course (end value-start value)/step value is no. of my files xyz_channel_*.dat. I just want that velocity should be printed from start to end; parallel to the 'diff' value starting for xyz_channel_1.dat file and ending with xyz_channel_200.dat.

So in total there is two problem, 1st problem is for velocity, it prints a constant value "6"..... but i want it step increase, for example..

my $velocity;
for ($velocity=6.659; $velocity=14.659; $velocity=$velocity+0.04){
print "$velocity \n";
}

and second problem is sorting the files for calculation.... it is sorting but not not in numerical order.
...
..
see the output,....which i get

6                 3 from /data/shuklax/xyz_channel/xyz_channel_1.dat
6                -3 from /data/shuklax/xyz_channel/xyz_channel_10.dat
6               -13 from /data/shuklax/xyz_channel/xyz_channel_100.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_101.dat
6                 6 from /data/shuklax/xyz_channel/xyz_channel_102.dat
6                -5 from /data/shuklax/xyz_channel/xyz_channel_103.dat
6               -17 from /data/shuklax/xyz_channel/xyz_channel_104.dat
6               -11 from /data/shuklax/xyz_channel/xyz_channel_105.dat
6               -16 from /data/shuklax/xyz_channel/xyz_channel_106.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_107.dat
6                -1 from /data/shuklax/xyz_channel/xyz_channel_108.dat
6                -7 from /data/shuklax/xyz_channel/xyz_channel_109.dat
6               -15 from /data/shuklax/xyz_channel/xyz_channel_11.dat
6                -2 from /data/shuklax/xyz_channel/xyz_channel_110.dat
6               -12 from /data/shuklax/xyz_channel/xyz_channel_111.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_112.dat
6                 1 from /data/shuklax/xyz_channel/xyz_channel_113.dat
6                -8 from /data/shuklax/xyz_channel/xyz_channel_114.dat
6                 1 from /data/shuklax/xyz_channel/xyz_channel_115.dat
6                -5 from /data/shuklax/xyz_channel/xyz_channel_116.dat
6               -16 from /data/shuklax/xyz_channel/xyz_channel_117.dat
6                -2 from /data/shuklax/xyz_channel/xyz_channel_118.dat
6                -3 from /data/shuklax/xyz_channel/xyz_channel_119.dat
6                -1 from /data/shuklax/xyz_channel/xyz_channel_12.dat
6               -13 from /data/shuklax/xyz_channel/xyz_channel_120.dat
6                10 from /data/shuklax/xyz_channel/xyz_channel_121.dat
6               -15 from /data/shuklax/xyz_channel/xyz_channel_122.dat
6                 0 from /data/shuklax/xyz_channel/xyz_channel_123.dat
6               -11 from /data/shuklax/xyz_channel/xyz_channel_124.dat

I want the output from the files in following order......

/data/shuklax/xyz_channel/xyz_channel_1.dat
/data/shuklax/xyz_channel/xyz_channel_2.dat
/data/shuklax/xyz_channel/xyz_channel_3.dat
/data/shuklax/xyz_channel/xyz_channel_4.dat
/data/shuklax/xyz_channel/xyz_channel_5.dat
.
.
.
.
.
.
/data/shuklax/xyz_channel/xyz_channel_200.dat

please check the code,

#!/usr/bin/perl
use strict;
use warnings;
my ($start_range1, $end_range1, $start_range2, $end_range2) = (30, 100, 40, 150);
my $cur_dir = "/data/shuklax/xyz_channel/"; #Path to data files on my computer
my @files_to_integrate = glob("$cur_dir/xyz_channel_*.dat");
my @files_to_integrate_sorted = sort sortfiles @files_to_integrate;
open my $fout, '>', "$cur_dir/xyz_channel_out.dat"; #Output file for $omega and $diff
foreach my $f (@files_to_integrate_sorted){
    my ($sum1, $sum2, $velocity) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
    printf $fout "%-14d%5d from %s\n", ($velocity, $diff, $f);
}
close ($fout);

sub sortfiles {            #Sort filenames numerically by the digit(s) in the filename
    my ($nbr_a) = ($a =~ /\D+(\d+)\D/);
    my ($nbr_b) = ($b =~ /\D+(\d+)\D/);
    $nbr_a <=> $nbr_b;
}
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    my $velocity;
    for ($velocity=6.65959; $velocity<=14.65959; $velocity=$velocity+0.04){
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2, $velocity)
}
}

Edited 6 Years Ago by shuklax: n/a

#!/usr/bin/perl
use strict;
use warnings;
my ($start_range1, $end_range1, $start_range2, $end_range2) = (30, 100, 40, 150);
my $cur_dir = "/data/shuklax/xyz_channel"; #Path to data files on my computer
open my $fout, '>', "$cur_dir/xyz_channel_out.dat"; #Output file for $omega and $diff
my $x;
my $f;
for ($x=1;$x<201;$x++){
    	my ($sum1, $sum2, $velocity) = integrate_file("$cur_dir/xyz_channel_$x\.dat");
    	my $diff = $sum2 - $sum1;
    	print "$f $sum1 $sum2 $diff\n";
    	printf $fout "%-14d%5d from %s\n", ($velocity, $diff, $f);   
}
close ($fout);
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    my $velocity;
    for ($velocity=6.65959; $velocity<=14.65959; $velocity=$velocity+0.04){
	    open my $fin, '<', $file or die "Failed to open $file: $!";
	    $_ = <$fin>; #Skip first line
	    while (<$fin>){
	        my ($x, $y) = split(/ /, $_);
	        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
	        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
	    }
	    return ($s1, $s2, $velocity)
     }
}

Edited 6 Years Ago by mitchems: n/a

Hi, not working.......
it still prints constant '6' in first colom ....in output....
$f is not initialized in printf
variable $x is used twice, is this ok ???

Try this. Some of this is d5e5's code, so I don't know if that will work. I didn't modify his code. What you need is a FILE handle, not a variable to print to the file.I didn't test it because I don't have a copy of the 200 files.

#!/usr/bin/perl
use strict;
use warnings;
my ($start_range1, $end_range1, $start_range2, $end_range2) = (30, 100, 40, 150);
my $cur_dir = "/data/shuklax/xyz_channel"; #Path to data files on my computer
open my OUT, '>', "$cur_dir/xyz_channel_out.dat"; #Output file for $omega and $diff
my $x;
my $f;
for ($x=1;$x<201;$x++){
    	my ($sum1, $sum2, $velocity) = integrate_file("$cur_dir/xyz_channel_$x\.dat");
    	my $diff = $sum2 - $sum1;
    	print "$f $sum1 $sum2 $diff\n";
    	printf OUT "%-14d%5d from %s\n", ($velocity, $diff, $f);   
}
close (OUT);
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    my $velocity;
    for ($velocity=6.65959; $velocity<=14.65959; $velocity=$velocity+0.04){
	    open my $fin, '<', $file or die "Failed to open $file: $!";
	    $_ = <$fin>; #Skip first line
	    while (<$fin>){
	        my ($x, $y) = split(/ /, $_);
	        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
	        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
	    }
	    return ($s1, $s2, $velocity)
     }
}

Looking at this more carefully, it's all messed up. Your for loop in the sub routine should be in the main code. You return values before you go to the next velocity. It's going to have to be rewritten completely. The reason you get a 6 in the first column is that you are printf the format with no decimals. You probably want to printf more like this. OUT is a file handle to the output file.

printf OUT "%.5f %5d from %s\n", ($velocity, $diff, $f);

In order to fix this, you will need to rearrange the code to call the velocity in the top part of the file, not the subroutine. As I said, the subroutine is returning before the for loop goes forward. So, you are only calculating the first value of velocity.

Do you want something more like this?

#!/usr/bin/perl
use strict;
use warnings;
my ($start_range1, $end_range1, $start_range2, $end_range2) = (3, 5, 4, 5); #I only have 2 files with 5 numbers each
my $cur_dir = "."; #Path to data files on your computer, mine are in current dir
open OUT, '>', "$cur_dir/xyz_channel_out.dat"; #Output file for $omega and $diff
my $x;
my $f;
for ($x=1;$x<3;$x++){ #this should be 201 for 200 files
    $f="$cur_dir/$x\.dat"; #my files are named 1.dat and 2.dat, you will want a prefix
    my $velocity=0;
    for ($velocity=6.65959; $velocity<=14.65959; $velocity=$velocity+0.04){
    		my ($sum1, $sum2) = integrate_file($f);
    		my $diff = $sum2 - $sum1;
    		print "$f $sum1 $sum2 $diff\n";
    		printf OUT "%.5f %5d from %s\n", ($velocity, $diff, $f);   
    }
}
close (OUT);
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
	    open FILE, '<', $file or die "Failed to open $file: $!";
	    $_ = <FILE>; #Skip first line
	    while (<FILE>){
	    	chomp;
	        my ($x, $y) = split(/ /);
	        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
	        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
	    }
	    return ($s1, $s2) 
}

Hi
Velocity is not related to file no., it is something which starts at some random value and increases with constant step and ends at some value. Of course (end value-start value)/step value is no. of my files xyz_channel_*.dat. I just want that velocity should be printed from start to end; parallel to the 'diff' value starting for xyz_channel_1.dat file and ending with xyz_channel_200.dat.

So in total there is two problem, 1st problem is for velocity, it prints a constant value "6"..... but i want it step increase, for example..

my $velocity;
for ($velocity=6.659; $velocity=14.659; $velocity=$velocity+0.04){
print "$velocity \n";
}

and second problem is sorting the files for calculation.... it is sorting but not not in numerical order.
...
..
see the output,....which i get

6                 3 from /data/shuklax/xyz_channel/xyz_channel_1.dat
6                -3 from /data/shuklax/xyz_channel/xyz_channel_10.dat
6               -13 from /data/shuklax/xyz_channel/xyz_channel_100.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_101.dat
6                 6 from /data/shuklax/xyz_channel/xyz_channel_102.dat
6                -5 from /data/shuklax/xyz_channel/xyz_channel_103.dat
6               -17 from /data/shuklax/xyz_channel/xyz_channel_104.dat
6               -11 from /data/shuklax/xyz_channel/xyz_channel_105.dat
6               -16 from /data/shuklax/xyz_channel/xyz_channel_106.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_107.dat
6                -1 from /data/shuklax/xyz_channel/xyz_channel_108.dat
6                -7 from /data/shuklax/xyz_channel/xyz_channel_109.dat
6               -15 from /data/shuklax/xyz_channel/xyz_channel_11.dat
6                -2 from /data/shuklax/xyz_channel/xyz_channel_110.dat
6               -12 from /data/shuklax/xyz_channel/xyz_channel_111.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_112.dat
6                 1 from /data/shuklax/xyz_channel/xyz_channel_113.dat
6                -8 from /data/shuklax/xyz_channel/xyz_channel_114.dat
6                 1 from /data/shuklax/xyz_channel/xyz_channel_115.dat
6                -5 from /data/shuklax/xyz_channel/xyz_channel_116.dat
6               -16 from /data/shuklax/xyz_channel/xyz_channel_117.dat
6                -2 from /data/shuklax/xyz_channel/xyz_channel_118.dat
6                -3 from /data/shuklax/xyz_channel/xyz_channel_119.dat
6                -1 from /data/shuklax/xyz_channel/xyz_channel_12.dat
6               -13 from /data/shuklax/xyz_channel/xyz_channel_120.dat
6                10 from /data/shuklax/xyz_channel/xyz_channel_121.dat
6               -15 from /data/shuklax/xyz_channel/xyz_channel_122.dat
6                 0 from /data/shuklax/xyz_channel/xyz_channel_123.dat
6               -11 from /data/shuklax/xyz_channel/xyz_channel_124.dat

I want the output from the files in following order......

/data/shuklax/xyz_channel/xyz_channel_1.dat
/data/shuklax/xyz_channel/xyz_channel_2.dat
/data/shuklax/xyz_channel/xyz_channel_3.dat
/data/shuklax/xyz_channel/xyz_channel_4.dat
/data/shuklax/xyz_channel/xyz_channel_5.dat
.
.
.
.
.
.
/data/shuklax/xyz_channel/xyz_channel_200.dat

please check the code,

#!/usr/bin/perl
use strict;
use warnings;
my ($start_range1, $end_range1, $start_range2, $end_range2) = (30, 100, 40, 150);
my $cur_dir = "/data/shuklax/xyz_channel/"; #Path to data files on my computer
my @files_to_integrate = glob("$cur_dir/xyz_channel_*.dat");
my @files_to_integrate_sorted = sort sortfiles @files_to_integrate;
open my $fout, '>', "$cur_dir/xyz_channel_out.dat"; #Output file for $omega and $diff
foreach my $f (@files_to_integrate_sorted){
    my ($sum1, $sum2, $velocity) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
    printf $fout "%-14d%5d from %s\n", ($velocity, $diff, $f);
}
close ($fout);

sub sortfiles {            #Sort filenames numerically by the digit(s) in the filename
    my ($nbr_a) = ($a =~ /\D+(\d+)\D/);
    my ($nbr_b) = ($b =~ /\D+(\d+)\D/);
    $nbr_a <=> $nbr_b;
}
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    my $velocity;
    for ($velocity=6.65959; $velocity<=14.65959; $velocity=$velocity+0.04){
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2, $velocity)
}
}

I don't understand why the array of filenames wouldn't sort on your computer. They sorted OK on mine. Anyway, let's use mitchems' for loop counter to retrieve the files in the correct order and increment the velocity for each input file (I'm still not sure I understand how velocity is related to the data.)

#!/usr/bin/perl
use strict;
use warnings;

my ($start_range1, $end_range1, $start_range2, $end_range2) = (30, 100, 40, 150);
#my $cur_dir = "/data/shuklax/xyz_channel"; #Path to data files on my computer
my $cur_dir = "/home/david/Programming/Perl"; #Path to data files on my computer

#Output file for $omega and $diff
open my $fout, '>', "$cur_dir/xyz_channel_out.dat"
        or die "Failed to open $cur_dir/xyz_channel_out.dat: $!";
my $x;
my $velocity = 6.65959; # Add 14.65959 for each file?
#for ($x=1;$x<201;$x++){
for ($x=1;$x<3;$x++){
	my $f = "$cur_dir/xyz_channel_$x\.dat";
    	my ($sum1, $sum2) = integrate_file($f);
    	my $diff = $sum2 - $sum1;
    	print "$f $sum1 $sum2 $diff\n";
    	printf $fout "%-14.5f %5d from %s\n", ($velocity, $diff, $f);
	$velocity += 14.65959; #Increment velocity by step
}
close ($fout);
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}

Edited 6 Years Ago by d5e5: n/a

Hi,
please help me out with this..........!!!

In the final code........... Now I need one inetegration range, which is not fixed, however another range is fixed. Say range2, which is not fixed, varies accordding to one equation, for start value ($start_rang2 = $m * $n + 100, here $n is increasing no. of my files....and $m is some fixed value, for e.g. "-0.1") and $end_range2= $start_range2 + 50; 50 because i want a fixed width of my range2.

I have added the variables $start_range2 and $end_range2 now, within the for loop, of $n.... so that I can change this range of integration for my files, as they increase in number, $n. However I left definition of range1 in the begening as it does not change with $n.
But i am not able to compile the code below, ................. perhaps I am not writting the correct syntex for defining the ...$start_range2, and $end_range2 and $m.

#!/usr/bin/perl
use strict;
use warnings;

my ($start_range1, $end_range1) = (30, 100);
my $m = (-0.1);
my $cur_dir = "/data/shuklax/xyz_channel"; #Path to data files on my computer

#Output file for $omega and $diff
open my $fout, '>', "$cur_dir/xyz_channel_out.dat"
        or die "Failed to open $cur_dir/xyz_channel_out.dat: $!";
my $n;
my $velocity = 6.65959; 
for ($n=1;$n<201;$n++){
	my $start_range2 = $m * $n + 110;
	my $end_range2 = $start_range2 + 70;
	my $f = "$cur_dir/xyz_channel_$x\.dat";
    	my ($sum1, $sum2) = integrate_file($f);
    	my $diff = $sum2 - $sum1;
    	print "$f $sum1 $sum2 $diff\n";
    	printf $fout "%-14.5f %5d from %s\n", ($velocity, $diff, $f);
	$velocity += 0.01; #Increment velocity by step
}
close ($fout);
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}

You can't compile it because of the scoping issues with your new variables. You need a # on the first line and need to re-scope your variables. Since you scope the "2" files in the for loop, you can't use them in the subroutine. I fixed that. This one compiles, but I don't have the 200 files to process the integration:

#!/usr/bin/perl
use strict;
use warnings;
my $x=0;
my $start_range1=30;
my $end_range1 = 100;
my $start_range2;
my $end_range2;
my $m = (-0.1);
my $cur_dir = "/data/shuklax/xyz_channel"; #Path to data files on my computer
 
#Output file for $omega and $diff
open my $fout, '>', "$cur_dir/xyz_channel_out.dat"
        or die "Failed to open $cur_dir/xyz_channel_out.dat: $!";
my $n;
my $velocity = 6.65959; 
for ($n=1;$n<201;$n++){
	$start_range2 = $m * $n + 110;
	$end_range2 = $start_range2 + 70;
	my $f = "$cur_dir/xyz_channel_$x\.dat";
    	my ($sum1, $sum2) = integrate_file($f);
    	my $diff = $sum2 - $sum1;
    	print "$f $sum1 $sum2 $diff\n";
    	printf $fout "%-14.5f %5d from %s\n", ($velocity, $diff, $f);
	$velocity += 0.01; #Increment velocity by step
}
close ($fout);
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
 
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}

One more thing... looking at mitchems' version, now that the loop counter is renamed $n instead of $x, I think the following line: my $f = "$cur_dir/xyz_channel_$x\.dat"; should be replaced by: my $f = "$cur_dir/xyz_channel_$n\.dat"; #$n is the file counter

This question has already been answered. Start a new discussion instead.