d5e5 109 Master Poster

Sorry for another regex post but i have been trying to get a regex to work to check for unwanted characters in a string like @#?! I have tried

var cityreg=/^[^$%@!]+$/;

but it doesn't seem to work?

You need to assign the regex pattern object to a variable and then use the test method of the regex object to check the input you pass to it.

function check()
{
    var text=prompt("Please enter some text","Harry Potter");
    var clean=/^[^$%@!]+$/; //Define your regex pattern
    if (text!=null && text!=""){
      if (clean.test(text)){ //Use your regex pattern to test input and return true or false
        alert(text + " is OK.")
      }else{
        alert(text + " contains an invalid character.")
      }
    }
}
d5e5 109 Master Poster
>>> line_of_numbers = "5 7 6 35 346 245"
>>> list_of_numbers = line_of_numbers.split()
>>> print list_of_numbers
['5', '7', '6', '35', '346', '245']
>>>
d5e5 109 Master Poster

I haven't tested this but something like the following should work. It opens a new file for output using a prefix plus the input file name for the name of each output file.

#!/usr/bin/perl
use strict;
use warnings;

#If you want to open a new output file for every input file
#Do it in your loop, not here.
#my $outfile = "KAC.pdb";
#open( my $fh, '>>', $outfile );

opendir( DIR, "/data/tmp" ) or die "$!";
my @files = readdir(DIR);
closedir DIR;

foreach my $file (@files) {
    open( FH, "/data/tmp/$file" ) or die "$!";
    my $outfile = "output_$file"; #Add a prefix (anything, doesn't have to say 'output')
    open(my $fh, '>', $outfile);
    while (<FH>) {
        my ($line) = $_;
        chomp($line);
        if ( $line =~ m/KAC 50/ ) {
            print $fh $_;
        }
    }
    close($fh);
}
d5e5 109 Master Poster

For really simple html you can use regular expressions. For more complex data it would be better to search CPAN for a good html parser module and learn how to use it (which I haven't got around to doing yet). Meanwhile the following script should do what you want.

#!/usr/bin/perl
#ParseList.pl
use strict;
use warnings;
open my $fh, '<', '/home/david/Programming/Perl/data/list.txt';
my @list;
while (<$fh>){
    m/href="(\w+\.\w+)"/;
    push @list, $1;
}
print "Here is the list:\n";
print join(", ", @list);
d5e5 109 Master Poster

OK, I duplicated the data and added group names PREVIT:nmrValidate_300w.pdb etc. (See attached input data file test.txt)

#!/usr/bin/perl
use strict;
use warnings;

my %group;
my $groupname;

#Change the following line to assign path to your data file
my $path = '/home/david/Programming/Perl/data';
my $filename = "$path/test.txt";
open(my $fh, '<', $filename);

while (<$fh>){
    chomp;
    if (m/^PREVIT/){
        $groupname = $_;
        $group{$groupname}->{count_all} = 0;
        $group{$groupname}->{count_50_ALY} = 0;
        next;
    }else{
        $group{$groupname}->{count_all}++;
        $group{$groupname}->{count_50_ALY}++ if m/-50 -ALY/;
    }
}
close $fh;

#The following code adapted from code found at
# http://devdaily.com/perl/edu/qanda/plqa00016/
print "\nTHE FIVE GROUPS HAVING MOST -50 -ALY\n";
print "GROUP COUNTS IN DESCENDING NUMERIC ORDER:\n";
print "Number of contact\t\tnumber of ALY group\n";
my $line_count = 1;
foreach my $key (sort {$group{$b}->{count_50_ALY} <=> $group{$a}->{count_50_ALY};} (keys(%group))) {
    last  if $line_count > 5; #Print only the top five groups.
    print "($line_count)\t$group{$key}->{count_all}\t\t\t$group{$key}->{count_50_ALY} \t\t $key\n";
    $line_count++;
}
#Output is:
#THE FIVE GROUPS HAVING MOST -50 -ALY
#GROUP COUNTS IN DESCENDING NUMERIC ORDER:
#Number of contact		number of ALY group
#(1)	6			4 		 PREVIT:nmrValidate_301w.pdb
#(2)	6			4 		 PREVIT:nmrValidate_192w.pdb
#(3)	4			3 		 PREVIT:nmrValidate_251w.pdb
#(4)	4			3 		 PREVIT:nmrValidate_302w.pdb
#(5)	14			0 		 PREVIT:nmrValidate_166w.pdb
d5e5 109 Master Poster

An hour and 1/2 huh? Well, admittedly, my code was sloppy. I want to do some profiling to see where the time is spent. I have never used the CSV module before and I put in some unnecessary code. Glad you got it solved though!

Mike's script works for me too. Now Bastien has two solutions, so that's great. I like the way Text::CSV parses fields with multiple embedded commas; and I use it for that, but it has more options and features that I haven't figured out how to use properly. There also is a separate Text::CSV::Encoded module which I have not installed or tried. Hopefully you won't need it. (In fact, aren't most or all of the French accented characters represented in ascii?)

My preference would be to convert all the hex in each record first, perhaps create a temporary 'converted' file that could be input to a second script that would use Text::CSV to parse the records into fields, if necessary. But breaking processing up into multiple steps is just my preference, probably ingrained from years of writing batch programs on old mainframes.

d5e5 109 Master Poster

Dear David Ellis
Thank you so much for your solution ^^.
By the way, if I want to count 5 biggest group of "-50 -ALY", I change this command to:
$biggest[5] < $group{$key}
it does not work, how can I fix it?

Best regards
Quy

Sorry, Quy, I'm not sure I understand your latest question. There were only three groups in the data you posted so there is no "5 biggest group". Could you show me what output you want the program to print?
Best regards,
David

d5e5 109 Master Poster

Something has changed in the convert subroutine in the last couple of scripts you posted.

##my $mem = $_; #This does not contain the argument passed to the sub
my $mem = $_[0]; #The first (and only) element in the array of args, @_

When I run the following script, which reads the attached file, I get the attached output file. (If, by chance, this version of the script works for you, then there's no need to send us your data files. Otherwise, please attach your input file and we'll have look at it.)

#!/usr/bin/perl
#TextCSVBastienP.pl
use strict;
use warnings;
use diagnostics;
#use Encode qw( _utf8_on );
#binmode STDOUT, ":utf8";
my $dir = '/home/david/Programming/Perl/data';
my $in = $dir . '/' . '1.csv.txt'; #Added .txt in order to attach files to post
my $out = $dir . '/' . 'out.csv.txt';#Daniweb requires .txt extension to attach 
#open(INFILE,"<:raw:encoding(utf-16):crlf:utf8", "1.csv");
open(INFILE, '<', $in);
open(OUTFILE, '>', $out);
while(<INFILE>)
{
	my $rec = $_;
	chomp($rec);
	$rec =~ s/X'([a-fA-F0-9]+)'/convert($1)/eg;
        print OUTFILE "$rec\n";
}
close(INFILE);
close(OUTFILE);

sub convert{
    #This subroutine contains Mike's regex to substitute hex digits with
    #the corresponding characters
    my $mem = $_[0]; #The first (and only) element in the array of args, @_
    $mem =~ s/0d0a//; #Remove 0d0a from string of hex digits
    $mem=~s/([a-fA-F0-9]{2})/chr(hex $1)/eg;
    chomp $mem;
    return $mem;
}
BastienP commented: As usual, a very helpful post :-) +1
d5e5 109 Master Poster

I forgot that you wanted to count only the lines containing "-50 -ALY". That is easily done by adding a condition to the statement that increments the value of the hash item. The following prints only one group that has the most "-50 -ALY" records (if more than one group has the 'most' it prints only one of them.)

#!/usr/bin/perl
use strict;
use warnings;

my %group;
my $groupname;

#Change the following line to assign path to your data file
my $path = '/home/david/Programming/Perl/data';
my $filename = "$path/test.txt";
open(my $fh, '<', $filename);

while (<$fh>){
    chomp;
    if (m/^PREVIT/){
        $groupname = $_;
        $group{$groupname} = 0;
        next;
    }else{
        $group{$groupname}++ if m/-50 -ALY/;
    }
}
close $fh;

my @biggest = ('None', 0);
while(my ($key, $value) = each(%group)) {
    if ($biggest[1] < $group{$key}){
        @biggest = ($key, $value);
    }
}
print "The $biggest[0] group has the most '-50 -ALY' lines ($biggest[1]).\n";
#### Output is
# The PREVIT:nmrValidate_192w.pdb group has the most '-50 -ALY' lines (4).
d5e5 109 Master Poster
#!/usr/bin/perl
use strict;
use warnings;

my %group;
my $groupname;

#Change the following line to assign path to your data file
my $path = '/home/david/Programming/Perl/data';
my $filename = "$path/test.txt";
open(my $fh, '<', $filename);

while (<$fh>){
    chomp;
    if (m/^PREVIT/){
        $groupname = $_;
        $group{$groupname} = 0;
        next;
    }else{
        $group{$groupname}++;
    }
}
close $fh;

#The following code adapted from code found at
# http://devdaily.com/perl/edu/qanda/plqa00016/
print "\nGROUP COUNTS IN DESCENDING NUMERIC ORDER:\n";
foreach my $key (sort hashValueDescendingNum (keys(%group))) {
   print "\t$group{$key} \t\t $key\n";
}

#----------------------------------------------------------------------#
#  FUNCTION:  hashValueDescendingNum                                   #
#                                                                      #
#  PURPOSE:   Help sort a hash by the hash 'value', not the 'key'.     #
#             Values are returned in descending numeric order          #
#             (highest to lowest).                                     #
#----------------------------------------------------------------------#

sub hashValueDescendingNum {
   $group{$b} <=> $group{$a};
}

This gives the following output:

GROUP COUNTS IN DESCENDING NUMERIC ORDER:
	14 		 PREVIT:nmrValidate_166w.pdb
	6 		 PREVIT:nmrValidate_192w.pdb
	4 		 PREVIT:nmrValidate_251w.pdb
d5e5 109 Master Poster

Tried parenthesis but still same deal - only shows records with an entry in 'install' table, not all records. I think a JOIN might be what i'm after, but I'm not sure yet how to do it across 3 tables.

It looks like you need a left join, but it's not clear to me for which table you want to list all rows, and which tables should be joined to lookup corresponding rows if they exist plus return rows of null fields where they do not (for example products that have no rows in the 'install' table.)

Example of left join for three tables

d5e5 109 Master Poster

I'm new to MySQL, so pardon the dumb question, but what does the word 'juggernaut' mean when it appears immediately before your CREATE statement (as shown in your attached thumbnail)?

d5e5 109 Master Poster

Worked fine for me in the MySQL monitor in terminal.

mysql> CREATE TABLE employees (              
    ->              employeeNumber int(11) NOT NULL,    
    ->              lastName varchar(50) NOT NULL,      
    ->              firstName varchar(50) NOT NULL,     
    ->              extension varchar(10) NOT NULL,     
    ->              email varchar(100) NOT NULL,        
    ->              officeCode varchar(10) NOT NULL,    
    ->              reportsTo int(11) default NULL,     
    ->              jobTitle varchar(50) NOT NULL,      
    ->              PRIMARY KEY  (employeeNumber)       
    ->            );
Query OK, 0 rows affected (0.10 sec)
d5e5 109 Master Poster

You could create a subroutine to which you pass each directory name and put your code in there. Try running the following first. It has a subroutine called 'something' that prints a message each time it is called. You could put your code in there.

#!/usr/bin/perl
#DoSomethingInDirAndSubdirs.pl
use strict;
use warnings;
my $startdir = '/home/david/Programming';
do_something_in_dir_and_subdirs($startdir);

sub do_something_in_dir_and_subdirs {
    my $dir = $_[0];
    opendir DH, $dir or die "Failed to open $dir: $!";
    my @d;
    my $res = something("Doing something in $dir");
    while ($_ = readdir(DH)) {
        next if $_ eq "." or $_ eq "..";
        my $fn = $dir . '/' . $_;
        if (-d $fn) {
            push @d, $fn;
        }
    }
    if (scalar @d == 0) { #If no directories found, $dir is lowest dir in this branch
        #We're done.
        return;
    }
    foreach (@d) {
        do_something_in_dir_and_subdirs($_); #Look for directories in directory
    }
}

sub something {
    my $msg = $_[0]; #Assign passed argument to variable
    print '-' x 75, "\n";
    print "$msg\n";
}
d5e5 109 Master Poster
#!/usr/bin/perl
use strict;
use warnings;

#For testing, let's assign a string constructed with q(), which puts single
#quotes around text that may contain quotes. You can read from your file instead.
my $rec = q("CN=BERRY Richard,OU=TestFinance,OU=HR,OU=CORP,OU=CR,DC=mycorp,DC=com","BERRY Richard",BERRY,FR,"Puylouvier",FRANCE,,56000,"+33 1 23 45 67 89","+ 33 1 23 45 67 80",Richard,FRANCE,IT,X'332c206176656e7565204e6577746f6e0d0a',Bob.Malone,Richard.BERRY@mycorp.net," -",);
#Remove X and ticks, call convert subroutine with data from between the ticks
#Substitute with string returned from convert subroutine.
$rec =~ s/X'([a-fA-F0-9]+)'/convert($1)/eg;
print $rec;

sub convert{
    #This subroutine contains Mike's regex to substitute hex digits with
    #the corresponding characters
    my $mem = $_[0];
    $mem =~ s/0d0a//; #Remove 0d0a from string of hex digits
    $mem=~s/([a-fA-F0-9]{2})/chr(hex $1)/eg;
    chomp $mem;
    return $mem;
}
d5e5 109 Master Poster

I haven't figured out how to do this yet but the solution may involve something called Fork, which is a way of starting up a copy of your program that knows that it is a copy (i.e. not the parent) and can do something independently and somehow communicate back to the parent that the task has been accomplished. It couldn't hurt to read this short python fork example

d5e5 109 Master Poster

I'm not sure why you need an infinite loop but maybe you want to wrap the sendIP() in a try block. The finally block will quit the server whether or not your sendIP() succeeds or causes an exception.

#!/usr/bin/env python
# This is the loop i need to break. I know there will be another problem with the sleep() :
while 1:
    try:
        sendIP()
        time.sleep(25000)
    finally: #The following statements run whether or not an exception occurs
        print "Connection Closed.."
        server.quit()
        break
d5e5 109 Master Poster

From what I've read, it's considered a best practice. One plus is that more than one web page can include the same script. Also if the script is called from different web pages and it's already in memory it doesn't have to be reloaded. I've been reading about it in Sams Teach Yourself JavaScript in 24 Hours

d5e5 109 Master Poster

Yes, this site was down part of yesterday afternoon. Since this thread hasn't been marked solved yet, here's one more approach:

#!/usr/bin/env python
from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):
    labels = []
    values = []
    save_next = False
    def handle_data(self, *args):
        s, = args
        if MyHTMLParser.save_next:
            MyHTMLParser.values.append(s)
            MyHTMLParser.save_next = False

        if str(args).find("Allowance") > 0:
            MyHTMLParser.labels.append(s)
            MyHTMLParser.save_next = True

#Assign web page contents to htmlstring (Snipped from post to save space)        
htmlstring = """<Please paste webpage content here>"""

h = MyHTMLParser()
h.feed(htmlstring)
h.close()

for i in range(1, 5):
    print "%30s:%15s" % (h.labels[i], h.values[i])
d5e5 109 Master Poster

If you stick with the regular expressions approach, the following should return the full time (hh:mm:ss).

rawstr = r"""Time Until Allowance Refill</td><td style="border-width:0px;">\s*(?P<Refill_Time>\d+:\d+:\d+)\D"""

In regular expressions \d+ means "one or more consecutive digits", so \d+:\d+:\d+)\D means "one or more consecutive digits followed by a colon followed by ... etc." and the \D represents any character that is not a digit. Only the portion matched by the pattern between the parentheses belongs to the group which gives you the data you want (hopefully).

d5e5 109 Master Poster

Kodos The Python Regex Debugger suggests the following ways to find Plan Allowance value.

#!/usr/bin/env python
import re
rawstr = r"""Plan Allowance \(MB\)</td><td style="border-width:0px;">\s*(?P<Plan_Allowance>\d+)\D"""
embedded_rawstr = r"""Plan Allowance \(MB\)</td><td style="border-width:0px;">\s*(?P<Plan_Allowance>\d+)\D"""
matchstr = """<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head>
<meta http-equiv="refresh" Content="60;url=/stlui/user/allowance_request.html">
<title>Download Allowance Status</title></head><body><h3 style="font-size: 150%; color:blue; text-align:center; font-weight:bold">DOWNLOAD ALLOWANCE STATUS</h3><TABLE width='100%' cellpadding='10'><tr><td style="text-align:center;"><h3>Usage within allowance - no download restrictions</h3></td><td><span style="display:block;text-align:center">100%</span><table style="font-size:1; background-color:white; border:1px solid black;" width='15%' cellspacing='0' cellpadding='0' align='center'><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr></table><span style="display:block;text-align:center">0%</span></td></tr><tr><td>&nbsp;</td><td><table style="font-size:18; border-width:0px;" width=90% cellspacing=0 cellpadding=0 ALIGN=CENTER><tr><td>&nbsp;</td><td style="border-width:0px;">Plan Allowance (MB)</td><td style="border-width:0px;"> 625</td></tr><tr><td>&nbsp;</td><td style="border-width:0px;">Allowance Remaining (MB)</td><td style="border-width:0px;"> 625</td></tr><tr><td><span style="background-color: green;">&nbsp;&nbsp;&nbsp;</span>&nbsp;</td><td style="border-width:0px;">Allowance Remaining (%)</td><td style="border-width:0px;"> 100</td></tr><tr><td>&nbsp;</td><td style="border-width:0px;">Time Until Allowance Refill</td><td style="border-width:0px;"> 22:18:01</td></tr></table></td></tr></TABLE>
</body>
</html>"""

# method 1: using a compile object
compile_obj = re.compile(rawstr)
match_obj = compile_obj.search(matchstr)

# method 2: using search function (w/ external flags)
match_obj = re.search(rawstr, matchstr)

# method 3: using search function (w/ embedded flags)
match_obj = re.search(embedded_rawstr, matchstr)

# Retrieve group(s) from …
d5e5 109 Master Poster

What was the error that you saw in the terminal? If an error other than IOError, UnicodeDecodeError or SyntaxError occurs (if that is possible?) then I think (haven't tested it) control would go to your else block which reports success.

d5e5 109 Master Poster

What are the names of your input files? Before you run your program you have 10 files. After you run it you will have 20 (10 input plus the 10 new output files). If you run your program more than once, the program will need to specify the names of the files it needs to read, since you won't want to read the output files as well.

I recommend you name your input files according to an easily recognisable pattern, such as, "in_001.txt", "in_002.txt" etc. Then your program can make an array of file names to read, and as it opens each file it can open a new output file named something like "out_001.txt", "out_002.txt" etc.

d5e5 109 Master Poster

One more thing... looking at mitchems' version, now that the loop counter is renamed $n instead of $x, I think the following line: my $f = "$cur_dir/xyz_channel_$x\.dat"; should be replaced by: my $f = "$cur_dir/xyz_channel_$n\.dat"; #$n is the file counter

d5e5 109 Master Poster

I can't explain what is going on in Windows, but from what you say it seems the solution is to have your program get and use the absolute path joined with the filename. The following should work on both platforms:

#!/usr/bin/env python
import webbrowser
import os
abs_path = os.path.abspath('.') #Absolute path of current working directory
filename = os.path.join(abs_path, 'index.html')
print 'Try to open ' + filename
webbrowser.open(filename)
d5e5 109 Master Poster

I see mitchems has answered your question for installing in Windows. As for ubuntu, I think I installed the DBD::mysql module the lazy way in ubuntu by using the Synaptic Package manager and doing a quick search for libdbd-mysql-perl. You may already have the DBI module installed as it may have been included with your Perl already. You can read more about installing these modules here.

d5e5 109 Master Poster

is it possible to open and work in two files from same or different location simultaneously.

How it is possible? Please suggest

Thanks / Regards
Mahesh

Yes it is possible. How? When you open a file you assign it to a file handle which you give any name you want. You can open another file and assign it to a file handle with a different name. You can read more about opening files here

d5e5 109 Master Poster

Please wrap your program in [code] Your program goes here [/code] tags. Otherwise we can't see how the lines in your program are indented and have to guess. Also when you say it didn't work, do you mean that you got no output, or that the output was not what you wanted? If you got some output please show us what it looked like.

d5e5 109 Master Poster

OK, I'll try that.

One more question: I want to open a local file with the browser. Is it possible to use arelative path? I wouldn't know where the user places the folder, so the path to the html file has to be relative.

I tried opening the file using the absolute path, then finding out my current working directory and opening the file with a path relative to it, then changing the cwd to the file location and opening the file by name only. It all works, on Linux anyway.

>>> import webbrowser
>>> webbrowser.open('/home/david/Programming/Python/index.html')
True
>>> import os
>>> os.getcwd()
'/home/david'
>>> webbrowser.open('Programming/Python/index.html')
True
>>> os.chdir('Programming/Python')
>>> webbrowser.open('index.html')
True
>>>
d5e5 109 Master Poster

Hello Bastien,
You're quite welcome. Thanks for the feedback. No problem about the delay. I was thinking that if my assumption that all fields in your input were in quotes was wrong -- i.e. if the numeric fields were not in quotes, then the script wouldn't work correctly without a minor change. I'm glad it worked.

Regards,
David

d5e5 109 Master Poster

Oh, wait, it's not completely solved:

on linux, everything is fine.

on windows, however, webbrowser.open only opens Explorer, even though firefox is the default.

If I try startfile, it does open the associated program as d5e5 explained, but it fails to open the help file (it seems it only works with URL's)

I tried the following

>>> import webbrowser
>>> webbrowser.open('c:\Users\David\Programming\Javascript\CHAPTERS\ch 08\index.htm')
True
>>>

on Windows Vista Home Premium and it opened the file in my default browser, which is Chrome. In Linux it opened in Firefox, which is my default browser for Linux. I don't know why webbrowser is using Explorer on your computer when it is not the default. (Maybe your version of Windows or Explorer is different than mine?) Have you tried the following?

>>> webbrowser.get("windows-default").open('c:\Users\David\Programming\Javascript\CHAPTERS\ch 08\index.htm')
True
>>>
d5e5 109 Master Poster

Hi
Velocity is not related to file no., it is something which starts at some random value and increases with constant step and ends at some value. Of course (end value-start value)/step value is no. of my files xyz_channel_*.dat. I just want that velocity should be printed from start to end; parallel to the 'diff' value starting for xyz_channel_1.dat file and ending with xyz_channel_200.dat.

So in total there is two problem, 1st problem is for velocity, it prints a constant value "6"..... but i want it step increase, for example..

my $velocity;
for ($velocity=6.659; $velocity=14.659; $velocity=$velocity+0.04){
print "$velocity \n";
}

and second problem is sorting the files for calculation.... it is sorting but not not in numerical order.
...
..
see the output,....which i get

6                 3 from /data/shuklax/xyz_channel/xyz_channel_1.dat
6                -3 from /data/shuklax/xyz_channel/xyz_channel_10.dat
6               -13 from /data/shuklax/xyz_channel/xyz_channel_100.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_101.dat
6                 6 from /data/shuklax/xyz_channel/xyz_channel_102.dat
6                -5 from /data/shuklax/xyz_channel/xyz_channel_103.dat
6               -17 from /data/shuklax/xyz_channel/xyz_channel_104.dat
6               -11 from /data/shuklax/xyz_channel/xyz_channel_105.dat
6               -16 from /data/shuklax/xyz_channel/xyz_channel_106.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_107.dat
6                -1 from /data/shuklax/xyz_channel/xyz_channel_108.dat
6                -7 from /data/shuklax/xyz_channel/xyz_channel_109.dat
6               -15 from /data/shuklax/xyz_channel/xyz_channel_11.dat
6                -2 from /data/shuklax/xyz_channel/xyz_channel_110.dat
6               -12 from /data/shuklax/xyz_channel/xyz_channel_111.dat
6                -9 from /data/shuklax/xyz_channel/xyz_channel_112.dat
6                 1 from /data/shuklax/xyz_channel/xyz_channel_113.dat
6                -8 from /data/shuklax/xyz_channel/xyz_channel_114.dat
6                 1 from /data/shuklax/xyz_channel/xyz_channel_115.dat
6                -5 from /data/shuklax/xyz_channel/xyz_channel_116.dat
6               -16 from /data/shuklax/xyz_channel/xyz_channel_117.dat
6                -2 from /data/shuklax/xyz_channel/xyz_channel_118.dat
6                -3 from /data/shuklax/xyz_channel/xyz_channel_119.dat
6                -1 from /data/shuklax/xyz_channel/xyz_channel_12.dat
6               -13 from /data/shuklax/xyz_channel/xyz_channel_120.dat
6                10 from /data/shuklax/xyz_channel/xyz_channel_121.dat
6               -15 from /data/shuklax/xyz_channel/xyz_channel_122.dat
6                 0 from /data/shuklax/xyz_channel/xyz_channel_123.dat
6               -11 from /data/shuklax/xyz_channel/xyz_channel_124.dat

I want the output from the files …

d5e5 109 Master Poster

Hi,
@mitchems & @d5e5 ........Thank you very much.
@d5e5......... in my case; it matters that 'diff' should be done in numerical order i.e. start from file xyz_channel_1.dat,.... and should finish with xyz_channel_200.dat.

so, as far as i understand this 'glob' will then not work...

with help from both of you; i have made the code below; but it is not printing anything.... i want to save the output in a file called out.dat.

in fact in out.dat file i want to print two coloms; 1st colom will be some corresponding values;

my $velocity;
for ($velocity=3; $velocity=203; $velocity=$velocity+1)

and second colom will be 'diff' from files, which starts from xyz_channel_1.dat...
so out.dat will look like following

3             'diff'from xyz_channel_1.dat 
4             'diff'from xyz_channel_2.dat 
5             'diff'from xyz_channel_3.dat
6             'diff'from xyz_channel_4.dat
.
.
.
.
.
203            'diff'from xyz_channel_200.dat
#!/usr/bin/perl
use strict;
use warnings;

my $velocity;
for ($velocity=3; $velocity=203; $velocity=$velocity+1); 
 
my ($start_range1, $end_range1, $start_range2, $end_range2) = (20, 30, 100, 110);
opendir(DATA, "/shuklax/data/xyz_channel") || die "Can't open data directory \n";

# all the files xyz_chennel_1.dat ...... xyz_chennel_200.dat are in "/shuklax/data/xyz_channel".....

while( $file = readdir(DATA) )
{
foreach my $f ($file){
    my ($sum1, $sum2) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
}
 
sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= …
d5e5 109 Master Poster

Well, one important difference between my solution and yours is that I am making full use of references and you're not. I'm not saying that's better or worse, but for a newbie, perhaps he doesn't understand references that well and if he's doing it for a class, the teacher will probably not believe that he wrote either of our solutions :-), especially mine because of all the de-referencing stuff.

BTW, I like the "glob" way of reading the files. I typically use something like this:

opendir(DATA, "$DATAROOT/$dir") || die "Can't open data directory ($DATAROOT)\n";
while( $file = readdir(DATA) )
{
 #do something with each file
}

But both work! Nice job! I love how perl allows you to you stuff so many different ways.

Yes, references can be very useful but the notation looks scary when you're new to Perl. When reading files into an array I used to avoid using references too as I found the various alternative notations confusing. Instead I would save each line as an array element and split the element into another array as needed to access the component fields (which could require splitting the same array element more than once). But sometimes references are pretty much essential, such as for passing or returning more than one list to or from subroutines -- so it's good to become familiar with them sooner or later. (Not to mention hash references. Passing attributes to objects and saving complex data to databases require hashes and hash references.)

As …

d5e5 109 Master Poster

OK, try this. Not saying it's better than mitchems' solution, just different.

#!/usr/bin/perl
use strict;
use warnings;

my ($start_range1, $end_range1, $start_range2, $end_range2) = (2, 4, 2, 5);
my $cur_dir = "/home/david/Programming/Perl"; #Path to data files on my computer
my @files_to_integrate = glob("$cur_dir/xyz_channel_*.dat");

foreach my $f (@files_to_integrate){
    my ($sum1, $sum2) = integrate_file($f);
    my $diff = $sum2 - $sum1;
    print "$f $sum1 $sum2 $diff\n";
}

sub integrate_file {
    my $file = $_[0];
    my ($s1, $s2) = (0, 0);
    open my $fin, '<', $file or die "Failed to open $file: $!";
    $_ = <$fin>; #Skip first line
    while (<$fin>){
        my ($x, $y) = split(/ /, $_);
        $s1 += $y if ($x >= $start_range1) and ($x <= $end_range1);
        $s2 += $y if ($x >= $start_range2) and ($x <= $end_range2);
    }
    return ($s1, $s2)
}
d5e5 109 Master Poster

My wife says I have to turn off the computer now so this is just a stub. The following shows you can use glob to do something to every file that is named according to a specific pattern:

#!/usr/bin/perl
use strict;
use warnings;

my @files_to_integrate = glob("/home/david/Programming/Perl/xyz_channel_*.dat");

foreach my $f (@files_to_integrate){
    integrate_file($f);
}

sub integrate_file {
    my $file = $_[0];
    print "Integrating $file now\n";
}

I think you said you want to do two integrations and calculate the difference on each file, correct? I hope to have time to look at this a little more tomorrow.

d5e5 109 Master Poster

Hi there. Day 2 of programming python. In this thread I posted my first attempt
http://www.daniweb.com/forums/post1231604.html#post1231604
and growing from there it goes to slightly deeper water here.

I have three .txt files:

nvutf8.txt here new vocab items are stored
esutf8.txt here example sentences are stored
exoututf8.txt example sentences from esutf8.txt containing vocab from nvutf8.txt is supposed to be stored here.

I have written the following code:

#step1: find example sentences in esutf8.txt which contain new voc items from nvutf8.txt
#step2: among those sentences find those which contain as few as possible new words from kvutf8.txt (known vocab).

import codecs

enout = codecs.open('Python/ExListBuild/exoututf8.txt', encoding = 'utf-8', mode = 'w')

nvin = codecs.open('Python/ExListBuild/nvutf8.txt', encoding = 'utf-8', mode = 'r')

for line in open('Python/ExListBuild/nvutf8.txt'):
	newvocab = nvin.readline()
	print "-"
	print "next vocab item being checked"
	print "-"
	esin = codecs.open('Python/ExListBuild/esutf8.txt', encoding = 'utf-8', mode = 'r')
	for line in open('Python/ExListBuild/esutf8.txt'):
		sentence = esin.readline()
		index = sentence.find(newvocab)
		if index==-1:
			print "nope"
		else:
			print "yes"
			enout.write(sentence)
	esin.close()
nvin.close()

There are some hard to understand irregularities going on.

I use the following example sentences in esutf8.txt:
我前边要拐弯了,请注意。
车来了快跑。
请排好队上车。
带好自己的东西。
方向错了!
我给你讲一个成语故事。
感谢你对我们的关心。

For new vocab I use in nvutf8.txt:

And I get returned in exoututf8.txt:
我前边要拐弯了,请注意。
我给你讲一个成语故事。
感谢你对我们的关心。

So it worked fine for 我, but it did not work for 要 (which is in the first sentence).
EDIT: Apparently it always works ONLY for the last …

d5e5 109 Master Poster

I made up a simplified data file containing the following:

Tom,no,2009
Dick,no,2008
Harry,se,2010
Jane,ip,2009
Frank,te,2007

Notice the last record has a deliberately invalid item for testing. The following should print your data, after looking up the desired values for $dat_type that are defined in a hash. Using a hash means you don't need a complex if-else statement to test every possible $dat_type.

#!/usr/bin/perl
use strict;
use warnings;

# %dtypes hash will associate dat_type keys with values to print
my %dtypes = (no => 'nokia',
              se => 'sony ericsson',
              ip => 'iphone');

my $dat_file = 'sample.csv';

open (F, $dat_file) or die ("Could not open $dat_file: $!");

my $type2print;
printf "%-10s%-25s%-10s\n", ('Name', 'Type', 'Year');
while (my $line = <F>){
    chomp $line; #Remove trailing linefeed
    my ($name,$dat_type,$year) = split ',', $line;
    
    if (exists $dtypes{$dat_type}) {
        $type2print = $dtypes{$dat_type}
    }else{
        $type2print = "Unknown Type '$dat_type'";
    }

    printf "%-10s%-25s%-10s\n", ($name, $type2print, $year);
}

This prints the following output:

Name      Type                     Year      
Tom       nokia                    2009      
Dick      nokia                    2008      
Harry     sony ericsson            2010      
Jane      iphone                   2009      
Frank     Unknown Type 'te'        2007
d5e5 109 Master Poster

Neat. I didn't know you could print to a scalar variable. But then I haven't read all the documentation:icon_redface:

d5e5 109 Master Poster

Sorry, tonyjv, I posted the above before noticing your solution, and it was too late to edit or delete mine. I agree it is much better to call the same function for each weekday than to define separate functions for Monday, Tuesday, etc.

d5e5 109 Master Poster

The ValueError occurs after your while valid == False: loop has completed. Since the ValueError occurs outside the loop where you do your exception handling, it is not handled and so crashes the program.

d5e5 109 Master Poster

Thanks tonyjv, but I just can't figure this out. Using my incredible genius, (sarcasm) it finally clicked in my brain that the code in red cannot possibly give me how many minutes are left along with the hours. Because the code there now is meant to show me how many minutes are in the number of hours they worked! :$

It's supposed to show how many hours they worked, and then how many minutes are left.
Like, if the clock in time is 11:20AM, and clock out time is 9:15PM. The program should say that they worked 9 hours and 55 minutes that day.

import time

print "When did they clock in? Use HH:MMtt format."
clockIn = raw_input()

timeString1 = "05/23/10 " + clockIn
timeTuple1 = time.strptime(timeString1, "%m/%d/%y %I:%M%p")

print "When did they clock out?"
clockOut = raw_input()

timeString2 = "05/23/10 " + clockOut
timeTuple2 = time.strptime(timeString2, "%m/%d/%y %I:%M%p")

totalTime = time.mktime(timeTuple2) - time.mktime(timeTuple1)

print "They worked",totalTime//(60.0*60),"hours and",totalTime/60.0,"minutes today."

Here is what happens when I enter the clock in and clock out time I said above.

IDLE 2.6.5 ==== No Subprocess ====
>>>
When did they clock in? Use HH:MMtt format.
11:20AM
When did they clock out?
9:15PM
They worked 9.0 hours and 595.0 minutes today.
>>>

I don't know where to go from here to make this work...

Your code neglects to subtract the minutes that you have already converted into hours from the remaining minutes. For example, to convert 65 minutes …

d5e5 109 Master Poster

Let me see if I got this right:

os.startfile("PATH OR URL") would open the defaultbrowser for windows and send it there?

and

webbrowser.open("PATH OR URL") would do the same but for any platform?


can you clarify why they are so different?? Which one is better???

When the os module is compiled on an non-Windows platform it doesn't have a startfile attribute or method. I guess os.startfile works on Windows by checking the Windows registry for associations between the file extension and its default program.

webbrowser.open("PATH OR URL") works on both Windows and Linux, etc., presumably because it does not depend on the Windows registry to discover the default browser. That's all I know about os.startfile and webbrowser.open so far so can't say which is best.

d5e5 109 Master Poster
>>> import webbrowser
>>> webbrowser.open('/home/david/index.html')
True
>>>

It opens my default browser, Firefox, and displays my index.html file. I think it's supposed to do the same on other platforms.

d5e5 109 Master Poster

Look at vegaseat's code snippet about date and time handling and scroll down to the part that says "Calculate difference between two times (12 hour format) of a day:". That looks like what you need.

d5e5 109 Master Poster

I made up the following data to test and put it in a file called 'users.csv'.

"Name","ComplicatedColumn","AccountDisabled","OnSite",
"Fred","Age=25,Pet=Fish","0","1",
"Donna","Age=29,Pet=Cat","0","1",
"Tom","Age=35,Pet=Gerbil","1","0",
"Linda","Age=27,Pet=Dog","0","0",

The following works for me (finally) except if the "AccountDisabled" header name doesn't exist in the file it won't catch that error... so it may need better error-checking.

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
my $dir = '/home/david/Programming/Perl';
my $file = $dir . '/' . 'users.csv';

my $csv = Text::CSV->new({always_quote => 1});


open (my $fh, "<", $file) or die $!;
my @file = <$fh>;
close $fh;
my $check_colname= "AccountDisabled";
my $check_col_index;
if ($csv->parse($file[0])) { #Look at the first line to find index of column name
    my @header = $csv->fields();
    ($check_col_index) = grep { $header[$_] eq $check_colname} 0..$#header;
    print "Index is $check_col_index\n";
}else{
    die "Couldn't parse first line of $file";
}

foreach (@file) { #Loop through array and print
    if ($csv->parse($_)) {
        my @columns = $csv->fields();
        next if $columns[$check_col_index] eq "1"; #If AccountDisabled column has a "1" in it
        my $status = $csv->combine(@columns);    # combine columns into a string
        my $line   = $csv->string();             # get the combined string
        print "$line\n";
    } else {
        my $err = $csv->error_input;
        print "Failed to parse line: $err";
    }
}

I'll be away tomorrow (Saturday). Have a good weekend.

BastienP commented: Once again something 100% usefull. Bastien +1
mitchems commented: Nice use of Text::CSV! +2
d5e5 109 Master Poster

You could do something like the following:

#!/usr/bin/perl
use strict;
use warnings;
my $string;
my $desired_string_length = 25; #For testing. You can change this to 7000 later.
my $strcount; #Count the separate strings printed

open my $fh, "/home/david/Programming/Perl/cnames.txt" or die "couldn't find file: $!";

while (<$fh>){
    chomp; #Removes trailing new-line character from $_
    $string .= "$_ "; # "$_ followed by one space"
    if (length($string) > $desired_string_length) {
        print "$string\n";
        $string = ''; #Reset string to empty
        $strcount++; #Add one to counter
    }
}     
if (length($string) > 0) {#If any data remains in string
    print $string;
    $strcount++;
}
close $fh;

Thanks, that works perfectly but now I have another question.... $string has a lot of file names stored in it such as hello.c why.c time.c office.c work.c etc. The length of this file is about 30000 characters. I want to make sets of about 7000 characters and I want to make sure that I always have a complete file name at the end so i cant have hello.c why.c ti (it should always have the .c extension). I'm not sure how to do this, any help will be appreciated.

Thanks once again,
Rahul

d5e5 109 Master Poster

As a rule it's better to start a new topic, even if the problem is similar to the solved one. We'll still be able to refer to the solved one if necessary. There is an option where you can attach a text file to your post if you want. That way the record format of the data is preserved.

Here I am, trying to drop a line where a specific field is equal to a specific value, and stuck again...

I'm replying to this thread because it's the same kind of stuff but feel free to tell me if I'd rather split the topic.

I have to check if a specific column value for a line is equal to a fixed value. For instance, I still have this file with user account data of the ActiveDirectory, a user per line. If this user's account has the status disabled, I have to drop the line (because I won't mention him on the phonebook, he's probably gone of the company). So I'm comparing the field named "AccountDisabled" in the header to 0 or 1. If the value is 1, then drop the line.

I'll post my tests scripts (but non-working like I want them to).

Regards,
Bastien

d5e5 109 Master Poster

Interesting. I have to leave my computer now but I'll try and have a look at this tomorrow.

d5e5 109 Master Poster

You can read the correct version of this code at at this link. I think you have some typing errors in what you show us but it's hard to tell without the code tags.