d5e5 109 Master Poster

Instead of taking filename from alt take it from the end of the src URL.

# Obtains all individual comic data
sub getComicData {
    my $siteData = get("$sitePrefix$current/");
    my @data = split /\n/, $siteData;
    foreach (@data) {
        if (/http:\/\/xkcd.com\/(\d+)\//) {
            $current = $1;
        }
        
        #Instead of taking filename from alt
        #take it from the end of the src URL
        if (/src="(http:\/\/imgs.xkcd.com\/comics\/(.+\.\w{3}))"/) {
            $currentUrl = $1;
            #if (/alt="(.+?)"/) {
            #    $title = $1;
            #    $title = "House of Pancakes" if $current == 472;  # Color title on comic 472 with weird syntax
            #}
            $title = $2;
            say "File to save: $title";
            if (/title="(.+?)"/) {    #title commonly know as 'alt' text
                $alt = $1;
            }
        }
    }
}
d5e5 109 Master Poster

(also how do i properly wrap my code in the [code] as this is my first post with a code snippet

[CODE]#First line of code
.
.
.
.
#Last line of code[/CODE]
d5e5 109 Master Poster

This works for me in a Linux environment:

perl -n -i.bak -e '$r=1 if m/<process-type="Remote">/;$m=1 if $r && m/<\/module-data>/;print;if ($r and $m){print "blah\n" x 7;($r,$m)=(0,0);}' file.txt

file.txt now contains:

--------------------------FILE------------------------------------------
<process-type="Local">
               <module-data>
               </module-data>
<process-type="Remote">
               <module-data>
               </module-data>
blah
blah
blah
blah
blah
blah
blah
--------------------------FILE------------------------------------------
d5e5 109 Master Poster

I don't know how to do it with a one-liner.

The "add only if not already present" requirement is much more easily done by building a hash instead of an array.

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

my %emails = ('sam@email.com' => undef,
              'john@email.com' => undef,
              'jenifer@email.com' => undef);#Hash of emails

while (<DATA>){
    chomp;
    s/^Zip_Name: //;#Remove unwanted text at beginning of $_ (default record variable)
    $emails{$_} = undef; #Email as key in hash automatically unique.
}

say "Hash contains the following emails:";
say foreach (sort keys %emails);

__DATA__
Zip_Name: jenni@email.com
Zip_Name: sam@email.com
Zip_Name: dave@email.com
Zip_Name: john@email.com

This gives the following output:

Hash contains the following emails:
dave@email.com
jenifer@email.com
jenni@email.com
john@email.com
sam@email.com
d5e5 109 Master Poster

Hi d5e5,

Great Thank you.. I was looking for same one i got my desired output.

Once again thanks for all your effort and suppport.

You're welcome. I'm glad it finally works.

Please don't forget to mark this thread solved.

d5e5 109 Master Poster
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

my $filename_old = 'old.txt';
my $filename_new = 'new.txt';
my %data;
my %moved;

read_file($filename_old);
read_file($filename_new);

#Find lines moved from old
foreach my $v(keys %{$data{$filename_old}}){
    foreach my $g(keys %{$data{$filename_old}->{$v}}){
        $data{$filename_new}->{$v}->{$g}->{count} = 0
            unless defined $data{$filename_new}->{$v}->{$g}->{count};
        if ($data{$filename_old}->{$v}->{$g}->{count}
            > $data{$filename_new}->{$v}->{$g}->{count}) {
            $moved{'from'}->{$v}->{'value'} = $v;
            $moved{'from'}->{$v}->{'group'} = []
                unless defined $moved{'from'}->{$v}->{'group'};
            push @{$moved{'from'}->{$v}->{'group'}}, $g;
            #say "$v $g count is $data{$filename_old}->{$v}->{$g}->{count}";
            #say "$filename_new $v $g count is $data{$filename_new}->{$v}->{$g}->{count}";
        }
    }
}

#Find lines moved to new
foreach my $v(keys %{$data{$filename_new}}){
    foreach my $g(keys %{$data{$filename_new}->{$v}}){
        $data{$filename_old}->{$v}->{$g}->{count} = 0
            unless defined $data{$filename_old}->{$v}->{$g}->{count};
        if ($data{$filename_new}->{$v}->{$g}->{count}
            > $data{$filename_old}->{$v}->{$g}->{count}) {
            $moved{'to'}->{$v}->{'value'} = $v;
            $moved{'to'}->{$v}->{'group'} = []
                unless defined $moved{'to'}->{$v}->{'group'};
            push @{$moved{'to'}->{$v}->{'group'}}, $g;
            #say "$v $g count is $data{$filename_old}->{$v}->{$g}->{count}";
            #say "$filename_new $v $g count is $data{$filename_new}->{$v}->{$g}->{count}";
        }
    }
}

foreach my $k(sort keys %{$moved{'from'}}){
    my $v = $moved{'from'}->{$k}->{'value'};
    my @gf = @{$moved{'from'}->{$k}->{'group'}};
    my @gt = @{$moved{'to'}->{$k}->{'group'}};
    say "Value $v from group @gf has been moved to @gt group";
}

sub read_file {
    my $filename = shift;
    open my $fh, '<', $filename or die "Failed to open $filename: $!";    
    while (<$fh>){
        chomp;
        next if m/^##/; #Skip commented-out data lines
        next unless m/\d{3}/;
        my ($group, $value) = split;
        $data{$filename}->{$value}->{$group}->{'count'}++;
    }
}

This gives the following output:

Value 465 from group Unknown has been moved to DEF group
Value 876 from group ABC has been moved to Unknown group
d5e5 109 Master Poster

I assumed that if a certain value was found this implies that it was moved. However i am not sure if OP wanted to check for values that moved from any group (which, if i'm not mistaken, is what my code does).
Since we have 2 code samples now i'm sure OP should be able to figure it out by himself now :)

I'm still not completely sure that any program can say what lines 'moved' from FileA to FileB and vice versa by reading only those two files without reading a previous version of them -- especially since @realoneomer had to show us the contents of the previous versions in order to explain what it meant for lines to 'move'.

If I needed to do something like this, I would first try to get by with using some procedure to compare FileA with FileB, such as the diff command in Linux or one of several examples available in Perl to find the difference between two text files, and see if that served the purpose. Trying to determine what lines moved from what file to what file might require reading the previous files and comparing with them as well, and that could require a lot of work.

d5e5 109 Master Poster

The following does not give exactly the output you want but hopefully it is a first step in that direction.

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

my $filename_a = 'a.txt';
my $filename_b = 'b.txt';
my %data;

read_file($filename_a);
read_file($filename_b);

#use Data::Dumper;
#print Dumper(\%data);

foreach my $v(keys %{$data{$filename_a}}){
    foreach my $g(keys %{$data{$filename_a}->{$v}}){
        $data{$filename_b}->{$v}->{$g}->{count} = 0
            unless defined $data{$filename_b}->{$v}->{$g}->{count};
        if ($data{$filename_a}->{$v}->{$g}->{count}
            != $data{$filename_b}->{$v}->{$g}->{count}) {
            say "$filename_a $v $g count is $data{$filename_a}->{$v}->{$g}->{count}";
            say "$filename_b $v $g count is $data{$filename_b}->{$v}->{$g}->{count}";
        }
    }
}

sub read_file {
    my $filename = shift;
    open my $fh, '<', $filename or die "Failed to open $filename: $!";    
    while (<$fh>){
        chomp;
        next if m/^##/; #Skip commented-out data lines
        next unless m/\d{3}/;
        my ($group, $value) = split;
        $data{$filename}->{$value}->{$group}->{'count'}++;
    }
}

This gives the following output:

a.txt 431 ABC count is 1
b.txt 431 ABC count is 0
a.txt 431 Unknown count is 1
b.txt 431 Unknown count is 0
d5e5 109 Master Poster

Hello Perl Guru's

I am playing with two text files using perl but i have been end up after a one day effort and got nothing there fore i have decided to post some thing here for help well here are some details that what actually i want to do

(I am genarating one file from shell script and on the end of execution this file will be renamed to file_old and after that a shell script will be executed on next day and it will generate a file with name file_new and then i want to compare both the files that is if a value of col2 has been changed then show me the value of col1 either from FileA or FileB)

I have two text files name FileA and FileB and both files have two columns like

FileA

Col1 Col2
ABC 123
ABC 987
DEF 456
DEF 898
DEF 658
GHI 789

and FileB also have two columns and it looks like

Col1 Col2
ABC 123
ABC 987
DEF 456
DEF 898
DEF 658
GHI 789
GHI 435
GHI 654
KLM 543
KLM 123
KLM 324

now i want to compare the col2 of both files against col1 if any data in col2 has moved in first or second file then show me the value in col1 from both files i.e., if data has moved from fileA to fileB then show …

d5e5 109 Master Poster

The pending method gives you the number of items in the queue, so you can peek at each of them in a loop.

#!/usr/bin/perl
use strict;
use warnings;

use Thread::Queue;
my $q = new Thread::Queue;
$q->enqueue('item1', 'item2', 'item3');

my $count = $q->pending;
my @queued_items;
push @queued_items, $q->peek($_) foreach(0 .. $count-1);

print "Items currently in queue:\n";
print  join "\n", @queued_items;
gutchi commented: Very straightforward. thanks! +0
k_manimuthu commented: Well Understand & Nice Example +2
d5e5 109 Master Poster

k_manimuthu's answers should work fine. Here is a slightly different way to do the same thing.

#!/usr/bin/perl
use strict;
use warnings;

my $input_file = 'blast.txt';

open my $fh, '<', $input_file or die "Cannot Open the $input_file : $!";

my $sequence;
while (<$fh>){
    chomp;
    $sequence .= $_ unless m/^>/;#Skip the line that starts with >
}

print $sequence, "\n";

if ($sequence =~ /^ATG.*TAT$/){
    print "The above sequence starts with ATG and ends with TAT, so it's a gene.";
}
else{
    print "The above sequence is not a gene.";
}
close $fh;

This gives the following output:

ATGGGCCTACATCCACSTAT
The above sequence starts with ATG and ends with TAT, so it's a gene.
d5e5 109 Master Poster

split splits a string according to a pattern that separates what we want to be elements in a list and returns a list, if used in list context. Since the default is to split words separated by one or more whitespace characters, I could have taken the defaults for the split arguments, so that it would split the contents of the string in $_ (the default variable) on the default pattern. Here is a slightly improved, commented version of the subroutine, in a little script I used to test it from the command line.

#!/usr/bin/perl
use strict;
use warnings;

#The following works even when more than one space separates the words.
print format_spn_string('peas carrots     beets corn');

sub format_spn_string{
    $_ = shift; #Assign first (and only) subroutine argument to $_
    my @array = split; #Split string ($_ if unspecified) and assign list to @array
    my $out = join ', ', @array; #join elements of list. Separator is comma space
    return $out;
}
d5e5 109 Master Poster

Please try the following oversimplified script for an example of how to format your spsns with commas.

#!/usr/bin/perl
use strict;
use warnings;
 
use CGI qw(:standard);

main();

sub main
{
####print "<<HEADER"; #Commented out. Causes "malformed header" error.

print header; #Let the CGI module print your header
print "<html><head>\n";   
print "<title>RED DRILL PAGE</title></head>\n";
print "<body>\n";
print "<h3>Red Drill Credit Exposure</h3>\n";
if (param('spns')){
    my $spns = param('spns');
    $spns = format_spn_string($spns);
    print "spns are: $spns";
}
else{
    show_form();#calling show_form subroutine
}
print"</body></html>\n";
}

sub show_form
{
        my $url = url;
        print qq{<form name="input" action=$url method="get">\n};
        print qq{<table align="center" border="1" bordercolor="black" cellpadding="2" cellspacing="0">\n};
		print qq{<tr>};
		print qq{<td align="right">Please enter your SPNs</td};
		print qq{</tr>\n};
		print qq{<td align="left"><input type"text" width="7" name="spns" value="">};
		print qq{<BR>Place each SPN seperated by a space</td>};
		print qq{</table><center><input type="submit" value="Submitted"></center></form>\n};
}

sub format_spn_string{
    my $in = shift;
    my $out = join ', ', split /\s/, $in;
    return $out;
}
d5e5 109 Master Poster

What happened when you tested it? I can't test your script because I don't have all your modules and don't have your l2cgi.cfg file. Try to access your script on your web server, then open /var/log/apache2/error.log or whatever file your server uses to report errors and see what errors you find near the end of the log.

Where did you find Murex::Passwords? I don't see it on CPAN.

Don't require cgi-lib.pl. Somebody asked about cgi-lib.pl on Perl Monks in 2005 and were advised not to use it because it was old and obsolete.

d5e5 109 Master Poster

Your example has only enough data for inserting one row.

#!/usr/bin/env python
#python 2
import re
import sqlite3

conn = sqlite3.connect('example')
c = conn.cursor()

mylist = []
with open("usage.log") as fp:
      for line in fp:
            match = re.match(r'(0\.0\.0|1\.6\.1|1\.8\.1)\(([0-9\.]+)', line)
            if not match: continue
            version, value = match.groups()
            mylist.append(value)

#Execute the cursor   
c.execute('INSERT INTO energielog (sernr, peak, kwh) VALUES (?, ?, ?)', mylist)

# Save (commit) the changes
conn.commit()

#Retrieve and display all rows from your table
c.execute('select * from energielog order by sernr')
for row in c:
    print row

# Close the cursor
c.close()

Running the above gives the following output:

(u'06026104', u'0.1501', u'02484.825')

Note: I don't specify a value for ROWID because I assume it will autoincrement.

d5e5 109 Master Poster

When you talk about the names of your rows you confuse me. Do you mean you have a table named energielog having three columns named sernr, peak and kwh?

d5e5 109 Master Poster

Good, except for one detail. Look at the last word at the end of your output when you run your program. Does it say 'None'? Is 'None' the last word in alice_in_wonderland.txt?

d5e5 109 Master Poster

I see what you mean about the split on spaces. Do you think if I remove that then the program will work? Thanks

No, that sounds too optimistic. Usually when I debug a program I get rid of one error and then another one pops up.:(

I think the reason the $site variable has no value when you try to concatenate it with something else is that the parseREBASE subroutine expects to find both the name and the site on each line that it reads, but in the file you attached there is only one field on each line. After reading a line to get the name, the program should read the next line to get the site. I made a change to the parseREBASE sub to read the next line and assign it to $site so $site will not be uninitialized when it is used. Try replacing the parseREBASE sub with the following:

sub parseREBASE {

    my($rebasefile) = @_;

    use strict;
    use warnings;

    # Declare variables
    my @rebasefile = (  );
    my %rebase_hash = (  );
    my $name;
    my $site;
    my $regexp;

    # Read in the REBASE file
    my $rebase_filehandle = open_file($rebasefile);

    while(<$rebase_filehandle>) {

    # Discard header lines
    ( 1 .. /Rich Roberts/ ) and next;

    # Discard blank lines
    /^\s*$/ and next;
    #--------------------------Start of changes 2010-12-02 d5e5
    ##### The following commented-out code assumes there are two or three fields
    ##### in each line of the file you attached, but there is only one
    ##### field per …
d5e5 109 Master Poster

Now I get the error.
Regarding this statement in the parseREBASE sub: my @fields = split( " ", $_); I don't understand why you split each line from the file on spaces because each non-blank line appears to contain only one sequence followed by end-of-line character but no spaces. For example, lines 10 through 15 of the file you attached look like this:

AanI
TTA!TAA
AarI
CACCTGCNNNN!
AasI
GACNNNN!NNGTC

... so why split on spaces?

My computer time is just about over for today but I'll try to have another look at this tomorrow.

d5e5 109 Master Poster

I copied and ran your script but couldn't reproduce the error you got. The program kept prompting me with the message "Search for what restriction site for (or quit)?: " as long as I typed and entered some input. When I pressed enter with no input the program exited with no error. I created a dummy file called 'rebase.txt' but didn't know what to put in it.

That message you got, "use of initialized value $site in concatenation or string" probably means that the $site variable has no value assigned to it when some statement attempts to combine it with another string. But I don't know what data to enter to get your program to reproduce the error you are getting.

d5e5 109 Master Poster

The question confuses me too. I still see only 10 bases and I don't see any 'ATG' in the sequence. Whoever gave you this question may have made a mistake.

d5e5 109 Master Poster

print read_book() calls the read_book function once. The read_book function does something to each of the lines in the file, so at the end of the first for-loop the variable l (bad name for a variable... looks like a number 1) contains the last line of the file. Then the second for-loop reads the first word into variable w and returns w. Returning from a function means exiting the function, which is called only once. Result: the one word returned is the first word in the last line of the file.

d5e5 109 Master Poster

Hello, I would like to know if I need to use a regular expression to match the desired substring in order to print out 10 characters of the start codon ATG.

My dna sequence is "CATAGAGATA"

Thanks for any advice.

I don't think I understand the question. Your dna sequence consists of 10 characters and you want to print out 10 characters starting with the substring 'ATG'? I don't see any occurrence of the substring 'ATG' in your sequence. Can we shuffle the dna sequence until it contains (or starts with?) 'ATG'? Please tell us how you would determine the output without using a program and then maybe we can advise how to write a program that does it.

For example, does the following do what you want?

#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(shuffle); #This module includes a method to shuffle arrays.

my $str = "CATAGAGATA";
my @arr;

while (1){
    @arr = $str =~ m/[AGCT]/g; #Convert string into array of single letters
    @arr = shuffle(@arr); #Shuffle the letters of the array randomly
    last if @arr[0,1,2] = qw(A T G)# Exit loop if first 3 elements = start codon
}

print "Shuffled sequence is:\n";
print join('', @arr), "\n";

This outputs:

Shuffled sequence is:
ATGAGTCTAA
d5e5 109 Master Poster

LOL?
are you trying to unmask my lack of Perl knowledge?
because I already stated I started learning perl few days ago.

BTW thanks for the answer but what I was wondering was if it is possible to print messages while the loop is running not when it finishes.

Because I coded a small script that runs for about 5 mins and It would be a nice detail to print the number of seconds until the process finishes

I think that is not possible thanks anyways.

I didn't mean to disparage your knowledge of perl and maybe I misunderstood your original question. What I thought you asked originally was "while the loop is running i want to print the numbers of times the loop has been repeated" and IMO mitchems answered that question. Now you ask how to print the number of seconds left before the process finishes. In my opinion that is a different question. Of course if you don't know in advance how much time a process is going to take then you probably cannot say how much longer the process will take while the process is running. But you would need to provide us with more information about the process before we can say whether what you want to do is possible. Why not start a new thread to ask it and mark this one solved?

d5e5 109 Master Poster

RTMF...

use strict;
use warnings;
my $x=0;
while($x<100){
	$x++;
	print "$x\n";
}
print "the loop ran $x times\n";

I wonder if terabyte would mind telling us how to code a few tools, including calculators without printing during a loop.:-/

If I were the OP, I'd mark this one solved.

d5e5 109 Master Poster

Thank you for your help and it makes sense

You're welcome. Please mark this thread solved.

d5e5 109 Master Poster

Could you have more than one prod_id per proposal-evaluator group? If so, which prod_id do you want... first, last, greatest, least? Or maybe you should group by prod_id in addition to PA.proposal_id and PA.evaluator_ID.

d5e5 109 Master Poster
#!/usr/bin/perl
#score_many_mutant.pl
use strict;
use warnings;

my $sequence='A G G G C A C C T C T C A G T T C T C A T T C T A A C A C C A C
A T A A T T T T T A T T T G T A T T A T T C A G A T T T T T C A T G A A C T T T
T C C A C A T A G A A T G A A G T T G A C A T T G T T A T T T C T C A G G G T C
T C G G T T C A C C A G T A T T T G A C A A A C T T G A A G C T G A A C T A G C
T A A A G C T G C T A T G T C A T T G C C T G C A A C C A A G G G C T T T C A G
T T T G G T A G T G G G T T T G C A G G C A C C T T T T T G A C T G G G A G T G
A A C A C A A T G A …
d5e5 109 Master Poster

Sorry about the square brackets this was my first time using the wrap code feature on this site. I thank you for all of the help and will try to see if the previous code you made can help me with the calculation of the scores.

Assuming the script you posted mutates a string of bases by changing one letter at one position, couldn't you calculate the scores as follows?

#!/usr/bin/perl
#score_mutant.pl
use strict;
use warnings;

my $sequence = 'AGCT'; #Short string (Should work for long strings too.)
my   $mutant = 'AGAT'; #Copy of above string except C has mutated to A.

print "Sequence,Mutant,Score\n";
foreach(0 .. length($sequence) - 1){
    my $s = substr($sequence, $_, 1);
    my $m = substr($mutant, $_, 1);
    my $score = determine_score($s, $m);
    print "$s,$m,$score\n";
}

sub determine_score{
    my ($alpha, $beta) = sort @_; #Sort two base args in alphabetical order
    
    #If the base pair did not change assign 0.
    return 0 if $alpha eq $beta;
    
    #If a purine was mutated to a purine,
    #or a pyrimidine to a pyrimidine assign a value of +1 to that base pair.
    #If a purine was mutated to a pyrimidine or vice versa
    #assign a value of -1 to that base pair.
    my %rules;
    $rules{'A'}{'G'} = +1;
    $rules{'A'}{'T'} = -1;
    $rules{'A'}{'C'} = -1;
    $rules{'G'}{'T'} = -1;
    $rules{'C'}{'G'} = -1;
    $rules{'C'}{'T'} = +1;
    
    return $rules{$alpha}{$beta};
}

This gives the following output:

Sequence,Mutant,Score
A,A,0
G,G,0
C,A,-1
T,T,0
d5e5 109 Master Poster

Why do you have square brackets around some of your statements? Once I removed the square brackets it seemed to run OK. It takes a long string of bases and modifies one of the bases at a random position. I don't understand how to use the array of scores either. If you were to generate an array of scores for the sequence compared to the mutant, you would end up with an array of all zero scores except possibly one non-zero element, because all the bases are the same except one. But I don't understand what the scores mean or how they are used.

d5e5 109 Master Poster

I have tried numerous times to incorporate all aspects of my script but I am only getting it to shuffle but not the random shuffle and not the mutation. Can you tell me what order I need to use with my subroutines in order to mutate the DNA sequence and then perform the random shuffle and then calculate the z score? I first put the srand expression and then my sequence and then a subroutine to shuffle the sequence. I don't understand how I am supposed to compare the original sequence and the mutated sequence. Thank you for all your help

The script you quoted reads a file into an array called @original, copies @original to @shuffled and then shuffles @shuffled the required number of times (between 10 and 20). You can add logic to the script you quoted to define the rules for assigning z-scores to base pairs constructed from @original and @shuffled. This logic is demonstrated in the script in the post at http://www.daniweb.com/forums/post1380986.html#post1380986. At the end you will have an array of z-scores called @scores. The first score in @scores is determined by the first base-pair in @base-pairs, and so forth. I don't know what you want to do with the @scores array. Maybe just print it? I don't know what you mean when you say the shuffle is not a random shuffle.

d5e5 109 Master Poster

Hi,
Actually i want the mail to be same in all systems.

since i used as /t as as delimeter i was not getting.

i want to use printf but i dont know to use.

what actually i want is for example if i take

first line

phaneesh should come under user

/proj/sw_apps/phaneesh under path

5.7GB under used space

1.20% under used%

Exactly.but as i was using /t iwas unable as its width is differnt from system to system.

so in short i need help

in area that to print 4scalars under 4 headings exactly with good looking

in mail.

try to help.tanq

I haven't tried the sendmail program but I guess if you can get it to line up OK on the display that's a start. Let's try lining up the column headers with the first line of data.

#!/usr/bin/perl
use strict;
use warnings;

printf ("%7s","user");
printf ("%25s","path");
printf ("%22s","Usedspace");
printf ("%23s", "Used%\n");
print '_' x 79, "\n";

#From your desired output, I guess your variables have the following values
my $user1 = 'phaneesh';
my $path1 = '/proj/sw_apps/phaneesh';
my $space1 = '5.7';
my $b = 'GB';
my $usedp1 = '1.20';
my $p = '%';

printf ('%-17s%-20s%13s%2s%21s%s', $user1, $path1, $space1, $b, $usedp1, $p);

This gives the following output:

user                     path             Usedspace                 Used%
_______________________________________________________________________________
phaneesh         /proj/sw_apps/phaneesh          5.7GB                 1.20%

I find http://www.devdaily.com/blog/post/perl/reference-page-perl-printf-formatting-format-cheat-sheet a handy guide to printf.

d5e5 109 Master Poster

I don't know what caused the error message you were getting but if you read the entire file into a string variable, make sure all letters are upper-case, remove all non-letter characters (including spaces and carriage-returns), and split the string into an array it should work OK.

#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(shuffle); #This module includes a method to shuffle arrays.

#Create integer between 10 and 20
my $times_to_shuffle = 10 + int(rand(11));
my $filename = 'Test_sequence.txt';
my $string = slurp_file($filename);
$string = uc($string); #Make sure all letters are upper case
$string =~ s/[^AGCT]//; #Remove all characters that are not A, G, C, or T

my @original = $string =~ m/[AGCT]/g; #Assign all bases to an array
my $base_count = @original; #Count elements in array
print "Number of bases in sequence is $base_count \n";

print "Shuffle sequence $times_to_shuffle times.\n";
my @shuffled = @original; #Copy original array to array to be shuffled.
foreach (1..$times_to_shuffle){
    @shuffled = shuffle(@shuffled);
}

print "@shuffled\n";

sub slurp_file {
    my $file = $_[0];
    local $/;
    open( FH, '<', $file ) or die "Could not open $file ... $!";
    my $text = <FH>;
    return $text;
}

Gives the following output:

Number of bases in sequence is 829 
Shuffle sequence 11 times.
G T A A G A T A A T A A A A G T G T T G T A G C A A G G A T T A G T A T A T A C G C C …
d5e5 109 Master Poster

My sequence is quite large so I was creating a shuffled sequence to the original sequence but I keep on getting a message that there is an "uninitialized value." My sequence has over 1,000 bases.

Can you attach your sequence as a text file to your post? (See the "Manage Attachments" button.)

d5e5 109 Master Poster

Try this modified version of one of dch26's solutions:

#!/usr/bin/perl
use strict;
use warnings;

my $df_stats = qx{df /home/}; #I don't have dir called /compare/ so I used /home/

my @fields=split(/\s+/,$df_stats); #Split on one or more whitespace characters

#Can we assume first 6 fields are column headers?
#If so, instead of the following
#print "available space=$fields[3], used=$fields[4]\n";

#Try this:
my $available = $fields[6+3];
my $used = $fields[6+4];

print "available space=$available, used=$used\n";

Running this on my computer (Linux platform) gives the following output:

available space=6655136, used=66830396
d5e5 109 Master Poster

How about something like this. Make an array of strings representing the base pair combinations of original and shuffled sequences. Then assign a score to each base pair according to the rules which can be represented by a hash.

#!/usr/bin/perl
use strict;
use warnings;

#If a purine was mutated to a purine,
#or a pyrimidine to a pyrimidine assign a value of +1 to that base pair.
#If a purine was mutated to a pyrimidine or vice versa
#assign a value of -1 to that base pair.
#If the base pair did not change assign 0.

my %rules = (
            AA => 0,
            AG => +1,
            AT => -1,
            AC => -1,
            GA => +1,
            GG => 0,
            GT => -1,
            GC => -1,
            TA => -1,
            TG => -1,
            TT => 0,
            TC => +1,
            CA => -1,
            CG => -1,
            CT => +1,
            CC => 0
            );

#Dummy sequence for testing
my @original = qw(C G T T T G T A A A T T G C A T C A A G);
my @shuffled = qw(G T T C A A A G T A G C T T A G A C T T);

my @scores;
my @base_pairs = make_base_pairs(\@original, \@shuffled);
foreach my $bp (@base_pairs){
    push @scores, $rules{$bp}
}
print join("\t", @base_pairs), "\n";
print join("\t", @scores);
sub make_base_pairs{
    my @orig = @{$_[0]}; #de-reference input array
    my @shuf = @{$_[1]}; #de-reference input array
    my $idx = 0;
    my @bps;
    foreach my $base (@orig){
        push @bps, …
d5e5 109 Master Poster

Thanks for the suggestion. I looked through it but I did not see anything. In order to get the same base distribution I was thinking that I had to use the srand expression. I would like to know if I am going in the right direction. Thank you

If you want rand to return the same sequence each time you run your program then using srand with a constant makes sense to me. If you don't explicitly call srand, it is called implicitly each time you run your program using time etc. as arguments so that shuffling can give you different results. (For most purposes, you do want shuffling to give unpredictable results.) I haven't tried it but that's what I read in http://perldoc.perl.org/functions/srand.html.

d5e5 109 Master Poster

You're welcome. I really don't know anything about z-scores as I haven't studied biology for about 40 years.:) You could look for a list of modules on CPAN but I don't know which module would best suit your purpose. Maybe this one?

d5e5 109 Master Poster

You are welcome choosenalpha. Please don't forget to mark this thread solved.

d5e5 109 Master Poster

What I posted above is not quite right. You don't want to start with the original sequence each time you shuffle. You want to replace the original sequence with the result of shuffling and use then shuffle the new sequence, and so on for a random number of repetitions.

Plus, you don't have to write your own shufflearray subroutine. Perl 5.7 and later comes with the List::Util module that you can import and use for its shuffle() function. The revised outline script would look like this.

#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(shuffle); #This module includes a method to shuffle arrays.

#Create integer between 10 and 20
my $times_to_shuffle = 10 + int(rand(21));

#Dummy sequence for testing
my @sequence = qw(C G T T T G T A A A T T G C A T C A A G);

print "Shuffle sequence $times_to_shuffle times.\n";
foreach (1..$times_to_shuffle){
    @sequence = shuffle(@sequence);
    print "@sequence\n";
}

This gives the following output:

Shuffle sequence 17 times.
T C A G A T A A T G A G T T T G C C T A
C T G A A G C T T A A T T G T C T A G A
G G G T C A C T T G A A A A T T C T T A
G A A G A C T G T C T C G T A T T T A A
G T G A …
d5e5 109 Master Poster

I would create the main outline of the logic first. If it needs to call a subroutine, you can create a dummy subroutine (sometimes called a 'stub') and modify it later after figuring out the details of how it will accomplish its task. A subroutine stub consists of the subroutine name plus logic to assign the arguments to its own variables, plus a comment to indicate it is just a stub and the real logic needs to be filled in later.

Your first task, I think, is to calculate the number of times you need to call the shufflearray subroutine and assign this number to a variable. If you haven't already figured out how to extract the FASTA sequence (whatever that is), start with a hardcoded sequence to test your subroutine for the first time. An example of a draft of your main logic might be something like this:

#!/usr/bin/perl
use strict;
use warnings;

#Create integer between 10 and 20
my $times_to_shuffle = 10 + int(rand(21));

#Dummy sequence for testing
my @sequence = qw(C G T T T G T A A A T T G C A T C A A G);
my @shuffled_sequence;

print "Shuffle sequence $times_to_shuffle times.\n";
foreach (1..$times_to_shuffle){
    @shuffled_sequence = shufflearray(\@sequence); #Pass reference to array as argument
    print "@shuffled_sequence\n";
}

sub shufflearray{
    #This sub is just a stub
    #Logic needed to randomly resequence @in an return as @out
    my @in = @{$_[0]}; #Dereference passed array ref
    
    my @out = @in;
    return @out;
}
d5e5 109 Master Poster

Yes.

#!/usr/bin/env python

filename = '/home/david/Programming/Python/data.txt'

f = open(filename)
for line in f:
    if line.startswith('apple'):
        print line
d5e5 109 Master Poster

All your scripts should include the following:

use strict;
use warnings;

These will give error messages or warnings if your script does things that could potentially result in unexpected results. Fix your script according to the error messages or warnings that the strict and warnings modules give.

For example, let's run the first two statements from what you posted:

#!/usr/bin/perl
use strict;
use warnings;

random_int(10, 20);
print "$x\n";

This gives the following output:

Global symbol "$x" requires explicit package name at /home/david/Programming/Perl/temp.pl line 6.
Execution of /home/david/Programming/Perl/temp.pl aborted due to compilation errors.

Also, you print $x but you don't show us any statement assigning a value to $x... so what is $x supposed to contain?

d5e5 109 Master Poster

Please wrap your script (or the perl code you wish to show us) in [CODE]Your program goes here[/CODE] tags.

You say part of the script is giving you "the issue". Can you show us the issue, preferably by giving an example of an input sequence, an example of the output you expect and an example of the unsatisfactory output that you are getting.

d5e5 109 Master Poster

To fix it, add a newline character immediately after FOOTER. Like this:

#!/usr/bin/perl
use strict;
use warnings;
print <<FOOTER;
</body> <! --end tag for main page section -->
</html>	<!-- end tag for entire HTML page -->
FOOTER
#Had to add newline after FOOTER
#to avoid "Can't find the string terminator..." error

Now output is

</body> <! --end tag for main page section -->
</html>	<!-- end tag for entire HTML page -->

See http://www.perlmonks.org/?node_id=686004 for some explanations.

d5e5 109 Master Poster

And OP must add the recursion?
To only check if things are palindrome is by the way enough to:

def ispalindrome(x): return x == x[::-1]

(The recursive version can be written with same amount of lines)

I like that. It's much shorter than what I was thinking of. However the OP says that palindromes can be sentences as well as single words, so the function perhaps should also remove all punctuation and spaces from the input string before testing it the first time. That way it could handle palindromic sentences such as, "Red Roses run no risk, sir, on nurses order."

d5e5 109 Master Poster

The following statement fails to specify the file open mode. The default mode of read only is assumed. open(OUT,"LuContig091010RNAcomp.fa")or die $!;#No redirection character so file is opened in read mode I prefer the 3-argument version of the open, like this: open( OUT, ">", "LuContig091010RNAcomp.fa" ) || die "Can't create output file $!"; See http://perldoc.perl.org/perlopentut.html#Simple-Opens

d5e5 109 Master Poster

@snippsat

Thank-you, but can you please will this work if i'm reading the file from a list? My project requires me to create a def function that can convert the list of numbers into a number before I can get the average. Btw am I able to put anything in the brackets at the last line you've indicated? Sorry for being so noob :(

"...reading the file from a list?" You mean reading a file into a list? If so, please post the contents of the file between [CODE=text] [/code] tags so we can see how many numbers per line, what character occurs between the numbers (space, tab, comma, etc.) and what other characters are in the file besides numbers.

By "...convert the list of numbers into a number before I can get the average" do you mean add the list of numbers to get a total? That's what sum(numList) does in snippsat's function.

d5e5 109 Master Poster

I have Python 2.6 and it worked fine for me without my uncommenting #n = fin.readline() or anything else. I made no changes except the file path and it showed me a word count for the my file (not the constitution but some other file in my folder).

When it doesn't work for you, does the wordTotal variable equal zero, or in what way does it not work?

d5e5 109 Master Poster

Please try this approach.

#!/usr/bin/perl
use strict;
use warnings;

my (@in, @out);
while(<DATA>){
    chomp;
    push @in, [split(/\t/)];#Build an array of arrays
}

my $prev_aref;
foreach my $aref (@in){#foreach array reference
    if (!defined($prev_aref)
        or $$aref[1] ne $$prev_aref[1]
        or abs($$aref[2] - $$prev_aref[2]) >= 250){#At least 250 away from prev loc
        push @out, $aref;
        $prev_aref = $aref;
    }
}

@out = sort my_sort @out;

foreach my $aref (@out){
    print join "\t", @$aref, "\n";
}

sub my_sort{
    my $r = $$a[1] cmp $$b[1]; #Compare group
    if ($r == 0) { #If group in same group
        $r = $$a[2] <=> $$b[2]; #Compare location
    }
    return $r;
}
__DATA__
12	scaffold534_10_	147	-103	397	D	mdv1-miR-M3*_MI	-1	1.77e+02
12	scaffold534_10_	1391	1141	1641	D	mdv1-miR-M3*_MI	-2	2.92e+03
19	scaffold534_10_	1525	1275	1775	D	cin-miR-4218-5p	-2	4.62e-01
16	scaffold534_10_	5765	5515	6015	D	mmu-miR-546_MIM	-2	2.07e+01
21	scaffold534_10_	6625	6375	6875	D	ath-miR414_MIMA	-2	3.54e-02
12	scaffold534_13_	1969	1719	2219	D	mdv1-miR-M3*_MI	-2	2.92e+03
18	scaffold534_15_	208	-42	458	D	cin-miR-4194-3p	-2	1.65e+00
12	scaffold534_16_	8087	7837	8337	D	mdv1-miR-M3*_MI	-2	2.92e+03
19	scaffold534_16_	16182	15932	16432	D	gma-miR1533_MIM	-2	4.62e-01
19	scaffold534_16_	16185	15935	16435	D	gma-miR1533_MIM	-2	4.62e-01
19	scaffold534_16_	16188	15938	16438	D	gma-miR1533_MIM	-2	4.62e-01
19	scaffold534_16_	16191	15941	16441	D	gma-miR1533_MIM	-2	4.62e-01
19	scaffold534_16_	16194	15944	16444	D	gma-miR1533_MIM	-2	4.62e-01
12	scaffold534_16_	17672	17422	17922	D	mdv1-miR-M3*_MI	-2	2.92e+03

This gives the following output.

12	scaffold534_10_	147	-103	397	D	mdv1-miR-M3*_MI	-1	1.77e+02	
12	scaffold534_10_	1391	1141	1641 …