User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Perl section within the Software Development category of DaniWeb, a massive community of 361,627 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 2,179 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Perl advertiser:
Views: 2946 | Replies: 32
Reply
Join Date: Mar 2006
Posts: 545
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 28
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: count characters in a string

  #21  
Apr 11th, 2008
Your code does not seem to work properly katharnakh. I did not try and determine why. I don't think the "exists" function works on arrays:

exists ${$cnter{$word}}[$index]

like it does on hash keyes:

exists $cnter{$word}{$title}

so that might be a problem.
Reply With Quote  
Join Date: Mar 2008
Posts: 17
Reputation: godevars is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
godevars's Avatar
godevars godevars is offline Offline
Newbie Poster

Re: count characters in a string

  #22  
Apr 13th, 2008
I tried it as well and the results give the same result of numbers for each word in each section. I think it does have something to do with [$index].
Reply With Quote  
Join Date: Mar 2008
Posts: 17
Reputation: godevars is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
godevars's Avatar
godevars godevars is offline Offline
Newbie Poster

Re: count characters in a string

  #23  
Apr 13th, 2008
I added some spacing and then decided to add all the words that were in each section. I declared a new variable $count and wanted to add the count being pushed to the end of the @t array. My thought was this would add all words in the section and then I could print it out as the last line. The second foreach loop is for one section, correct?

foreach my $word (sort keys %cnter){
      print OUT "$word : ";
      my @t = ();
      my $count = 0;
      foreach my $title (@order) {
            push @t, (exists $cnter{$word}{$title}) ?  $cnter{$word}{$title} : 0;
            my $count += $t[-1] # this would add the number just pushed to end of array
      }
      print OUT join("\t", @t),"\n";
      print OUT "$count"; #prints total then goes to next title
} 

I may be interpreting this incorrectly.

Thanks-
Reply With Quote  
Join Date: Mar 2006
Posts: 545
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 28
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: count characters in a string

  #24  
Apr 13th, 2008
Did you run the code? The first obvious problem is declaring the variable twice with "my". The code you posted looks like it will always display zero because of that. But even if you properly declare the variable it will not count the total of words per section. It looks like it will be the total of only each instance of a word for all the sections it is found in. Say the word were "foo" and it was found 3 times in section I and 2 times in section III the code will print 5, I think. I did not run the code to see but I can tell it is not totalling the words per section. I would do that while the data is being read in from the file, not after, while the data is being printed to the OUT file, although that is more than likely possible.
Last edited by KevinADC : Apr 13th, 2008 at 6:28 pm.
Reply With Quote  
Join Date: Mar 2008
Posts: 17
Reputation: godevars is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
godevars's Avatar
godevars godevars is offline Offline
Newbie Poster

Re: count characters in a string

  #25  
Apr 14th, 2008
I did run it and you are right. I was getting zeros. I thought it was because of how I set it up. I corrected it and I am getting the count for the word in the sections. I am looking at now creating a hash for each word where the key is the section and the values are the counts. Then in the outside of the loops I could add all the same key/values to get a total number of words. Sections is $title and the counts are what is being pushed in the second 'foreach'.

%section_hash = ($title => $t[-1])

I tried this and this replaced my values until I hit the last section. I reviewed this and thought it may be easier to make an array of the numbers for each section instead. Is there a way to change the name of an array label within the loop? In this example, I want to end up with 2 section arrays.

my @section_array = ();
push @section_array, @t; 

I'd like the @section_array to change based on $title. I'll try again tomorrow.

Thanks-
Reply With Quote  
Join Date: Jan 2006
Posts: 215
Reputation: katharnakh is an unknown quantity at this point 
Rep Power: 3
Solved Threads: 19
katharnakh's Avatar
katharnakh katharnakh is offline Offline
Posting Whiz in Training

Re: count characters in a string

  #26  
Apr 14th, 2008
Hi Kavin,
Originally Posted by KevinADC View Post
Your code does not seem to work properly katharnakh. I did not try and determine why. I don't think the "exists" function works on arrays:

exists ${$cnter{$word}}[$index]

like it does on hash keyes:

exists $cnter{$word}{$title}

so that might be a problem.

The program was executing but output was wrong. I realised later that it is important to maintain the indexes of the titles encountered. I have done the changes... I have added %index as global instead of local to else block.
  1. use strict;
  2. use warnings;
  3. open(IN, "readme.txt") or die "ERROR: $!";
  4. open(OUT, ">seeme.txt") or die "ERROR: $!";
  5. my (%cnter, $title, @order, %index);
  6.  
  7. while(<IN>) {
  8. next if (/^\s*$/);
  9. chomp;
  10. my @line = split(/\s+/);
  11. if($line[0] =~ /^=/) {
  12. $line[0] =~ tr/=//d; # remove all the "=" from the section title
  13. $title = "@line";
  14. push @order, $title;
  15. }
  16. else {
  17. tr/,.?!//d for @line; #remove some punctuation
  18. tr/A-Z/a-z/ for @line; #convert all text to lower case so 'Word' and 'word' are the same
  19.  
  20. # a local hash to store { section => its index in the array
  21.  
  22. @index{@order} = (0..$#order);
  23.  
  24. #$cnter{$_}{$title}++ for @line;
  25. ${$cnter{$_}}[$index{$title}]++ for @line;
  26. }
  27. }
  28.  
  29. print OUT join("\t",@order),"\n";
  30. foreach my $word (sort keys %cnter){
  31. print OUT "$word :";
  32. my @t = ();
  33. foreach my $title (@order) {
  34. #push @t, (exists $cnter{$word}{$title}) ? $cnter{$word}{$title} : 0;
  35. push @t, (defined ${$cnter{$word}}[$index{$title}]) ? ${$cnter{$word}}[$index{$title}] : 0;
  36. }
  37. print OUT join("\t", @t),"\n";
  38. }
  39. close(IN);
  40. close(OUT);

Thanks Kavin...
katharnakh.
challenge the limits
Reply With Quote  
Join Date: Jan 2006
Posts: 215
Reputation: katharnakh is an unknown quantity at this point 
Rep Power: 3
Solved Threads: 19
katharnakh's Avatar
katharnakh katharnakh is offline Offline
Posting Whiz in Training

Re: count characters in a string

  #27  
Apr 14th, 2008
Originally Posted by KevinADC View Post
Your code does not seem to work properly katharnakh. I did not try and determine why. I don't think the "exists" function works on arrays:

exists ${$cnter{$word}}[$index]

like it does on hash keyes:

exists $cnter{$word}{$title}

so that might be a problem.

Hi Kavin,
the code does work with push @t, (exists ${$cnter{$word}}[$index{$title}]) ? ${$cnter{$word}}[$index{$title}] : 0; line. I am not sure why, may be you can tell, if you have an idea, because code does include user strict; use warnings; construct and it does not seems to throw any warnings or errors... I forgot to mention that point in the earlier thread.

katharnakh.
Last edited by katharnakh : Apr 14th, 2008 at 3:59 am.
challenge the limits
Reply With Quote  
Join Date: Mar 2008
Posts: 17
Reputation: godevars is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
godevars's Avatar
godevars godevars is offline Offline
Newbie Poster

Re: count characters in a string

  #28  
Apr 14th, 2008
Still learning on my end.
I am reviewing the code and am trying to understand this:

$cnter{$_}{$title}++ for @line;

I see this as a hash, %cnter, being populated . $_ is the word and the $title is the section. I see that the keys = $_ in this loop which are all the words. The value would then be the $title and the ++ is to count each individual word appearing in the section. Is the 'for @line' portion used for reading each line as it comes through?

I think this means this code already has a hash of all the words in each section: keys %cnter. I am tryin to figure out how to detemrine how to identify the hash for each section.

Thanks-
Reply With Quote  
Join Date: Mar 2006
Posts: 545
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 28
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: count characters in a string

  #29  
Apr 14th, 2008
Originally Posted by katharnakh View Post
Hi Kavin,
the code does work with push @t, (exists ${$cnter{$word}}[$index{$title}]) ? ${$cnter{$word}}[$index{$title}] : 0; line. I am not sure why, may be you can tell, if you have an idea, because code does include user strict; use warnings; construct and it does not seems to throw any warnings or errors... I forgot to mention that point in the earlier thread.

katharnakh.



Then I guess the exists funtion does work for arrays as well as hashes. I will try some simple tests later and see what happens.
Reply With Quote  
Join Date: Mar 2006
Posts: 545
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 28
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: count characters in a string

  #30  
Apr 14th, 2008
Originally Posted by godevars View Post
Still learning on my end.
I am reviewing the code and am trying to understand this:

$cnter{$_}{$title}++ for @line;

I see this as a hash, %cnter, being populated . $_ is the word and the $title is the section. I see that the keys = $_ in this loop which are all the words. The value would then be the $title and the ++ is to count each individual word appearing in the section. Is the 'for @line' portion used for reading each line as it comes through?

I think this means this code already has a hash of all the words in each section: keys %cnter. I am tryin to figure out how to detemrine how to identify the hash for each section.

Thanks-


%cnter is a two dimensional hash ( a hash of hashes). $_ (the words) and $title (the section title) are bot hash keys. The value of a hash key can be another hash (and more things besides). ++ is the count of each word per section.

'for @line' just loops through the @line array and applies the value of each "line" to $_ which is used to build the hash up with. Its a short way of writing:

for (@line) {
    $cnter{$_}{$title}++;
}
 

It does mean there is a hash with all the words counted per all sections.

%cntr = (
word1 => {
    title1 => count,
    title2 => count,
 },
word2 => {
    title1 => count,
    title2 => count,
 } 
etc etc 

If a word was not found in a section it would not be in the word hash. This is why my code checks later all the section titles and applies a value of 0 (zero) if a word was not found in a particular section.
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

DaniWeb Perl Marketplace
Thread Tools Display Modes

Similar Threads
Other Threads in the Perl Forum

All times are GMT -4. The time now is 6:38 am.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC