954,541 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Sentence count query

hi, i need to output the number of sentences in a file. The file looks like this:

For the eyeing of my scars, there is a charge,
For the hearing of my heart,
It really goes.

And there is a charge a very large charge,
For a word or a touch,
Or a bit of blood.

As you can see 6 sentences.
The perl code i have written for this is as follows:

open(OUT, "<$file") || die "Cant open $file: $!";
  
  $sentences = 0;
  my($ch);

while($ch = getc(OUT))
{
 if($ch eq"?" || $ch eq "!" || $ch eq ".")
 {
   $sentences++;
 }
} 

   close(OUT);

  print("Statistics for $file\n");
  print("Sentences: $sentences\n");


For this, sentences can either end with ?/!or . which is the reason i have them in the code. The output for the code, however says
Sentences: 0
which of course is wrong, im missing something but what!! also i need some pointers on counting characters and words coz im struggling to find resources that are helpful to me
thanx

inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

You are comparing strings:

if($ch eq"?" || $ch eq "!" || $ch eq ".")

instead of searching for patterns:

if($ch =~ /[?!.]/)

EDIT: actually, let me try something, I think getc() might be affecting the code too. I never use it so I have to check how it works.

KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

OK, when I run your code the count is 2. The problem appears to be that you are not chomping the input and you have "!" instead of "," in your conditional statement.

When "!" is changed to "," the count goes to 7 because the first sentence has 2 commas:

For the eyeing of my scars, there is a charge,

You also need to chomp $ch:

chomp($ch)

before comparing the strings.:

while(my $ch = getc(DATA))
{
chomp($ch);
if($ch eq "?" || $ch eq "," || $ch eq ".")
{
$sentences++;
}
}
KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

The concept of sentences does not translate well to poetry or song liricks as they do not follow the same grammatical rules as formal writing.

KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

Thankyou soooo much :)
any advice on counting characters and words? like an idea of what sort of statements to use or anywhere i can find some useful info?

inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

A google search found this page:

http://en.literateprograms.org/Category:Programming_language:Perl

It shows a "word count" program listed but I have not read it to see how well written it is.

And this one:

http://www.perlmonks.org/?node_id=457784

KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

right so ive got the sentences working, and i included some code to count the number of lines, i tested them separately and they work fine, but put them together in the code and i get this output:

Lines:7
Sentences: 0

heres is the complete code i have written so far:

#!C\perl\bin\perl.exe 



if($#ARGV == -1)
{
  print("Please enter a filename ");
  $file = <STDIN>;
  chomp($file);
}
   else
   {
  $file = $ARGV[0];
   }

      if($file !~ m/^[a-zA-Z\_]{1}[a-zA-Z0-9\_]{7}(\.txt|\.TXT)$/) 
      {
      die("Incorrect format!\n");  
      }

  if(!-e $file)
  {
    die("error! file does not exist!\n");
    }
 
  if(-z $file)
  { 
    die("File is empty!\n");
    }
  
   open(OUT, "<$file") || die "Cant open $file: $!";
    
    for($line=0; <OUT>; $line++){
    } 

  $sentences = 0;
  my($ch);

 while($ch = getc(OUT))
 {

   chomp($ch);
 
 if($ch eq ".")
 {
   $sentences++;
 }
} 
 

   

   close(OUT);

  print("Statistics for $file\n");
  print("Lines: $line\n");
  print("Sentences: $sentences\n");


do i need to explicitly open and close each filehandle separately for each count? or am i just missing something?
Also i found some code to count paragraphs:

$/ ='':
open(FH, $file) or die
1 while <FH>
$para_count = $.;

which i tried to include in the code but its my understanding that $/='': sets like a paragraph mode for the file which again works fine alone but with the rest of the code it treats everything like a paragraph and i get different output like instead of lines: 7 i get lines:2 paragraphs:2
im completely lost

inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

Hi inked. I doing exactly the same as you and I find that with the getc, I am unable to count the lines as the getc command (i believe)scans for every character. My code is the same as yours except...

if ($ch =~ m/\s/) {$ws++;} #word
$c++; #character

I tried to do code to count each line by this...
if ($ch =~ m/\n/g) {$L++;} #lines

but this doesn't work as it doesn't include the blank space which is a new line. So it only count 6 lines when there is 7

The code I done for paragraphs is this...
if ($ch =~ m/^$/){ $p++;} #paragraph

...but they both doesn't work either. I can do these without the getc command but not sure how two put them all together in one file yet. Maybe use a sub routine? Any clues?

EDIT - Maybe I should create a new variable, different to ch and use that variable to read the file without the getch command. I try tomorrow after work

steve80
Newbie Poster
1 post since Jan 2009
Reputation Points: 10
Solved Threads: 1
 

inked,

add this line to your code:

use warnings;

and rerun it and list any warnings you get.

KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

ok did what you said and this is what i got:

Scalar found where operator expected line 38 near $sentences"
Missing semicolon on previous line?

"my" variable $ch masks earlier declaration in same scope line 42

theres a few syntax errors but im sure i can find them.

inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

ok fixed a few errors i get now get this message

useless use of a variable in void context
line 40
name main::lines used only once possible typo line 60

inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

Post you code again, the code that returns the warning you mention. "Useless use of variable" is generally something like:

$foo;

The above line does nothing, its useless. Thats a warning though, not an error. And the other warning is obvious, you have something used only once in your program (a sclar or filename or filehandle, etc) , that is also a warning.

KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

yeah i found the typo warning so its just the usless warning

#!C\perl\bin\perl.exe 

use warnings;



if($#ARGV == -1)
{
  print("Please enter a filename ");
  $file = <STDIN>;
  chomp($file);
}
   else
   {
  $file = $ARGV[0];
   }

      if($file !~ m/^[a-zA-Z\_]{1}[a-zA-Z0-9\_]{7}(\.txt|\.TXT)$/) 
      {
      die("Incorrect format!\n");  
      }

  if(!-e $file)
  {
    die("error! file does not exist!\n");
    }
 
  if(-z $file)
  { 
    die("File is empty!\n");
    }
  
   open(OUT, "<$file") || die "Cant open $file: $!";
    
   for($line=0; <OUT>; $line++)
   {  
   }
  $sentences = 0;
   
  ($ch);

 while($ch = getc(OUT))
 {

   chomp($ch);
 
 if($ch eq ".")
 {
   $sentences++;
 }   

} 
 

   

   close(OUT);

  print("Statistics for $file\n");
  print("Lines: $line\n");
  print("Sentences: $sentences\n");
inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

this line is doing nothing (its useless):

($ch);


I am not sure what you think this does:

open(OUT, "<$file") || die "Cant open $file: $!";
for($line=0; <OUT>; $line++)
{ 
}


I'm surprised that doesn't return a syntax error, but it definetly isn't doing anything and that is where $line is used only once. If you want to count the lines in the file:

open(OUT, "<$file") || die "Cant open $file: $!";
$line++ while (<OUT>);
KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

hey, changed my code and its working fine, thanx for all your help ill keep the parts of this code that work for future reference

inked
Newbie Poster
14 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You