Hello friends ,
I need to parse some data from a file and arrange it in a certain file..however the file is so confusing and has such minute issues that it has really confused me now..can sumbody help.
Thanks
Aj
I am attaching the main part of the input file which are causing me trouble.

I am using this code :

my @file1 =<INFILE>;
     foreach $lines(@file1)
       {	   
	   if($i==3)
            {
             $i=1;
	        print OUTFILE"\\"; 
		print OUTFILE"\n";  
	    }
	
             if ($lines =~m/^\\/)
              {
               $i=$i+1;
              $lines=~s/\\//g;
	      print OUTFILE"\n";             
	      if ($i==2) 
	         {	       
	          print OUTFILE"$lines";	       
	         } 
              }
	    if($lines =~m/^JASSS:|Date:|Title:/)
	    {	       
	      print OUTFILE"$lines";   
	    }
	    elsif ($lines =~m/^Author: && ^Address:/)
	    {
             print OUTFILE"\n"; 
	     print OUTFILE"$lines";   
	   }
	 elsif ($lines =~m/^Address:/)
	    {	
		 print OUTFILE"\n";	 
		print OUTFILE"$lines";   
	    }
	    else 
	      {
		  chomp($lines);
		   print OUTFILE"$lines";
	       }  

	   }
	   
   close (INFILE);   
 close(OUTFILE);
 exit;

OUTPUT comes like this:
ID: 1.1.2
Date: 31 Jan 1998
Title: Qualitative Modeling and Simulation of Socio-Economic Phenomena
Author: Giorgio Brajnik
Address: Dipartimento di Matematica e InformaticaUniversit&agrave; di Udine Udine Italy Italy
Author: Marji Lines{The author and address shouldn't come together, which doesn't come in this case but if i change the code to get the 2nd record correctly this gets disturbed
Address: Dipartimento di Scienze StatisticheUniversit&agrave; di Udine Udine Italy 33100 Italy


This paper describes an application of recently developed qualitative reasoning techniques to complex, socio-economic allocation problems.
\
ID: 2.3.3
Date: 30 Jun 1999
Title: Simulating Household Waste Management Behaviours
Author: Peter Tucker
Address: Environmental Initiatives GroupHigh Street PAISLEY PA1 2BE United Kingdom
Author: Andrew Smith
Address: Language Evolution and Computation Research UnitSchool of Philosophy Psychology and Language Sciences University of Edinburgh,{I don't want a new line here but if i change my code to get this correctly the 1st record gets disturbed
Adam Ferguson Building, 40 George Square EH8 9LL Edinburgh, United Kingdom

The paper reports the outcome of research to demonstrate the proof of concept.

NOTE: {I would like to have an output which fulfills both the criteria.}

Attachments
\\

ID: 1.1.2

Date: 31 Jan 1998

Title: Qualitative Modeling and Simulation of Socio-Economic Phenomena

Author: Giorgio Brajnik

Address: Dipartimento di Matematica e InformaticaUniversit&agrave; di Udine	Udine	Italy	Italy	

Author: Marji Lines

Address: Dipartimento di Scienze StatisticheUniversit&agrave; di Udine	Udine	Italy	33100	Italy	

\\

This paper describes an application of recently developed  qualitative reasoning techniques to complex, socio-economic  allocation problems. We explain why we believe traditional  optimization methods are inappropriate and how qualitative reasoning  could overcome some of these shortcomings. A case study is presented where an authority is expected to devise a policy that  satisfies certain constraints. We describe how sets of rules of  thumb implementing such a policy can be analyzed and validated by  the decision maker using a program which automatically builds and  simulates qualitative models of the underlying dynamical system.  Such a program constructs and simulates models from incomplete  descriptions of initial states and functional relationships between  variables. We show that it nevertheless gives sufficient information  to the decision maker.

\\
ID: 2.3.3

Date: 30 Jun 1999

Title: Simulating Household Waste Management Behaviours

Author: Peter Tucker

Address: Environmental Initiatives GroupHigh Street	PAISLEY	PA1 2BE	United Kingdom	

Author: Andrew Smith

Address: Language Evolution and Computation Research UnitSchool of Philosophy	Psychology and Language Sciences	University of Edinburgh,

Adam Ferguson Building, 40 George Square	EH8 9LL	Edinburgh, United Kingdom	

\\

The paper reports the outcome of research to demonstrate the proof of concept for simulating individual, collective and interactive household waste management behaviours to provide a tool for efficient integrated waste management planning. The developed model simulates whole communities as distributions of individual households engaged in managing their own domestic waste, through home composting or recycling activities. The research addresses the personal hierarchical ordering of these activities, choices for participation and the factors affecting the waste diversion levels to each of the available outlets. These choices are driven by the underlying attitudes of the community residents, linked in part to socio-demographic factors but also containing a large random, or stochastic, element. Structures for modelling the stochastic variations are developed. The social elements of the simulation are used as control parameters determining the waste material flows through the household which provide a process simulation, or material balance, across the household. The developed models enable the investigation of possible management interventions to increase overall performance. Behavioural responses to other external stimuli can also be simulated. Model application to the simulation of environmental impacts from recycling are discussed briefly. The paper concludes with examples drawn from model validation trials on kerbside newspaper recycling schemes. 

\\
my @file1 =<INFILE>;
foreach $lines(@file1)
{
        chomp($lines);
        if($i==3)
        {
                $i=1;
                print OUTFILE"\\";
                print OUTFILE"\n";
        }
        if ($lines =~m/^\\/)
        {
                $i=$i+1;
                $lines=~s/\\//g;
                print OUTFILE"\n";             
                if ($i==2) 
                {              
                        print OUTFILE"\n"; 
                        print OUTFILE"$lines";         
                } 
        }
        if($lines =~m/^JASSS:|Date:|Title:/)
        {              
                print OUTFILE"\n"; 
                print OUTFILE"$lines";   
        }
        elsif ($lines =~m/^Author:/)
        {
                print OUTFILE"\n"; 
                print OUTFILE"$lines";
        }
        elsif ($lines =~m/^Address:/)
        {
                print OUTFILE"\n";
                $lines=~s/,$/, /g;
                print OUTFILE"$lines";
        }
        else
        {
                print OUTFILE"$lines";
        }
}

close (INFILE);
close(OUTFILE);
exit;

Edited 3 Years Ago by Dani: Formatting fixed

Sometimes it's simpler to read the entire document into one string variable and then apply a series of global substitute commands. The nice thing about this way is you can add a substitute command to your program, run it, visually inspect the output, then add another substitute command, test again until the output looks right.

I find this more intuitive sometimes because it's similar to what I would have to do if I didn't have time to write a program. I would have to load the document into a good text editor that allows regular expressions for search and replace, and keep running search and replace commands until the document has been tidied up. I tested the following with your ID.txt input file:

#!/usr/bin/perl -w
#ParseFile.pl
use strict;
my ($f1, $f2) = @ARGV;
open (INFILE, $f1) || die "Can't open $f1: $!";
open (OUTFILE, ">$f2") || die "Can't open $f2: $!";
undef $/; #When $/ doesn't contain a record-end character Perl reads entire file
my $string = <INFILE>; #Read entire file into a string variable
$/ = "\n";
my $stringout = $string;
$stringout =~ s/^\\\\//gm; #Remove double backslashes at start of any line
$stringout =~ s/^(JASSS:|ID:|Date:|Title:|Address:|Author:)(.*)\n/$1$2/gm; #Remove extra newlines
$stringout =~ s/\n^Author:/Author:/gm; #Remove extra newline before Author
$stringout =~ s/^ID:/\\\n$&/gm; #Put a single backslash on the line before ID:
#Remove the extra blank lines and single backslash at the start of the document (not global)
$stringout =~ s/^\s*\\//m;
print OUTFILE $stringout;
close INFILE;
close OUTFILE;

Thanks but I tested it again and your code doesn't work for me, however, I have already solved the problem.
A code like this gives me the output which actually I wanted..

my @file1 =<INFILE>;
     foreach $lines(@file1)
      {
        chomp($lines);
          if ($lines =~m/^\\/)
          {
             $i=$i+1;
             $lines=~s/\\//g;
            print OUTFILE"\n"; 
              if ($i==2) 
               { 
                 print OUTFILE"\n"; 
                 print OUTFILE"$lines"; 
              } 
          }
          if($i==3)
           {
            $i=1;
            print OUTFILE"\\";
            print OUTFILE"\n";
           }
              if($lines =~m/^JASSS:|Date:|Title:/)
               { 
                 print OUTFILE"\n"; 
                 print OUTFILE"$lines"; 
               }
          elsif ($lines =~m/^Author:/)
              {
                print OUTFILE"\n"; 
                print OUTFILE"$lines";
              }
             elsif ($lines =~m/^Address:/)
            {
              print OUTFILE"\n";
              $lines=~s/,\n/, /g;
              print OUTFILE"$lines";
            }
          else
          {
            print OUTFILE"$lines";
         }
    }
close (INFILE);
close(OUTFILE);
exit;

Anyways thanks for your reply, I learnt a new way of dealing with files.

Cheers
Aj

You're welcome. I took a second look at my output today and see that it still isn't quite right.:ooh: Both ways have their pros and cons but yours actually worked so that's what counts.:)

This article has been dead for over six months. Start a new discussion instead.