I have a code which parses/validates all the fields present in i/p weblog file.
My first field is ip_address & currently can have a value like
Now I have a change where ip_address can be a dummy value something like $10.00 or $23.123.34. or $12.233.

How should I change my regular expression so as to handle both the values ?

Code:#! /usr/bin/perl -w

use strict;

while (<DATA>) {

   $_ =~ m|^
            (\d+\.\d+\.\d+\.\d+)?        # capture  clientip
            \s                                          # followed by space
            ([\w-]+)\s                                  # caputre '-'  or their membership id
            \[(\d{1,2}/\w{3}/\d{4}:\d{2}:\d{2}:\d{2})   # then the date
            \s\+\d{4}\]\s"                              # the '  +0100] "' ready for the method on the next line
            (\w{3,4})\s                                 # ermm, the  method
            (\/.*?)\s                                   # The request
            (\w{4}\/\d\.\d)"\s                          # the protocol
            (\d{3})\s([\d-]+?)\s"                       # status & content length
            (.+?)"\s"                                   # referer
            (.*?)"\s"                                   # useragent will need post processing
            (.+?)"                                      # All cookie  string, will need post processing

   my $cookies = cookieStringCleaner($11);

   my ($persistant, $session);
   foreach my $loopvar (@$cookies) {

       if ($loopvar =~ /^eBizDAn/i) {
           $persistant = $loopvar;
       elsif ($loopvar =~ /^eBizCo/i) {
           $session = $loopvar;

   print "\n\n\nLINE: $.\nIP: $1\nMEMBER: $2\nDATE: $3\nMETHOD: $4
$10\nCOOKIE Persist: $persistant\nCOOKIE Session: $session";


sub cookieStringCleaner() {

   my $cookieString = shift;

   # clean up the data a bit, remove spaces and '-'
   # the '-' is an error by (other language)  random num generator.
   # taking it out will make lookups easier as they will just be a  number

   $cookieString =~ tr/ //d; 
   $cookieString =~ tr/-//d; 

   my @cookies = split(/;/, $cookieString);

   return \@cookies;

I tried replacing (\d+.\d+.\d+.\d+)? with something like (\$.*|\d+.\d+.\d+.\d+)? but this gives me 2 extra places in case of $10.00 so it returns me the value "$10.00 - -"

Please suggest. Thanks in advance.

Edited by Reverend Jim: Fixed formatting

6 Years
Discussion Span
Last Post by sysadm1n

Can you provide a few lines of the log file to show the options and what you wish to get out of it?






and that should work for you!

This article has been dead for over six months. Start a new discussion instead.
Be sure to adhere to our posting rules.