Hi,
I have an application which is about harvesting the data from html pages, contains an interface.
I had implemented it for (like customization of it) some html page to harvest.
But jvm hitting me with -

Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: java.lang.RuntimeException: Uncompilable source code - cannot find sy
mbol
symbol  : class HarvesterConfigurationStub
location: package au.com.allhomes.listing.harvester
        at au.com.allhomes.listing.harvester.harness.ListingTestHarness.<clinit>
(ListingTestHarness.java:11)
Could not find the main class: au.com.allhomes.listing.harvester.harness.Listing
TestHarness.  Program will exit.

I am not sure why this is happening now, after I am using the command -

java -Xmx512m -cp . au.com.allhomes.listing.harvester
.harness.ListingTestHarness --inputHtml /Harvester/build/classes/sample/data/rse
arch.html --listingType residentialSale --parserConfigImpl au.com.allhomes.listi
ng.harvester.parse.RealEstateCoParserConfiguration --urlConfigFile /Harvester/bu
ild/classes/sample/data/local_url_map.properties --outputXml results-realestate.
xml --pageType listingPage

Please stop PMing me.

Post this "batch" script your talking about and post the entire and exact command line command you are using to execute the program manually.

@REM !/bin/bash
@REM The listing harvester test harness is intended to be used to test the transformation
@REM from an input HTML file to an output allhomes XML file.
@REM
@REM
set HARVESTER_CLASSPATH=.;./sample
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;JTidy-minimal.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;all-gui-util.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;allhomes-listing-feed-binding_v0_2.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;commons-httpclient-3.1.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;commons-lang-2.4.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;harvester-smurf.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;log4j-1.2.14.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;xbean-minimal.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;testng-5.8-jdk15.jar


java -Xmx512m -classpath %HARVESTER_CLASSPATH% au.com.allhomes.listing.harvester.harness.ListingTestHarness --inputHtml /home/you/test/data/Test.html --listingType residentialSale --parserConfigImpl au.com.allhomes.listing.harvester.parse.RuralPropertyCoParserConfiguration --parserClassImpl au.com.allhomes.listing.harvester.parse.RobertsReListingSearchResultParser --urlConfigFile sample/data/local-url-map.properties --outputXml test-results.xml --pageType listingPage

This is the batch file, which executes successfully and produce the output as desired for the test run,in which case -

au.com.allhomes.listing.harvester.parse.RuralPropertyCoParserConfiguration
is the 'parserConfigImpl ' as 'RealEstateCoParserConfiguratrion' in my case.
And m just putting the batch file name on the cmd to execute it going to where the batch file is kept, at the cmd.
like the batch file is kept at
C:\harvester
so =>
C:\harvester>listing-harvester-test-harness

You're problem is that you are not setting the classpath. Look at all those set commands. Those are setting the classpath, you are attempting to execute the command manually without a classpath. Of course you are going to have problems.

Had you mentioned, and then shown this in your last thread we might have avoided all of this.

But these are all jar files which executes fine with batch file, but I am trying to execute the application which I have not jarred yet.
I got ur point too.But how the classpath gets set multiple times in the command java -Xmx512m ....
Or else it looks for the appropriate classpath value to assign in place of '%HARVESTER_CLASSPATH% '
Is the scenerio clear enough now

But these are all jar files which executes fine with batch file, but I am trying to execute the application which I have not jarred yet.

So? So then you use the directories where the "pre-packaging" classfiles are rather than the jarfiles.

I got ur point too.But how the classpath gets set multiple times in the command java -Xmx512m ....

It doesn't get set multiple times. "set" in batch is a command.

set <VARNAME>=value

and you use a variable in batch with

%VARNAME%

I should think that would be somewhat self-explanatory for any programmer looking at that.

Or else it looks for the appropriate classpath value to assign in place of '%HARVESTER_CLASSPATH% '

see the explanation above.

Is the scenerio clear enough now

Regardless of what the "scenario" is, if the program is written to need 20 libraries, and you try to run it without using any, it's not going to work. It's as simple as that.

To make "set" a bit clearer (seems like it's needed) we'll go through that.

set HARVESTER_CLASSPATH=.;./sample

HARVESTER_CLASSPATH = .;./sample

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;JTidy-minimal.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;all-gui-util.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;allhomes-listing-feed-binding_v0_2.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;commons-httpclient-3.1.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;commons-lang-2.4.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar;commons-lang-2.4.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;harvester-smurf.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar;commons-lang-2.4.jar;harvester-smurf.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;log4j-1.2.14.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar;commons-lang-2.4.jar;harvester-smurf.jar;log4j-1.2.14.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;xbean-minimal.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar;commons-lang-2.4.jar;harvester-smurf.jar;log4j-1.2.14.jar;xbean-minimal.jar

set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;testng-5.8-jdk15.jar

HARVESTER_CLASSPATH = .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar;commons-lang-2.4.jar;harvester-smurf.jar;log4j-1.2.14.jar;xbean-minimal.jar;testng-5.8-jdk15.jar

java -Xmx512m -classpath %HARVESTER_CLASSPATH% ...

java -Xmx512m -classpath .;./sample;JTidy-minimal.jar;all-gui-util.jar;allhomes-listing-feed-binding_v0_2.jar;commons-httpclient-3.1.jar;commons-lang-2.4.jar;harvester-smurf.jar;log4j-1.2.14.jar;xbean-minimal.jar;testng-5.8-jdk15.jar ...

where the libraries come into the picture now?but i understood about setting the classpath like in the batch file, will write the batch file which refers the class files instead of jar files.

hmm..thnx for the nice clarification above.Its very cystall clear now,also i got few searches for this,variable in the batch file to set the classpath.classpath states the jvm to look for the executable files in those directories and its in semicolon seperated form.

So then you use the directories where the "pre-packaging" classfiles are rather than the jarfiles.

Ok, now i copied all external libraries it needs in the
compilation's current directory and my batch is like this -

@REM !/bin/bash
@REM Now the harvester runs for the real estate site to convert
@REM it as input HTML file to an output realEstate XML file.
@REM
@REM
set HARVESTER_CLASSPATH=.;./sample


java -Xmx512m -classpath %HARVESTER_CLASSPATH% au.com.allhomes.listing.harvester.harness.ListingTestHarness --inputHtml /Harvester_new/build/classes/sample/data/rsearch.html --listingType residentialSale --parserConfigImpl au.com.allhomes.listing.harvester.parse.RealEstateCoParserConfiguration --urlConfigFile /Harvester_new/build/classes/sample/data/local_url_map.properties --outputXml results-realestate.xml --pageType listingPage

So I think now jvm will look into the current directory and also in the sample folder for data.Is this right, but stil it seems that it has trouble in finding the html file

What html file?

If you mean one of the arguments such as

--inputHtml /Harvester_new/build/classes/sample/data/rsearch.html

then make sure you have the argument correct.

Also, if you copied jars somewhere, then you still need to include those jars on the class path (or unjar, which I wouldn't suggest) as including a directory in the classpath will not automatically include all jars within those directories.

Comments
"still need to include those jars on the class path"

make sure you have the argument correct.

yes its the html file which goes as a input,and its been correct as (my command in ) the batch file,( with changes made for inclusion of jarred libraries) is below -

@REM !/bin/bash
@REM Now the harvester runs for the real estate site to convert
@REM it as input HTML file to an output realEstate XML file.
@REM 
@REM
set HARVESTER_CLASSPATH=.;./sample
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;JTidy-minimal.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;allhomes-listing-feed-binding_v0_2.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;commons-httpclient.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;commons-lang-2.0.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;persistence.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;testng-5.9-jdk15.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;xbean.jar
set HARVESTER_CLASSPATH=%HARVESTER_CLASSPATH%;xercesImpl.jar
 
java -Xmx512m -classpath %HARVESTER_CLASSPATH% au.com.allhomes.listing.harvester.harness.ListingTestHarness --inputHtml /Harvester_new/build/classes/sample/data/rsearch.html --listingType residentialSale --parserConfigImpl au.com.allhomes.listing.harvester.parse.RealEstateCoParserConfiguration --urlConfigFile /Harvester_new/build/classes/sample/data/local_url_map.properties --outputXml results-realestate.xml --pageType listingPage

Also, if you copied jars somewhere, then you still need to include those jars on the class path (or unjar, which I wouldn't suggest) as including a directory in the classpath will not automatically include all jars within those directories.

thnx for this special piece of knowledge, which i would not have expected otherwise.And made this change ,plz note the batch file above.
But now its been v demotivating for me,I am constantly getting the same old error for cannot read the html file.
Eventhough i checked with the folder rights access and paths also.
I kept it here C:\Harvester_new\build\classes\sample

public InputStream getHtmlFile() {
      InputStream htmlFileStream;
      String htmlFileName = this.cli.getHtmlFile();
      if (htmlFileName != null) {
         File htmlFile = new File(htmlFileName);
         if (htmlFile.exists() && htmlFile.canRead()) {
            try {
               htmlFileStream = new FileInputStream(htmlFile);
            }
            catch (FileNotFoundException e) {
               throw new IllegalArgumentException("Cannot read>> the inputHtml [" + htmlFileName + "]");
            }
         }
         else {
            throw new IllegalArgumentException("Cannot read <<the inputHtml [" + htmlFileName + "]");
         }
      }
      else {
         htmlFileStream = null;
      }

      return htmlFileStream;
   }

I get this exception at the cmd =>
Error: Cannot read <<the inputHtml [/Harvester_new/build/classes/sample/data/rse
arch.html]
So it goes in the outer catch.I checked with file permissions too.Its having full permission and also t6he path is correct.

If you are on a windows system, you probably need to include the drive designation (i.e. C: ) on the argument.

c:/Harvester_new/build/classes/sample/data/rsearch.html

instead of

/Harvester_new/build/classes/sample/data/rsearch.html

If you are on a windows system, you probably need to include the drive designation (i.e. C: ) on the argument.

sory m not agree this time, it worked fine on windows before.
still tried below too -

java -Xmx512m -classpath %HARVESTER_CLASSPATH% au.com.allhomes.listing.harvester.harness.ListingTestHarness --inputHtml C:/Harvester_new/build/classes/sample/data/rsearch.html --listingType residentialSale --parserConfigImpl au.com.allhomes.listing.harvester.parse.RealEstateCoParserConfiguration --urlConfigFile C:/Harvester_new/build/classes/sample/data/local_url_map.properties --outputXml results-realestate.xml --pageType listingPage

Also m doing it from
C:\Harvester_new\build\classes directory on cmd and it contains evrything inside it , batch file,jarred libraries ans the sample folder too.
cant think of whats wrong now.

Use a debugger and find out what happens behind this line

String htmlFileName = this.cli.getHtmlFile();

You are only using "File" in the rest of it, so it has nothing to do with classpath, and as long as the filename is correct and you have permissions to both it and the directory its in, it will be found. It all, however, depends on what that method above returns (which probably depends on what you enter as an argument).

Also, from earlier

... the batch file ... ... --inputHtml /Harvester_new/build/classes/sample/data/rsearch.html ... ...
I kept it here C:\Harvester_new\build\classes\sample

So which is it?

If it is

/Harvester_new/build/classes/sample/rsearch.html

as you seem to indicate with the last line of the above quote and you use

/Harvester_new/build/classes/sample/data/rsearch.html

in the batch (as you posted above), then, well, yeah, the file won't be found.

How about some diagnosis, as opposed to scratching head & trying things at random. The subtle << in the error message implies that no exception was thrown (except by you), and that the (htmlFile.exists() && htmlFile.canRead()) is false. So, which is it? Display each of these separately so you know whether it's unable to find the file or unable to read it (eg because the file is still open in an editor or previous test run that wasn't terminated)

BTW, Whether the above is the (or even part of) the problem, or just some sort of cut-n-paste error in the post, doesn't matter. You will find that 90% of the debugging work you will ever do is "attention to detail" stuff like you see right here. Basic Stuff. And if you cannot start getting that right you are going to have a very hard time. Especially when you insist that you have checked and rechecked and checked the stuff again. If you say that "on the job" and then someone comes behind you and finds this sort of simple "attention to detail" error, too often, you are going to be in hot water.

One of your daily mantras should be "attention to detail", say it, do it, live it, be it.

How about some diagnosis, as opposed to scratching head & trying things at random. The subtle << in the error message implies that no exception was thrown (except by you), and that the (htmlFile.exists() && htmlFile.canRead()) is false. So, which is it? Display each of these separately so you know whether it's unable to find the file or unable to read it (eg because the file is still open in an editor or previous test run that wasn't terminated)

yes its true, I have noticed this, because thats why i have confirmed it by putting two diff types of strings ">>" in the msg.
I knw this is thrown exception by me, as if loop goes false.
My worry is not this, but why if loop is being false and it cant find the html file.

To find out why "loop is false" you can (a) contemplate your navel in the hope that the answer is supplied by your subconscious, or (b) do like I said and find out.

sorry,needed to work on a php project these few days..so will be working on the harvesting project soon whn i wind the php up..

This article has been dead for over six months. Start a new discussion instead.