I'm working on a web scraper of sorts, my main goal is data, I was wondering what a good, efficient way of organizing the data that I'm downloading is?
Here is what I have, I download the webpage, and ends up as a FileReader object, then I send it on over to ParserCallback extended class and take the tags out, and I organize the data a little, so it looks like this:
About to parse http://www.futureshop.ca/Search/SearchResults.aspx?q=10140009 Sony 15.5" Intel Core i3 330M 2.13GHz Laptop (VPCEB12FDT) - Future Shop Regular Price: $749.99 Discount -$50.00 Sale Price $699.99 Approximate Battery Life Up To 4 Hours Hard Drive Speed/Capacity 500GB 5400 RPM LED Backlit Display Not Applicable Optical Drive SuperMulti Dual Layer DVD+/-R/RW Processor Speed 2.13 GHz Processor Type Intel Core i3 Processor 330M RAM 4 GB Screen Size 15.5" ...
Because every page on the site has different amounts of information I tried using an array list and then converting it to an array but that smelled a little.
Note, this is a hobby project for myself, I work for this company :P