I'm a software development student and over the summer break I decided to make an Android app, figure the best way to improve as a programmer is to program. What I'm trying to do is search a website for information (i.e. search wikipedia for the release date of Superman or search ebay for the price of a nVidia Titan).

Now my problem is that to achieve this I need to use the websites search engine and then parse the result of that search and I honestly don't know where to start with that so any help or tips would be most welcome.

Thank you in advance,

Recommended Answers

All 12 Replies

After re-reading my own post I realize I could be a bit more specific, I don't want to make an app that will search the entire web, only one that searches specific hardcoded websites.

So lets say I make an application for cinemas, the user inputs the name of the film and the cinema he wants and the application prints the times. So in the background the application would go to the cinemas webpage, use the webpages search engine, parse the page and print a list of times.

Many of the sites would be able to provide you with dedicated API that will be returning JSON object there is number of libraries that can help you to convert JSON to POJO (plain old java object) for example GSON or Jackson. In the worst case scenario that you have to actaully search through HTML you may want to use Jsoup library

Thank you peter_budo for your reply, indeed I have been looking through websites to find their API and some of them do (Amazon and Ebay for example), however how would I go around the ones that don't have an API? I had already considered using Jsoup to parse the website however after looking at the available methods I don't see how would I be able to submit a search and then parse the resulting page.

Here's an example: Overclockers doesn't have an API so I would have to somehow use their search engine and then use Jsoup to parse the result of that search (I'm not even sure it's possible).

Well given the purpose of their business and size of it overclockers are unlikely provide any api to help you out. You may have to search for some larger best price/top deal search engines for API to use

So there really is no way of doing it using some sort of HTML API? Was hopping for a way to submit the html page with data in a text field.

I would strongly advice against it, It does require too much manual work that goes down to sink hole as soon as website owner decide to make changes...

True but for educational reasons I would still like to do it, I assume I would need to use something like GET and POST?

Indeed but if you look at the source you'll notice <ul id="productList" class="view-grid">...</ul> only that part matters, I created a method which recieves a String (the String would be ul tag) and iterates it, saving the price and the alt in an object and then place that object in an HashMap (alt is the key), once it reaches the end I simply return the Map.

That is what I meant. Lot of manual parsing and if they make smallest change in their page you can get screwed and spend much time on figuring out difference and how to fix it. (Speaking of own experience with Dice Battlelog for Battlefield 3 and filtering data out of it. Luckily lot of stuff is in JSON so I do not need to parse html, beside first page after login)

Indeed, no doubt it's a horrible idea! But since I'm only making the app to self-learn how to program for Android it'll do :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.