so I'm using code from http://schmidt.devlib.org/java/file-download.html in the hopes of inputting a URL and getting the file that URL points to. It works, sometimes. 90% of the time it doesn't work. Also, it seems a lot of the pages I want to download do not end in html, although I'm not too sure if that is a problem or not. Anyway, what I want is to know if its possible to supply a URL as a string and then download the file that URL points to. I have a feeling there's an easier way to do that, but I'm not sure.
Here's an example of a page that I want to download:
https://www.sportsbet.com.au/results/racing/Date/today
I'm writing a little program that download HTML files, scapes important information from those HTML files and writes the important stuff to a database. So far, as you might have guessed, things aren't going too well.
import java.io.*;
import java.net.*;
/*
* Command line program to download data from URLs and save
* it to local files. Run like this:
* java FileDownload http://schmidt.devlib.org/java/file-download.html
* @author Marco Schmidt
*/
public class FileDownload {
public static void download(String address, String localFileName) {
OutputStream out = null;
URLConnection conn = null;
InputStream in = null;
SocketAddress sa = new InetSocketAddress("proxy.csu.edu.au", 8080);
Proxy proxy = new Proxy(Proxy.Type.HTTP, sa);
try {
URL url = new URL(address);
out = new BufferedOutputStream(
new FileOutputStream(localFileName));
conn = url.openConnection(proxy);
in = conn.getInputStream();
byte[] buffer = new byte[1024];
int numRead;
long numWritten = 0;
while ((numRead = in.read(buffer)) != -1) {
out.write(buffer, 0, numRead);
numWritten += numRead;
}
System.out.println(localFileName + "\t" + numWritten);
} catch (Exception exception) {
exception.printStackTrace();
} finally {
try {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
} catch (IOException ioe) {
}
}
}
public static void download(String address) {
int lastSlashIndex = address.lastIndexOf('/');
if (lastSlashIndex >= 0 &&
lastSlashIndex < address.length() - 1) {
download(address, address.substring(lastSlashIndex + 1));
} else {
System.err.println("Could not figure out local file name for " +
address);
}
}
public static void main(String[] args) {
download("http://schmidt.devlib.org/java/file-download.html");
}
}
Angus Cheng