Hi! Does anybody know how to recursively read in files from a specific directory on the internet, in Java?
I want to read in all the text files from this web directory: http://www.cs.ucdavis.edu/~davidson/courses/170-S11/Female/

I know how to read in multiple files that are in a folder on my computer, and I how to read in a single file from the internet. But how can I read in multiple files on the internet, without hardcoding the URLs in?

// List the files on my Desktop
final File folder = new File("/Users/crystal/Desktop");
File[] listOfFiles = folder.listFiles();

for (int i = 0; i < listOfFiles.length; i++) {
	File fileEntry = listOfFiles[i];
	if (!fileEntry.isDirectory()) {
		System.out.println(fileEntry.getName());
	}
}
// Reading data from the web 
try 
{
	// Create a URL object
	URL url = new URL("http://www.cs.ucdavis.edu/~davidson/courses/170-S11/Female/5_1_1.txt");

	// Read all of the text returned by the HTTP server
	BufferedReader in = new BufferedReader (new InputStreamReader(url.openStream()));

	String htmlText;      // String that holds current file line

	// Read through file one line at a time. Print line
	while ((htmlText = in.readLine()) != null) 
	{
		System.out.println(htmlText);
	}
	in.close();
} catch (MalformedURLException e) {
	e.printStackTrace();
} catch (IOException e) {
	// If another exception is generated, print a stack trace
	e.printStackTrace();
}

Thanks!

HttpURLConnection. Use one to get the listing (you will, of course, also need an HTMLParser for parsing that content), then another connect call with a new URL per file. Google HttpURLConnection and HTMLParser and try a few things out. And, P.S., hopefully the directory is "listable" and the files "retreivable" (from an http point of view, i.e. that you can already do this from a browser) or none of this will help, of course.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.