Hi DW.

Is there anyone know how to read a text file line by line from the web/internet in VC++?

I have a function which works well in reading the local file but its seems as if its doesn't find or read a file on a web server. A file I would like to read is located on a web address like this : http://www.thesitedomain.com/test.txt
Any working idea in solving this problem?

1 Week
Discussion Span
Last Post by thines01

Are you sure about that link? I clicked once and it appears to be a boilerplate "for sale" page and now I get a broken link from the Daniweb parser when I click on it. So i can't see precisely what you are trying to read.

However, you did say "text file", and you did say that you are able to read it line by line on your local machine successfully. Thus it would appear to me that your question reduces to "How can I fetch text from the web into a buffer that I can read using C++?"

The generic answer, in my opinion, is to use libcurl.


It's my personal preference, but there are other libraries. libcurl is written for use in C, but there is a "wrapper" for C++, though I've generally found that it's easier to use the libcurl C version. This requires you to be familiar enough using C++ and C to link and use static libraries with your code. You mention that you are using Visual C++. You'll need to make sure that libcurl is built for use with Visual C++. There is a README that describes how to build libcurl or sometimes you can get lucky and someone's done it for you. It's not too hard for folks familiar with C++, but it can be a pain for those who are not.

The nice thing about curl is it's written for all sorts of languages. For me I'd do the whole thing in C/C++, but if the library linking, etc., is too much of a hassle, you can fetch the text using another implementation, including command line cURL, then parse that text using your C++ code that is already working. So take a look at the link and see if you can get some very small "Hello World"-type webpage parsing going in Visual C++ (they have examples). If you can, then that is your route. If you can't, then fetch with cURL or something else in a diffferent language, then somehow get that text to your C++ program that already works. That might involve sockets, pipes, forks, signals, etc., none of which are standard C++, but the learning curve really isn't so bad.

Thus, the overall answer is "It depends", and it largely depends on your familiarity with C++. It might also depend on the webpage you're fetching.


That might involve sockets, pipes, forks, signals, etc., none of which are standard C++

A little clarification here. Signals are standard in C++. I meant that the task involves creating a child process using cURL to fetch the webpage. When complete, you would signal the parent process somehow. At that point, the parent process would process the data using the function you have that already works. The way you do all of this differs between Linux, Windows, Visual C++, mingw, etc., and thus is not "standard". You can use the standard csignal library to raise the signal. You can also use the exit function from cstdlib, which is also standard, to notify the parent.

The "standard" vs. "non-standard" way of doing things is both C++'s best and worst feature. You can do practically anything in C++, but you have to worry about cross-platform compatibility way more than in, say, Java.

Votes + Comments
I find that wget is most useful for this sort of stuff.
It works but not always and everywhere.

I've just saw something about


and I will try it and see if its whats I'm looking for but its look like its is what I'm looking for.


Without additional clarification (if you are using MFC, ATL, or .NET), here is a sample piece of code that reads a text file from a web server and extracts only the subtitles from what looks like story descriptions.
The text is on TextFiles.com.

I use a WebClient with the method OpenRead to open a stream to the data.
I use a Regular Expression to strip out only the subtitle.
I have attached a screenshot of the output.

#include "stdafx.h"

using namespace System;
using namespace System::IO;
using namespace System::Net;
using namespace System::Text::RegularExpressions;

int main(void)
    Regex^ rxSubTitles = gcnew Regex("\\d.*>(?<subtitle>.*)<");
    String^ strFileIn = L"http://textfiles.com/adventure/221baker.txt";
    WebClient^ wc = gcnew WebClient();
    StreamReader^ fileWebIn = gcnew StreamReader(wc->OpenRead(strFileIn));
    String^ strData = "";

        strData = fileWebIn->ReadLine();

            // take ONLY the subtitle without additional decoration

    return 0;
Attachments Output.jpg 62.26 KB
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.