0

Hi DW.

Is there anyone know how to read a text file line by line from the web/internet in VC++?

I have a function which works well in reading the local file but its seems as if its doesn't find or read a file on a web server. A file I would like to read is located on a web address like this : http://www.thesitedomain.com/test.txt
Any working idea in solving this problem?

4
Contributors
8
Replies
70
Views
3 Months
Discussion Span
Last Post by thines01
0

Are you sure about that link? I clicked once and it appears to be a boilerplate "for sale" page and now I get a broken link from the Daniweb parser when I click on it. So i can't see precisely what you are trying to read.

However, you did say "text file", and you did say that you are able to read it line by line on your local machine successfully. Thus it would appear to me that your question reduces to "How can I fetch text from the web into a buffer that I can read using C++?"

The generic answer, in my opinion, is to use libcurl.

https://curl.haxx.se/libcurl/

It's my personal preference, but there are other libraries. libcurl is written for use in C, but there is a "wrapper" for C++, though I've generally found that it's easier to use the libcurl C version. This requires you to be familiar enough using C++ and C to link and use static libraries with your code. You mention that you are using Visual C++. You'll need to make sure that libcurl is built for use with Visual C++. There is a README that describes how to build libcurl or sometimes you can get lucky and someone's done it for you. It's not too hard for folks familiar with C++, but it can be a pain for those who are not.

The nice thing about curl is it's written for all sorts of languages. For me I'd do the whole thing in C/C++, but if the library linking, etc., is too much of a hassle, you can fetch the text using another implementation, including command line cURL, then parse that text using your C++ code that is already working. So take a look at the link and see if you can get some very small "Hello World"-type webpage parsing going in Visual C++ (they have examples). If you can, then that is your route. If you can't, then fetch with cURL or something else in a diffferent language, then somehow get that text to your C++ program that already works. That might involve sockets, pipes, forks, signals, etc., none of which are standard C++, but the learning curve really isn't so bad.

Thus, the overall answer is "It depends", and it largely depends on your familiarity with C++. It might also depend on the webpage you're fetching.

2

That might involve sockets, pipes, forks, signals, etc., none of which are standard C++

A little clarification here. Signals are standard in C++. I meant that the task involves creating a child process using cURL to fetch the webpage. When complete, you would signal the parent process somehow. At that point, the parent process would process the data using the function you have that already works. The way you do all of this differs between Linux, Windows, Visual C++, mingw, etc., and thus is not "standard". You can use the standard csignal library to raise the signal. You can also use the exit function from cstdlib, which is also standard, to notify the parent.

The "standard" vs. "non-standard" way of doing things is both C++'s best and worst feature. You can do practically anything in C++, but you have to worry about cross-platform compatibility way more than in, say, Java.

Votes + Comments
I find that wget is most useful for this sort of stuff.
It works but not always and everywhere.
0

I've just saw something about

boost

and I will try it and see if its whats I'm looking for but its look like its is what I'm looking for.

1

Without additional clarification (if you are using MFC, ATL, or .NET), here is a sample piece of code that reads a text file from a web server and extracts only the subtitles from what looks like story descriptions.
The text is on TextFiles.com.

I use a WebClient with the method OpenRead to open a stream to the data.
I use a Regular Expression to strip out only the subtitle.
I have attached a screenshot of the output.

#include "stdafx.h"

using namespace System;
using namespace System::IO;
using namespace System::Net;
using namespace System::Text::RegularExpressions;

int main(void)
{
    Regex^ rxSubTitles = gcnew Regex("\\d.*>(?<subtitle>.*)<");
    String^ strFileIn = L"http://textfiles.com/adventure/221baker.txt";
    WebClient^ wc = gcnew WebClient();
    StreamReader^ fileWebIn = gcnew StreamReader(wc->OpenRead(strFileIn));
    String^ strData = "";

    while(!fileWebIn->EndOfStream)
    {
        strData = fileWebIn->ReadLine();

        if(rxSubTitles->IsMatch(strData))
        {
            // take ONLY the subtitle without additional decoration
            Console::WriteLine(rxSubTitles->Match(strData)->Groups["subtitle"]->Value->ToString());
        }
    }

    fileWebIn->Close();
    return 0;
}
Attachments Output.jpg 62.26 KB
0

@Thines01. Thank you for that and it look like its what I'm looking for indeed. To mention I'm using Microsoft Visual C++ 2010 under the Microsoft Visual Studio 2010 IDE, and the project is a GUI (CLI) application. My out comes with the boost is that it worked fine with a console application but when I tried to include it to my GUI app an error of already included WinSock.h is thrown so that means boost didn't help as its gives me a problem with my intended application.

I will give this a try and see if it will solve my problem then I will also comment back. About the cURL, I had already had downloaded it, but I couldn't understand it as to how I can use it to read the text and also I couldn't understand the library building part as its talked about UNIX OS whereas I'm using Windows OS.

0

@Thines01. I've tested your code and it does retrieve the data, but the issue that I'm having now is that on my other code that read a file on a local system uses a getline and .find, npos to find the specific data from the file but now I can't seem to do this or apply it when using your code.

Here is a sample text file which should give you an idea of what kind of file I'm working on and how data is inside.
STRUCTURE

ID,Number,Key (Sometimes this part is empty or had a 1 digit)

111,676546755,
132,4735432,4

Now what I'm trying to do is to search the given ID from the file and if the ID is found then read and split that line to corresponding variables which will hold each piece but the piece that we now deal with is the Number as well as the key (Note that sometimes Key can be empty) this is for further verification process purposes.

Thats what I want to archieve. Here is just a piece of code that I used to archive this in a local file.

ifstream in("d:\\aaaa.txt");
if(in.is_open())
                         {
                             while(!in.eof() && getline(in, line, ','))
                             {
                                if((offset = line.find(ID, 0)) == string::npos)
                                {
                                // AS YOU CAN SEE HERE IS WHERE I ASIGN EACH FIELD DATA
                                // TO ITS CORRESPONDING VARIABLE.
                                    ID = atoi(line.c_str());
                                    in >> dat;
                                    getline(in, NUMBER, ',');
                                    getline(in, line, '\n');

                                    KEY = (int)atof(line.c_str());

Edited by Mr.M: Correcting

2

Easy to forget how much power you have with the Unix toolbelt at your fingertips.

Screen_Shot_2017-08-21_at_15_49_23.png

Votes + Comments
Toolbelt, power? Batman using the force?
0

So we don't muddy the water with a second question, let's continue this in another post OR in a personal message.
There may be some benefit to others if you make it a public post.
So, my suggestion is to close this post and start another one.
Either way, I will give you a thines01-flavored answer to your question.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.