Hi everyone;

I got no idea how to start this and if it is possible;

I have a text file (.txt) that have lot of lines and I want it to transfer it in Excel (.xlsx / CSV)

But not all will be transfer in excel, only a specific number (array);

Example Text File:

AUG.02, 2015
SMPH01 RPMM 020000
AAXX 02001
98132 31462 21101 10293 20264 40145 53020 70540 82100 33302 20255 56399 59004 82817 ='01
  rh93/76 rmks 255 2am, fe.
98133 31467 20902 10300 20270 40109 5//// 70544 82200 222// 20501 33301 20255 56299 58002
 82818 MJ. RMK.RH=90/80 MIN.=25.5 8AM
98134 32465 21002 10287 20253 39928 40114 53013 82200 33301 20250 56999 58002 82819 ='12
  max=305@0600z min=250@1800z rh=87/79 MP
98222 31570 10201 10284 20242 40114 51007 70544 81200 33301 20249 56999 59002 81820 ='01
  occurance of min. Temp. 2210z ca

As you can see it is delimited (space);

I want to get some specific numbers; and some will be ignored;

Line 1 to 3 will be ignored;

I get number in line 4,6,7,8,10;

It will be look like this; xlsx / CSV :

   A      B      C      D      E      F      G      H      I      J
1 98     132     3     146     7     054     8     210   20255   82817
2 98     133     3     146     7     054     8     220   20255   82818
3 98     134     3     246     5     301     8     220   20250   82819

Column = Col
Col A always start in 98
Col B the three numbers beside the 98
Col C first number in second cell (ex. 98132 3 1462)
Col D second cell 3 numbers included (ex. 98132 3 146 2)
Col E cell 8, first number (ex. 7 0540)
Col F cell 8, 3 numbers included (ex. 7 054 0)
Col G cell 9, first number (ex. 8 2100)
Col H cell 9, 3 numbers included (ex. 8 210 0)

Col I is will be pretending to the number which is start 333, in my example above (ex. 33302 then Col I will be 20255) the 333 number will be the clue where I will be appear.

Col J will be ignore the 2 cell after Col I, refering in the example above (ex. 20255 56399 59004 82817) 56399 and 59004 will be ignored and Col J will be 82817

Another problem is in (Col J line 2) 82818 is in the next line of the text file.

Please help me, and having explanations will be more understandable.

Thank you and God Bless;

-zelrick

Recommended Answers

All 35 Replies

You could read/skip the first 2 lines ...
Then read 2 lines, skipping over that first line and parsing the 2nd line ... till done

You could use stringstream objects to ease parsing each line of numbers you wish to parse.

Hello Sir David W;

I'm sorry can you give me something scratch to begin with?

I'm naive in C++ programming sorry.

-zelrick

It is possible to parse data, but it may be a nightmare depending on the following...

1) Would data format be correct all the time?
2) If not, what do you have to deal with any improper format?
3) Could you simply ignore/dump the improper data format?

If you could assume that data always come in 2 lines, simply read 2 lines and merge them into 1 to get rid of the need to check which line the interested portions are in. Then you should be able to easily parse (digit) string from there.

Hello Sir David W; Sir Taywin;

Sir David W,

I'm still on the study to the link that you have given to me, I understand some of the content. Thank you to the link;

Sir Taywin

Yes, the first 3 line is correct all the time. line 4 till end don't have any format. It change time to time but it consist 5 string in one cell (but this is on text file format) I just assume it can be put in excel since there is a space delimited; example:

//This is text file:
98133 31467 20902 10300 20270 40109 5//// 70544 82200

As you can see, 5 string then a space gap;

I'm confused now please help me also in my logic; This is what I'm thinking / will happen:

1. In my C++ code, I indicate the name of the text file (multiple text file is good).
2. If I run my C++, It will create a new file (CSV is good, but can also in xslx).
3. When I open CSV file / xlsx file, what I need only must be there (refering to the example of my post, only selected string will be called);

thus my logic is right or not?

can C++ can do it? I'm sorry I'm very confused now how it will happen.

PS: I'm using Visual Studio 2008; I use string for all of it since no computation included and some content have string, it will be more complicated if I put integer;

Thank you; God Bless

-zelrick

Show us the code you have tried so far ... for this:

You could read/skip the first 2 lines ... Then read (in pairs of) 2 lines, skipping over that first line and parsing the 2nd line ... till done ... You could use stringstream objects to ease parsing each line of numbers you wish to parse ... (as per the examples at the link provided.)

Hello Sir David W;

I've been searching and here I got;

It have an error and not running; but still sticking in this code;

#include<iostream>
#include<fstream>
#include<string>
#include<vector>
#include<sstream>

using namespace std;

int main();
{
ifstream in("untitled.txt");
vectors< vectors<string> > data;
string line;

{
    while (getline(in,line)){
        vectors <string> fields;
        std::stringstream linestream(line);
        for (int col = 0; col < 21 ; col++){
            string f;
            linestream >> f;
            fields.push_back(f);
        }
        data.push_back(fields);
    }

    for (vectors < vectors<string> >::size_type row = 0; row < data.size() ; row++ ) {
        for (vectors<string>::size_type col = 0; col < data[row].size() ; col++ ) {
            cout<<data[row][col] << ",";
        }
        cout << endl;
    }
    for (vectors<string>::size_type col = 0; col < data[row].size(); col++ ) {
        for (vectors<vectors<string>>::size_type row = 0 ; row < data.size() ; row++ ) {
            cout << data[row][col] << ",";
        }
        cout << endl;
}

return 0;
}

Hello Sir David W; Sir Taywin;

You might want to see this; Click

I been searching to someone that same problem with me. According to the comments he successfully finish his problem. When I try, it gives me nothing;

My Second try (new code):

But still there is an error which is I can't fix;

But it works fine to the video tutorial that I watch;

It will just read everything in the text file;

Here is the code:

#include<iostream>
#include<fstream>
#include<string>
#include<vector>
#include<sstream>
//#include<array>
#include<iomanip>
#include<stdio.h>

int rowA = 0;
int colA = 0;

using namespace std;

int main()
{
string lineA;
string x;
int arrayA[10][10] = {{0}};
string filename;
ifstream fileIN;

// Intro
cout << "This Program Reads the Number of rows and columns in your data file" << endl;
cout << "(aswell as inputs the data file into an Array" << endl;
cout << "\nPlease Enter the Data file below and press Enter" << endl;
cin >> filename;
fileIN.open(filename);

// Error Check
if(fileIN.fail()) {
    cerr << "* File you are trying to access cannot be found or opened!";
    exit(1);
}

// Reading the Data File
cout << "\n" << endl;
while (fileIN.good()) {
    while(getline(fileIN, lineA)) {
        istringstream streamA(lineA);
        colA = 0;
        while(streamA >>x) {
            arrayA[rowA][colA] = x;
            colA++;
        }
        rowA++;
    }
}

// Display Data
cout << "# of Rows ---->" << rowA << endl;
cout << "# of Columns ---->" << colA << endl;
cout << " " << endl;
for(int i = 0; i<rowA; i++) {
    for(int j = 0; j<colA; j++) {
        cout << left << setw(6) << arrayA[i][j] << " ";
    }
    cout << endl;
}

return 0;
}

You really do not need to read the processed records into a vector ...

I would keep it simple ... something like this:

// fileReadWrite.cpp //

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

using namespace std;

const char* FILE_IN  = "rawDataForExcel.txt";
const char* FILE_OUT = "selectedData.csv";

// example of 'in file'
/*
    AUG.02, 2015
    SMPH01 RPMM 020000
    AAXX 02001
    98132 31462 21101 10293 20264 40145 53020 70540 82100 33302 20255 56399 59004 82817 ='01
      rh93/76 rmks 255 2am, fe.
    98133 31467 20902 10300 20270 40109 5//// 70544 82200 222// 20501 33301 20255 56299 58002
     82818 MJ. RMK.RH=90/80 MIN.=25.5 8AM
    98134 32465 21002 10287 20253 39928 40114 53013 82200 33301 20250 56999 58002 82819 ='12
      max=305@0600z min=250@1800z rh=87/79 MP
    98222 31570 10201 10284 20242 40114 51007 70544 81200 33301 20249 56999 59002 81820 ='01
      occurance of min. Temp. 2210z ca
*/

struct CSV
{
   string ary[10]; // A,B,C,D,E,F,G,H,I,J; //

   // returns 0, or 1 if NEXT line is read in already //
   int extract( istream& fin, const string& str )
   {
       istringstream iss( str );
       string tmp;
       iss >> tmp;
       ary[0] = tmp.substr(0,2);
       ary[1] = tmp.substr(2);

       iss >> tmp;
       ary[2] = tmp.substr(0,1);
       ary[3] = tmp.substr(1,3);
       ary[4] = ary[5] = ary[6] = ary[7] = ary[8] = ary[9] = "x";

       for( int i = 0; i < 6; ++ i )
            iss >> tmp; // skip over 3,4,5,6,7 and read 8
       ary[4] = tmp.substr(0,1);
       ary[5] = tmp.substr(1,3);

       iss >> tmp;
       ary[6] = tmp.substr(0,1);
       ary[7] = tmp.substr(1,3);

       while( iss >> tmp && tmp.substr(0,3) != "333" ) ;
       iss >> ary[8] >> tmp >> tmp;
       iss >> ary[9];
       if( !iss ) // then read next line ... //
       {
           string tmp;
           getline( fin, tmp );
           istringstream iss( tmp );
           iss >> ary[9];
           return 1;
       }

       return 0;
   }


   friend ostream& operator << ( ostream& os, const CSV& rec )
   {
       for( int i = 0; i < 9; ++ i )
            os << rec.ary[i] << ',';
       return os << rec.ary[9] << endl;
   }

} ;




int main()
{
    ifstream fin( FILE_IN );
    ofstream fout( FILE_OUT );
    if( fin && fout )
    {
        CSV rec;
        string line;
        int count = 0;

        // skip over first two lines ... //
        while( getline( fin, line ) && ++count < 2 ) ;

        count = 0; // reset to zero //
        while( getline( fin, line ) )
        {
            ++count;
            if( count % 2 == 0 )
            {
                //cout << line << endl; // while debugging //
                int flag = rec.extract( fin, line );
                cout << count/2 << ',' << rec; // write each 'rec' on a new line, to the console screen
                fout << count/2 << ',' << rec; // write each 'rec' on a new line, in the output.txt file
                count += flag;
            }
        }

        fout.close();
        fin.close();
    }
    else
    {
        if( !fin ) cout << "There was a problem opening file " << FILE_IN << endl;
        if( !fout ) cout << "There was a problem opening file " << FILE_OUT << endl;
    }
}

Hello Sir David W;

I've tried your code and no error appear but once I run it it just only show "Press any key to continue..."

Thank you; God Bless;

-zelrick

You need to have the file it is looking for ... available.

For ease of access, place the file to be read in the same folder as your compiled .exe (executable) file.

Did you NOT see the comments ?

const char* FILE_IN  = "rawDataForExcel.txt";
const char* FILE_OUT = "selectedData.csv";
// example of 'in file'
/*
    AUG.02, 2015
    SMPH01 RPMM 020000
    AAXX 02001
    98132 31462 21101 10293 20264 40145 53020 70540 82100 33302 20255 56399 59004 82817 ='01
      rh93/76 rmks 255 2am, fe.
    98133 31467 20902 10300 20270 40109 5//// 70544 82200 222// 20501 33301 20255 56299 58002
     82818 MJ. RMK.RH=90/80 MIN.=25.5 8AM
    98134 32465 21002 10287 20253 39928 40114 53013 82200 33301 20250 56999 58002 82819 ='12
      max=305@0600z min=250@1800z rh=87/79 MP
    98222 31570 10201 10284 20242 40114 51007 70544 81200 33301 20249 56999 59002 81820 ='01
      occurance of min. Temp. 2210z ca
*/

The demo program, as above, IS looking for a file with the name:

"rawDataForExcel.txt"

So ...

Make sure a file with the above contents is there ... and called by that name!

I do change the File in and File out, but didn't know that I need to put it at the same folder where I build sorry sir my mistake.

AWESOME! it works, but I got error when I try it to the real file.

Sir David W, can help me figure how I get error; This is my one of the complete file;

AUG.02, 2015
SMPH01 RPMM 020000
AAXX 02001
98132 31462 21101 10293 20264 40145 53020 70540 82100 33302 20255 56399 59004 82817 ='01
rh93/76 rmks 255 2am, fe.
98133 31467 20902 10300 20270 40109 5//// 70544 82200 222// 20501 33301 20255 56299 58002
82818 MJ. RMK.RH=90/80 MIN.=25.5 8AM
98134 32465 21002 10287 20253 39928 40114 53013 82200 33301 20250 56999 58002 82819 ='12
max=305@0600z min=250@1800z rh=87/79 MP
98222 31570 10201 10284 20242 40114 51007 70544 81200 33301 20249 56999 59002 81820 ='01
occurance of min. Temp. 2210z ca
98223 31460 21101 10288 20259 40103 52013 70540 81205 33301 20240 56999 58006 81819 ='08
RH 96/47 TMIN 24.0 @ 5:30 A.M. TAFOR RPLI 0606 32006KT 9999 FEW019 TEMPO 0612 32008KT 9999
FEW019 SCT100 RPLI 020000Z 11002KT 9999 FEW019TCU 29/26 Q1010 NOSIG RMK A2982 TCU W RPG
98232 31570 30000 10279 20246 30109 40112 53009 70500 81100 222// 20100 33302 20262 56999
58007 81820 RH92/65 MAX.TEMP.33.0@0600Z MIN.TEMP.26.2@2200Z REM.SLIGHT SEA ET'09
98233 31568 50000 10253 20234 40121 53024 70540 83230 33302 20240 56999 58010 83820 85358='08
T MIN = 24.0 @ 5:40 AM, MX RH = 90%, MN RH = 65% MLT
98324 11462 40901 10280 20251 40107 53013 60084 70544 82211 33312 20244 56929 58006 70084 82816
83358 CC MIN TEMP 24.4 AT 2100
98325 31462 31601 10278 20252 40101 53011 70592 83970 222// 20000 33302 20260 56999 58001 81916
82817 98308 SEA SMOOTH RH-89/71 RMK-26.0/0000Z EI '56
98327 11468 50301 10278 20251 39934 40107 52011 69984 70544 82108 33312 20254 56999 59002
70008 82820 85280 MAX RH = 94 MIN RH = 44 MAX TEMP = 32.8 MIN TEMP = 25.4 / 4:57 am - GD '13
98328 11364 23601 10186 20177 38523 40097 53022 69901 70544 82201 33311 20164 50270 55007
59014 82814.rh 98/82% min temp.16.4 dc @0250z:-) rb'02
98330 11475 20000 10274 20258 30074 40106 52019 69964 70544 81901 33301 20255 56999 58008
70006 81918 555 00045 Tmin. = 25.5C at 6:00AM dc'08
98334 12480 20000 10270 20243 39897 40099 52013 60014 82200 33311 20245 56999 59001 70010
82819 small short wavelets jg.Mx33.0at3pmmn24.5at8pmrh 91/55 '13
98336 31465 30402 10284 20252 30102 40107 53009 70544 82201 33301 20244 56999 58003
82815. Min. 244-2352 Z. AM '14
98425 11560 80402 10280 20237 40101 52007 60224 70544 8487/ 33312 20245 56999 58004 70220
84620 88358 min. 2200 PL'59
98426 31568 60000 10285 20243 40098 52010 70544 81178 33303 20252 56999 58002 81820 83358
86280; RHmax=88%, min=53%; Tmin=25.2C@2145UTC ...HH'08
98427 11470 30000 10270 20236 39923 40100 53006 60154 70182 82360 33302 20240 56999 59004
70150 82917 83357 555 20010 RD RH 90/68 tme lwst tmp. 2200 Z'11
98428 11460 43602 10281 20242 40097 53008 60114 71541 82961 33312 20274 56999 58001 70106
81917 81618= 2100Z 89/62 /cg'56
98429 11559 70402 10278 20228 40098 51009 69954 71544 85970 333 20258 56999 58000 70005
82922 83823 85359 = t.m. 25.8/3;27pm - ct'08
98430 1145870401 10274 20242 40100 52009 60454 70584 84970 333 20240 56999 58000 70447
81918 82620 85358 555 29997=min t 24.0/6am=cd'59
98431 31470 41602 10271 20251 30057 40104 53005 70544 82230 33302 20253 56999 58002 82818
84360= rh-90/60 tmin@2330z kb'54
98432 11468 70000 10264 20254 40097 52008 69904 70584 83570 33312 20248 56999 58001 79999
83618 86358 555 29999 rh-90/56 total mo rr- 309.1 mm rbm'06
98433 31462 73203 10228 20211 39386 40104 53004 60044 71544 85970 33301 20205 56999 58005
70044 85912 lp '01
21433 31460 73203 10217 20207 39382 40102 55002 74022 83270 33302 56999 83812 87357 lp2149
98434 11468 73603 10265 20259 30098 40106 53014 60114 72582 84908 33313 20251 56199 58000
70109 83917 81818 84275 555 20018 RH= 97/74 min temp2984 NEG RMK. NB/VV @ 2200Z / RR'07
98435 12472 30000 10284 20255 40076 53005 60064 82101 33311 20244 56999 59006 70055 82818='09
RH=95/75 min temp=24.4C RL
98440 11470 20000 10274 20258 30093 40098 53009 60504 70280 81931 33312 20225 56999 58000 70503
81915 81816 sb
98444 12468 20000 10280 20252 30076 40095 53013 60015 82201 33311 20245 56999 59001 70010
82818 tmax@2200z rh=61-97% MT '57
98446 12475 62202 10240 20205 30065 40111 53007 60024 84208 33312 20242 56599 58001 70020
84818 86270 RH=94/74.JP'13
98526 12575 80000 10270 20249 40103 5//// 69924 82207 33312 20238 56999 59001 70002 82820 88274 98315 jbf

VIA SMS/GLOBE/SMART/PLDT      JUDGE FORLIFE/CM/RR

Sir David W, all my file must read all "98" example 98435 98440

if not started with 98 it will be considered as a continuation of the above line.

PS: please tell me why I got error but no error in our first example.

Thank you; God Bless;

-zelrick

The raw data file NEEDs to be 'massaged' to be regular to be read ok ...

IF these TWO lines are FIXED ...

It seems to be ALL read ok.

AUG.02, 2015
SMPH01 RPMM 020000
AAXX 02001
98132 31462 21101 10293 20264 40145 53020 70540 82100 33302 20255 56399 59004 82817 ='01
rh93/76 rmks 255 2am, fe.
98133 31467 20902 10300 20270 40109 5//// 70544 82200 222// 20501 33301 20255 56299 58002
82818 MJ. RMK.RH=90/80 MIN.=25.5 8AM
98134 32465 21002 10287 20253 39928 40114 53013 82200 33301 20250 56999 58002 82819 ='12
max=305@0600z min=250@1800z rh=87/79 MP
98222 31570 10201 10284 20242 40114 51007 70544 81200 33301 20249 56999 59002 81820 ='01
occurance of min. Temp. 2210z ca
98223 31460 21101 10288 20259 40103 52013 70540 81205 33301 20240 56999 58006 81819 ='08
RH 96/47 TMIN 24.0 @ 5:30 A.M. TAFOR RPLI 0606 32006KT 9999 FEW019 TEMPO 0612 32008KT 9999 FEW019 SCT100 RPLI 020000Z 11002KT 9999 FEW019TCU 29/26 Q1010 NOSIG RMK A2982 TCU W RPG
98232 31570 30000 10279 20246 30109 40112 53009 70500 81100 222// 20100 33302 20262 56999
58007 81820 RH92/65 MAX.TEMP.33.0@0600Z MIN.TEMP.26.2@2200Z REM.SLIGHT SEA ET'09
98233 31568 50000 10253 20234 40121 53024 70540 83230 33302 20240 56999 58010 83820 85358='08
T MIN = 24.0 @ 5:40 AM, MX RH = 90%, MN RH = 65% MLT
98324 11462 40901 10280 20251 40107 53013 60084 70544 82211 33312 20244 56929 58006 70084 82816
83358 CC MIN TEMP 24.4 AT 2100
98325 31462 31601 10278 20252 40101 53011 70592 83970 222// 20000 33302 20260 56999 58001 81916
82817 98308 SEA SMOOTH RH-89/71 RMK-26.0/0000Z EI '56
98327 11468 50301 10278 20251 39934 40107 52011 69984 70544 82108 33312 20254 56999 59002
70008 82820 85280 MAX RH = 94 MIN RH = 44 MAX TEMP = 32.8 MIN TEMP = 25.4 / 4:57 am - GD '13
98328 11364 23601 10186 20177 38523 40097 53022 69901 70544 82201 33311 20164 50270 55007
59014 82814.rh 98/82% min temp.16.4 dc @0250z:-) rb'02
98330 11475 20000 10274 20258 30074 40106 52019 69964 70544 81901 33301 20255 56999 58008
70006 81918 555 00045 Tmin. = 25.5C at 6:00AM dc'08
98334 12480 20000 10270 20243 39897 40099 52013 60014 82200 33311 20245 56999 59001 70010
82819 small short wavelets jg.Mx33.0at3pmmn24.5at8pmrh 91/55 '13
98336 31465 30402 10284 20252 30102 40107 53009 70544 82201 33301 20244 56999 58003
82815. Min. 244-2352 Z. AM '14
98425 11560 80402 10280 20237 40101 52007 60224 70544 8487/ 33312 20245 56999 58004 70220
84620 88358 min. 2200 PL'59
98426 31568 60000 10285 20243 40098 52010 70544 81178 33303 20252 56999 58002 81820 83358
86280; RHmax=88%, min=53%; Tmin=25.2C@2145UTC ...HH'08
98427 11470 30000 10270 20236 39923 40100 53006 60154 70182 82360 33302 20240 56999 59004
70150 82917 83357 555 20010 RD RH 90/68 tme lwst tmp. 2200 Z'11
98428 11460 43602 10281 20242 40097 53008 60114 71541 82961 33312 20274 56999 58001 70106
81917 81618= 2100Z 89/62 /cg'56
98429 11559 70402 10278 20228 40098 51009 69954 71544 85970 333 20258 56999 58000 70005
82922 83823 85359 = t.m. 25.8/3;27pm - ct'08
98430 1145870401 10274 20242 40100 52009 60454 70584 84970 333 20240 56999 58000 70447
81918 82620 85358 555 29997=min t 24.0/6am=cd'59
98431 31470 41602 10271 20251 30057 40104 53005 70544 82230 33302 20253 56999 58002 82818
84360= rh-90/60 tmin@2330z kb'54
98432 11468 70000 10264 20254 40097 52008 69904 70584 83570 33312 20248 56999 58001 79999
83618 86358 555 29999 rh-90/56 total mo rr- 309.1 mm rbm'06
98433 31462 73203 10228 20211 39386 40104 53004 60044 71544 85970 33301 20205 56999 58005
70044 85912 lp '01 21433 31460 73203 10217 20207 39382 40102 55002 74022 83270 33302 56999 83812 87357 lp2149
98434 11468 73603 10265 20259 30098 40106 53014 60114 72582 84908 33313 20251 56199 58000
70109 83917 81818 84275 555 20018 RH= 97/74 min temp2984 NEG RMK. NB/VV @ 2200Z / RR'07
98435 12472 30000 10284 20255 40076 53005 60064 82101 33311 20244 56999 59006 70055 82818='09
RH=95/75 min temp=24.4C RL
98440 11470 20000 10274 20258 30093 40098 53009 60504 70280 81931 33312 20225 56999 58000 70503
81915 81816 sb
98444 12468 20000 10280 20252 30076 40095 53013 60015 82201 33311 20245 56999 59001 70010
82818 tmax@2200z rh=61-97% MT '57
98446 12475 62202 10240 20205 30065 40111 53007 60024 84208 33312 20242 56599 58001 70020
84818 86270 RH=94/74.JP'13
98526 12575 80000 10270 20249 40103 5//// 69924 82207 33312 20238 56999 59001 70002 82820 88274 98315 jbf

If the (next) line (after the line processed) DOES NOT begin with '98' ...
i.e. if this is the case ...
and then ...
this NEXT LINE is to be taken as a 'continuation' ...

then you can make some slight edit to code to process RAW DATA (without it being pre-processed-massaged.)

Hello Sir David W;

Sorry for the late reply;

This code:

while( iss >> tmp && tmp.substr(0,3) != "333" ) ;

will be my reference I change 333 to 98, But I don't know where to put it.

PS: In this code:

const char* FILE_IN  = "Untitled.txt";

can I put more .txt? like multiple processing?

"Untitled.txt","sample.txt","sample2.txt"; //can be?

if can, how will be the output in excel?

NO!

This code: while( iss >> tmp && tmp.substr(0,3) != "333" ) ;
will be my reference I change 333 to 98, But I don't know where to put ...

NO!!! Leave that as is, since that was a part of your program data extraction specifications.

You need to think about what your program is supposed to do ... at each step ... and thus what edit(s) to make to example???

If you wish to process several files, each with the very same structure as the first, you could store the names in an array or an other C++ container like a vector, and then loop though each file name to process that file in turn ... while outputting processed data to files (or appending to a common output file if that was what you want.)

But ... get your code working perfectly ok for ONE representative example input file firstly !!!

Hint : the code might be a little simpler ... Read / discard 1st 3 lines ...
Then read a line ... and while exists next line and next line does not begin with "86" append to first line
Otherwise have first line of next rec... So process prev rec / 'extended line' and output the extracted data from this 'extended line'

Hello Sir David W;

Under Int main();

string PK; // stands for primary key;

while ( getline( fin, line) && PK++ > 1);
PK = 98;

Am I right to write while under Int main()?

Sorry for being naive;

Thank you; God Bless;

-zelrick

This code of yours:

string PK; // stands for primary key;
while ( getline( fin, line) && PK++ > 1);
PK = 98;

does NOT make any sense at all!

If PK is a C++ string,
then the default value (when constructed as you have coded)
is "", i.e. the empty string, and then this code:

PK++ makes NO sense at all!

But ...

PK += "Sam"; 
// this code makes sense, it 'concatenates' the string "Sam" 
// to the end of the empty string PK ... 
// so that PK then becomes "Sam" //

And then trying to compare a C++ string value to an int of value 1 ...
this reveals that you are really terribly mixed up,
and NOT understanding even C++ coding basics!

You need to start at the beginning and learn to program step by step.
(Study the 6 fast steps link I gave you previously ...)
There are many beginning C++ tutorials on line ...

Try this slower paced one:

http://developers-heaven.net/forum/index.php/topic,127.0.html

Where did you get this problem of extracting data from source files structured as you have suggested ... to produce the particular extracted csv data file you seem to want?

But take a look ...
this is how one might code a solution ...
after your last 'hint' ...
about the way the data is structured in the input data file:

// fileReadWrite3.cpp //  // 2015-08-26 //

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

/*
    Hint:

    The code might be a little simpler ... Read / discard 1st 3 lines ...
    Then read a line ... and while exists next line and next line does not begin with "98" append to first line
    Otherwise have first line of next rec...
    So process prev rec / 'extended line' and output the extracted data from this 'extended line'
*/

using namespace std;

const char* FILE_IN  = "rawDataForExcel.txt";
const char* FILE_OUT = "selectedData.csv";

// example of 'in file'
/*
AUG.02, 2015
SMPH01 RPMM 020000
AAXX 02001
98132 31462 21101 10293 20264 40145 53020 70540 82100 33302 20255 56399 59004 82817 ='01
rh93/76 rmks 255 2am, fe.
98133 31467 20902 10300 20270 40109 5//// 70544 82200 222// 20501 33301 20255 56299 58002
82818 MJ. RMK.RH=90/80 MIN.=25.5 8AM
98134 32465 21002 10287 20253 39928 40114 53013 82200 33301 20250 56999 58002 82819 ='12
max=305@0600z min=250@1800z rh=87/79 MP
98222 31570 10201 10284 20242 40114 51007 70544 81200 33301 20249 56999 59002 81820 ='01
occurance of min. Temp. 2210z ca
98223 31460 21101 10288 20259 40103 52013 70540 81205 33301 20240 56999 58006 81819 ='08
RH 96/47 TMIN 24.0 @ 5:30 A.M. TAFOR RPLI 0606 32006KT 9999 FEW019 TEMPO 0612 32008KT 9999
FEW019 SCT100 RPLI 020000Z 11002KT 9999 FEW019TCU 29/26 Q1010 NOSIG RMK A2982 TCU W RPG
98232 31570 30000 10279 20246 30109 40112 53009 70500 81100 222// 20100 33302 20262 56999
58007 81820 RH92/65 MAX.TEMP.33.0@0600Z MIN.TEMP.26.2@2200Z REM.SLIGHT SEA ET'09
98233 31568 50000 10253 20234 40121 53024 70540 83230 33302 20240 56999 58010 83820 85358='08
T MIN = 24.0 @ 5:40 AM, MX RH = 90%, MN RH = 65% MLT
98324 11462 40901 10280 20251 40107 53013 60084 70544 82211 33312 20244 56929 58006 70084 82816
83358 CC MIN TEMP 24.4 AT 2100
98325 31462 31601 10278 20252 40101 53011 70592 83970 222// 20000 33302 20260 56999 58001 81916
82817 98308 SEA SMOOTH RH-89/71 RMK-26.0/0000Z EI '56
98327 11468 50301 10278 20251 39934 40107 52011 69984 70544 82108 33312 20254 56999 59002
70008 82820 85280 MAX RH = 94 MIN RH = 44 MAX TEMP = 32.8 MIN TEMP = 25.4 / 4:57 am - GD '13
98328 11364 23601 10186 20177 38523 40097 53022 69901 70544 82201 33311 20164 50270 55007
59014 82814.rh 98/82% min temp.16.4 dc @0250z:-) rb'02
98330 11475 20000 10274 20258 30074 40106 52019 69964 70544 81901 33301 20255 56999 58008
70006 81918 555 00045 Tmin. = 25.5C at 6:00AM dc'08
98334 12480 20000 10270 20243 39897 40099 52013 60014 82200 33311 20245 56999 59001 70010
82819 small short wavelets jg.Mx33.0at3pmmn24.5at8pmrh 91/55 '13
98336 31465 30402 10284 20252 30102 40107 53009 70544 82201 33301 20244 56999 58003
82815. Min. 244-2352 Z. AM '14
98425 11560 80402 10280 20237 40101 52007 60224 70544 8487/ 33312 20245 56999 58004 70220
84620 88358 min. 2200 PL'59
98426 31568 60000 10285 20243 40098 52010 70544 81178 33303 20252 56999 58002 81820 83358
86280; RHmax=88%, min=53%; Tmin=25.2C@2145UTC ...HH'08
98427 11470 30000 10270 20236 39923 40100 53006 60154 70182 82360 33302 20240 56999 59004
70150 82917 83357 555 20010 RD RH 90/68 tme lwst tmp. 2200 Z'11
98428 11460 43602 10281 20242 40097 53008 60114 71541 82961 33312 20274 56999 58001 70106
81917 81618= 2100Z 89/62 /cg'56
98429 11559 70402 10278 20228 40098 51009 69954 71544 85970 333 20258 56999 58000 70005
82922 83823 85359 = t.m. 25.8/3;27pm - ct'08
98430 1145870401 10274 20242 40100 52009 60454 70584 84970 333 20240 56999 58000 70447
81918 82620 85358 555 29997=min t 24.0/6am=cd'59
98431 31470 41602 10271 20251 30057 40104 53005 70544 82230 33302 20253 56999 58002 82818
84360= rh-90/60 tmin@2330z kb'54
98432 11468 70000 10264 20254 40097 52008 69904 70584 83570 33312 20248 56999 58001 79999
83618 86358 555 29999 rh-90/56 total mo rr- 309.1 mm rbm'06
98433 31462 73203 10228 20211 39386 40104 53004 60044 71544 85970 33301 20205 56999 58005
70044 85912 lp '01
21433 31460 73203 10217 20207 39382 40102 55002 74022 83270 33302 56999 83812 87357 lp2149
98434 11468 73603 10265 20259 30098 40106 53014 60114 72582 84908 33313 20251 56199 58000
70109 83917 81818 84275 555 20018 RH= 97/74 min temp2984 NEG RMK. NB/VV @ 2200Z / RR'07
98435 12472 30000 10284 20255 40076 53005 60064 82101 33311 20244 56999 59006 70055 82818='09
RH=95/75 min temp=24.4C RL
98440 11470 20000 10274 20258 30093 40098 53009 60504 70280 81931 33312 20225 56999 58000 70503
81915 81816 sb
98444 12468 20000 10280 20252 30076 40095 53013 60015 82201 33311 20245 56999 59001 70010
82818 tmax@2200z rh=61-97% MT '57
98446 12475 62202 10240 20205 30065 40111 53007 60024 84208 33312 20242 56599 58001 70020
84818 86270 RH=94/74.JP'13
98526 12575 80000 10270 20249 40103 5//// 69924 82207 33312 20238 56999 59001 70002 82820 88274 98315 jbf
*/

struct CSV
{
   string ary[10]; // A,B,C,D,E,F,G,H,I,J; //

   void extract( const string& str )
   {
       istringstream iss( str );
       string tmp;
       iss >> tmp;
       ary[0] = tmp.substr(0,2);
       ary[1] = tmp.substr(2);

       iss >> tmp;
       ary[2] = tmp.substr(0,1);
       ary[3] = tmp.substr(1,3);

       // initIal ... just in case bad data ... and not RE-SET ... //
       //ary[4] = ary[5] = ary[6] = ary[7] = ary[8] = ary[9] = "x";

       for( int i = 0; i < 6; ++ i )
            iss >> tmp; // skip over 3,4,5,6,7 and read 8
       ary[4] = tmp.substr(0,1);
       ary[5] = tmp.substr(1,3);

       iss >> tmp;
       ary[6] = tmp.substr(0,1);
       ary[7] = tmp.substr(1,3);

       while( iss >> tmp && tmp.substr(0,3) != "333" ) ; // skip over until "333..." reached ... //

       iss >> ary[8] >> tmp >> tmp >> ary[9];
   }


   friend ostream& operator << ( ostream& os, const CSV& rec )
   {
       for( int i = 0; i < 9; ++ i )
            os << rec.ary[i] << ',';
       return os << rec.ary[9] << endl;
   }

} ;




int main()
{
    ifstream fin( FILE_IN );
    ofstream fout( FILE_OUT );
    if( fin && fout )
    {
        CSV rec;
        string line, line_next;
        int count = 0;

        // skip over first 3 lines ... and read 4th //
        while( getline( fin, line ) && ++count < 4 ) ;

        count = 0; // reset to zero //

        while( getline( fin, line_next ) )
        {
            if( line_next.substr(0,2) != "98" )
                line += ' ' + line_next;
            else
            {
                rec.extract( line );
                ++count;
                cout << count << ',' << rec; // write each 'rec' on a new line, to the console screen
                fout << count << ',' << rec; // write each 'rec' on a new line, in the output.txt file
                line = line_next; // update next rec/line ... //
            }
        }

        rec.extract( line ); // need to PROCESS lAST LINE NOT processed above ... //
        ++count;
        cout << count << ',' << rec; // write each 'rec' on a new line, to the console screen
        fout << count << ',' << rec; // write each 'rec' on a new line, in the output.txt file

        fout.close();
        fin.close();
    }
    else
    {
        if( !fin ) cout << "There was a problem opening file " << FILE_IN  << endl;
        if( !fout ) cout << "There was a problem opening file " << FILE_OUT << endl;
    }
}

Hello Sir David W;

I have been doing it in VB.net but I got problem:

1. I don't know how to do it in multi processing.
2. It has GUI which make the work take more time.
3. Some strings like "9////" can't find.
4. When I try to process .txt open with excel (using vb.net codes) some of them didn't follow my delimited codes. So some result is like this "984329876567"; Some ignore the whitespace and make it 1 long numbers;

So I come up with C++ which may do the:

1. Multi-processing for text file going to excel with wanted value; (which is I really don't know if can).
2. No GUI; direct text file to excel (that will be less time consume).
3. Can count the string even it is "9////" or "abcde".

Sorry if my code not doing fine but I'm trying the best :(

In VB.net; I can read the "98" by doing this;

 If row(0).ToString.StartsWith("98") Then

In C++; Can I do this?: (Base on your code provided):

If (ary[0] == "98") //since the ary[0] must contain "98"
{
    ary[0] = tmp.substr(0,2);
}
else
    iss >> tmp; //this should skip on next line that have "98" on ary[0]; must read next line;

Am I in the right path?

Thank you; God Bless;

-zelrick

You have already been given a working C++ example solution ... if you are really sincere about learning to program in C++, then ... you really do need to start at the beginning ... as I previously indicated, and even provided some links to help you to start.

For example ... if I were to attempt to talk to you about 'entropy' and the 2nd law of thermodynamics ... before you had studied 'energy' ( / 'heat' / 'work' ) and the 1st law of thermodynamics ... and 'the atomic nature' of all matter ... and the concept of 'temperature' ... would you be able to understand and solve problems that related to the concept of 'entropy' ?

http://developers-heaven.net/forum/index.php/topic,2587.msg3103.html#msg3103

Hello Sir David W;

I'm afraid I don't have enough time to study all of it. But can you help me to figure it out please.

string s ("98");
if (ary[0] == s) {
    ary[0] = tmp.substr(0,2);
}
else {
    string tmp;
    getline( fin, tmp);
    return 1;
}

and lastly Sir David W, how about if ary[9] is blank? this code can do?

if (ary[9] == ""){
    cout<< "N/A";
    string tmp;
    getline( fin, tmp);
    return 1;
}

PS: (Edit Post)

I comment first the:

iss >> ary[9];
// till in:
return 1

the result is "x"

on ary[4] done it, but in our current code, ary[9] get the next line which is start in 98.

Thank you; God Bless;

-zelrick

Did you try out the working code I provided?

Did you read the comments in the code and try to follow the steps?

If you do not take time to read the instructions and follow them ... then why should I take any more time with you?

Hello Sir David W;

Yes sir. I was trying out the working code, but I'm afraid to mess it up so I ask it before I decode it.

Sir am I correct in my understanding on this;

iss >> tmp //is getting the next line?

if yes;

string s ("98");
for (int i = 1; i < 10; i++)
if (ary[0] == s) {
    ary[0] = tmp.substr(0,2);
}
else if {
    string tmp;
    getline( fin, tmp);
    istringstream iss (tmp);
    iss >>arry[i];
    return 1;
}

iss >> tmp; //is getting the next line?

No ... getting the next item.

string test = "a BB ccc";
istringstream iss( test );
string str[3];
iss >> str[0] >> str[1] >> str[2];

What do you think is now in each array element?

Can you write a short program to test this out ... and see what happens?

Hello Sir David W;

I try this;

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;
int main()
{
string test = "a BB ccc";
istringstream iss( test );
string str[3];
iss >> str[0] >> str[1] >> str[2];
return 0;
}

No output; but I put:

cout<< str[3]; //before return 0;

and weird things happen.

but my understanding, the output must be "a BB ccc" 0 stands for a, 1 for B, and 2 for c;

In C++ you need to write your own code to print out the elements of an array ...

string test = "a BB ccc";
istringstream iss( test );
string str[3];
iss >> str[0] >> str[1] >> str[2];
// try this ... or code a loop
cout << str[0] << ',' << str[1] << ',' << str[2];

Hello Sir David W;

Ok sir I got it;

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;
int main()
{
string test = "a BB ccc";
istringstream iss( test );
string str[3];
iss >> str[0] >> str[1] >> str[2];
for (int i=0; i < 3; i++)
cout << str[i];
return 0;
}

But still the iss >> //still hard to understand.

iss means like next? and the ">>" symbol of calling.

like cin>> //will call a variable.

example:
int a = 0;
cout<<"....";
cin>>a; //this call variable a

Sir David W; can do more explanation to this part; Thank you;

istringstream iss( str );
string tmp;
iss >> tmp;

Can you test this code ?

cout << "Enter 3 words separated by a space: ";
string a,b,c;
cin >> a >> b >> c;
cout << "You entered: " << a << " and " << b << " and " << c << endl;

Do you see how the extraction operator >> works with a cin stream?

istringstream objects are 'string streams' that via the extraction operator >> you can extract strings (or other types) from the stream.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.