I'm new at scripting and I need help with converting multiple ".TXT" files to one ".CSV" file. Each file will have a different number of lines, then after the last line there is a blank line then more data, I don't want to use the data after the blank line. I need help with this today if possible.
The contents are as follows.

AAAUSA-0046 AAAA Functional Aggregate AAAAA061,100.00,0.00,0.00,good
AAAUSA-0057 AAAA CCCC - AAAAA026 - Functional Aggregate,100.00,0.00,0.00,good

Each line starts with the AAAUSA-nnnn, I need to have this in the first field of the CSV file. The second field in both lines will be different titles and some will include special characters.
The third field will be the "AAAAA061" .
The fourth will be the 100.00.

The "AAAAA061" and "AAAAA026" may be before or after the "Functional Aggregate", This isn't a big concern since I want to eliminate "Functional Aggregate".
Next I want to eliminate the commas and keep only the 100.00.
The output should look like this: for each row.
AAAUSA-0046 AAAAA AAAAA061 100.00

Field-1 = AAAUSA-0046
Field2 = AAAAA ( this could be multiple words with commas, dashes and spaces etc.)
Field3 = AAAAA061
Field4 = 100.00

Thanks in advance.

Recommended Answers

All 15 Replies

Windows batch files isn't sophisticated enough to do all that. You will need to write a small program in one of the man programming languages such as C or C++.

Is this the output you want?

AAAUSA-0046,AAAA,AAAAA061,100.00
AAAUSA-0057,AAAA CCCC -,AAAAA026,100.00
Press any key to continue . . .

I'll be running the script through a cygwin command shell, so couldn't this be done using regex expressions?

I need it with out the commas, so that it can be put into an XLS spread sheet.
AAAUSA-0046 AAAA AAAAA061 100.00
AAAUSA-0057 AAAA CCCC AAAAA026100.00
The "AAAA" or "AAAA CCCC" would be in the second sell of the spread sheet.

Cell A1 would contain - AAAUSA-0057
Cell A2 would contain - "AAAA" or "AAAA CCCC"
Cell A3 would contain - AAAAA026
Cell A4 would contain - 100.00

I don't know. But it only took me about 1/2 hour to write a C++ program, 70 line (including blank lines) that produced the output I posted. And it will run a whole lot quicker than doing that with a shell script.

>>I need it with out the commas, so that it can be put into an XLS spread sheet.

"AAAUSA-0046" "AAAA" "AAAAA061" 100.00
"AAAUSA-0057" "AAAA CCCC" "AAAAA026" 100.00
Press any key to continue . . .

I doing this as a training exercise for work and was requested to do it as a script. How about power shell? I never worked with programs so I'm not sure how to do C or C++.

If you're using cygwin then you're not doing a windows batch script, you're more than likely using bash, sh, tcsh, zsh, etc. What shell are you running in cygwin?

I'm using bash in cygwin, but It would be best to use a windows batch script if possible.

you can't do windows script in cygwin. why not this

I am able to do commands such as ls and pwd so I can perform some unix commands, but I don't have GnyWin32 in the path. Do I need to download Coreutils?

You're using bash in cygwin but saying a windows batch file would be preferable but exploring adding coreutils... what language can this be in? I'm afraid you're out of luck when it comes to a .BAT file unless you can hunt down those guys who made as400 batch files... they did things I didn't know DOS was capable of.

Is a bash script an acceptable solution?

Member Avatar for onaclov2000

If you have perl, this would be a simple program to come up with.....you can just grab all .txt files, then open each of them, and read it in print out as needed, and once you see for example 2 blank lines, stop getting data.....

Sorry I didn't get back sooner, The shell is BASH. It was suggested to use ADO instead, any thoughts?

What I forgot to mention is that I need to read in the numbers at the end of each line. They represent percentages so the values can vary.
EX: 100.00,0.00,0.00,good

I came up with a vb script that reads in the data but doesn't stop at the blank line of the input. Also every thing is put into the first cell of each row.
The typical line will have the following: What is in the () would be the in the 2nd cell of each row of the csv file, I want to eleminate the dashes anf the words "Functional Aggregate". The other issue is that some rows may have the "Functional Aggregate" and the AAAAA026 switched around.
AAAUSA-0057 (AAAA CCCC) - AAAAA026 - Functional Aggregate,100.00,0.00,0.00,good

The output needs to look like the following:
AAAUSA-0057" "AAAA CCCC" "AAAAA026" "100.00" "0.00" "0.00"
this would be cells A1 - A6 and so on.

This is the VBS code I have so far.

Const xlGeneralFormat = 1
Const xlNormal = -4143
Const xlLastCell = 5

Dim sFiNa
Dim oFS
Dim oExcel
Dim oWBook
Dim sTmp

' Text file to work with

sFiNa = "FunBusAgg"

Set oFS = CreateObject( "Scripting.FileSystemObject" )
sFiNa = oFS.GetAbsolutePathName( sFiNa )
Set oExcel = CreateObject( "Excel.Application")

oExcel.Visible = True
' oExcel.Whatever = False

sTmp = "Working with MS Excel Vers. " & oExcel.Version _
& " (" & oExcel.Workbooks.Count & " Workbooks)"

oExcel.Workbooks.Open sFiNa, xlGeneralFormat
Set oWBook = oExcel.Workbooks( 1 )

' oExcel.Range(oExcel.cells(1,1),oExcel.cells(100,1)).Select
oExcel.Range( oExcel.cells( 1, 1 ) _
, oExcel.cells( oWBook.Sheets( 1 ).UsedRange.SpecialCells( xlLastCell ).Row , 1 ) _
).Select
oExcel.Selection.TextToColumns oExcel.Range("A1"), xlGeneralFormat

' save as XLS
oWBook.SaveAs sFiNa + ".csv", xlNormal


' oWBook.Close
' oExcel.Quit

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.