Convert a group of csv files to another group of csv files

Question

skiabox 0 Light Poster

12 Years Ago

I have some files, consisting of end of day stock data in the following format :

Filename: NYSE_20120116.txt
<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
AA,20120116,10.73,10.78,10.53,10.64,20457600

How can I create files for every symbol? For example for the company A

Filename : A.txt
<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
A,20120117,39.76,40.39,39.7,39.99,4157900

(I don't want A.txt to contain the first line (the line with <ticker>) and the first column (the column with the symbol A))

I have tried to do it using a bash script, but the script is extremely slow.

Thank you.

python

3 Contributors
4 Replies
196 Views
2 Days Discussion Span
Latest Post 12 Years Ago Latest Post by skiabox

hughesadam_87 54 Junior Poster

12 Years Ago

I'm not quite clear. You're saying you want to be able to open the file on the fly, and based on what you call, it will automatically ignore the first line and column? Or do you want to alter all the files in one go?

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 1 · 2012-07-06T09:40:18+00:00

Collect the info to dictionary and write each key to it's own file.

f = ["""\
<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
AA,20120116,10.73,10.78,10.53,10.64,20457600
""", """\
<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120117,26.15,36.36,35.59,36.49,3327400
AA,20120117,10.73,20.78,10.53,10.64,20457600
"""
     ]

collect = dict()

for day in f:
    for line in day.splitlines()[1:]:
        key, d = line.split(',', 1)
        collect.setdefault(key, []).append(d)


for key, info in sorted(collect.items()):
    print(key)
    print('\n'.join(info) + '\n')

Output:

A
20120116,36.15,36.36,35.59,36.19,3327400
20120117,26.15,36.36,35.59,36.49,3327400

AA
20120116,10.73,10.78,10.53,10.64,20457600
20120117,10.73,20.78,10.53,10.64,20457600

hughesadam_87 54 Junior Poster · Answer 2 · 2012-07-06T17:20:43+00:00

PyTony's way is a nice solution. I just wanted to add that since you already mentioned trying to do this through bash, have you looked into the awk package? It runs through the shell and is apt at manipulating files, extracting rows, columns etc... on the fly through the terminal. My buddy swears by it for on-the-fly data manipulations.

skiabox 0 Light Poster · Answer 3 · 2012-07-06T20:41:17+00:00

Maybe I was not so clear.
I have 24 files of the type nyse_20120116.
Each such file contain 3200 stock symbols with their open, high, low, close, volume.
I want to create 3200 files of the form stock_name.txt (for example A.txt, AA.txt) with each file containing all the stock data.
I believe it is clear now.