Editing CSV files

Question

lstensland 0 Newbie Poster

14 Years Ago

I am new to programming and have been reading a few books on Pyhton but can't seem to grasp what I believe is probably a simple process. I need someone to show me briefly a sample script that would allow me to open a CSV file and edit it line by line. This is an example of part of the CSV file I have.

Links:

I want to remove certain things from each line, for example the numbers at the beginning of each line, entire lines that have certain words, etc. This is a very short part of the entire file but from this all I would want left is a new file containing this:

http://www.example.com/products

I can't figure out how to create a script that would weed through a CSV file and remove portions of lines or entire lines depending on the content.

Any help in the right direction would be much appreciated. Thanks.

python

5 Contributors
15 Replies
440 Views
2 Weeks Discussion Span
Latest Post 14 Years Ago Latest Post by lstensland

jice 53 Posting Whiz in Training

14 Years Ago

Your datas aren't in a csv format...
For csv, here is a simple example to read and write datas

import csv

# This class is to define another csv format if you need to
class excel_french(csv.Dialect):
    delimiter=';'
    quotechar='"'
    doublequote=True
    skipinitialspace=False
    lineterminator='\n'
    quoting=csv.QUOTE_MINIMAL

fic='yourfile.csv'
outcsvfic='out.csv'

csv.register_dialect('excel_french', excel_french)

cr=csv.reader(open(fic,"r"), 'excel_french') # 'excel_french' is optionnal.
            # Only if you want to use another csv format than the default one
cw=csv.writer(open(outcsvfic,'w'), 'excel_french')

for row in cr:
    print row
    cw.writerow(row[:2])

More informations in the doc
http://docs.python.org/library/csv.html

TrustyTony 888 ex-Moderator

14 Years Ago

To take only links which contain products from your (not CSV) data:

for x in open('linklist.txt').readlines():
    _,_,link=x.partition(' ==> ')
    if 'products' in link: print link

jice 53 Posting Whiz in Training

14 Years Ago

To take only links which contain products from your (not CSV) data:
for x in open('linklist.txt').readlines():
    _,_,link=x.partition(' ==> ')
    if 'products' in link: print link

Why do you "readlines()" ?
As far as i know, this loads the whole file in memory which is of no use...

for x in open('linklist.txt'):
    _,_,link=x.partition(' ==> ')
    if 'products' in link: print link

would do exactly the same

TrustyTony 888 ex-Moderator

14 Years Ago

Just catched it from some newbie posters I answered recently, I do normally use in open(..)

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

lstensland 0 Newbie Poster · Answer 1 · 2010-05-26T20:39:31+00:00

Your datas aren't in a csv format...
For csv, here is a simple example to read and write datas

import csv

# This class is to define another csv format if you need to
class excel_french(csv.Dialect):
    delimiter=';'
    quotechar='"'
    doublequote=True
    skipinitialspace=False
    lineterminator='\n'
    quoting=csv.QUOTE_MINIMAL

fic='yourfile.csv'
outcsvfic='out.csv'

csv.register_dialect('excel_french', excel_french)

cr=csv.reader(open(fic,"r"), 'excel_french') # 'excel_french' is optionnal.
            # Only if you want to use another csv format than the default one
cw=csv.writer(open(outcsvfic,'w'), 'excel_french')

for row in cr:
    print row
    cw.writerow(row[:2])

More informations in the doc
http://docs.python.org/library/csv.html

Thanks for your reply. I know this is basic stuff but seeing an example is the best way for me to learn how this all ties together. I will continue reading up on Python but this information has been a great help. Thanks.

lstensland 0 Newbie Poster · Answer 2 · 2010-05-26T20:42:22+00:00

To take only links which contain products from your (not CSV) data:
for x in open('linklist.txt').readlines():
    _,_,link=x.partition(' ==> ')
    if 'products' in link: print link

Thanks for your reply. This is another piece of very helpful information. I appreciate you taking your time to help me with this.

lstensland 0 Newbie Poster · Answer 3 · 2010-06-06T10:59:33+00:00

I have found that it may be more helpful for my situation to be able to delete several items from my links. From a sample link this:

http://www.testsite.com/site-map
http://www.testsite.com/customer-login
http://www.testsite.com/
http://www.testsite.com/products
http://www.testsite.com/services
http://www.testsite.com/become-a-customer
http://www.testsite.com/contact
http://www.testsite.com/about
http://www.testsite.com/LearningCenter
http://www.testsite.com/knivesandtools
http://www.testsite.com/knivesandtools
http://www.testsite.com/knives
http://www.testsite.com/knivesaccessories

I would like to be able to delete all links that have: site-map, customer-login, services, become-a-customer, contact, about, and LearningCenter. Any ideas?

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 4 · 2010-06-06T11:21:35+00:00

Like this:

from __future__ import print_function

not_these=['site-map',
           'customer-login',
           'services',
           'become-a-customer',
           'contact',
           'about',
           'LearningCenter']

for i in open('links.txt'):
    for nono in not_these:
        if nono in i:
##            print (i, 'has',nono)
            break
    else: print(i, end='')

lstensland 0 Newbie Poster · Answer 5 · 2010-06-08T08:35:04+00:00

Like this:

from __future__ import print_function

not_these=['site-map',
           'customer-login',
           'services',
           'become-a-customer',
           'contact',
           'about',
           'LearningCenter']

for i in open('links.txt'):
    for nono in not_these:
        if nono in i:
##            print (i, 'has',nono)
            break
    else: print(i, end='')

Yes that's it. I just need this to be written to a csv file instead of printed but I will try to do that myself. Like I said, I am quite new. I have been through a few python books and am finding it difficult to grasp. I understand what I read and can create the examples in the book, but when it comes to creating a script on my own I have a hard time getting my ideas to actually work.

Do you have any suggestions you could offer as far as books or tutorials that may better explain Python or give more complete examples that I can learn from?

Thanks.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 6 · 2010-06-08T09:55:37+00:00

Perhaps you could read the sticky thread Starting Python http://www.daniweb.com/forums/thread20774.html !

ennoil 0 Newbie Poster · Answer 7 · 2010-06-09T00:15:19+00:00

2 things...First, and this will make life easier for what you want to do, you are NOT dealing with CSV files. These are plain text files which are easier to work with.
Second, I have been learning Python as well. I have spent (and in some cases wasted) money on several books. The 2 books that I have used most have been "Beginning Python, Using Python 2.6 and Python 3.1" (by James Payne, Wrox) and "Python: Essential Reference" (by David M Beazley, Addison Wesley). I start with the beginning book, read the first 3-4 chapters, then started writing and used the indexes of both to find things I need...I then ask here or other places if Google does not help.
I do know BASH scripting and have been trying to convert my knowledge to Python...

lstensland 0 Newbie Poster · Answer 8 · 2010-06-09T03:29:44+00:00

The files I am using are CSV files, I open them with Microsoft Excel. When I post information on them I just copy the text from Excel and paste it here to show what information I am dealing with.

Thanks for the advice on the books, I will look into them right away.

ennoil 0 Newbie Poster · Answer 9 · 2010-06-09T07:57:15+00:00

When I think of a CSV file, I look for something like:

"Column 1","Column 2","Column 3"

You gave us:

0. blog ==> http://blog.example.com/
1. site map ==> http://www.example.com/site-map

Unless this is one column from your CSV file then it can be treated like a normal text file. It would be easier to work with then. I am also not sure if the csv module would even recognize this as a CSV file (if it does, can you use "==>" as the delimiter???). Because of the text between the number and the ==>, you would have a really hard time using spaces as delimiters...

lstensland 0 Newbie Poster · Answer 10 · 2010-06-10T06:48:25+00:00

When I think of a CSV file, I look for something like:
"Column 1","Column 2","Column 3"
You gave us:
0. blog ==> http://blog.example.com/
1. site map ==> http://www.example.com/site-map
Unless this is one column from your CSV file then it can be treated like a normal text file. It would be easier to work with then. I am also not sure if the csv module would even recognize this as a CSV file (if it does, can you use "==>" as the delimiter???). Because of the text between the number and the ==>, you would have a really hard time using spaces as delimiters...

I think you might be right. I have a script that pulls links and I have told it to save as "example.csv". I get a list from top to bottom of links. These are not separated by commas but the file is ".CSV" and is opened with Excel.

You are right about using "==>" as the delimiter. Someone replied earlier with a script that uses this and the script works. The reason I am saving as .csv is because the final output will be a CSV file like the example I gave earlier and will be a database that I will upload to a website. I am going to have certain info in certain cells in an Excel sheet that will need to be modified before the upload. I think it will be more organized and easier to use if I can say for example, "Append column C to column E" or "Multiply column d by 1.25".

While it is not actually separated by commas, if I were to manually fill in an Excel sheet with the information and select "Save as CSV", the output will be the same.

lstensland 0 Newbie Poster · Answer 11 · 2010-06-10T06:49:03+00:00

I think you might be right. I have a script that pulls links and I have told it to save as "example.csv". I get a list from top to bottom of links. These are not separated by commas but the file is ".CSV" and is opened with Excel.

You are right about using "==>" as the delimiter. Someone replied earlier with a script that uses this and the script works. The reason I am saving as .csv is because the final output will be a CSV file like the example I gave earlier and will be a database that I will upload to a website. I am going to have certain info in certain cells in an Excel sheet that will need to be modified before the upload. I think it will be more organized and easier to use if I can say for example, "Append column C to column E" or "Multiply column d by 1.25".

While it is not actually separated by commas, if I were to manually fill in an Excel sheet with the information and select "Save as CSV", the output will be the same.