Dear friends,
sorry for my ignorance about python containers but I have the need to store (and access) 4 different values per file in folder, namely the filename (with its path) and 3 other values (string or None). When I access the database, I will need to know the 3 values of one specific file.
I am exploring different solutions, but since I am not that expert, I cannot manage to find a good one.
Should a list of tuple work well? Something like

mylist=[]
for file in glob.glob(inputpath+"\\*.txt"):
    mylist.append([file, value1, value2, value3])

I was even thinking to save the tuples in a file and access them when the process is completed... but I am not sure this is the fastest/best way.
Thank you for your help,
Gianluca

Recommended Answers

All 6 Replies

You can use a namedtuple as a simple container

from collections import namedtuple

FileInfo = namedtuple("FileInfo", "path foo bar baz")

mylist=[]
for file in glob.glob(inputpath+"\\*.txt"):
    mylist.append(FileInfo(file, value1, value2, value3))

import pickle
pkl = "fileinfo.pkl"

with open(pkl, "wb") as ofh:
    pickle.dump(mylist, ofh)

with open(pkl, "rb") as ifh:
    print(pickle.load(ifh))

Great, thank you.
And how can I call the foo of the third file for example?

Sorry, I misunderstood your question. You can go with pickle if the number of records is small enough and you are ready to load the whole list in memory to access elements.

Otherwise there are many persistence solutions in python. A simple one in your case is the ZODB module, a package which was created for the Zope framework but is independent of Zope. Install it with easy_install. Here is the code for your case

#!/usr/bin/env python
# -*-coding: utf8-*-
from __future__ import unicode_literals, print_function

from collections import namedtuple
from contextlib import contextmanager
from persistent.list import PersistentList

@contextmanager
def zodb(path, create = False):
    """
    My own context to open easily a zodb database
    """
    from ZODB import FileStorage, DB
    import transaction
    storage = FileStorage.FileStorage(path, create = create)
    db = DB(storage)
    conn = db.open()
    root = conn.root()
    try:
      yield root
    finally:
        transaction.commit()
        conn.close()
        db.close()
        storage.close()

FileInfo = namedtuple("FileInfo", "path foo bar baz")
datafile = "mylist.fs"
CREATE = True

if CREATE:
    mylist = PersistentList()
    mylist.append(FileInfo("file1", "val0", "val1", "val2"))
    mylist.append(FileInfo("file2", "v0", "v1", None))

    with zodb(datafile, create=True) as root:
        root["mylist"] = mylist
else:
    with zodb(datafile) as root:
        print(root["mylist"][1].foo)

Run it the first time with CREATE = True, then with CREATE = False.

Another solution could be a sqlite3 database for example. Or an hdf5 file managed with the module h5py (I like this solution).

I read that for large lists, the module zc.blist would be a useful addon to your use of ZODB. It hides a B-tree storage structure behind a list interface. A Pypi search with 'zodb' may reveal other goodies.

Assuming filenames are unique, create a Python dictionary where the filenames are the keys:
mydict = {"file1": ("val0", "val1", "val2"), "fil2": ("val10", "val11", "val12"), ...}
and then use Python module shelve to create a 'persistent to file' dictionary.

Here is an example of module shelve:

''' Shelve_test1.py
use a Python dictionary and module shelve
to create a 'persistent to file' dictionary

Python2 creates a single (24kb) file 'my_shelve.slv'

Python3 creates 3 much smaller (1kb) files:
"my_shelve.slv.dir"
"my_shelve.slv.dat"
"my_shelve.slv.bak"
'''

import shelve
import pprint

# name shelve byte stream data file
data_file = "my_shelve.slv"

# save a python dictionary object to a file
# access key will be string 'comp_d' or whatever you pick
sh = shelve.open(data_file)
# this will auto-save to the shelve file
sh['comp_d'] = {'upgrade': 'steep hill', 'chip': 'munchies for TV'}

sh.close()

# retrieve the python dictionary object from the file
# use correct access key 'comp_d'
sh = shelve.open(data_file)
dc = sh['comp_d']

# then add additional data
dc['cursor'] = 'someone who swears'
# add to shelve via access key 'comp_d'
# will also auto-update the shelve file
sh['comp_d'] = dc

# close when done
sh.close()

# test the whole thing
# retrieve the updated dictionary from the file
sh = shelve.open(data_file)
mydict = sh['comp_d']
pprint.pprint(mydict)

print('-'*30)

# or pick a particular item
print(mydict['cursor'])

# close when done
sh.close()

""" my output -->

{'chip': 'munchies for TV',
 'cursor': 'someone who swears',
 'upgrade': 'steep hill'}
------------------------------
someone who swears

"""

I would stick with a list of tuples and module pickle to save/dump and load the container object. Use a list of lists if you want to edit the contents.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.