Hello everybody!

I have to go through each subfolder and extract matching strings from each of the files (File_1, File_2, File_3, File_4, etc.). Unfortunately, I don't know how to do that under Linux.
I have the following structure:

MainFolder:
---Subfolder A
------File_1
------File_2
------File_3
---Subfolder B
------File_4
------File_5
------File_6
---Etc.

My original code only goes through all the files in the first subfolder (Subfolder A).

import os, glob
import sys

path = sys.argv[1]
for file in glob.glob(os.path.join(path,'*.*')):
    print "Current file: ", file
    f = open(file, 'rU')
    split line
    do smth

How do I go through the rest of the subfolders? I tried to use os.walk() but so far I only managed to get it to list all the subfolders and files and I can't get it to open each file. Can anyone help?

Thank you,
M.

You can use os.walk this way

import os

def must_open(dirpath, filename):
    """change this function to select if a file must be opened or not"""
    return True

def opened_files(*args):
    """generate a sequence of pairs (path to file, opened file)
    in a given directory. Same arguments as os.walk."""
    for dirpath, dirnames, filenames in os.walk(*args):
        for filename in filenames:
            if must_open(dirpath, filename):
                filepath = os.path.join(dirpath, filename)
                yield (filepath, open(filepath, "rU"))

# main loop:
mydir = "/home/user/path/to/my/dir"
for filepath, file in opened_files(mydir):
    # do something
commented: Thank you. Your solution works! +1
commented: nice opened_files generator +2
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.