I have a directory listing with many subdirectories having many files.
I want to recursively search for the oldest 5 files starting from the base directory and not 5 from each subdirectory.
I am writing a shell script which sorts them using
ls -lRtur|egrep "txt|jpg" > /tmp/file1
Now from this /tmp/file1 file I want to sort the files same as what the ls -ltr command does that is oldest file time to newest file time first.
How do I sort based on Linux time stamp?
The files itself also have Linux timestamps embedded in them
So I can sort based after extracting them as well if it is easier.
My /tmp/file1 has entries like below.
-rw-rw-r--. 1 usr1 usr1 705 2010-01-22 17:25 sample20100603173659.jpg

I want to get the 5 oldest files and then delete them.

This will get you the oldest 5 files:

find . -type f -print0| xargs -0 stat -t '%s' | awk '{ print $9 "\t" $16 }' | tr -d '"' | sort -n | tail -5|awk '{print $2}'

find . -type f : Show all the files below current directory that are not directories or links
-print0: each output is 'zero separated'
xargs -0 : pass each item of output from the left part as an arg to the right part, accepting the zero-sep output from find
stat -t '%s' : print the file info with times as seconds since unix epoch
awk ... : show the modify time and the file name, tab sep
tr -d : remove the extraneous double quotes
sort -n : sort in numeric order
tail -5 : show last 5 items
awk ... : show only the file name

find's -print0 and xargs -0 are used at least partly to avoid problems if the file names have spaces in them. Also, this format is slightly more efficient than the default.

You could mess around a bit more with stat -f to avoid the first awk, maybe. The second awk could be cut -f2 -d'\t' ... or you could use some non-tab sep in the first awk command or the stat -f command.

If there are a *lot* of these files, or you want to run this thing very often, then it will be well to worry about removing some of the awk which is slow, and pipes which are also slow.

Be sure that -t works for your version of stat ... and be sure that you actually have a stat command: Not all OSs provide it by default (though it is always available "somehow") As a Python programmer, I would use Python for this: You invoke the interpreter just once, so no pipe overhead, and the dates can be directly sorted.

This article has been dead for over six months. Start a new discussion instead.