943,754 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Unsolved
  • Views: 1700
  • Python RSS
Jan 11th, 2009
0

Large File Support - Win32

Expand Post »
Hi all,

I'm relatively new to python (I've been writing python code for about half a year now) and I'm trying to figure out how to use "seek" on larger files.

I'm doing some work with large hard disk image files, and the raw devices themselves. I need the ability to seek towards the tail end of a file that is extremely large (hundreds of gigs).

Below is an excerpt of the code I'm using:

Python Syntax (Toggle Plain Text)
  1. f.seek(self.offset*512L)
  2. f.seek(self.sectors_per_cluster*self.start_cluster_mft*512L, os.SEEK_CUR)
  3. f.seek(record_number*512L*self.mft_cluster_size, os.SEEK_CUR)
  4. mft = f.read(512L*self.mft_cluster_size)

It looks like currently, the seek in python only allows me to use long (32 bit integers) as an offset in fseek, or maybe it's just that I can't use long long (64 bit) integers with Python under my current operating system. I don't know how to get around this limitation...

The Windows API can definitely work with large files, and in Linux I believe (if I remember correctly) I could use lseek64 using C.

Is there a way around this limitation with Python? Is there a library I can use, or any other way I do this without have to write some kinda crazy hack.

BTW, I'm using a 32 bit build of Windows XP and Python 2.5.
Similar Threads
Reputation Points: 16
Solved Threads: 3
Newbie Poster
int3grate is offline Offline
11 posts
since Jan 2009
Jan 12th, 2009
0

Re: Large File Support - Win32

Why don't you use the second parameter of the seek function ?

From http://docs.python.org/library/stdty...l#file-objects :
Quote ...
file.seek(offset[, whence])¶

Set the file’s current position, like stdio‘s fseek. The whence argument is optional and defaults to os.SEEK_SET or 0 (absolute file positioning); other values are os.SEEK_CUR or 1 (seek relative to the current position) and os.SEEK_END or 2 (seek relative to the file’s end). There is no return value.

For example, f.seek(2, os.SEEK_CUR) advances the position by two and f.seek(-3, os.SEEK_END) sets the position to the third to last.

Note that if the file is opened for appending (mode 'a' or 'a+'), any seek() operations will be undone at the next write. If the file is only opened for writing in append mode (mode 'a'), this method is essentially a no-op, but it remains useful for files opened in append mode with reading enabled (mode 'a+'). If the file is opened in text mode (without 'b'), only offsets returned by tell() are legal. Use of other offsets causes undefined behavior.

Note that not all file objects are seekable.

Changed in version 2.6: Passing float values as offset has been deprecated.
Reputation Points: 64
Solved Threads: 56
Posting Whiz in Training
jice is offline Offline
225 posts
since Oct 2007
Jan 12th, 2009
0

Re: Large File Support - Win32

At one time you could pass a float to seek() and it would convert to a long long. I don't know if that still works or not. It would be doubtful if you are using 2.6 or 3.0.
Reputation Points: 741
Solved Threads: 692
Nearly a Posting Maven
woooee is offline Offline
2,305 posts
since Dec 2006
Jan 12th, 2009
0

Re: Large File Support - Win32

I am using the second parameter, if you look at the example code I posted... Problem is - even though the Windows build does have large file support, sys.maxint returns 32. So a 32 bit integers is the biggest number I can specify when using seek. That means the max file size I can address is 4GB.

I don't see any way around this, and may have to switch to c# to complete this project....
Reputation Points: 16
Solved Threads: 3
Newbie Poster
int3grate is offline Offline
11 posts
since Jan 2009
Jan 12th, 2009
0

Re: Large File Support - Win32

Click to Expand / Collapse  Quote originally posted by woooee ...
At one time you could pass a float to seek() and it would convert to a long long. I don't know if that still works or not. It would be doubtful if you are using 2.6 or 3.0.
Well, I'm using python 2.5, do you know if that would still work? If so, how would I do that?
Reputation Points: 16
Solved Threads: 3
Newbie Poster
int3grate is offline Offline
11 posts
since Jan 2009
Jan 12th, 2009
0

Re: Large File Support - Win32

actually tried using floats with seek in python 2.5.2 and I got the following error:

Python Syntax (Toggle Plain Text)
  1. OverflowError: long int too large to convert to int
Reputation Points: 16
Solved Threads: 3
Newbie Poster
int3grate is offline Offline
11 posts
since Jan 2009
Jan 12th, 2009
0

Re: Large File Support - Win32

And this may be because of 32 bit MS Windows limitations. Datetime will go down to microseconds on Linux, but only milliseconds on MS Windows. There was a bug report filed, and supposedly fixed, but the files tested were only a few GB each http://bugs.python.org/issue1672853 You may have to either split the file into parts or use the whence option as stated above, with either a seek from the end, or multiple os.SEEK_CUR statements (which may or may not work).
Quote ...
f.seek(2, os.SEEK_CUR) advances the position by two and f.seek(-3, os.SEEK_END) sets the position to the third to last.
Reputation Points: 741
Solved Threads: 692
Nearly a Posting Maven
woooee is offline Offline
2,305 posts
since Dec 2006

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: How can I read a pdf web page?
Next Thread in Python Forum Timeline: Running a program with arguments





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC