Large File Support - Win32

Please support our Python advertiser: Programming Forums - DaniWeb Sister Site
Reply

Join Date: Jan 2009
Posts: 11
Reputation: int3grate is an unknown quantity at this point 
Solved Threads: 3
int3grate int3grate is offline Offline
Newbie Poster

Large File Support - Win32

 
0
  #1
Jan 11th, 2009
Hi all,

I'm relatively new to python (I've been writing python code for about half a year now) and I'm trying to figure out how to use "seek" on larger files.

I'm doing some work with large hard disk image files, and the raw devices themselves. I need the ability to seek towards the tail end of a file that is extremely large (hundreds of gigs).

Below is an excerpt of the code I'm using:

  1. f.seek(self.offset*512L)
  2. f.seek(self.sectors_per_cluster*self.start_cluster_mft*512L, os.SEEK_CUR)
  3. f.seek(record_number*512L*self.mft_cluster_size, os.SEEK_CUR)
  4. mft = f.read(512L*self.mft_cluster_size)

It looks like currently, the seek in python only allows me to use long (32 bit integers) as an offset in fseek, or maybe it's just that I can't use long long (64 bit) integers with Python under my current operating system. I don't know how to get around this limitation...

The Windows API can definitely work with large files, and in Linux I believe (if I remember correctly) I could use lseek64 using C.

Is there a way around this limitation with Python? Is there a library I can use, or any other way I do this without have to write some kinda crazy hack.

BTW, I'm using a 32 bit build of Windows XP and Python 2.5.
Reply With Quote Quick reply to this message  
Join Date: Oct 2007
Posts: 149
Reputation: jice is on a distinguished road 
Solved Threads: 38
jice jice is offline Offline
Junior Poster

Re: Large File Support - Win32

 
0
  #2
Jan 12th, 2009
Why don't you use the second parameter of the seek function ?

From http://docs.python.org/library/stdty...l#file-objects :
file.seek(offset[, whence])¶

Set the file’s current position, like stdio‘s fseek. The whence argument is optional and defaults to os.SEEK_SET or 0 (absolute file positioning); other values are os.SEEK_CUR or 1 (seek relative to the current position) and os.SEEK_END or 2 (seek relative to the file’s end). There is no return value.

For example, f.seek(2, os.SEEK_CUR) advances the position by two and f.seek(-3, os.SEEK_END) sets the position to the third to last.

Note that if the file is opened for appending (mode 'a' or 'a+'), any seek() operations will be undone at the next write. If the file is only opened for writing in append mode (mode 'a'), this method is essentially a no-op, but it remains useful for files opened in append mode with reading enabled (mode 'a+'). If the file is opened in text mode (without 'b'), only offsets returned by tell() are legal. Use of other offsets causes undefined behavior.

Note that not all file objects are seekable.

Changed in version 2.6: Passing float values as offset has been deprecated.
Reply With Quote Quick reply to this message  
Join Date: Dec 2006
Posts: 1,029
Reputation: woooee is a jewel in the rough woooee is a jewel in the rough woooee is a jewel in the rough 
Solved Threads: 290
woooee woooee is offline Offline
Veteran Poster

Re: Large File Support - Win32

 
0
  #3
Jan 12th, 2009
At one time you could pass a float to seek() and it would convert to a long long. I don't know if that still works or not. It would be doubtful if you are using 2.6 or 3.0.
Reply With Quote Quick reply to this message  
Join Date: Jan 2009
Posts: 11
Reputation: int3grate is an unknown quantity at this point 
Solved Threads: 3
int3grate int3grate is offline Offline
Newbie Poster

Re: Large File Support - Win32

 
0
  #4
Jan 12th, 2009
I am using the second parameter, if you look at the example code I posted... Problem is - even though the Windows build does have large file support, sys.maxint returns 32. So a 32 bit integers is the biggest number I can specify when using seek. That means the max file size I can address is 4GB.

I don't see any way around this, and may have to switch to c# to complete this project....
Reply With Quote Quick reply to this message  
Join Date: Jan 2009
Posts: 11
Reputation: int3grate is an unknown quantity at this point 
Solved Threads: 3
int3grate int3grate is offline Offline
Newbie Poster

Re: Large File Support - Win32

 
0
  #5
Jan 12th, 2009
Originally Posted by woooee View Post
At one time you could pass a float to seek() and it would convert to a long long. I don't know if that still works or not. It would be doubtful if you are using 2.6 or 3.0.
Well, I'm using python 2.5, do you know if that would still work? If so, how would I do that?
Reply With Quote Quick reply to this message  
Join Date: Jan 2009
Posts: 11
Reputation: int3grate is an unknown quantity at this point 
Solved Threads: 3
int3grate int3grate is offline Offline
Newbie Poster

Re: Large File Support - Win32

 
0
  #6
Jan 12th, 2009
actually tried using floats with seek in python 2.5.2 and I got the following error:

  1. OverflowError: long int too large to convert to int
Reply With Quote Quick reply to this message  
Join Date: Dec 2006
Posts: 1,029
Reputation: woooee is a jewel in the rough woooee is a jewel in the rough woooee is a jewel in the rough 
Solved Threads: 290
woooee woooee is offline Offline
Veteran Poster

Re: Large File Support - Win32

 
0
  #7
Jan 12th, 2009
And this may be because of 32 bit MS Windows limitations. Datetime will go down to microseconds on Linux, but only milliseconds on MS Windows. There was a bug report filed, and supposedly fixed, but the files tested were only a few GB each http://bugs.python.org/issue1672853 You may have to either split the file into parts or use the whence option as stated above, with either a seek from the end, or multiple os.SEEK_CUR statements (which may or may not work).
f.seek(2, os.SEEK_CUR) advances the position by two and f.seek(-3, os.SEEK_END) sets the position to the third to last.
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the Python Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC