| | |
Large File Support - Win32
Please support our Python advertiser: Programming Forums - DaniWeb Sister Site
![]() |
•
•
Join Date: Jan 2009
Posts: 11
Reputation:
Solved Threads: 3
Hi all,
I'm relatively new to python (I've been writing python code for about half a year now) and I'm trying to figure out how to use "seek" on larger files.
I'm doing some work with large hard disk image files, and the raw devices themselves. I need the ability to seek towards the tail end of a file that is extremely large (hundreds of gigs).
Below is an excerpt of the code I'm using:
It looks like currently, the seek in python only allows me to use long (32 bit integers) as an offset in fseek, or maybe it's just that I can't use long long (64 bit) integers with Python under my current operating system. I don't know how to get around this limitation...
The Windows API can definitely work with large files, and in Linux I believe (if I remember correctly) I could use lseek64 using C.
Is there a way around this limitation with Python? Is there a library I can use, or any other way I do this without have to write some kinda crazy hack.
BTW, I'm using a 32 bit build of Windows XP and Python 2.5.
I'm relatively new to python (I've been writing python code for about half a year now) and I'm trying to figure out how to use "seek" on larger files.
I'm doing some work with large hard disk image files, and the raw devices themselves. I need the ability to seek towards the tail end of a file that is extremely large (hundreds of gigs).
Below is an excerpt of the code I'm using:
Python Syntax (Toggle Plain Text)
f.seek(self.offset*512L) f.seek(self.sectors_per_cluster*self.start_cluster_mft*512L, os.SEEK_CUR) f.seek(record_number*512L*self.mft_cluster_size, os.SEEK_CUR) mft = f.read(512L*self.mft_cluster_size)
It looks like currently, the seek in python only allows me to use long (32 bit integers) as an offset in fseek, or maybe it's just that I can't use long long (64 bit) integers with Python under my current operating system. I don't know how to get around this limitation...
The Windows API can definitely work with large files, and in Linux I believe (if I remember correctly) I could use lseek64 using C.
Is there a way around this limitation with Python? Is there a library I can use, or any other way I do this without have to write some kinda crazy hack.
BTW, I'm using a 32 bit build of Windows XP and Python 2.5.
•
•
Join Date: Oct 2007
Posts: 149
Reputation:
Solved Threads: 38
Why don't you use the second parameter of the seek function ?
From http://docs.python.org/library/stdty...l#file-objects :
From http://docs.python.org/library/stdty...l#file-objects :
•
•
•
•
file.seek(offset[, whence])¶
Set the file’s current position, like stdio‘s fseek. The whence argument is optional and defaults to os.SEEK_SET or 0 (absolute file positioning); other values are os.SEEK_CUR or 1 (seek relative to the current position) and os.SEEK_END or 2 (seek relative to the file’s end). There is no return value.
For example, f.seek(2, os.SEEK_CUR) advances the position by two and f.seek(-3, os.SEEK_END) sets the position to the third to last.
Note that if the file is opened for appending (mode 'a' or 'a+'), any seek() operations will be undone at the next write. If the file is only opened for writing in append mode (mode 'a'), this method is essentially a no-op, but it remains useful for files opened in append mode with reading enabled (mode 'a+'). If the file is opened in text mode (without 'b'), only offsets returned by tell() are legal. Use of other offsets causes undefined behavior.
Note that not all file objects are seekable.
Changed in version 2.6: Passing float values as offset has been deprecated.
•
•
Join Date: Jan 2009
Posts: 11
Reputation:
Solved Threads: 3
I am using the second parameter, if you look at the example code I posted... Problem is - even though the Windows build does have large file support, sys.maxint returns 32. So a 32 bit integers is the biggest number I can specify when using seek. That means the max file size I can address is 4GB.
I don't see any way around this, and may have to switch to c# to complete this project....
I don't see any way around this, and may have to switch to c# to complete this project....
•
•
Join Date: Jan 2009
Posts: 11
Reputation:
Solved Threads: 3
Well, I'm using python 2.5, do you know if that would still work? If so, how would I do that?
•
•
Join Date: Jan 2009
Posts: 11
Reputation:
Solved Threads: 3
actually tried using floats with seek in python 2.5.2 and I got the following error:
Python Syntax (Toggle Plain Text)
OverflowError: long int too large to convert to int
•
•
Join Date: Dec 2006
Posts: 1,029
Reputation:
Solved Threads: 290
And this may be because of 32 bit MS Windows limitations. Datetime will go down to microseconds on Linux, but only milliseconds on MS Windows. There was a bug report filed, and supposedly fixed, but the files tested were only a few GB each http://bugs.python.org/issue1672853 You may have to either split the file into parts or use the whence option as stated above, with either a seek from the end, or multiple os.SEEK_CUR statements (which may or may not work).
•
•
•
•
f.seek(2, os.SEEK_CUR) advances the position by two and f.seek(-3, os.SEEK_END) sets the position to the third to last.
![]() |
Similar Threads
- Help with automatic update problem and more (Viruses, Spyware and other Nasties)
- explorer.exe uses 99% CPU (Viruses, Spyware and other Nasties)
- Matrix Determinate Calculation (C++)
- Linker Error (C++)
- Identifying Drivers needed (Windows 95 / 98 / Me)
Other Threads in the Python Forum
- Previous Thread: How can I read a pdf web page?
- Next Thread: Running a program with arguments
| Thread Tools | Search this Thread |
alarm ansi assignment avogadro backend beginner binary bluetooth character cmd code customdialog cx-freeze data decimals dictionary directory drive dynamic error examples exe file float format function gnu graphics gui halp heads homework http ideas import input itunes java leftmouse line linux list lists logging loop module mouse number numbers output parsing path pointer port prime programming progressbar projects push py2exe pygame pyglet pyqt python random recursion schedule screensaverloopinactive script scrolledtext sqlite statistics stdout string strings sudokusolver sum table terminal text thread threading time tkinter tlapse tricks tuple tutorial ubuntu unicode urllib urllib2 variable ventrilo webservice wikipedia windows write wxpython xlib






