| | |
read in using fread() for arbitrary length data
Please support our C advertiser: Programming Forums - DaniWeb Sister Site
Thread Solved |
•
•
Join Date: Mar 2008
Posts: 7
Reputation:
Solved Threads: 0
Hi..
I'm want to read each record (in 1 line) using fread(), the problem is the record length is arbitrary..
e.g.
1 "Joshua" "Rosenthal" "34 Mellili Ln" "Earlwood" 1 "000113133121" 0.000
2 "Martin" "Serrong" "45 Rosenthal Ccl" "Doveton" 1 "000113133121" 0.000
3 "Jacob" "Leramonth" "59 Dalion Pl" "Belmont" 1 "000113133121" 0.000
since fread() required how many characters we want to read, i'm doing like this..
do anyone have better way to do it? since the one i'm doing take so much time, and time is the important things in my asg (im calculating file access time)..
I'm want to read each record (in 1 line) using fread(), the problem is the record length is arbitrary..
e.g.
1 "Joshua" "Rosenthal" "34 Mellili Ln" "Earlwood" 1 "000113133121" 0.000
2 "Martin" "Serrong" "45 Rosenthal Ccl" "Doveton" 1 "000113133121" 0.000
3 "Jacob" "Leramonth" "59 Dalion Pl" "Belmont" 1 "000113133121" 0.000
since fread() required how many characters we want to read, i'm doing like this..
C Syntax (Toggle Plain Text)
while(fgets(str,100,fp) != NULL) // read each line using fgets { recordLen[j] = strlen(str); // so i can get length of each record j++; } do { len = 0; randNum = lrand48() % 830000; // read record randomly, 830000 is number of records rewind(fp); for(i = 0; i < randNum-1; i++) // in order to seek the file pointer, i'm sum up length of each record // until the record i want to read { len += recordLen[i]; } fseek(fp, len, SEEK_SET); fgets(str,100,fp); // another fgets() to read the record I want to read, to find the length fseek(fp, len, SEEK_SET); result = fread(buffer, CHAR_BYTE, strlen(str), fp); count++; } while(count < MAX_READ); // MAX_READ = 50000, read in 50000 times
do anyone have better way to do it? since the one i'm doing take so much time, and time is the important things in my asg (im calculating file access time)..
Why use fread at all? fgets seems to be a better fit for your problem, especially since you're using it anyway to preprocess the record lengths. The more input requests you make, the slower your code will be, so if optimization in your goal, minimize the number of calls that read from the file.
Ideally you would keep a portion of the file in memory at any given time, but it's difficult to make this efficient when the access pattern is random.
Ideally you would keep a portion of the file in memory at any given time, but it's difficult to make this efficient when the access pattern is random.
I'm here to prove you wrong.
If all you're doing is calculating file access time, what's wrong with just doing random position + random length ?
The length of an individual record doesn't seem that important if you're not actually using that record when you're done.
All you're going to produce is something like bytes/sec as the answer - right?
The length of an individual record doesn't seem that important if you're not actually using that record when you're done.
All you're going to produce is something like bytes/sec as the answer - right?
>so there's no way to get the length of record directly?
No, but you're already assuming that the maximum length of a record is 99 characters, so I don't really see how the length matters. Why not build the length array as you need it, and instead of storing the record length, store the offset of the record. Something like so:
That way you only read as much of the file as necessary. There's also an added benefit of being able to calculate the record length from the offsets if you need it. As long as seeking is quick, you should see at least some improvement, unless you're unlucky and you hit the upper end of the random range immediately.
No, but you're already assuming that the maximum length of a record is 99 characters, so I don't really see how the length matters. Why not build the length array as you need it, and instead of storing the record length, store the offset of the record. Something like so:
C Syntax (Toggle Plain Text)
recordLen: array[0..830000] n: int as 0 # The first record is always at offset 0 recordLen[n] := 0 while count < MAX_READ do randNum: int as rand() % 830000 if randNum > n then # Fill the lengths up to randNum seek fp to recordLen[n] # Build the offsets up to randNum while n < randNum and read str from fp do n := n + 1 recordLen[n] := recordLen[n - 1] + length str loop else seek fp to recordLen[randNum] endif # At this point we're at the correct offset read str from fp # Process str count := count + 1 loop
Last edited by Narue; Apr 23rd, 2008 at 12:37 pm.
I'm here to prove you wrong.
![]() |
Other Threads in the C Forum
- Previous Thread: finding divisors of a number using an array
- Next Thread: Validate range from text file
| Thread Tools | Search this Thread |
Tag cloud for C
adobe ansi api append array arrays bash binarysearch centimeter char character cm convert copyanyfile copypdffile createcopyoffile createprocess() csyntax directory dynamic execv feet fflush fgets file floatingpointvalidation fork frequency function getlasterror getlogicaldrivestrin givemetehcodez global graphics gtkgcurlcompiling hardware highest homework i/o ide infiniteloop initialization interest intmain() kilometer lazy license linked linkedlist linux linuxsegmentationfault list match matrix meter microsoft multi mysql oddnumber odf open openwebfoundation pattern pause pdf pointer pointers posix power program programming pyramidusingturboccodes read recursion recv recvblocked repetition scheduling segmentationfault send shape single socketprogramming spoonfeeding stack standard strchr string strings structures suggestions system test testautomation unix urboc user voidmain() win32api windows.h






