User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Shell Scripting section within the Software Development category of DaniWeb, a massive community of 423,407 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 4,807 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Shell Scripting advertiser: Programming Forums
Views: 2838 | Replies: 4
Reply
Join Date: Dec 2006
Posts: 4
Reputation: crab68 is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
crab68 crab68 is offline Offline
Newbie Poster

Troubleshooting 'GREP' usage on huge files, any Limitation?

  #1  
Dec 13th, 2006
grep -v "connected" filename > newfile

With regards to the use of grep and then output to a file as it was done in the sample above, are there any limitation to its use especially when the file is big? I have experience record truncation when it is output to a file. Anyone ever experience this before? How can this problem be resolved?
AddThis Social Bookmark Button
Reply With Quote  
Join Date: May 2004
Posts: 177
Reputation: jim mcnamara is on a distinguished road 
Rep Power: 5
Solved Threads: 9
jim mcnamara jim mcnamara is offline Offline
Junior Poster

Re: 'GREP' usage on huge files, any Limitation?

  #2  
Dec 13th, 2006
Record truncation? Not normal behavior unless the record has embedded ascii nul characters. Lack of disk space or exceeding enabled quotas will also cause the output file to truncate.

grep has a line length limit of 2048 characters.
There also is a concept of largefiles, files which are so big a signed 32 bit file pointer cannot access them > 2.4GB.

Which of these things applies to your case?
Last edited by jim mcnamara : Dec 13th, 2006 at 9:51 am.
Reply With Quote  
Join Date: Dec 2006
Posts: 4
Reputation: crab68 is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
crab68 crab68 is offline Offline
Newbie Poster

Troubleshooting Re: 'GREP' usage on huge files, any Limitation?

  #3  
Dec 14th, 2006
Originally Posted by jim mcnamara View Post
Record truncation? Not normal behavior unless the record has embedded ascii nul characters. Lack of disk space or exceeding enabled quotas will also cause the output file to truncate.

grep has a line length limit of 2048 characters.
There also is a concept of largefiles, files which are so big a signed 32 bit file pointer cannot access them > 2.4GB.

Which of these things applies to your case?


The file size is about 1.2GB. The recond was truncated when it was run in the script but when it was manually run later, the records in the file did not get truncated. Thus it is an intermittent problem. It could be due to disk space but I can't verify.
Reply With Quote  
Join Date: May 2004
Posts: 177
Reputation: jim mcnamara is on a distinguished road 
Rep Power: 5
Solved Threads: 9
jim mcnamara jim mcnamara is offline Offline
Junior Poster

Re: 'GREP' usage on huge files, any Limitation?

  #4  
Dec 14th, 2006
The way disk i/o in unix works is that data is parked in an in-memory cache in the kernel - it is not guaranteed to be written to disk when the write() system call is invoked. Every 30 seconds or so the syncer daemon issues a sync command. This forces the kernel to write everything in the kernel buffer to disk.

What you are seeing is an incompleted write operation - for whatever reason. Common reasons are - a signal was sent to the process that terminated it, write() or sync failed because something else filled up the
disk (maybe a temp file) and then that file went away, disk errors caused a fatal error. If it's an nsf mounted disk then the network also becomes an issue. What errors do you see in the log?
Reply With Quote  
Join Date: Dec 2006
Posts: 4
Reputation: crab68 is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
crab68 crab68 is offline Offline
Newbie Poster

Troubleshooting Re: 'GREP' usage on huge files, any Limitation?

  #5  
Dec 18th, 2006
Originally Posted by jim mcnamara View Post
The way disk i/o in unix works is that data is parked in an in-memory cache in the kernel - it is not guaranteed to be written to disk when the write() system call is invoked. Every 30 seconds or so the syncer daemon issues a sync command. This forces the kernel to write everything in the kernel buffer to disk.

What you are seeing is an incompleted write operation - for whatever reason. Common reasons are - a signal was sent to the process that terminated it, write() or sync failed because something else filled up the
disk (maybe a temp file) and then that file went away, disk errors caused a fatal error. If it's an nsf mounted disk then the network also becomes an issue. What errors do you see in the log?


There was no tracking of error message in the script. WIll probably need write a program to do the task of dividing the files into two.
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.

DaniWeb Shell Scripting Marketplace
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

Thread Tools Display Modes

Similar Threads
Other Threads in the Shell Scripting Forum

All times are GMT -4. The time now is 1:28 pm.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC