Hello,
I have a file say id.dat
In this file each line contains only an id number.
I have a group of files (say 1.dat, 2.dat, 3.dat,...., 50.dat). There are unspecified number of records in each of them.
If any of these records contain any of the id number present in id.dat, then that record is copied to a new file say file.dat
So, to summarize we need to collect all the records from group of files (say 1.dat, 2.dat, 3.dat,...., 50.dat) which contain the id numbers from id.dat into a single file called new.txt

note please:
1) records are all of not same size
2) id numbers in records( in files 1.dat, 2.dat, 3.dat,...., 50.dat) may be at different positions. so we need to search id numbers in records using some substring concept.

Configuration: Unix shell scripting

Recommended Answers

All 6 Replies

Do your own homework!

Hey Its not homework.
I work for a company and this thing i do manually which consumes lot of my time.
if somehow this can be automated, I will save a lot of time for other work.

aswin_agarwal> Hey Its not homework.
Do not expect much help if you can not see that this is your own problem, therefore your own homework to achieve. If you do not show effort as the rules of the forum indicates, this thread is in vain.

I suggest you start looking into 'awk' and post the script you have tried so far.

Ok this is work related.
Like i download files from servers and those files contain data of certain members. There are a lot of files. I did write a C code for it but as i am not really good in shell scripting, i am unable to find a way out for this problem. I will list out the c code here. Now I have given u the complete details that are required to solve this problem.

Consider this that in a line of text u need to search for a substring and if u find the substring then copy it to a new file. So u have to copy all lines from all files that contain that substring in a single new file. Now, this substring is an id, and as explained above there are n no of ids stored in a seprate file. now if any record contains any of these ids, you need to copy that record into the final file that we are creating.

In simple words from the whole data of all members, we are extracting the data of our list of members.

Here is the C code that i wrote:

#include<fstream.h>
#include<iostream.h>
#include<conio.h>
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
//using namespace std;
int main ()
{
clrscr();
char buffer[256],buffer1[256],buffer2[256];
char *ptr;
int flag=0,len=0;
//ifstream myfile ("test.dat");
FILE *myfile1,*myfile,*myfile2,*myfile3;
myfile1=fopen("out1.dat","w");
myfile3=fopen("c:/tc/bin/tfiles/in1.dat","r");
if(myfile3==NULL){
cout<<"error file missing-->myfile3."<<endl;
getche();
exit(0);
}
while (!feof(myfile3)){
fgets(buffer2,100,myfile3);
cout<<"file name -->"<<buffer2<<endl;
len=strlen(buffer2);
buffer2[len-1]='\0';
//getche();
myfile=fopen(buffer2,"r");
if(myfile==NULL){
cout<<"error file missing-->"<<buffer2<<endl;
getche();
exit(0);
}
while (!feof(myfile)){
cout<<"aswin1"<<endl;
fgets(buffer,500,myfile);
myfile2=fopen("list.dat","r");
if(myfile2==NULL){
cout<<"error file missing-->myfile2."<<endl;
getche();
exit(0);
}
flag=0;
while ((!feof(myfile2)) && flag==0){
cout<<"aswin2"<<endl;
fgets(buffer1,100,myfile2);
len=strlen(buffer1);
buffer1[len-1]='\0';
cout<<buffer1<<endl;
ptr=strstr(buffer,buffer1);
if(ptr!=NULL){
flag=1;
fputs(buffer,myfile1);
cout<<"aswin3"<<endl;
//cout << buffer << endl;
getche();
}
else cout<<"record rejected"<<endl;
}
fclose(myfile2);
}
fclose(myfile);
}
fclose(myfile1);
fclose(myfile3);
getche();
return 0;
}


The id is a 10 digit number. also may contain alphabets.

Hi Asin!

I think what Aia meant was that this forum is a great place to go for help with something that you're working on, but it's not the best place to post a set of requirements and say "write me a script!". If you want somebody to do all the work for you, you might be better off posting some place like rentacoder (or does daniweb have a section like that?) :)

If you can show us what you've tried so far (shell scripting, not C ;) ), we might be able to point you in the right direction! You'll probably want to loop through your id.dat file for the ids, and use something like 'awk' or 'grep' to pull them out of your data files.

Should be fairly simple!

-G

Use grep (obviously). Option -f might be of interest if ids.txt has needed format (one search pattern per line). If it doesn't have the needed format you might wanna set it up right using sed or something.
Assuming all your numbered input dat files are in "."

grep -f ids.txt [0-9]*.dat >> new.txt
# OR
find . -name "[0-9]*.dat" -exec grep -f ids.txt {} \; >> new.txt

If that's not the case you'll have loop.

for f in [0-9]*.dat
do
    for id in `cat ids.txt`
    do
        # do your grepping..
    done
done

If you don't know shell scripting look up the manuals listed in this forum / WWW. I suggest learning grep, sed, awk at least.

PS: "Homework" here means something you are supposed to do. Doesn't matter if it's given by your boss or teacher. :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.