I'm trying to load some data I download from my real estate system. They have split out the features in a separate file. Here are my 2 files:

File1.txt (~25,000 records)
|123|Main|St|City|State|1701,1708|800,801|MORE|MORE|ETC|
|987|Bailey|Dr|City|State|1702,1708,1717|800,801|MORE|MORE|ETC|

File2.txt (~1,300 records)
|RES|800|City Sewer|
|RES|801|City Water|
|RES|1701|1 Car|
|RES|1702|2 Car|
|CND|1702|2 Car|
|RES|1708|Door Opener|
|RES|1717|Side Load|

I need to do a couple of thing which I have done with awk. Combine fields 1, 2, and 3 and extract 4, 5, 6, and 7 for mysql insert. Problem is I need to change the values of field 6 and 7 to their matching value in file2.txt

So my insert looks like this now
INSERT INTO DB_TABLE VALUES ('123 Main St', 'City', 'State', '1701,1708', '800,801');

but it needs to be:
INSERT INTO DB_TABLE VALUES ('123 Main St', 'City', 'State', '1 Car, Door Opener', 'City Sewer, City Water');

I think I can do it easily with a multi-dim array but I don't know how to read in file2 in the same awk. I could load these into the database and do it but I have about 25,000 records in file1.txt nightly and I think it would be much faster just do this in some shell scripting.

Any ideas? I guess another option would be python, perl, c, etc. I thought maybe sed on the output file, but I still have 2 files to read.

Thanks,
Chris

Hey there,

here's a quick, ugly-looking-but-easy-to-understand (I hope ;) bit of shell script. I didn't format the output exactly, but you should be able to do that simply by modifying the end of each line that seeds the newg and newh variable. The script is called c (below):

Using your two files, which I'm calling "a" and "b"

$ cat a
|123|Main|St|City|State|1701,1708|800,801|MORE|MORE|ETC|
|987|Bailey|Dr|City|State|1702,1708,1717|800,801|MORE|MORE|ETC|
$ cat b
|RES|800|City Sewer|
|RES|801|City Water|
|RES|1701|1 Car|
|RES|1702|2 Car|
|CND|1702|2 Car|
|RES|1708|Door Opener|
|RES|1717|Side Load|

#!/bin/bash

IFS="|"

while read a b c d e f g h i
do
        newg=`for x in \`echo $g|awk -F"," '{for (i=1;i<=NF;i++) printf("%s|", $i) }'\`;do grep $x b|awk -F"|" '{printf("%s ", $4)}';done`
        newh=`for x in \`echo $h|awk -F"," '{for (i=1;i<=NF;i++) printf("%s|", $i) }'\`;do grep $x b|awk -F"|" '{printf("%s ", $4)}';done`
        echo "$b $c $d $e $f $newg $newh"
done <a

It produces this output:

$ ./c
123 Main St City State 1 Car Door Opener City Sewer City Water
987 Bailey Dr City State 2 Car 2 Car Door Opener Side Load City Sewer City Water

Hope that helps you out :)

, Mike

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.