Hi Gurus,

A colleague and I have a process where we want to look at a file we receive from a vendor (XML) and data that is in our db. In order to bypass loading the xml and comparing it once loaded we are loading the db data and xml into datatables so that we can use LINQ to see what has changed and/or doesn't exist.

Any other thoughts/ideas on how to compare these to data sources?

Thanks,

S

Could you load the smaller repository into RAM (into a Dictionary/Hashtable) and compare the larger repository to it line-by-line?

If both are too large for RAM, you could scan the XML for certain keys, and index the XML file (seek) then query back into the file when the DB record is encountered with the same key.

If both fit into RAM, Link-To-Objects is a perfect solution for the audit.

>>A colleague and I have a process where we want to look at a file we receive from a vendor (XML) and data that is in our db. In order to bypass loading the xml and comparing it once loaded we are loading the db data and xml into datatables so that we can use LINQ to see what has changed and/or doesn't exist.

Just load them both in memory on your machine and compare them. How you go about that depends on the type of data your receiving, relationships, number of records, etc etc. Don't import the data to your database so you can compare there. That creates overhead on the sql server.

Comments
You are always giving nice suggestion

>>we are loading the db data and xml into datatables

Just load them both in memory on your machine and compare them.

Did i mis-read this or is there some ambiguity in the term 'datatable'?
I took it to mean a datatable object in memory :p
If they mean a datatable in the database then i wholeheartedly concur with sknake...process it in memory if it isnt too large to do so :)

This article has been dead for over six months. Start a new discussion instead.