try DISTINCT
eg: select distinct * from table1 join table2 on blah...
hollystyles
Veteran Poster
1,182 posts since Feb 2005
Reputation Points: 262
Solved Threads: 68
The ON clause will only look for ID numbers that are equal to each other.
No the default JOIN specifys only records that match, the default join is INNER.
LEFT OUTER JOIN (can be abreviated to LEFT JOIN) means all records in left table whether there's a match in the right table or not, RIGHT JOIN is the opposite FULL OUTER JOIN being both.
Your query:
SELECT tableA.* FROM tableA LEFT JOIN tableB
ON tableA.id = tableB.id
WHERE tableB.id IS NULL
Has nothing to do with distinct records, using a left join and a where is null clause is the classic way to get UNMATCHED records from the left table.
Having written all the above I now think I know what you're after, you want to insert only records from table a into table b where they don't exist in table b already ?
If so then you are on the right track with the unmatched type query, that is the way to do it, I;m not aware of a better way to be honest.
hollystyles
Veteran Poster
1,182 posts since Feb 2005
Reputation Points: 262
Solved Threads: 68
Hmm my gut feeling is it isn't, but I also know the only way to be sure is to test it for ourselves using profiler and looking at the query plan for each scenario, which could give me some material for another Sql Article, this could be quite an interesting thread actually. I only have Sql 2005 having finally evicted Sql 2000 from my box only yesterday, but the results should still be indicative for both versions.
hollystyles
Veteran Poster
1,182 posts since Feb 2005
Reputation Points: 262
Solved Threads: 68
OK so I set up a database as follows:
IF EXISTS(SELECT * FROM sysdatabases WHERE name='DBmyUnmatchedTest')
DROP DATABASE DBmyUnmatchedTest
GO
create database DBmyUnmatchedTest
GO
use DBmyUnmatchedTest
GO
create table a
(
iD int identity(1,1) primary key,
val int
)
create table b
(
iD int NOT NULL,
val int
)
declare @val int
set @val = 1
while @val <= 1000
begin
insert into a (val)
values (@val)
select @val = @val + 1
end
insert into b (iD, val)
select iD, val from a where iD <= 500
This creates a database and two related tables, we insert half the records in table a to table b.
I then executed the two types of statement and looked at the query plans. The second statement using the where not exists syntax did appear to have a slightly better plan, the where b.iD is null had an additional filter at the end of the plan, but it only representes 1% of the total cost so really there wasn't much between them. When I added an index to the iD column in table b both queries' performance improved equally.
Due to limitations in the express edition of Sql Server 2005 I can't attach Sql Profiler. But the client statistics reveal a slightly better average Total Execution Time of 108ms for the original left join query as opposed to an average of 130ms for the not exists method.
hollystyles
Veteran Poster
1,182 posts since Feb 2005
Reputation Points: 262
Solved Threads: 68