I'm looking to select almost duplicate rows in a table. I initially just sorted all the rows by name and picked out the duplicates manually (took a while), but I'm wondering if there's a "real" way to go about doing this. Problem is that the duplicate names can slightly differ. I've also tried taking the names and converting them into substrings, but if I cut the names into strings with 5 characters and compare them, it doesn't catch all the duplicates. If I cut the names into strings of 4 characters, it thinks some entries are duplicates when they're not. Any ideas?

Recommended Answers

All 2 Replies

OK, I've been told a few extras is cool and we can look though it manually. But I have an additional question. My query outputs only the substring by which i did the grouping. How can I get the query to output the primary key and full string as well? The database is PostgreSQL and I understand in MySQL, you could just add the fields you wanted to in the SELECT clause, but PostgreSQL won't let you. Is there a way around that?

q1 my magic crystal ball allows me to see what you have done, and point there, in your code.....
English translation:: Post your select statement, at least, between [code]

[/code] tags, and perhaps a description of the column names,

q2, well actually you were told wrong http://www.postgresql.org/docs/9.0/static/sql-select.html WHERE somecolumn SIMILAR TO avalue ORDER BY something different
and just for fun coz I like the search animation let me google that for you

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.