I am developing an sql query to use to grab information from a postgres db. The scenario is as follows: the database I am grabbing information is the backend postgres database of an application (Tableau). My objective is to pull all of the workbooks, views, and datasources (considered objects) that are untagged in the database. I am using 4 tables - taggings (contains all tagged objects related to the object by the objects id), views (contains all views and their ids), workbooks (contains all workbooks and their ids), and datasources (contains all datasources and their ids).

I have already successfully created queries to pull all of the tagged objects from the database (they are the inner select's in the not exists statements), but getting the untagged objects is proving more difficult. The code I currently have works successfully for the datasources, but the workbooks section returns one extra record than it should and the views section returns 19 extra records than it should. I am using not exists to get all of the objects that are not tagged, and union to combine the separate queries for views, datasources, and workbooks. I cannot spot the error, so please let me know if you can.

I am testing the query using pgAdmin.

SELECT distinct w.name as Object, 
t.taggable_type as Type
FROM workbooks w, taggings t
WHERE t.taggable_type = 'Workbook'
AND NOT EXISTS
(SELECT distinct w.name
FROM taggings
WHERE taggings.taggable_type = 'Workbook' 
AND w.id = taggings.taggable_id)

Union

SELECT distinct v.name as Object, 
t.taggable_type as Type
FROM views v, taggings t
WHERE NOT EXISTS
(SELECT v.name
FROM taggings
WHERE taggings.taggable_type = 'View' 
AND v.id = taggings.taggable_id)
AND t.taggable_type = 'View'

Union

SELECT distinct d.name as Object, 
t.taggable_type as Type
FROM datasources d, taggings t
WHERE t.taggable_type = 'Datasource'
AND NOT EXISTS
(SELECT d.name
FROM taggings
WHERE taggings.taggable_type = 'Datasource' 
AND d.id = taggings.taggable_id)
order by type

I hope inner join works in postgress, try following query

select v.name as object, t.taggable_type as Type
FROM views v left outer join taggings t on (v.id = t.taggable_id
and t.taggable_type = 'View')
where t.taggable_id is null

select v.name as object, t.taggable_type as Type
FROM views v left outer join taggings t on (v.id = t.taggable_id
and t.taggable_type = 'View')
where t.taggable_id is null

This is a good idea, however this solution returns 76 less records than expected. I can't figure out why I can't get the correct records to be returned.

can you post sample data script and table structutre, I will try it here

I'm not sure what you mean by data script, but the table structures are below. It may be important to note that I am querying a database that is in the backend of an application, so I have no power to change any of the structure or data contained within, all I can do is query the data.

The views table structure that is relavent to this query:
id (serial number), name (char varying(255))

Note: A view represents an object. All objects have a similar table structure.

Taggings table structure that is relavent:
taggable_id (int), taggable_type(char varying(255))

Notes: the taggable id relates to the id field in the view table.
A "tagging" relates an object id to the tag id that it is tagged with when it is published. For example, a View with confidential information would have a confidential tag attached to it. The view's id and the confidential tag's id are both be linked together in the Taggings table.
The taggable type field will specify if the object tagged is a View, Workbook, or Datasource, which are all valid Tableau objects.

by sample I mean you sample values in "views" table and "taggings" table. then tell me what output you expect from that 2 tables. LIke give me as following
views
id , name
1, view_1
2, view_2
3, view_3
4, view_4

taggings
taggable_id, taggable_type
2, view
3, view

output required
1, view_1
4, view_4

Like about you give me sample data for both table, then give me what output u requrie from that 2 tables sample data. Is above given case is correct?

The sample case you have given is correct, yes. Only for the output I want
view_1, view
view_4, view
instead. Being the name and the taggable_type because I plan on doing this with multiple types of objects.

select v.name as object, t.taggable_type as Type
FROM views v left outer join taggings t on (v.id = t.taggable_id
and t.taggable_type = 'View')
where t.taggable_id is null

IF you looking for
view_1, view
view_4, view

then my previous query is perfect. It must work, check again your data.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.