hi,
I’ve got a big problem. Could anyone give me a hand?

My aim is:Remove the duplicate data, that is, the various entries of EndTime and Qos for a same ServiceID due to the repeated feedback. The rule is to only keep the records with the lowest Qos for a same ServiceID. If there are several records having the lowest Qos, then only keep whatever one of these records. The client accepts and prefers to this scheme. In this example, for the ID=6,7, and 8, just keep 6 or 7.

explain:There are some technical service data (ServiceID, TCID, EndTime, and QoS) submitted by client in a whole year, and I have added a unique primary key (ID by name) because there are some duplicates.

some data:

ServiceID   ID  TCID    EndTime Qos
2000    2   Jacob   2011/1/1    2
2000    3   Jacob   2011/1/1    2
2001    4   Jacob   2011/1/1    2
2002    5   Jacob   2011/2/3    1
2003    6   Tyler   2011/1/4    1

Data Structure:

  • ID: Unique primary key of record
  • ServiceID: ID of a certain service
  • TCID: ID of a technical support engineer
  • EndTime: Ending Time of aservice
  • Qos:Quality of service (1 Satisfied; 2 Average; 3Unsatisfactory/Dissatisfied).

DDL and insert SQL(mysql5):

CREATE TABLE `service` (
  `ServiceID` INTEGER(11) NOT NULL,
  `ID` INTEGER(11) NOT NULL ,
  `TCID` VARCHAR(40)  NOT NULL,
  `EndTime` DATE NOT NULL,
  `Qos` CHAR(1)  NOT NULL,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `ID` (`ID`)
);
COMMIT;
INSERT INTO `service` (`ServiceID`, `ID`, `TCID`, `EndTime`, `Qos`) VALUES
  (2004, 9, 'Jacob', '2011-02-04', '1'),
  (2000, 2, 'Jacob', '2011-01-01', '2'),
  (2000, 3, 'Jacob', '2011-01-01', '2'),
  (2001, 4, 'Jacob', '2011-01-01', '2'),
  (2002, 5, 'Jacob', '2011-02-03', '1'),
  (2003, 6, 'Tyler', '2011-01-04', '1'),
  (2003, 7, 'Tyler', '2011-01-04', '1'),
  (2003, 8, 'Tyler', '2011-01-03', '2'),
  (2005, 10, 'Jacob', '2011-02-05', '1'),
  (2006, 11, 'Jacob', '2011-02-04', '2'),
  (2007, 12, 'Jacob', '2011-01-08', '1'),
  (2008, 13, 'Tyler', '2011-02-06', '1'),
  (2009, 14, 'Dylan', '2011-02-08', '1'),
  (2010, 15, 'Dylan', '2011-02-09', '1'),
  (2014, 16, 'Andrew', '2011-01-01', '1'),
  (2013, 17, 'Andrew', '2011-01-01', '1'),
  (2012, 18, 'Andrew', '2011-02-19', '1'),
  (2011, 19, 'Andrew', '2011-02-02', '1'),
  (2015, 20, 'Andrew', '2011-02-01', '1'),
  (2016, 21, 'Andrew', '2011-01-19', '1'),
  (2017, 22, 'Jacob', '2011-01-01', '1'),
  (2018, 23, 'Dylan', '2011-02-03', '1'),
  (2019, 24, 'Dylan', '2011-01-09', '1'),
  (2020, 25, 'Dylan', '2011-01-01', '1'),
  (2021, 26, 'Andrew', '2011-01-03', '1'),
  (2021, 27, 'Dylan', '2011-01-11', '1'),
  (2022, 28, 'Jacob', '2011-01-09', '1'),
  (2023, 29, 'Tyler', '2011-01-19', '1'),
  (2024, 30, 'Andrew', '2011-02-01', '1'),
  (2025, 31, 'Dylan', '2011-02-03', '1'),
  (2026, 32, 'Jacob', '2011-02-04', '1'),
  (2027, 33, 'Tyler', '2011-02-09', '1'),
  (2028, 34, 'Daniel', '2011-01-06', '1'),
  (2029, 35, 'Daniel', '2011-02-01', '1');
 COMMIT;

Almost forget to say that, just a reminder, the client only accept the SQL instead of stored procedure in database to implement it.

this is my first aim, there are 3 more, I am intended to complete all these steps one after another. Could anyone help achieve the first goal? I know it is quite complicated and many thanks to you in advance.

Recommended Answers

All 9 Replies

I forget to say that: I only need query statement because customer did not allow us to write database, so I cant run update or delete statement.
thank you again, all the nice people.

Member Avatar for diafol

So to get this right:

SHow distinct service id, keeping lowest QoS, then lowest ID

SELECT DISTINCT serviceid,tcid,endtime,qos FROM (SELECT * FROM service ORDER BY serviceid, qos, id) AS base GROUP BY serviceid

I may have overcooked it.

commented: answered my question,thank you +0

Wow, It worked! thank you, ardav
It's what I want. Then I can go with next step, and can you help me further? (or should I start a new question/thread?)

aim2: Qos=1 represents the Satisfied Service. The monthly total Satisfied Services achieved by each technical support engineer is referred to as “Personal Total Satisfied Services this Month”. Those technical support engineers whose “Personal Total Satisfied Service this Month” ranking among the top 2 in that month are referred to as the “Top 2 since this Month”. If they are outstanding enough to be entitled to the “Top 2 since this Month” every month, then they can be referred to as “Outstanding of Class 1”. All in all, this step is to compute the “Outstanding of Class 1”. In this example, “Top 2 since this Month” for January is Andrew and Jacob, and that for February are Andrew, Dlyan, and Jacob. Therefore, the honor title of “Outstanding of Class 1” is awarded to Andrew and Jacob.

aim3:Then, to compute the “Outstanding of Class 2” (the engineers of “Top 2 since this Month” are not included) equals to compute those ranking the third and the forth places. In this example, the “Outstanding of Class 2” is the Tyler.

aim4:The final goal is to combine the “Outstanding of Class 1” with “Outstanding of Class 2”. The result will be ultimately transferred to report for rendering. My dataSet should be:

TCID           level
Andrew         top2
Jacob          top2
Tyler          top4

I know It maybe a complicated SQL, but I'm in trouble for 2 weeks, hope some one can help me.

Member Avatar for diafol

SELECT tcid, mnth, cnt, RANK
FROM(SELECT TCID, mnth, cnt,
@tcidv:=CASE WHEN @mn <> mnth THEN 1 ELSE @tcidv+1 END AS RANK,
@mn:=mnth AS MYCNT FROM(SELECT @tcidv:= 0) AS t,(SELECT @mn:= 0) AS m,(SELECT tcid, MONTH(endtime) AS mnth, COUNT(Qos) AS cnt FROM service WHERE qos = 1 GROUP BY mnth, tcid ORDER BY mnth, cnt DESC, tcid) AS base) AS base2 WHERE RANK < 5

Will give the top four for every month, ranked like this:

Capture45

What it doesn't do is keep RANK = 1 for Dylan and Jacob for month 2. A bit of playing about with the SQL variables should sort that out.

Are you sure you need all this through SQL, you may find it easier through some language like php or asp?

thank you ardav for reply!
I must use SQL because I should return a dataset/resultSet to reporting tools , Jasper only identify SQL ( JAVA bean may does too, but It seems too complicated).
about rank and data:
after filtering duplicate date, the TCID and cnt and rank should be:

month   cnt TCID    rank
1       4     Andrew    1
1       1     Daniel    5
1       2     Dyla    3
1       3     Jacob  2
1       2     Tyler  3
2       4     Andrew    1
2       1     Daniel    5
2       4     Dyla    1
2       4     Jacob  1
2       2     Tyler  4

I think you forgot filtering data as you have done it in aim1, and rank is not equal to order, we should use rank( otherwise, how to chose the people with the same cnt)

according to data above:
Top 2 since January is: Andrew, Jacob
Top 2 since February is: Andrew, Dyla, Jacob
so Outstanding of Class 1 is: Andrew, Jacob

Top 4 since January is: Dyla, Tyler
Top 4 since February is: Tyler
so Outstanding of Class 2 is: Tyler

please continue to help me
BTW, I think datakeyword's answer is right, but It is not a SQL solution.

hi, datakeyword
thanks for your email reply and msn help, I have got correct answer and add esProc to Jasper, It worked well.
I still need a SQL solution, because It is desighed to using SQL.

hi, ardav
datakeyword's answer is right, and It spend me half day to study it.
But ESPROC is not a ANSI standard, so I still need a SQL solution, waiting for your help.

I will take ESPROC as my final solution in the project at this friday if I can't make a MySQL statement out.
It seemed more easy to understand than MYSQL syntax even MSSQL, and It is in my ability.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.