I am trying to come up with a query to track a student cohort based on their start term. I want to see if the number of students that started in 2012 actually go down in 4 years time. For this example I am starting with fall 2012 (12/FA). I am closer to where I want to be from where I started. Nevertheless, the results I want should be:

| Student Cohort 2012 | Year |
------------------------------
| 2                   | 2012 |
| 1                   | 2013 |
| 1                   | 2014 |
| 0                   | 2015 |

But these are the results I am getting:

| Student Cohort 2012 | Year |
------------------------------
| 2                   | 2012 |
| 6                   | 2014 |

I have created an SQLFiddle of data that can be viewed here: http://sqlfiddle.com/#!2/1547e/3

Any help with figuring this out is greatly appreciated.

Recommended Answers

All 4 Replies

You'll want to set DISTINCT on your query

select COUNT(DISTINCT a.stuAcadCredID, 0) AS 'Student Cohort 2012', YEAR( STR_TO_DATE( a.addDate,  "%Y" ) ) AS  'Year'
from stu_acad_cred a
LEFT JOIN application b ON a.termCode = b.startTerm 
where a.addDate >= DATE_SUB(CURDATE(),INTERVAL 4 YEAR)
AND a.termCode = '12/FA'
GROUP BY YEAR( STR_TO_DATE( a.addDate,  "%Y" ) )

That will make the numbers accurate. I'm not really sure how to return a year that doesn't have any data. I'm not an SQL person, so there is probably some trick that I'm not aware of.

@pixelsoul Thanks for chiming in. It makes sense to use DISTINCT and have no idea why I did not think of that. I will take a closer look at your query, but I have edited my Fiddle (http://sqlfiddle.com/#!2/9ff722/1) with your query and the results are still not what I am looking for. I also found some mistakes I made and corrected it.

There are two students that have a start term of 12/FA, but only one of them actually registered for the 12/FA term (stu_acad_cred records). So, the 12/FA cohort should start with student since only one student registered for a course in 12/FA. I probably need to take a second look at my query and use an INNER JOIN instead of a LEFT JOIN.

Any input is still welcomed. Again, thanks for taking the time look at my issue.

I think you want something more like this

SELECT
COUNT(DISTINCT a.stuID) AS 'Students'
,YEAR( STR_TO_DATE( a.addDate,  "%Y" ) ) AS  'Year'

FROM
stu_acad_cred a

INNER JOIN application b
ON a.stuID = b.personID

WHERE a.addDate >= DATE_SUB(CURDATE(),INTERVAL 4 YEAR)
AND a.termCode = '12/FA'

GROUP BY YEAR( STR_TO_DATE( a.addDate,  "%Y" ) )

Your latest query is showing 3 for STUDENT COHORT 2012 because student/person 12 is involved in 3 of different groups. I believe what you want is a distinct count of students that are involved, which for 2012 would only be 1, and for 2013 would be 4. Hopefully I am correct about that.

Ok, here is the updated http://sqlfiddle.com/#!2/9ff722/55 which now gives me the correct number of students in a particular cohort. Now, this line a.addDate >= DATE_SUB( CURDATE( ) , INTERVAL 4 YEAR ) in my query is not needed since it is not doing what I hoped.

I want to show the 12/FA cohort over time. Meaning that I want to track this student until he or she graduates based on registrations to see if he or she continues or withdraws at some point. Is it possible to start with 2012 and then have the select query increment/loop step wise (by year or term) for a certain number of years or terms? If so, how?

Thanks

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.