I am trying to determine the time complexity (and space complexity if anyone would like to have a go at it) of a sort function i implemented with a custom comparator. it is the sort function from the STL <algorithm>. First off, I am using Visual C++, so I am assuming it uses the standard C++ sort, which from what I have read on the web has an average complexity of N*log(N) and a worst case of N^2. Now, here is the implementation:

bool by_length (const string& s1, const string& s2) {
    
     if ( s1.length() == s2.length() ) 
	return lexicographical_compare( s1.begin(), s1.end(), s2.begin(), s2.end() );
									
	return s1.length() < s2.length();
}


int main() {

vector<string> v;

//populate vector here....

sort(v.begin(), v.end(), by_length);

return 0;
}

Now, i have also read that lexicographical_compare (another STL <algorithm>) runs in linear time. I am looking for best case, average case, and worst case. Now, an affecting factor is that there may be anywhere from no strings with the same length in the vector up to all strings having the same length. any help is appreciated. also, if anyone has any tips on optimization, that is appreciated also.

Now, an affecting factor is that there may be anywhere from no strings with the same length in the vector up to all strings having the same length.

Obviously the best case for comparison is where no strings have the same length, in which case the comparison is O(1) because std::string::length() has constant complexity[1].

Worst case is when all strings have the same length. This is where lexicographical_compare() is called every time. Then the cost of a comparison grows from O(1) to O(M) where M is the length of the string.

Average case is harder because the complexity of your comparisons depend on the data. Without an expected average data set, the best you can do is make broad assumptions (eg. half of the strings will have the same length).

Also note that the complexity of std::sort() is total, not just comparisons. So for a more specific estimate you'd want to calculate the comparisons rather than overall time complexity and add your per-item comparison cost to that. It's not as simple with std::sort() because you don't know the exact algorithm (though introsort is a good assumption).


[1] It's not a hard requirement, but I'd be surprised to find an implementation that doesn't do it.

@Narue - first off, is it possible to find out which sort is being implemented with std::sort()? second, you said:

for a more specific estimate you'd want to calculate the comparisons rather than overall time complexity and add your per-item comparison cost to that.

just to be clear, would i add the inner lexicographic_compare time complexity to the outer sort's time complexity, or multiply by it?

is it possible to find out which sort is being implemented with std::sort()?

Yes, but the answer would only apply to a single implementation of the standard library.

would i add the inner lexicographic_compare time complexity to the outer sort's time complexity, or multiply by it?

One possible example: O(N logN + f(M)) where f(M) is the cost of M comparisons.

Edited 5 Years Ago by Narue: n/a

This article has been dead for over six months. Start a new discussion instead.