Now, an affecting factor is that there may be anywhere from no strings with the same length in the vector up to all strings having the same length.
Obviously the best case for comparison is where no strings have the same length, in which case the comparison is O(1) because std::string::length() has constant complexity[1].
Worst case is when all strings have the same length. This is where lexicographical_compare() is called every time. Then the cost of a comparison grows from O(1) to O(M) where M is the length of the string.
Average case is harder because the complexity of your comparisons depend on the data. Without an expected average data set, the best you can do is make broad assumptions (eg. half of the strings will have the same length).
Also note that the complexity of std::sort() is total, not just comparisons. So for a more specific estimate you'd want to calculate the comparisons rather than overall time complexity and add your per-item comparison cost to that. It's not as simple with std::sort() because you don't know the exact algorithm (though introsort is a good assumption).
[1] It's not a hard requirement, but I'd be surprised to find an implementation that doesn't do it.