Is there a systematic method to analyze the time complexity of a program strictly from output data such as run-time, the number of data swaps, data comparisons, etc? For example...

Data Size | Run Time | Comparisons | Moves
    ---------------------------------------------
    1000      | 0.00     | 15448       | 23965
    10000     | 0.00     | 251006      | 369906
    100000    | 0.04     | 4029787     | 5551078
    1000000   | 0.65     | 64737104    | 84244884

I'm pretty sure the algorithm that caused this data has a time complexity is O(n^1.5). I am wondering if there is a way to verify that from the data above?

I want to do this because I am trying to analyze another, more complex algorithm. (I can't get it's complexity simply by looking at it's code, so an alternate method would be helpful)

Recommended Answers

All 3 Replies

Well you can look for a correlation between the size of the item the calculation is dependent on (data size) and the resultant size of the calculation, which could in fact be any column. However your run time column is clearly useless as it does not have the resolution to show anything.

However using either comparisons or moves could work.

Excel, or any other spread sheet is a good tool for this. Plot the data in a graph and then get excel to calculate a trend line. In this case it appears that both comparisons and moves have a near linear relationship to data size.

However the problem with mathematical analysis of this data is the lack of data points. 4 really isn't a big set and that makes it easy for the maths to go wrong.

Banfa said almost everyhing you should know. But... I would not recommend to use Comparisons Or Moves columns for time complexity analysis. Simply because Comparisons and/or Moves can show different trend than run time column. So because these trends in principle can differ - i would strictly recommend only plot function of run time.
(Of cource as Banfa said you don't have enough points-> re-do experiment, by collecting a lot more data points - 10 or even 100 and possibly with good time resolution.)

I agree but if you are going to use time you will need to use a smaller unit of time so that you get proper results.

Is that time in seconds? Then you need to find a way to measure the time in milliseconds.

Also the problem with time, if you are not careful, on a multi-core, mutli-tasking system is that you accidentally measure elapsed time rather than processing time for your algorithm. If you program gets interrupted a lot by a higher priority process then that could severely skew your results.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.