I am a new python user and I am trying to code an implementation to calculate and empirical cdf. So far, I have some code (attached below) that returns a list of tuples [(datapoint, P(X>=x)),...]. The problem I am trying to resolve is how to take care of replicated data e.g [1,1,4,6,7..]. In my implementation, I can't handle repeated numbers.Any ideas to improve my implementation would be welcome, thanks.
class EmpiricalCDF: def __init__(self,datalist): ''' class that holds a list of data and returns cdf defined as p(X>=x) ''' self.datalist = datalist self.n = len(datalist) def cdf_data(self): data = self.datalist plotdata = for i in range(len(data)): n = float(self.n) length = len(data) plotdata.append((data,length/n)) data.pop(0) return plotdata