Hi thr,
I'm Jega, doing a project on data preprocessing using discretization(data mining methods). There is an algorithm called EFB(Equal Frequency Binning) using C++ as coding. Do any1 here familiar with machine learning. I got the algorithm, but coding part, i just too weak in tht.

Discretize(Interval)
Begin
PotentialCutpoints = ComputeCutPoints(Interval);
PriorityQueueIntervals.Add(Interval);
While stopping criteria is not met do
If PriorityQueueCPs is empty
Foreach cutpoint CP in PotentialCutpoints do
scoreCP = ComputeScoringFunction(CP,Interval);
PriorityQueueCPs.Add(CP,scoreCP);
End for
Else
BestCP = PriorityQueue.GetBest();
CurrentInterval = PriorityQueueIntervals.GetBest();
NewIntervals = Split(CurrentInterval,BestCP);
LeftInterval = NewIntervals.GetLeftInterval();
RightInterval = NewIntervals.GetRightInterval();
PotentialLeftCPs = ComputeCutPoints(LeftInterval);
PotentialRightCPs =ComputeCutPoints(RightInterval);
Foreach cutpoint CP in PotentialLeftCPs
scoreCP = ComputeScoringFunction(CP,LeftInterval);
PriorityQueueCPs.Add(CP,scoreCP);
PriorityQueueIntervals.Add(LeftInterval,scoreCP);
End For
// the same foreach cycle for PotentialRightCPs
End while
End

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.