Hi we are developing the KNN data mining classification algorithm and are unable to proceed after a point. We have been able to calculate the Euclidean distance metric and find the three closest neighbours to an instance and ordered them in ascending order. However, with the help of the distance metric, we are now supposed to be able to trace back the initial two points of the instance that gave us the distance metric value. after determining the points we are supposed to be able to find the majority class label and then assign that to the instance. we are unable to proceed after sorting the values, any help will be much appreciated.

The java code is below:

static void classifierKNN()
  {
       List trainingset = dataset.subList(0, dataset.size()-3);
       List testset = dataset.subList(dataset.size()-2, dataset.size()-1);
       List lastele=(List)testset;
       ArrayList<Double>  dm = new ArrayList<Double>();
       ArrayList <Double> point = new ArrayList<Double>();

        int size= trainingset.size();
        double dist[]= new double[size];
        for(int i=0;i<trainingset.size();i++)
      {
          List temp =(List)trainingset.get(i);
          dist[i] = (distanceMetric(temp,lastele));
          dm.add(dist[i])
          point.addAll(temp);
      }

        System.out.println("Distance: "+ dm);
        System.out.println("Point: "+ point); //the values of dm and point should ideally be    //located in one structure - list or array but unable to concatenate them successfully, keep //getting garbage values.

        ArrayList<Double>  asc = new ArrayList<Double>();
        Arrays.sort(dist);
        System.out.println("Asc sorting");
        for (int i=0;i<3;i++)
        {

          asc.add(dist[i]);
        }
         System.out.println(asc);
}


    }
}

Hi, you just need to create a class for example with name MyDistance containing the point id and point distance to the tested point. Then you define a comparator which help the sorting function to compare two object of MyDistance by the value stored in point distance member. When you do sorting the array of MyDistance objects with the comparator you define the id also would be sorted along with the distance so it is easy for you to trace back which point which such distance.

See http://www.javadeveloper.co.in/java-example/java-comparator-example.html for example of how to use comparator to sort a customized Objects array. Hope this helps.

Hi,

If you like to make it more efficient, you could use the KD-Tree data structure for finding the closest points.

Phil

This article has been dead for over six months. Start a new discussion instead.