Hello all,

I have a long list of coordinates which are supposed to trace the outline of a city. They are supposed to go into a program which plots the trajectory from one coordinate to the next, thereby plotting the outline. For some reason (I'm not sure why; I'm trying to help a friend with her lab work and now I'm just attached to figuring out the problem), the coordinates are scrambled. Furthermore, there are some coordinates present which are found inside, rather than on the perimeter of, the city. I'm trying to write an algorithm that sorts the coordinates in order and filters out the interior ones. My approach is as follows:

  1. Find the average x and y coordinates
  2. Transform all cordinates to be centered around the average (subtract average x and y from each coordinate pair).
  3. Use atan2 function to get the quadrant-inclusive tangents of each coordinate pair relative to the origin.
  4. Peform a selection sort and sort all cordinates in by inreasing arctan value.
  5. During selection sort, if two consecutive coordinate pairs are found within x degrees of each other, compare their respective distances from the origin. Keep the farthest one and delete the inner one.

When the x (degrees) value is set to 0, the algorithm correctly goes around the perimeter but gets fuzzy in the regions where there are interior coordinates present. However, when x is set to anything but 0, a bunch of points are chopped off. Strangely, no matter what the x value is, it's always the same points which are deleted. Could someone please look at the program below and tell me where I'm going wrong?

Note: The program is fairly well commented, but I'm trying to teach my friend simple C++. She's already familiar with simple C, so a lot of the commentary is geared for her. Please pardon the excessive commenting.

P.S. I'm self taught in C# and basically learned C++ on the fly; if you see any examples of poor or inefficient coding, I'd definitely appreciate the critique.

P.P.S.: I know the lat/long values should have been stored in a 2xn array but I couldn't figure out how to pass multidimensional arrays, so I just made one for lat and one for long.

#include <cstdlib> // was stdlib.h in C
#include <iostream> // was stdio.b in C
#include <string>   // needed to use strings
#include <fstream>  // needed to read and write to files (to create filestreams)
#include <math.h>   // for performing exponent operations
#include <sstream>  // for converting int to string (for status update)

using namespace std; // Write it at the top of every C++ program.

const int COORDINATE_LIMIT = 20000; // The size of the latitude and longitude arrays. This changes with size of file.

string GetFileName(); // gets the NAME of the input file
void ReadFile(string, double[], double[], int &, string[]); // gets the VALUES in the input file, AND the number of coordinates, AND the column headers.
void SortFile(double[], double[], int); // Sorts the data. 
string GetPrintName(); // gets the NAME of the file to be written
void PrintFile(string, double[], double[], int, string[]); // writes the sorted arrays to a file.

int main(int argc, char *argv[])
{
    string input_Filename; // the name of the input file
    string output_Filename; // the name of the sorted file to be written
    double latitude[COORDINATE_LIMIT]; // the list of latitude values
    double longitude[COORDINATE_LIMIT]; // the list of longitude values
    int NumCoords;  // the number of coordinates in the input file
    string Headers[2]; // the headers at the top of the input file. It might be necessary to keep these so that Igor can read the file?

    cout << "Warning. This program only allows " << COORDINATE_LIMIT << " coordinate pairs. If you have more, please change the array size and recompile.\n\n";

    input_Filename = GetFileName(); // get input file name
    ReadFile(input_Filename, latitude, longitude, NumCoords, Headers); // read the data from the input file
    SortFile(latitude, longitude, NumCoords); // sort the data
    output_Filename = GetPrintName(); // get the output file name
    PrintFile(output_Filename, latitude, longitude, NumCoords, Headers); // write the output file

    system("PAUSE"); // "Press any key to continue"
    return EXIT_SUCCESS; // tells Windows that your program closed successfully.
}

string GetFileName() // Gets name of the input file.
{
       string coordinates;
       cout << "Enter File Name: ";
       cin  >> coordinates;
       cout << "\n";
       return coordinates;
}

//PARAMETERS//////
// string Filename: the name of the file as input by the user. Should be "coordinates.txt"
// double latitude[]: The array that will contain all of the lattitude values
// double longitude[]: The array that will contain all of the longitude values
// int& NumCoords: The number of coordinates (lattitude + longitude / 2) in the input file. This will help you when you are using loops to sort them.
// string Headers[]: The column headers of the input file ("Indylat_new" and "Indylong_new")
/////////////////
void ReadFile(string Filename, double latitude[], double longitude[], int& NumCoords, string Headers[]) // Reads data into arrays and records how many coordinate pairs there are.
{
     //COMMENTS: The way fstream works is kind of weird. It reads spaces as if it they were new lines. So each iteration of the while loop
     // reads the next coordinate. The first line in the file says "Indylat_new     Indylong_new". When counter = -2, the computer reads
     // "Indylat_new". Then, when the counter = -1, the computer reads "Indylong_new". And so on. So when counter is an even number, it reads
     // a latitude and when counter is an odd number, it reads a longtiude.

     string temp;      // string which will hold each value from the input file
     int counter = -2; // So that the first two strings are skipped (Because they aren't numbers)
     NumCoords = 0;   // so that the number of coordinates begins at 0 before any are read

       RESTART: // goto command label. You'll see what this is for when you read the final "else" clause.

       ifstream InputFileStream; // creates an input filestream called InputFileStream. Don't worry about what a filestream is; just accept that you need it.
       InputFileStream.open(Filename.c_str()); // opens the file called Filename (that we passed in as a parameter). You don't need to know why I had to add ".cstr()" to the end. It gives a compiler error without it.
       if (InputFileStream.is_open()) // if the file successfully opens, the program continues to the while loop. Otherwise, it goes to the final "else" clause.
       {
        while (!InputFileStream.eof()) // reads each string in the input file
        {
           InputFileStream >> temp;    // assigns the current string in the input file to the temp variable.
           if (counter < 0)                 // skips the first two strings ("Indylat_new" and "Indylong_new")
           {
                       Headers[counter + 2] = temp; // adds the current string to the list of column headers
           }
           else if (counter % 2 == 0)       // if counter is an even number, it reads a latitude
           {
                latitude[counter / 2] = atof(temp.c_str());
                // atof will convert the value of temp to float (double). (In case you ever need it, atoi converts string to int).
                // note that again, I had to add ".c_str()". Again, it doesn't matter why for now, other than that the compiler gives an error without it.
                // The position in the array is (counter / 2) because counter goes up when reading latitude AND longitude
                NumCoords++;                
           }
           else if (counter % 2 == 1)      // if counter is an odd number, it reads a longitude
           {
                longitude[counter / 2] = atof(temp.c_str()); // same notes as before
                NumCoords++;
           }
           counter++;
        }

        NumCoords /= 2;
        cout << "File has been read successfully. " << NumCoords << " entries were detected.\n";
       }
       else // lets you know if the file couldn't be opened
       {
           cout << "Sorry, could not find file " << Filename << ". Did you add the file extension?\n";
           Filename = GetFileName(); // gets the corrected file name
           goto RESTART; // goes back to the RESTART label at the beginning of this function to try reading the file again.
       }
       InputFileStream.close(); // closes the input file so that other programs can open it.

       return;
}

//Gets the coordinates of the center of the city
void GetCenter(double center[], double latitude[], double longitude[], int NumCoords)
{        
         center[0] = 0; center[1] = 0;
         for(int i = 0; i < NumCoords; i++)
         {
                 center[0] += latitude[i];
                 center[1] += longitude[i];
         }
         center[0] /= NumCoords;
         center[1] /= NumCoords;

         return;
}

// transforms coordinates to be centered around the new origin
void Coord_Transform(double center[], double latitude[], double longitude[], int NumCoords)
{
     for  (int i = 0; i < NumCoords; i++)
     {
          latitude[i] = latitude[i] - center[0];
          longitude[i] = longitude[i] - center[1];
     }         
}

// untransforms coordinates back to their original values
void Coord_Inverse_Transform(double center[], double latitude[], double longitude[], int NumCoords)
{
     for  (int i = 0; i < NumCoords; i++)
     {
          if (latitude[i] != 0 && longitude[i] != 0) // 0,0 is flag that the coordinate should be deleted - therefore shouldn't be transformed.
          {
             latitude[i] = latitude[i] + center[0];
             longitude[i] = longitude[i] + center[1];
          }
     }         
}

// Produces the array of arctans which is sorted
void GetAtans(double Atan[], double latitude[], double longitude[], int NumCoords)
{
     for (int i = 0; i < NumCoords; i++)
     {
         Atan[i] = atan2(longitude[i], latitude[i]);
     }
     return;    
}

// asks for the angular tolerance within which to delete interior coordinate pairs
double GetSliceSize()
{
       double x;
       cout << "When two points are found to be within x degrees of each other, the point closer to the center will be deleted.\n";
       cout << "Please enter the value of x: ";
       cin >> x;

       x = x *  3.14159265/ 180;

       return x;
}

// returns true if one coordinate pair and its closest angular neighbor are within tolerance of each other
bool Proximity_Alert (double old_atan, double new_atan, double slice_size)
{
       bool alert = false;
       if (new_atan >= old_atan - slice_size || new_atan <= old_atan + slice_size)
       {
          alert = true;
       }
       return alert;
}

//Sorts the input file
//PARAMETERS///
// double[] latitude: array of all latitudes (populated from file).
// double[] longitude: array of all longitudes (populated from file).
// int NumCoords: The number of entries in the input file
///////////////
void SortFile(double latitude[], double longitude[], int NumCoords)
{
     double center[2]; // The latitude,longitude of the city center
     double Atan[COORDINATE_LIMIT]; // the array of arctan values
     double Lowest_Atan;  // the lowest arctan for the current iteration
     int Lowest_Pos;      // the position of the lowest arctan in the array for the current iteration
     double array_temp;   // holding variable to perform the array swap
     double SliceSize;    // angular tolerance for exclusing interior coordinates
     double Z_temp_old;   // distance of the ith coordinate pair from the city center
     double Z_temp_new;   // distance of the (lowest_Pos)th coordinate pair from the city center

     GetCenter(center, latitude, longitude, NumCoords);
     Coord_Transform(center, latitude, longitude, NumCoords);
     GetAtans(Atan, latitude, longitude, NumCoords);     
     SliceSize = GetSliceSize();

     for (int i = 0; i < NumCoords - 1; i++)
     {
         if (Atan[i] == -100) { continue;} // -100 is flag that this value shouldn't be used. Note that the normal range of Atan is -3.14 to 3.14.
         Lowest_Atan = Atan[i];            // Sets the initial comparison value for the selection sort
         Lowest_Pos = i;                   // Sets the initial comparison value position for the selection sort
         for (int j = i + 1; j < NumCoords; j++)
         {
             if (Atan[j] != -100 && Atan[j] <= Lowest_Atan) // -100 is flag that the value shouldn't be used
             {
                Lowest_Atan = Atan[j];      // if the current Atan value is the lowest found yet, set this to the new selection sort comparing var
                Lowest_Pos = j;             // Also, record the position of this value
             }
         }

         //Delete non-perimeter points
         if (SliceSize != 0 && Proximity_Alert(Atan[i], Atan[Lowest_Pos], SliceSize) == true)
         {
            Z_temp_old = sqrt(pow(latitude[i],2) + pow(longitude[i],2));                     //get distance from ith coordinate pair
            Z_temp_new = sqrt(pow(latitude[Lowest_Pos],2) + pow(longitude[Lowest_Pos],2));   // get distance from (Lowest_Pos)th coordinate pair

            if (Z_temp_old < Z_temp_new) { latitude[i] = 0; longitude[i] = 0; Atan[Lowest_Pos] = -100; }  // delete whichever one is closer to the center
            if (Z_temp_new < Z_temp_old) { latitude[Lowest_Pos] = 0; longitude[Lowest_Pos] = 0; Atan[Lowest_Pos] = -100;}
         }

         //Swap latitude and longitude. Put the coordinate with the lowest distance NEXT TO coordiante[i].
          array_temp = Atan[i];
          Atan[i] = Atan[Lowest_Pos];
          Atan[Lowest_Pos] = array_temp;
          array_temp = latitude[i]; 
          latitude[i] = latitude[Lowest_Pos];
          latitude[Lowest_Pos] = array_temp;
          array_temp = longitude[i];
          longitude[i] = longitude[Lowest_Pos];
          longitude[Lowest_Pos] = array_temp;          
     }

     Coord_Inverse_Transform(center, latitude, longitude, NumCoords); // back-transform coordinates

     return;     
}

string GetPrintName() // asks for the name of the output file
{
       string FileName;

       cout << "Please enter the name that you wish to call the output file: ";
       cin >> FileName;

       return FileName;
}

//PARAMETERS///
// string FileName: The name that the output file is to be called.
// double[] latitude: array of all latitudes (sorted)
// double[] longitude: array of all longitudes (sorted)
// int NumCoords: number of entries to be written
// string[] Headers: the column headers from the input file
////////////////
void PrintFile(string FileName, double latitude[], double longitude[], int NumCoords, string Headers[])
{
     ofstream OutputFileStream; // creates an output filestream called "OutputFileSream"
     OutputFileStream.open(FileName.c_str()); // opens the file for writing. Again, .c_str() is just something you have to do.

     // Writes the column headers
     OutputFileStream << Headers[0] << '\t' << Headers[1]; // The '\t' means tab, because that's how the column headers are separated in the input file.

     // Writes the data
     for (int i = 0; i < NumCoords; i++)
     {
         if (latitude[i] != 0 && longitude[i] != 0) // 0 is flag that the coordinate should be deleted
         {
            OutputFileStream << "\n" << latitude[i] << " " << longitude[i];
         }
     }

     cout << "File " << FileName << " has been written.\n";
}

Thanks!

  1. Don't use arrays for a co-ordinate array use a vector of a co-ordinate class

    class coordinate
    {
    public:
        // Constructors etc
        coordinate();
        coordinate(double lat, double lon);
        coordinate(const coordinate& cpy);
        ~coordinate();
    
        // Getters setters
        double getLat() const;
        double getLong() const;
        void setLat(double value);
        void setLong(double value);
        void setLatLong(double lat, double lon);
    
        // Add utilities if you wish 
        // e.g. a method to get baring of one coord from another
    
    private:
         double latitude;
         double longitude;
    };
    
    std::vector<coordinate> coordList;
    

Then pass the vector by reference to functions.

It looks like you are combining trying to sort and trying to remove interior points, you need to sort first. Also you have a lot of code for the sort but it you just added the atan2 value to your coordinate class and create a comparitor based on the atan2 value you can just call std::sort

// Assuming atan2 added into class

bool compCoord(const coordinate&op1, const coordinate&op2)
{
    return op1.getAtan2() < op2.getAtan2();
}

std::sort(coordList.begin(), coordList.end(), compCoord);

Since you are sorting if there are a lot of coordinates it may be better to use a list rather than a vector to hold you coordinate list

std::list<coordinate> coordList;

Once you are using a STL container (vector or list) you can use std::transform and or std::for_each with simple operators to perform you transformations and do atan2 calculations.

I think the logic at line 174 is wrong, assuming all values are positive then it will always return true, take old_atan as 2 and slice_size and 0.1 and it equates to (new_atan > 1.9 or new_atan < 2.1) this is true for any value of new_atan. I think you meant and not or (&& not ||)

(posting this while trying to repress the urge to criticize the style and design)

Line 174 is wrong. The condition on the if-statement should be with && not with or.

The whole logic of the sorting is wrong. Lines 206-207 assume that the value at index i is not yet sorted (so, it is a candidate for being the lowest-angle point). But then, in the block between 218 and 225, that assumption seems to be flipped, because now, you assume the element at index i to be the last sorted value (i.e., the value placed there by the previous iteration). The loop is schitzopheric. I think that this will work better:

 for (int i = 0; i < NumCoords - 1; i++)
 {
     if (Atan[i] == -100) { continue;} 
     Lowest_Atan = Atan[i]; 
     Lowest_Pos = i; 
     for (int j = i + 1; j < NumCoords; j++)
     {
         if (Atan[j] != -100 && Atan[j] <= Lowest_Atan) 
         {
            Lowest_Atan = Atan[j]; 
            Lowest_Pos = j; 
         }
     }

     if (i > 0 &&              // is there a sorted element preceding i?
         SliceSize != 0 && 
         Proximity_Alert(Atan[i-1], Atan[Lowest_Pos], SliceSize) == true)
     {
         Z_temp_old = sqrt(pow(latitude[i-1],2) + pow(longitude[i-1],2));
         Z_temp_new = sqrt(pow(latitude[Lowest_Pos],2) 
                         + pow(longitude[Lowest_Pos],2));
         if (Z_temp_old < Z_temp_new) {
             // Invalidate the i-1 coordinate: 
             latitude[i-1] = 0; longitude[i-1] = 0; 
             Atan[i-1] = -100; 
         } 
         if (Z_temp_new < Z_temp_old) { 
             // Invalidate the Lowest_Pos coordinate:
             latitude[Lowest_Pos] = 0; longitude[Lowest_Pos] = 0; 
             Atan[Lowest_Pos] = -100;
         }
         // NOTICE HERE!!!
         // AND, bring i back by 1 (because we either invalidated i-1 or 
         //  invalidated the Lowest_Pos, either way, we must go back one step).
         --i
     }

     //If we had to invalidate the Lowest_Pos, we should not swap it:
     if(Atan[Lowest_Pos] == -100)
       continue;

     //Swap latitude and longitude. NOTE: Use standard functions: (in <algorithm>)
     std::swap(Atan[i], Atan[Lowest_Pos]);
     std::swap(latitude[i], latitude[Lowest_Pos]);
     std::swap(longitude[i], longitude[Lowest_Pos]);        
 }

(end of "just fix your problem" part of my post)

I will abstain from telling you how C++ is not C and that your code is very horrible in that sense. But since you made it clear that your problem pertains to an actual implementation (not a toy exercise in a programming course), I must say something about the approach that you are using:

1. Selection-sort is shit!

You should use an off-the-shelf sorting algorithm if you are going to do this approach that you are doing. I suggest creating a struct (or class) with your coordinate information (relative to center) and the angle (from the atan). Then, define the comparison operator < with respect to the angle. Finally, have an array of these structs (or better, using a std::vector) that you generate the same way you generate it now (e.g. your Coord_Transform() and GetAtans() functions can be merged into one function that generates the array of structs (with relative coordinates and angles)). You can then use the standard std::sort algorithm (in <algorithm>) to sort the array by angle. After the sorting, you can do the pruning (eliminating the "sliced" points). This is going to be much faster and neater then using a selection-sort.

2. Not everything is a circle!

By that, I mean that the general outline of points that you are going to get is probably some kind of polygon or blobby shape (maybe not even convex), and so, using this kind of sorting by radial angle from the center is really not a good idea (even if you assume that taking the average of all points is a meaningful value for the "center"). You should try a completely different approach if you want it to work reasonably well.

For example, I would suggest that you do a kind of walking from point-to-point approach. If you pick a point in the input array, find its nearest neighbor(s), and then order the point and its (best) neighbor in the output array, and then repeat that with the neighbor point (incrementally building a sequence of points that follow each other). When you reach a point that doesn't have a close-enough neighbor or that its neighbor is already in the output array (you came back around the perimeter), then you stop that chain of points and start a new one with the first remaining (unsorted) point from the input array. At the end of this, you should have one (or a few) longer sequences of points which represent the outline of something, in addition to a bunch of small sequences (one or a few points only) representing the glitches that you are talking about (invalid interior points or whatever else).

Furthermore, using this technique, you can also do more "filtering" as you go along (while building a valid sequence of points). First of all, you should probably keep track of the expected direction of the next neighboring points, such that you can eliminate neighbors that don't follow naturally from the previous points in the sequence. I have also used this technique in the past to detect line-segments, sharp corners and blobby shapes (like the outline of a tree or a person). I did this just by maintaining an approximating polynomial (e.g., a least-square spline segment) for the past few points in the current sequence. If the polynomial seemed to have more or less zero curvature for a while, it meant that the sequence was a straight-line. A sharp increase in curvature between the looking-ahead spline-segment and the past polynomial indicated a sharp corner (and I could mark that point as the end on one line segment and the start of another). And, if it seemed that the curvature was more or less constant (and non-zero), then I would identify it as a possible circle-looking shape, which I would later fit a circle around those points.

Of course, this method will be much more efficient if you have a good method for looking up nearest neighbors in an array of points (doing look-ups in O(logN) instead of O(N)). For a 2D problem, a basic Kd-tree implementation would do the trick. Personally, I use a home-brewed dynamic vantage-point tree for this kind of work, but it is a bit of an overkill for a simple problem like this.

Could you please provide the problem statement? The reason I ask is that, as stated, it is ambiguous. Picture the points occurring within a 5-point star pattern. You probably want the perimeter to be a pentagon - convex at every external point. Otherwise you still won't get a smooth star shape, but a very jagged one, the size of the jags depending on how big your "x degrees" limit is.

If you can assume a convex shape, you need to determine whether or not each point is inside (toward the center) of a line between its two neighbors (a concave point), and if so eliminate it. You need to do that until the graph is convex at all points.

Good luck.

This article has been dead for over six months. Start a new discussion instead.