Hey guys! Much help needed here! I am a student doing a protein structure alignment algorithm which is my final year project, using various forms of translation, rotation & computing the RMSD.

From line 164:

This is my loop calculation, basically it computes the Root Mean Square Deviation (RMSD) of my X, Y & Z coordinates.

An example of how it looks in a text file that I will be reading from & storing in my 2D arrays. (NOTE: I'll be playing round with 2 diff PROTEIN files: File 1- coordinates1[][], File 2- coordinates2 [][]. 1 file has approx. 1000+ - 3000+ LINES, depending on the size of the protein)

ATOM 1 N VAL A 1 -4.040 15.048 13.602 1.00 37.04 N
ATOM 2 CA VAL A 1 -3.621 15.574 14.908 1.00 25.44 C
ATOM 3 C VAL A 1 -2.766 14.564 15.637 1.00 24.20 C

I'll be extracting these values(X,Y & Z respectively) from the files and will be storing in my 2D array.


Extracted coordinates:
X Y Z
-4.040 15.048 13.602
-3.621 15.574 14.908
-2.766 14.564 15.637

I have a couple of problems here:

When I do: RMSD_Final = R2 / coordinatesRotZ.length;
It literally divides by the size of the coordinates, which is 3000.

Here's an example of my declaration of my 2D array:
public static double [][] coordinatesRotZ = new double [3000][1000];

so I guess for my loops and all, it goes on and on for 3000 times, which i don't want that. I want it to go according to the number of rows that has numbers in it. Please note that I do not know the amount of rows which I will be using for the file reading part, that's why I put 3000. because I might be using diff files each time with diff number of rows each time.

please do advise:

1. Is there any possibility of getting back the number of rows that i've used so that I can use that as my .length & my denominator during the division?

2. The logic of the computation part from line 164 : I want to 'accumulate' all my R1s so that I can divide it by the total no. of coordinates of that particular file. Is it correct? Or perhaps is there any other better way of doing it? I wish to keep it simple as my java knowledge is limited.

Thank you all in advance! Please do help me as I am stuck on this for days and weeks.

public class testest{

   private static StringBuffer buffer;
   private static BufferedReader input1=null;
    private static String st[]= new String[20000]; //keep at 20000, initially 1000


   public static String [] arrayM1 = new String [3000];
   public static String [] arrayM2 = new String [3000];
   public static double [][] coordinates1 = new double [3000][1000];
   public static double [][] coordinates2 = new double [3000][1000];
   public static double [][] coordinates2T = new double [3000][1000];
   public static double [] Translate = new double [10];
   //public static double [] R1 = new double [3];

   public static double [][] coordinatesDiff = new double [3000][1000];
   public static double [][] coordinatesRotz = new double [3000][1000];
   public static double [][] coordinatesRotX = new double [3000][1000];
   public static double [][] coordinatesRotZ = new double [3000][1000];

   static DecimalFormat fmt= new DecimalFormat("0.000"); //3 D.P

    public static void main(String args[]) {


   int i=0,j=0;
   String text1;
   String text2;
   double X=0.0;
   double Y=0.0;
   double Z=0.0;
   double alpha=0.0;
   double beta =0.0;
   double theta=0.0;
   double R1 = 0.0;
   double R2 = 0.0;
   double R3 = 0.0;
   double RMSD_Final = 0.0;
   GetFilenameWithoutExtension example=new GetFilenameWithoutExtension(); //new instance of class
   int counter2=0;
   String line2="";
   int counter1=0;
   String line1="";
   

   try{

        FileInputStream fstream1 = new FileInputStream("C:\\Users\\Hazel\\Documents\\Desktop\\MPSIP\\proteintest1.txt");
        FileInputStream fstream2 = new FileInputStream("C:\\Users\\Hazel\\Documents\\Desktop\\MPSIP\\proteintest2.txt");
        DataInputStream in1 = new DataInputStream(fstream1); // Get the object of DataInputStream
        DataInputStream in2 = new DataInputStream(fstream2); // Get the object of DataInputStream
        BufferedReader br1 = new BufferedReader(new InputStreamReader(in1));
        BufferedReader br2 = new BufferedReader(new InputStreamReader(in2));
// to get protein ID

        while ((text1 = br1.readLine()) != null)   {  //Read File Line By Line

            if(text1.trim().startsWith("ATOM")) {
        	StringTokenizer s1 = new StringTokenizer(text1," ");

          	while(s1.hasMoreTokens()) {

			String ss1 = s1.nextToken();
		    	counter1++;

		    	if(counter1 == 3){
		    		line1 += ss1;
		    		arrayM1[i] = ss1;
		    	}
		}

                coordinates1[i][0]= Double.parseDouble(text1.substring(30,38));//X
                coordinates1[i][1]= Double.parseDouble(text1.substring(38,46));//Y
                coordinates1[i][2]= Double.parseDouble(text1.substring(46,56));//Z

		i+=1;
            }//end of if
	} //end of while.

	i=0;

	while ((text2 = br2.readLine()) != null)   {  //Read File Line By Line

            if(text2.trim().startsWith("ATOM")) {
        	StringTokenizer s2 = new StringTokenizer(text2," ");
        	while(s2.hasMoreTokens()) {

			String ss2 = s2.nextToken();
		    	counter2++;

		    	if(counter2 == 3){
		    		line2 += ss2;
		    		arrayM2[i] = ss2;
		    	}
		}

                coordinates2[i][0]= Double.parseDouble(text2.substring(30,38));//X
                coordinates2[i][1]= Double.parseDouble(text2.substring(38,46));//Y
                coordinates2[i][2]= Double.parseDouble(text2.substring(46,56));//Z
                
                
                
                i+=1;
                
                
            }//end of if

	} //end of while.

    }//end of try

    catch( IOException e) {
	e.printStackTrace();
    }//end of catch


//------------------------------Translation-------------------------------------



    for(int a=0; a<3; a++) {
    	
        for(int b=0; b<3; b++){
            for(int c=0; c<3; c++){

                X = coordinates1[1][0]/coordinates2[1][0]; //alpha c coordinate
                Y = coordinates1[1][1]/coordinates2[1][1];
                Z = coordinates1[1][2]/coordinates2[1][2];

                double [][] Translation1={ {X,0.0,0.0}, {0.0,Y,0.0}, {0.0,0.0,Z} };

                coordinates2T[a][b] += coordinates2[a][c]*Translation1[c][b];
        
        
       

//-------------------------Rotation & RMSD Cal----------------------------------

//Uses Leohard Euler's angle of rotation theorem
//5 degrees in rad.

                double [][] Rotz={ {Math.cos(0.087266463), Math.sin(-0.087266463), 0.0},
                     {Math.sin(0.087266463), Math.cos(0.087266463), 0.0},
                    {0.0, 0.0 , 1.0} };

                double [][] RotX={ {1.0,0.0,0.0},
                    {0.0,Math.cos(0.087266463), Math.sin(-0.087266463)},
                    {0.0,Math.sin(0.087266463), Math.cos(0.087266463)} };

                double [][] RotZ={ {Math.cos(0.087266463), Math.sin(0.087266463), 0.0},
                    {Math.sin(-0.087266463), Math.cos(0.087266463), 0.0},
                    {0.0, 0.0 , 1.0} };

                coordinatesRotz[a][b]+= coordinates2T[a][c]*Rotz[c][b];
                coordinatesRotX[a][b]+= coordinatesRotz[a][c]*RotX[c][b];
                coordinatesRotZ[a][b]+= coordinatesRotX[a][c]*RotZ[c][b];
              
             

            }
        }
    }

  for(int a=0; a<coordinates1.length; a++){
  
        for(int b=0; b<coordinatesRotZ; b++){
     R1 =Math.sqrt((Math.pow((coordinates1[a][0]-coordinatesRotZ[b][0]),2.0) //computes X
                  +Math.pow((coordinates1[a][1]-coordinatesRotZ[b][1]),2.0)// computes Y
                +Math.pow((coordinates1[a][2]-coordinatesRotZ[b][2]),2.0)));//computes Z
                
               R2 += R1;  //Store R1 for every loop into R2

       }
    }
    
    RMSD_Final = R2 / coordinatesRotZ.length;



      
    if( RMSD_Final!= 0.0)    
    System.out.println("RMSD="+fmt.format(RMSD_Final));
}
}
  1. When you insert the values to the arrays, keep a counter - this way you will know what was the last cell that you have written to - at the end instead of RMSD_Final = R2 / coordinatesRotZ.length; just do RMSD_Final = R2 / counter;
  2. The loop in line 164 accumulates all the R1 into R2, just be sure that R2 is 0 when you start.
  1. When you insert the values to the arrays, keep a counter - this way you will know what was the last cell that you have written to - at the end instead of RMSD_Final = R2 / coordinatesRotZ.length; just do RMSD_Final = R2 / counter;
  2. The loop in line 164 accumulates all the R1 into R2, just be sure that R2 is 0 when you start.

Thank you! but how do I know how many rows my array have used up. I tried printing count after the for loop and it gives me a shocking 9000000. But i don't need that much, i only need to limit it to the number of rows my array have used. can i get the value of it?

When do you increment? if you need only the number of rows of the matrix, you need to count only when you finish with a row, and not every cell (which will increment every column and every row, more than you need).

When do you increment? if you need only the number of rows of the matrix, you need to count only when you finish with a row, and not every cell (which will increment every column and every row, more than you need).

OK, I got it! Thanks :D

This question has already been answered. Start a new discussion instead.