hello

i am trying to do matrix multiplication on a 2D array using pitch.

i am able to load the 2D array on gpu using cudaMallocPitch() and cudaMemcpy2D() function, but i am not able to write the multiplication code.

The output which i am getting is wrong.

Can anyone help me out in the code

here's the which i have written

//---code for matrix multiplication using pitch---

```
float Pvalue=0;
xid = blockIdx.x * blockDim.x + threadIdx.x;
yid = blockIdx.y * blockDim.y + threadIdx.y;
for (int k = 0; k < N; ++k) { //D=T*M
float Melement = T[yid*pitch+k];
float Nelement = M[k+xid*pitch];
Pvalue += Melement * Nelement;
}
D[yid*pitch+xid] = Pvalue;
__syncthreads();
```

//---------

i am waiting for the help

thanx in advance....