Guys..My task is to convert for loop c++ to asm....and the speed shoud be fast....i managed to convert to asm the loop part but when i run and enter 2000 it shows 33 second is it possible to be lower the 5 second....

``````#include <iostream>
#include <ctime>

using namespace std;

void main ()
{
time_t start, end;
unsigned short a, b, c, y, count=0;
float diff;

cout << "Enter y : ";
cin >> y;
cout << "Calculation start..." << endl;
start = clock();

//this is loop part which need to change to increase the speed

_asm{

mov dx,y    // Set value of y to dx
mov ax,0    // AX as a, start from 0
set1:   mov bx,0    // BX as b, start from 0 / reset to 0
set2:   mov cx,0    // CX as c, start from 0 / reset to 0

again:  push ax     // Store AX value to Stack
add ax,ax   // This 3 line to calculate the
add ax,bx   // value of 2a + b - c
sub ax,cx   //

again2: cmp ax,dx   // Compare the value (2a + b - c) with y (DX)
inc count   // Else IF equal then count++;

skip:   cmp cx,dx   // Comparing c (CX) with y (DX)
inc cx      // Else c++;
pop ax      // Restore the 'a' (AX) value form stack
// (because AX value changed when calculate the value 2a+b-c)

up:     cmp bx,dx   // Comparing b (BX) with y (DX)
inc bx      // Else b++;
pop ax      // Restore 'a' value.
jmp set2    // Jump to set2 which will reset c value to 0

up2:    pop ax      // Restore 'a' value
cmp ax,dx   // Compare a (AX) with y (DX)
inc ax      // Else a++;
jmp set1    // Jump to set1 which will reset b value to 0

stop:
}

//Do not change the code below
end = clock();
diff = (float)(end - start)/CLOCKS_PER_SEC;
cout << "Calculation complete." << endl;
cout << "Time used is " << diff << " second" << endl;
cout << "There are " << count << " combination to produce " << y << endl;

system ("pause");

}
``````

You have used more loops than is necassary. The thing is once you have chosen a value of a and b you can calculate the value of c that solves the equation 2a + b - c = y. You do not need a loop that checks every value of c.

In fact since you just want a count of solutions you don't even need to calculate the value of c all you need to check is that the value would be in the right range 0 <= c <= y.

Written in c++ it looks something like

``````for(a=0; a<=y; a++)
{
for(b=0; b<=y; b++)
{
if ((((2 * a) + b - y) <= y) && (((2 * a) + b) >= y))
{
++count;
}
}
}
``````

This takes about 0.019s to run or if you switch on the optomiser 0.006s for an input of 2000. (NOTE with 3 loops the C++ solution was taking around 33s to complete or 20s under optomisation).

Interestingly the calculate c and then check it's in the right range solution

``````for(a=0; a<=y; a++)
{
for(b=0; b<=y; b++)
{
c = (2 * a) + b - y;

if (0 <=c && c <=y)
{
++count;
}
}
}
``````

takes 0.039s with no optomisation (slower) about 0.004s with optomisation (faster) which kind of suggests the optomiser finds it easier to optomise simpler code (which makes sense).

owh ok thanks alot but i want it in assembly....

The algorithm won't change - only the language. Consider Banfa's post as pseudo-code and convert that to assembler... :-)

Alternatively consider that your assembly routine takes 33s, my original C++ port of your assembly routine without optomisation also takes 33s but with optomisation takes 20s and accept that the C++ compiler and optomiser does a better job than you of producing assembly/machine code and just use the C++ implementation with optomisation switched on.

owh ok thanks for comments....is there any online converter available C++ to assembly...i need to convert this to assembly now...

``````for (a=0; a<=y; a++) {
for (b=0; b<=y/2; b++) {
for (c=0; c<=y/3; c++) {
if ((a+2*b+3*c) == y)
count++;
}
}
}
``````

Firstly I still recon you have 1 loop too many in that code. Again after the second loop you can calulate c and then check it is in the correct range. A bit of maths and an if statement is always generally going to be quicker than a loop.

To answer your question many C++ compilers can actually output the assembler for the code the are compiling, for example by using the -S switch with gcc the compiler outputs assembler to <filename>.s

owh ok thanks anyway is this correct according to c++ code...and when i input 2000 i still got 30sec....i want to get atleast 5 or lower is it possible?

`````` _asm{

mov dx,y    // Set value of y to dx
mov ax,0    // AX as a, start from 0
set1:   mov bx,0    // BX as b, start from 0 / reset to 0
set2:   mov cx,0    // CX as c, start from 0 / reset to 0

again:  push ax     // Store AX value to Stack
add ax,ax   // This 3 line to calculate the
add ax,bx   // value of 2a + b - c

again2: cmp ax,dx   // Compare the value (2a + b - c) with y (DX)
inc count   // Else IF equal then count++;

skip:   cmp cx,dx   // Comparing c (CX) with y (DX)
inc cx      // Else c++;
pop ax      // Restore the 'a' (AX) value form stack
// (because AX value changed when calculate the value 2a+b-c)

up:     cmp bx,dx   // Comparing b (BX) with y (DX)
inc bx      // Else b++;
pop ax      // Restore 'a' value.
jmp set2    // Jump to set2 which will reset c value to 0

up2:    pop ax      // Restore 'a' value
cmp ax,dx   // Compare a (AX) with y (DX)
inc ax      // Else a++;
jmp set1    // Jump to set1 which will reset b value to 0

stop:

stop:
}
``````

That is because you still have 3 loops, get rid of the third loop as I showed you. The problem with your implementation is that because of the nested loops it has a complexity of N^3. By removing a loop you reduce the complexity to N^2 which makes a huge difference to execution time.

It is slow because of the number of nested loops you use.

If I just compile the C++ code I get the following runtimes

``````                      Compiler      Execution
Code                  Optomised       Time