Guys..My task is to convert for loop c++ to asm....and the speed shoud be fast....i managed to convert to asm the loop part but when i run and enter 2000 it shows 33 second is it possible to be lower the 5 second....

```
#include <iostream>
#include <ctime>
using namespace std;
void main ()
{
time_t start, end;
unsigned short a, b, c, y, count=0;
float diff;
cout << "Enter y : ";
cin >> y;
cout << "Calculation start..." << endl;
start = clock();
//this is loop part which need to change to increase the speed
_asm{
mov dx,y // Set value of y to dx
mov ax,0 // AX as a, start from 0
set1: mov bx,0 // BX as b, start from 0 / reset to 0
set2: mov cx,0 // CX as c, start from 0 / reset to 0
again: push ax // Store AX value to Stack
add ax,ax // This 3 line to calculate the
add ax,bx // value of 2a + b - c
sub ax,cx //
again2: cmp ax,dx // Compare the value (2a + b - c) with y (DX)
jne skip // IF not equal then jump to skip
inc count // Else IF equal then count++;
skip: cmp cx,dx // Comparing c (CX) with y (DX)
jge up // IF c >= y, jump to up
inc cx // Else c++;
pop ax // Restore the 'a' (AX) value form stack
// (because AX value changed when calculate the value 2a+b-c)
jmp again // Jump to again to recount and recheck
up: cmp bx,dx // Comparing b (BX) with y (DX)
jge up2 // IF b >= y, jump to up2
inc bx // Else b++;
pop ax // Restore 'a' value.
jmp set2 // Jump to set2 which will reset c value to 0
up2: pop ax // Restore 'a' value
cmp ax,dx // Compare a (AX) with y (DX)
jge stop // IF a >= y, jump to stop
inc ax // Else a++;
jmp set1 // Jump to set1 which will reset b value to 0
stop:
}
//Do not change the code below
end = clock();
diff = (float)(end - start)/CLOCKS_PER_SEC;
cout << "Calculation complete." << endl;
cout << "Time used is " << diff << " second" << endl;
cout << "There are " << count << " combination to produce " << y << endl;
system ("pause");
}
```

You have used more loops than is necassary. The thing is once you have chosen a value of a and b you can calculate the value of c that solves the equation 2a + b - c = y. You do not need a loop that checks every value of c.

In fact since you just want a count of solutions you don't even need to calculate the value of c all you need to check is that the value would be in the right range 0 <= c <= y.

Written in c++ it looks something like

```
for(a=0; a<=y; a++)
{
for(b=0; b<=y; b++)
{
if ((((2 * a) + b - y) <= y) && (((2 * a) + b) >= y))
{
++count;
}
}
}
```

This takes about 0.019s to run or if you switch on the optomiser 0.006s for an input of 2000. (NOTE with 3 loops the C++ solution was taking around 33s to complete or 20s under optomisation).

Interestingly the calculate c and then check it's in the right range solution

```
for(a=0; a<=y; a++)
{
for(b=0; b<=y; b++)
{
c = (2 * a) + b - y;
if (0 <=c && c <=y)
{
++count;
}
}
}
```

takes 0.039s with no optomisation (slower) about 0.004s with optomisation (faster) which kind of suggests the optomiser finds it easier to optomise simpler code (which makes sense).

The algorithm won't change - only the language. Consider Banfa's post as pseudo-code and convert that to assembler... :-)

Alternatively consider that your assembly routine takes 33s, my original C++ port of your assembly routine without optomisation also takes 33s but with optomisation takes 20s and accept that the C++ compiler and optomiser does a better job than you of producing assembly/machine code and just use the C++ implementation with optomisation switched on.

owh ok thanks for comments....is there any online converter available C++ to assembly...i need to convert this to assembly now...

```
for (a=0; a<=y; a++) {
for (b=0; b<=y/2; b++) {
for (c=0; c<=y/3; c++) {
if ((a+2*b+3*c) == y)
count++;
}
}
}
```

Firstly I still recon you have 1 loop too many in that code. Again after the second loop you can calulate c and then check it is in the correct range. A bit of maths and an if statement is always generally going to be quicker than a loop.

To answer your question many C++ compilers can actually output the assembler for the code the are compiling, for example by using the -S switch with gcc the compiler outputs assembler to <filename>.s

owh ok thanks anyway is this correct according to c++ code...and when i input 2000 i still got 30sec....i want to get atleast 5 or lower is it possible?

```
_asm{
mov dx,y // Set value of y to dx
mov ax,0 // AX as a, start from 0
set1: mov bx,0 // BX as b, start from 0 / reset to 0
set2: mov cx,0 // CX as c, start from 0 / reset to 0
again: push ax // Store AX value to Stack
add ax,ax // This 3 line to calculate the
add ax,bx // value of 2a + b - c
add ax,cx //
again2: cmp ax,dx // Compare the value (2a + b - c) with y (DX)
jne skip // IF not equal then jump to skip
inc count // Else IF equal then count++;
skip: cmp cx,dx // Comparing c (CX) with y (DX)
jge up // IF c >= y, jump to up
inc cx // Else c++;
pop ax // Restore the 'a' (AX) value form stack
// (because AX value changed when calculate the value 2a+b-c)
jmp again // Jump to again to recount and recheck
up: cmp bx,dx // Comparing b (BX) with y (DX)
jge up2 // IF b >= y, jump to up2
inc bx // Else b++;
pop ax // Restore 'a' value.
jmp set2 // Jump to set2 which will reset c value to 0
up2: pop ax // Restore 'a' value
cmp ax,dx // Compare a (AX) with y (DX)
jge stop // IF a >= y, jump to stop
inc ax // Else a++;
jmp set1 // Jump to set1 which will reset b value to 0
stop:
stop:
}
```

That is because you still have 3 loops, get rid of the third loop as I showed you. The problem with your implementation is that because of the nested loops it has a complexity of N^3. By removing a loop you reduce the complexity to N^2 which makes a huge difference to execution time.

It is slow because of the number of nested loops you use.

If I just compile the C++ code I get the following runtimes

```
Compiler Execution
Code Optomised Time
Your Original Code No 7.237s
Your Original Code Yes 1.144s
Rewritten With 2 Loops No 0.018s
Rewritten With 2 Loops Yes 0.002s
```

You ought to be able to get at least as good as the unoptomised compilation of the C++ code.

Also the code in assembly at lines 9-10 does not implement either the equation of you C++ code or the equation in the comments.