You have used more loops than is necassary. The thing is once you have chosen a value of a and b you can calculate the value of c that solves the equation 2a + b - c = y. You do not need a loop that checks every value of c.
In fact since you just want a count of solutions you don't even need to calculate the value of c all you need to check is that the value would be in the right range 0 <= c <= y.
Written in c++ it looks something like
for(a=0; a<=y; a++)
{
for(b=0; b<=y; b++)
{
if ((((2 * a) + b - y) <= y) && (((2 * a) + b) >= y))
{
++count;
}
}
}
This takes about 0.019s to run or if you switch on the optomiser 0.006s for an input of 2000. (NOTE with 3 loops the C++ solution was taking around 33s to complete or 20s under optomisation).
Interestingly the calculate c and then check it's in the right range solution
for(a=0; a<=y; a++)
{
for(b=0; b<=y; b++)
{
c = (2 * a) + b - y;
if (0 <=c && c <=y)
{
++count;
}
}
}
takes 0.039s with no optomisation (slower) about 0.004s with optomisation (faster) which kind of suggests the optomiser finds it easier to optomise simpler code (which makes sense).
Banfa
Practically a Master Poster
695 posts since Mar 2010
Reputation Points: 508
Solved Threads: 109
Skill Endorsements: 5
The algorithm won't change - only the language. Consider Banfa's post as pseudo-code and convert that to assembler... :-)
rubberman
Posting Maven
2,571 posts since Mar 2010
Reputation Points: 365
Solved Threads: 305
Skill Endorsements: 52
Alternatively consider that your assembly routine takes 33s, my original C++ port of your assembly routine without optomisation also takes 33s but with optomisation takes 20s and accept that the C++ compiler and optomiser does a better job than you of producing assembly/machine code and just use the C++ implementation with optomisation switched on.
Banfa
Practically a Master Poster
695 posts since Mar 2010
Reputation Points: 508
Solved Threads: 109
Skill Endorsements: 5
Firstly I still recon you have 1 loop too many in that code. Again after the second loop you can calulate c and then check it is in the correct range. A bit of maths and an if statement is always generally going to be quicker than a loop.
To answer your question many C++ compilers can actually output the assembler for the code the are compiling, for example by using the -S switch with gcc the compiler outputs assembler to <filename>.s
Banfa
Practically a Master Poster
695 posts since Mar 2010
Reputation Points: 508
Solved Threads: 109
Skill Endorsements: 5
That is because you still have 3 loops, get rid of the third loop as I showed you. The problem with your implementation is that because of the nested loops it has a complexity of N^3. By removing a loop you reduce the complexity to N^2 which makes a huge difference to execution time.
It is slow because of the number of nested loops you use.
If I just compile the C++ code I get the following runtimes
Compiler Execution
Code Optomised Time
Your Original Code No 7.237s
Your Original Code Yes 1.144s
Rewritten With 2 Loops No 0.018s
Rewritten With 2 Loops Yes 0.002s
You ought to be able to get at least as good as the unoptomised compilation of the C++ code.
Also the code in assembly at lines 9-10 does not implement either the equation of you C++ code or the equation in the comments.
Banfa
Practically a Master Poster
695 posts since Mar 2010
Reputation Points: 508
Solved Threads: 109
Skill Endorsements: 5