I literally picked up asm today. I have been trying to create a 3d software rendering engine in C++, and figured out doing things directly would be faster....but, I tried the following code:

int _tmain(int argc, _TCHAR* argv[])
{
	int num, timestart, dur;
	timestart = clock();
	for(int i = 0 ; i < 10000000; i++)
	{
		__asm
		{
			MOV EAX, 11;
			MOV ECX, 5;
			XOR EDX, EDX;
			DIV ECX;
			MOV num, EDX;
		}
		//num = 11 / 5;
	}
	dur = clock() - timestart;
	cout << dur << endl;
	getch();
	return 0;
}

first I tried it the way it is now, and then I un-commented "num = 11 / 5;" inside the for loop, and commented out the asm. For the above test, the variable dur read 156 milliseconds for the asm and only 78 milliseconds for the C++...
can anyone explain?
Thanks =)

Recommended Answers

All 5 Replies

That compilers are smarter than you perhaps?

> I literally picked up asm today.
Whereas the combined asm experience of all the people who wrote the compiler is probably in the 1000+ YEAR category. My bet is that they've figured out some stuff that you haven't.

In particular
1. 11/5 is a constant. Seeing that, the compiler will do the work at compile time rather than literally emitting the code (like you did) to calculate the result (num = 2 in other words).
2. You don't use the num result, so it might not even do that much.
3. Having eliminated step 2, it sees the loop is empty, and gets rid of that as well.

> and figured out doing things directly would be faster
How?
By guessing that you could (the wrong approach)
Or by using a profiler to find out where the real hot spots are.

I meant figured "that" doing it directly may be better. I was going to try doing simple things like my cross/dot products and stuff with asm to see if that makes any difference.

another thing, I took a look at the dissasembly of this app, and it doesn't even seem to have a division where I did one in the above app. Even after I changed it to i / 2 instead of 11 / 5.

Well /2 is commonly achieved by doing >>1 (aka ASR in assembly).
Only the compiler probably knows a few more tricks than that.

Like I said, you're pitting your knowledge against the distilled experience of MANY experienced programmers.

> I was going to try doing simple things like my cross/dot products and stuff with asm to see if that makes any difference.
First thing you need to do is make sure this is in fact a hot-spot by doing some profiling.
Then figure out if there is a better way of working the problem which involves less calls to the hot function. The biggest optimisation is to simply not call it; so if you can identify useless calls, then that's a big win.

Rewriting in ASM is the absolute last resort when you've exhausted all other avenues.

Finally, if you're going to do it in asm, you need one hell of a crafty approach to the problem to make it worth-while.

Simply writing out instructions which you imagine a really stupid compiler would generate, then trying to squeeze out a bit of dead code will mean you'll always lose. Write your code, then look at the asm, and see if you can figure any of it out when the optimiser is turned on. There's all sorts of stuff you'd never think of going on in there.

Your compiler is NOT a programmable calculator which takes every character you write literally and outputs a little bit of assembler to match what you're trying to do.

Thanks Salem...that does all make sense. I guess I got a long way to go =)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.