Hi, I was wondering how to dynammicly hop a stream, as opposed to this static hop.

streamin hopwiththis;
streamin in;
streamout out;

mov eax,ecx;

and eax,1023;
cmp eax,0;
jnz end0;

movaps xmm0,in;
movaps out,xmm0;

end0:

Recommended Answers

All 9 Replies

I'm not exactly sure what you're asking for as in reference to static vs dynamic hop, but in your assembly you don't need the compare instruction as the and operation will result in the zero flag set!

You only show the code snippet but you can combine your code even more easily!

;
test ecx,(1024-1);
jnz end0;

Note that test is a read-only and operation. And AND is performed but the result is not written to the register. You also don't have to use only the eax register unless you're using a very old 80x86 processor. this also means the function is faster as there is no stall waiting for the value to be written back into the register before that register is read again for the compare.

In this particular case you used an immediate value bit mask of (2^N)-1 which is typically used for rolling over back to the origin again.

You can also TEST to a register so that the masking value is contained within a register such as

test ecx,edx

The purpose was to convert a stream to mem, to be used for the hop size. Thisway I can use the null or a bigger hop to adjust the cycles. Wich would block streams out of processing/slow them down.
Excuse me for asking newby questions btw.
This is my 2nd day assembling.

If I understand correctly the way described above has the same functionality, with less cycles.
But.. How do I get the stream linked to the hopsize?

my programmer, demands I do it like the initial example.
I think Yust Oldstyle. How could i make the hopsize vary, and how do I set it to null.

Then to help you I need clarity. Stream can mean many things. A disk file that is read N bytes at a time. Or a file in memory acting similar to a virtual file where blocks of memory are skipped such as parsing a TIFF file, reading a header, and ignoring a chunk so skipping over a block of memory by advancing the memory pointer the number of bytes wished to skip. But you keep using plural indicating there are multiple of these 'files' but I really don't understand what you keep referring to as 'hop' other then skipping (seeking) over N bytes of memory. If this is the case you can pass the number of bytes to skip in, then adjust your pointer by those bytes. You used a mask of 2^10 - 1 = 1023 which implies you're reading from a cache? or is that your block size?

There are two ways to jump to set block size! If the size trying to jump is 2^N then you can use an AND operation.

#define ALIGN1024( len ) (( len + 1023 ) & ~1023 ) // round up to 1024
And the following where N is 2^N
#define ALIGN_N( len, N ) (( len + (N-1) ) & ~(N-1) ) // round up to 128bits

But if not 2^N then you have to use a modulo method!

Essentially a remainer, but I don't know if this is what you really are requesting!

y = (( (x + (R-1)) / R ) * R);

You can also do this with a modulo instead of relying on the integer result of the division to round up?

The streams coming in are audio streams at samplerate.
typically44100 or 88200 Hz.
I was trying to reduce the processing, by taking the code out of processing when the result of "some comparrison" ==0.

basicly becouse I have two sepperate halves of sound switching back and forth.

The hop refers to the speed at wich processing is done.
My other thought would be to have it at 1(normal speed) when active.
And 4096 when not active.
So it should be more efficient.

But if the option exists, to skip processing untill called uppon again.
That would be my main objective.

movaps WTME=xmm1; 

*/ WTME is a float, that I have availabile.
jump from null should be on (WTIME!= 0), "is not zero"
/*
////////////////////////////////////////// I have this..

mov eax, "MY FUNCTION" [0]; 
shl eax,1;jz end;

"MY CODE"

end: 

//////////////////////////////////////////// Or this..

float temp=0;

push eax;                 
movaps xmm7,MY FUNCTION;
movaps temp,xmm7;         
fld temp[0];
fistp temp[0];
mov eax,temp[0];
cmp eax,0;                
jnz end0;                 

"MY CODE"

end0:                     
pop eax;                 

/////////////////////////////////////////////////

To clarify what I have here, WTME is a float.
Somehow, i`d like (prefferably the 2nd code) to detect when WTME
is zero, to set the hop to zero.
I`ve managed to do this but it didn`t work nicely at all.
It did save cpu.
So.. what would be the right way to go about retrofitting this peice of code to my float?

You only show a snippet of single stream logic?
But here are a few things to come to mind!

Is your memory 128-bit aligned? You are using the MOVAPS instruction which requires memory to be aligned. If unsure or if not aligned use the slower MOVUPS instead.

Note, you aren't using the register to look at the data so you can use any load instruction!

Also interlace those load-save instructions. You're stalling the processor waiting for the load to complete!

You're burning time converting the near 0.0 to integer 0. Can you do this another way or are you intentionally trying to find where a wave form passes through the zero amplitude. (Of course, if you're cropping the wave at that point, don't forget to change the data value to pure zero or you'll get a popping sound!

Just a thought. To save some speed try to use a pre-mask on the data.

mov eax, Source Memory

TEST eax,07ffff00
jnz continuework

Note I loaded memory from the source memory not the destination memory. This way I don't have to wait for the data to be written first so no instruction stall!

Also I used an integer mask so if I'm obviously not near zero, other bits exist, then no need to spend time tasking the FPU with push pop work.

Of course don't forget to do your really close zero logic.

And alternatively you're using a SIMD operation thus loading 128 bits. (4 32-bit SPFP values) So why not clear the sign bit, and then compare all four simultaneously to a near floating point zero? 4 flags will be set/clear. If one is set indicating less then, then do the extra work to figure which one!

Funny I never Knew where that came from.
In this case I use the 128bit registers sequentially.
(It`s pre assembled from other code so no harm done here).
In another case i don`t. That means i can cut that patch
by three quarters. including mask it again.
Im going to check this out for now.. trying to get it assembled.
I`ll get back to it.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.