Loop: LD F0, 0(R1) ;F0 - array element 
 ADDD F4, F0, F2 ;add scalar in F2 
 SD 0(R1), F4 ;store result 
 SUBI R1, R1, #8 ;decrement pointer  
;8 bytes (per double) 
 BENZ R1, Loop ;branch R1 != zero

on DLX this looks:

Loop: LD F0, 0(R1) 1 
  stall  2 
  ADDD F4, F0,F2 3 
  stall  4 
  stall  5 
  SD 0(R1), F4 6 
  SUBI R1, R1,#8 7 
  BENZ R1, Loop 8 
  stall  9

i want to understand when does the stall occur?


