| | |
Computer Architecture Help
![]() |
•
•
Join Date: Nov 2008
Posts: 1
Reputation:
Solved Threads: 0
I cannot seem to solve this problem. if someone can please help me, i would much appreciate it. thanks
•
•
•
•
This problem is concerned with how variations of Tomasulo’s algorithm perform when they run a loop that is very common. This loop is a vector loop called the DAXPY loop (for double precision aX plus Y) and it is the central operation in Gaussian elimination. The code below implements the operation Y = aX + Y for a vector of length 100. It assumes that initially R1 = 0 and F0 contains the value of a.
foo: L.D F2, 0(R1) ; load X(i)
MUL.D F4, F2, F0 ; multiply a*X(i)
L.D F6, 0(R2) ; load Y(i)
ADD.D F6, F4, F6 ; add a*X(i) + Y(i)
S.D F6, 0(R2) ; store Y(i)
DADDIU R1, R1, #8 ; increment X index
DADDIU R2, R2, #8 ; increment Y index
DSGTUI R3, R1, #800 ; test if done
BEQZ R3, foo ; loop if not done
In this code, the instruction DSGTUI R3, R1, #800 is an integer ALU operation which compares register R1 with unsigned immediate value 800, setting R3 to 1 if R1 > 800, 0 otherwise.
The pipeline functional units have the following characteristics:
PIC ATTACH ONE
Assume that:
- The functional units are not themselves pipelined.
- There is no forwarding between functional units, so that results are communicated using the CDB.
- The EX stage does both the effective address calculation and also the memory access for loads and stores (so that the pipeline is IF – ID – IS – EX – WB
- Loads take one cycle.
- The issue (IS) and write result (WB) stages each take 1 clock cycle.
- There are 5 load buffer slots and 5 store buffer slots.
- Assume that the BEQZ instruction takes 0 clock cycles.
- Assume that a queued instruction in a reservation station may execute in the same cycle that the previous instruction writes to the CDB.
- Assume also that a data dependent instruction begins to execute in the cycle after the data value is broadcast on the CDB.
a) For this part of the problem, use the single-issue Tomasulo MIPS pipeline shown in Figure 2.9 of your text with the pipeline latencies shown below in Table 1. Show the number of stall cycles for each instruction and what clock cycle each instruction begins execution (i.e. enters its first EX cycle) for three iterations of the loop. How many cycles does each loop iteration take? Give your answers in the form of a table like the one labeled Part (a) below. The first couple of lines of that table are filled in to give you an idea of what to do. You are to finish the table starting at the third line and continuing until 3 iterations are complete.
b) Using the code for DAXPY loop and a fully pipelined floating point unit with the latencies of Table 1. Assume a two-issue Tomasulo’s algorithm for the hardware with one integer unit taking one execution cycle (that means 0 cycles to use) for all integer operations. Show the number of stall cycles for each instruction and what clock cycle each instruction begins execution (i.e. enters its first EX cycle) for three iterations of the loop. Show your answer in the form of a table like the one labeled Part (b) below. The first few lines of that table are filled in to give you an idea of what to do. You are to finish the table starting at the fourth line and continuing until 3 iterations are complete.
c) Again using the MIPS code for DAXPY given above, assume Tomasulo’s algorithm with speculation as shown in Figure 2.14 of your text. Assume the latencies of Table 1 and also assume that there are separate integer functional units for effective address calculation, for ALU operations, and for branch condition evaluation. Create a table like the one labeled Part (c ) below for the first three iterations of the loop. The first few lines of that table are filled in to give you an idea of what to do. You are to finish the table starting at the fourth line and continuing until 3 iterations are complete.
PIC ATTACH TWO
PIC ATTACH THREE
![]() |
Similar Threads
- Computer Architecture Reference (Computer Science)
- Needing Expert Advice On Desktop Computer (Troubleshooting Dead Machines)
- computer architecture simulation (Computer Science)
- Computer Science, Computer (Software) Engineering... (IT Professionals' Lounge)
- Computer Architecture (Computer Science)
- Not enjoying computer science (IT Professionals' Lounge)
- Pipeline Hazzards (Motherboards, CPUs and RAM)
- Darn jumpbox! (DaniWeb Community Feedback)
Other Threads in the Computer Science Forum
- Previous Thread: Need help finding good information on Johnson's algorithm
- Next Thread: help to implement one step predictor
Views: 844 | Replies: 2
| Thread Tools | Search this Thread |
Tag cloud for Computer Science
ai algorithm algorithms architecture assembly assignment assignmenthelp assignments binary bizarre bletchleypark blogging book brunel bubble cern cheating-failed clueless code codebreaker compiler computer conversion cryptography data-base database development dfa dissertation dissertationtopic education extensions github givemetehcodez government graphics gui guidelines history homework homeworkassignment homeworkhelp ibm ideas impress info information insertion introductory itcontracts jobs language lazy lighthouse lincence linkbait marketing method mobileapplication ms nerd networkingprojects news notation offtopic openoffice os parser problem programming project ps3 quick recursion result sam-being-cute science security servers sex skills software sorting spoonfeeding sql sql-server stephenfry student student-gets-caught supercomputing syntactic telecom time traffic tree uk verify visualbasic writer ww2





