Points missed:

Student's Name: \_\_\_\_\_

Total score: \_\_\_\_\_ /100 points

East Tennessee State University Department of Computer and Information Sciences CSCI 4717 – Computer Architecture TEST 3 for Fall Semester, 2003

Section 201

### **Read this before starting!**

- The total possible score for this test is 100 points.
- This test is closed book and closed notes
- You may use one sheet of scrap paper that you will turn in with your test.
- When possible, indicate final answers by drawing a box around them. This is to aid the grader (who might not be me!) Failure to do so might result in no credit for answer. Example:



- **1 point will be deducted** per answer for missing or incorrect units when required. **No** assumptions will be made for hexadecimal versus decimal, so you should always include the base in your answer.
- If you perform work on the back of a page in this test, indicate that you have done so in case the need arises for partial credit to be determined.

"Fine print"

Academic Misconduct:

Section 5.7 "Academic Misconduct" of the East Tennessee State University Faculty Handbook, June 1, 2001:

"Academic misconduct will be subject to disciplinary action. Any act of dishonesty in academic work constitutes academic misconduct. This includes plagiarism, the changing of falsifying of any academic documents or materials, cheating, and the giving or receiving of unauthorized aid in tests, examinations, or other assigned school work. Penalties for academic misconduct will vary with the seriousness of the offense and may include, but are not limited to: a grade of 'F' on the work in question, a grade of 'F' of the course, reprimand, probation, suspension, and expulsion. For a second academic offense the penalty is permanent expulsion."

### Very short answers – Each answer is worth 2 points

1. Give an example of a type of operand fetch that would take longer than average to perform.

For the next 4 questions, use the following abbreviations for the stages of a 6-stage pipelineFetch instruction (FI)Decode instruction (DI)Calculate operands (CO)Fetch operands (FO)Execute Instruction (EI)Write Operand (WO)

- 2. List all of the stages that together could result in a bus resource conflict?
- A write-read data dependency occurs when the \_\_\_\_\_ stage of an instruction depends on a previous instruction's \_\_\_\_\_ stage.
- 4. Not all instructions use all stages. List all of the stages that could be considered optional for some instructions.
- 5. At the end of which stage does a processor know the result of a conditional branch?
- 6. If a single bit is used for a branch history and it is initialized to the "don't branch" state, then how many erroneous predictions will be made during the full execution of a typical loop?

a.) 1 b.) 2 c.) 3 d.) Varies depending on the circumstance

- 7. In what level of the memory hierarchy are the branch history bits typically stored?
- 8. True or false: A loop buffer doesn't help if the instruction hasn't been executed recently.
- 9. True or false: A loop buffer only provides a speedup because it reduces the instruction fetch time by using a high-speed memory for the buffer.
- 10. Which approach to branch prediction will the typical for-loop perform best with, "predict always taken" or "predict never taken"?
- 11. List one of the two methods of handling global variables in a system using register windows.
- 12. True or false: The purpose behind the delayed branch is to avoid flushing the pipeline.
- 13. Modify the code below to support a delayed branch. (There are two different ways.)

| mov | al,23   |  |  |
|-----|---------|--|--|
| cmp | bl,ch   |  |  |
| jne | label_a |  |  |
| inc | bl      |  |  |
| add | al, bl  |  |  |

- 14. Which cache coherence protocol uses distributed control, the directory protocol or snoopy protocol?
- 15. Assume a multiprocessor system uses the MESI protocol. If the current state of a line in processor A's cache is shared and processor B modifies that data without writing the new value to main memory, what does the state of that line in processor B's cache change to?

a.) modified b.) exclusive c.) shared d.) invalid e.) cannot be determined

- 16. Assume a multiprocessor system uses the MESI protocol. If the current state of a line in processor A's cache is shared and processor A modifies that data and writes the new value to main memory, but no other processor updates their copies, what does the state of that line in processor A's cache change to?
  - a.) modified b.) exclusive c.) shared d.) invalid e.) cannot be determined
- 17. Assume a multiprocessor system uses the MESI protocol. If processor A contains a line in its cache that no other processor contains in its cache, what is the state of the line in processor A's cache?

a.) modified b.) exclusive c.) shared d.) invalid e.) cannot be determined

- 18. For the following code, how many virtual registers will need to be created using register renaming? Do not count the original assignment of register values.

Answer:

19. Which two of the following dependencies exhibit the same behavior?

```
a.) Data dependency b.) Procedural dependency c.) Resource conflict
```

20. Name an application where vector computing would be the best option.

# For the next 5 questions, fill in the blank with the SMP bus configuration that best completes the sentence: time shared (TS), multiport memory (MM), or central control (CC).

- 21. \_\_\_\_\_ results in a very complex printed circuit board (lots of wires).
- 22. \_\_\_\_\_\_ is the most reliable (no central point of failure).
- 23. \_\_\_\_\_\_ is the best in terms of performance.
- 24. \_\_\_\_\_\_ is the simplest to implement.
- 25. \_\_\_\_\_\_ is the easiest to expand with additional processors.

### Sort of short answers – Each answer is worth 3 points

- 26. What simple change can be made to a two-stage RISC pipeline to turn it into a three-stage pipeline? (Remember that a 2-stage pipeline is restricted to overlapping only non-memory execution stages with memory access stages from other instructions.)
- 27. List the two factors that frustrate the theory that a processor will run faster with a greater number of stages in its pipeline.
- 28. List the two problems with creating multiple streams to handle conditional branches, i.e., putting both the jump and the non-jump execution streams into their own pipeline.
- 29. Circle ALL of the addressing modes supported by the typical RISC processor.
  - a.) Register b.) Indirect addressing c.) PC-relative
  - d.) Displacement e.) Dual operands f.) Unaligned addressing
- 30. What does an *instruction window* in a superscalar machine allow the processor to do?

## Medium answers – Each answer is worth 5 points

31. Assume 4 instructions, one of which is a conditional branch, are passed through a pipeline with 6 stages. Using  $\tau$  to represent the time it takes to complete a stage, calculate how long will it take to pass all 4 instructions through the pipeline *both without and with prefetching*? Assume the conditional branch is not the last instruction.

Without prefetching: \_\_\_\_\_ With prefetching: \_\_\_\_\_

32. Many CISC instructions are meant to close the gap between opcodes and HLL instructions. Name two reasons why this isn't always effective in making applications run faster. 33. List 3 of the 4 characteristics discussed in class of a RISC machine.

34. Identify the write-read, write-write, and read-write dependencies in the instruction sequence below by entering each line pair with a dependency in the correct column of the table to the right. For example, if L1 and L4 had a write-write dependency (which they don't), you would enter L1-L4 in the column labeled "write-write".

| L1: | R1 ← R2 + R3 | write-read | write-write | read-write |
|-----|--------------|------------|-------------|------------|
| L2: | R4 🗲 R1 + 1  |            |             |            |
| L3: | R1 🗲 R3 * 2  |            |             |            |
| L4: | R5 ← R1 + R3 |            |             |            |
| L5: | R5 🗲 R5 + 10 |            |             |            |

35. Describe the problem encountered when an interrupt occurs in a superscalar processor.

36. List two of the four potential advantages to be gained when using a symmetric multiprocessor system.

37. List three of the five operating system design considerations for a symmetric multiprocessor system.