Note: Make reasonable assumptions where necessary and clearly state them. Feel free to discuss problems with classmates, but the only written material that you may consult while writing your solutions are the textbook and lecture slides/videos. Solutions should be uploaded as a single pdf file on Canvas. Show your solution steps so you receive partial credit for incorrect answers and we know you have understood the material. Don't just show us the final answer.
This homework has an automatic penalty-free 1.5 day extension to accommodate any covid/family-related disruptions. In other words, try to finish your homework by Wednesday 1:25pm to keep up with the lecture content, but if necessary, you may take until Thursday 11:59pm . Of course, Thursday is Thanksgiving, so just get it done early so you have a stress-free break.
Briefly describe the Spectre attack. Try to address the following in your answer: (i) Where are the attacker and victim threads executing and what resources do they share? (ii) What is the code sequence in the victim that causes information leak and how? (iii) How does the attacker inspect the victim's information leak? (iv) How does the attacker amplify the information leak?
Consider a symmetric shared-memory multiprocessor (3 processors sharing a bus) implementing a snooping cache coherence protocol such as the one discussed in class. For each of the events below, explain the coherence protocol steps (does the cache flag a hit/miss, what request is placed on the bus, who responds, is a writeback required, etc.) and mention the eventual state of the data block in the caches of each of the 3 processors. Follow the format shown in Slide 11. Assume that X and Y are not in any of the caches at the start of the sequence, the caches are direct-mapped, and blocks X and Y map to the same set in each cache (X and Y cannot co-exist in a cache at any time).
P1: Write X
P2: Write X
P3: Read X
P1: Read X
P3: Write X
P3: Read Y
P2: Write Y
Consider the same sequence of memory accesses as above. Assume that 4 processors are connected with a point-to-point interconnect and implement distributed shared memory with a directory-based cache coherence protocol. For the above sequence of instructions, what are the total number of interconnect message transfers while implementing a write invalidate protocol? For each instruction, list the messages that must be sent on the network and the state of the line in the caches and in the directory. Assume that a message can include some control information as well as an address and cache line. Also assume that the home nodes for memory locations X and Y are both associated with processor P4. Follow the format shown in Slide 16. Assume that X and Y are not in any of the caches at the start of the sequence, the caches are direct-mapped, and blocks X and Y map to the same set in each cache (X and Y cannot co-exist in a cache at any time).