CS/EE 3810

Assignment 4

Due: 10:45am, Thu Feb 12th, 2026

Note: Make reasonable assumptions where necessary and clearly state them. Feel free to discuss problems with classmates, but the only written material that you may consult while writing your solutions are the textbook and lecture slides/videos/notes.

While all other homeworks are worth 100 points, this one is worth 150 points, i.e., it offers 50 points of extra credit.

Every homework has an automatic penalty-free 1.5 day extension to accommodate any health/family-related disruptions. In other words, try to finish your homework by Thursday 10:45am to keep up with the lecture content, but if necessary, you may take until Friday 11:59pm.

For this homework, please upload three separate files - the first is a typed up solution to Q1, the other two are separate .asm files for Q2 and Q3. The TAs will download and test your .asm files on MARS. Note that your MARS programs will be graded on readability and user friendliness, as well as correctness. That means LOTS OF COMMENTS!! Again, here's the document that provides an overview of MARS. The examples at the end of that doc will be especially useful as you write these longer programs.

Program Errors (30 points): Consider the following MIPS assembly procedure, countaA. The procedure receives the starting address of a string as an argument. It goes through the entire string and returns the number of occurrences of the letter "A" and the number of occurrences of the letter "a". The procedure has 3 errors. Identify these errors. Provide comments to explain the role of each line in the program.
countaA:
addi $v0, $zero, 0
addi $v1, $zero, 0
addi $t0, $zero, 65
addi $t1, $zero, 97
loop:
lw $t2, ($a0)
beq $t2, $zero, endproc
beq $t2, $t0, skipA
addi $v0, $v0, 1
skipA:
bne $t2, $t1, skipa
addi $v1, $v1, 1
skipa:
j loop
endproc:
jr $ra
Data Compression (50 points): Write a MIPS assembly program in the MARS simulator that accepts an input string of size less than 50 characters, and applies the following compression algorithm to the string, and then prints the resulting compressed string. The input string will only consist of alphabets, i.e., a-z and A-Z. First check the string to make sure it's a valid input (if not, print "Invalid Input" and quit). Then walk through the string looking for consecutive occurrences of the same character and replace them with the character and a count (called a "run length encoding"). Limit yourself to run lengths of less than 10. For example, if you see AAAAA, you would replace it with A5. If you see BBBBBBBBBBBB, you would replace it with B9B3. If you see CC, you would replace it with C2. Single character occurrences do not need a count. At the end, print the compression ratio, which is a floating point number = (size of input string) / (size of output string). For floating point conversion, see the FAQ at the end of the MARS google doc. Here is an example run of the program:

Provide an input string with less than 50 characters and only containing a-z or A-Z:
AACCCCCGTTTTTTTTTTTTTTAAAabbcd
The compressed string is:
A2C5GT9T5A3ab2cd
The compression ratio is 1.875.

Your program must have at least one procedure call with at least one save/restore of a register, while following the save/restore conventions. Before you start, go through the MARS overview in detail. For example, the code on page 7 and 8 will be useful as you go about allocating space for an array of characters and reading a string. For reference, here is an ASCII table.

Pattern Prediction (70 points):
In this problem, you will be given a string of 0's and 1's as input. Using the algorithm below, your program will predict the next bit in the string. Given an input string, your program will first scan the string to train itself, culminating in a prediction of the next bit. The training is done with an array of 32 counters with values between 0 and 7. All 32 counters are initialized to the value 4. Let's take the following 24-bit input string as an example to explain the algorithm:

Bit value	1	1	1	0	1	1	1	0	1	1	1	0	1	1	1	0	1	1	1	0	1	1	1	0
Bit-id	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23

This string is a recurring pattern of 1110, so it should be easy for us humans to guess that the next bit should be a 1. The algorithm starts its training from bit-id 4. It looks at the prior 5 bits (in this case, 11101) and converts that 5-bit binary number into a decimal number (in this case, 29). It then looks at bit-id 5. If bit-id 5 is 1, then (in this case) it increments counter[29]; if bit-id 5 is 0, then it decrements counter[29]. These are "saturating" increments and decrements, i.e., if the counter was already at 7, the increment operation should keep it at 7, and if the counter was already at 0, a decrement operation should keep it at 0. Note that counter[29]'s value is being trained to remember the common-case next bit -- if counter[29] is a high value (greater than 3), then the model thinks the next bit after sequence 11101 (the number 29) will be 1 -- if counter[29] is a low value (less than 4), then the model thinks the next bit after sequence 11101 will be 0.

After this first training step, the algorithm advances to bit-id 5. It looks at the prior 5 bits (in this case, 11011), converts it into a decimal number (27), uses that number to index the counter array (accesses counter[27]), and examines bit-id 6 to decide whether to increment/decrement that counter (in this case, bit-id 6 is 1, so counter[27] is incremented).

This training repeats until we reach the end of the string, i.e., as soon as the next bit-id is not 0 or 1, we switch from training to prediction. In the above example, this happens when we reach bit-id 23. So we look at the last 5 bits (bit-id 19-23, which in this case is 01110) and convert it to a decimal number (14 in this case). That decimal number is used to index our array (in this case, we'll examine counter[14]). If counter[index] is greater than 3, we print our predicted next bit as 1. If counter[index] is less than 4, we print our predicted next bit as 0.

Note that the program accepts the input as a string. You will need code to convert a string of 0 and 1 characters into a number when generating the index. We will not feed your program with invalid inputs, i.e., the input string will only have 0 or 1, and the input string length will remain between 6 and 100. A correct implementation of the above algorithm will do a decent job detecting most recurring patterns that are shorter than 6 characters.

While it was laborious to read this question, you have learned an important computer hardware concept. Modern processors use variants of this algorithm to identify patterns and make predictions about future behavior, e.g., whether a conditional branch will be true or false. We will study this further when we get to Branch Predictors.