![]() |
|||
Homework 1: ShellThis assignment will teach you how to use the Unix system call interface and the shell by implementing a small shell, which we will refer to as the 5460/6450 shell. You will also learn how to use GDB to debug your code. You can do this assignment on any operating system that supports the Unix API (Linux CADE machines, your laptop that runs Linux or Linux VM, and even MacOS, etc.). You don't need to set up xv6 for this assignment Submit your programs and the shell through Gradescope (see instructions at the bottom of this page). NOTE: YOU CANNOT PUBLICLY RELEASE SOLUTIONS TO THIS HOMEWORK. It's ok to show your work to your future employer as a private Git repo, however any public release is prohibited. For Mac / OSX users. The support of 32 bit applications is depricated in the latest version of your system. So if you already updated your system to macOS Catalina or have updated your XCode then we recommend you to do the homework at the CADE machines. NOTE: We are aware that there are several tutorials on writing shells online. This assignment itself borrows heavily from Stephen Brennan's blog post. We strongly encourage you to do this assignment without referring to the actual code in those implementations. You are welcome to look at broad concepts (which we also try to explain here), but the actual implementation should be your work.
NOTE: We recently were made aware of the GNU readline library. Bash (and other shells) rely heavily on it for auto-complete, moving the cursor around when entering input, and even reverse-search. For those interested, this is a really interesting read on the history of TIP: While building this assignment, several parts, like adding support for I/O redirection and pipes might not be immediately obvious, and are quite involved. We encourage you to take a look at xv6's shell to get design clues (sh.c). Note however, that you cannot take the xv6 implementation and submit it (or any other submissions from previous years). You might pass all the test cases, but you will receive a 0 on this assignment if you don't submit what is entirely your work. We will build shell in the following parts: 1) Reading and parsing a command, 2) Executing programs, 3) Implementing support for I/O redirection, and 4) Implementing support for pipes. To start, this is a skeleton of a simple UNIX shell. #include <stdlib.h> #include <unistd.h> #include <stdio.h> #include <fcntl.h> #include <string.h> #include <assert.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/wait.h> #include <errno.h> int main(int argc, char **argv) { sh_loop(); return EXIT_SUCCESS; }
As we will see, the shell is a program that essentially waits for user input, executes commands, and repeats. We will keep our shell simple, by just calling a function $gcc sh.c -o utsh Here gcc will compile your program as The basicsThe main job of a shell is to execute commands. One way to break this down is:
sh_loop() function, hence can look something like the following.
void sh_loop(void) { char *line; char **args; int status; do { printf("utsh$ "); line = sh_read_line(); args = sh_split_line(line); status = sh_execute(args); free(line); free(args); } while (status); }It runs in a loop, and it provides a prompt to the user every time the loop executes: utsh$Once the user enters a command, it calls sh_read_line to read the command, sh_split_line to parse it, and finally sh_execute to execute the command. It then loops back, trying to do the same thing all over again. Note here that the termination of the loop is dependant on the
status variable, which you will have to set appropriately when you write the sh_execute function.
Reading a lineWe do not want to test you on your skills with reading and parsing lines in C, which can be quite involved if one wants to handle several possible error situations. Hence, we provide you with a template forsh_loop() below.
The shell has to read characters from stdin into a buffer to parse it. The thing to note is that you cannot know before hand, how much text a user is going to input as a command, and hence, you cannot know how much buffer to allocate. One strategy is to start with an allocation of small size using
We encourage you to try out writing your char *sh_read_line(void) { char *line = NULL; size_t bufsize = 0; // have getline allocate a buffer for us if (getline(&line, &bufsize, stdin) == -1) { if (feof(stdin)) // EOF { fprintf(stderr, "EOF\n"); exit(EXIT_SUCCESS); } else { fprintf(stderr, "Value of errno: %d\n", errno); exit(EXIT_FAILURE); } } return line; }We have given an implementation of the parser for you, but make sure you understand what getline is doing.
Parsing the lineNow that we have the line inputted by the user, we need to parse it into a list of arguments. We won't be supporting backslash or quoting in our command line arguments. The list of arguments will be simply be separated by whitespace. What this means is a command likeecho "hello world" , will be parsed into 3 tokens: echo , "hello , and world" (and not into 2 tokens echo , and hello world as it should be ideally).
That being said, the parser, #define SH_TOK_BUFSIZE 64 #define SH_TOK_DELIM " \t\r\n\a" char **sh_split_line(char *line) { int bufsize = SH_TOK_BUFSIZE; int position = 0; char **tokens = malloc(bufsize * sizeof(char *)); char *token, **tokens_backup; if (!tokens) { fprintf(stderr, "sh: allocation error\n"); exit(EXIT_FAILURE); } token = strtok(line, SH_TOK_DELIM); while (token != NULL) { tokens[position] = token; position++; if (position >= bufsize) { bufsize += SH_TOK_BUFSIZE; tokens_backup = tokens; tokens = realloc(tokens, bufsize * sizeof(char *)); if (!tokens) { free(tokens_backup); fprintf(stderr, "sh: allocation error\n"); exit(EXIT_FAILURE); } } token = strtok(NULL, SH_TOK_DELIM); } tokens[position] = NULL; return tokens; }At the start of the function, we begin tokenizing by calling strtok() which returns a pointer to the first "token". What strtok() actually does is return pointers to within the string you give it (we call that pointer token ), and places a null terminator \0 at the end of each token. We store each pointer in an array (buffer) of character pointers called tokens .
Finally, we reallocate the array of pointers if necessary. The process repeats until no token is returned by strtok() , at which point we null-terminate the list of tokens.
Part 0: Debugging the shell program with GDBOn UNIX systems the main debugger is GDB (GNU debugger). To be able to comfortably debug your code compile it with the -g option which will instruct the compiler to generate debug symbols (variable names, source lines, etc.) for the program. For example, download sh_t.c and change the command you ran to build sh.c to$ gcc sh_t.c -o utsh -Wall -g -fno-picThis will compile your shell program with debugging symbols ( -g flag), and for simplicity avoid generating position independent code ( -fno-pic flag). Then you can start you program under control of gdb:
$ gdb utshThis starts gdb ready to execute your utsh program. To get it running type the run command in the GDB command prompt (or just r for short) :
(gdb) runNow the program runs and prints a shell prompt: utsh$ GDB is a feature-rich debugger, and it will take you some time to learn all the features. Here are a few starting points: GDB tutorial, GDB intro and GDB Cheat Sheet. Probably, the best resource for this homework is Operating Systems from 0 to 1. Chapter 6. Runtime Inspection and Debug (it is strongly recommended to read this chapter). At a high level you need only two main things: 1) breakpoints and 2) ability to examine data. Breakpoints can be set with the "b" command inside gdb. Breakpoints and single stepping
Running the program on its own is not that useful. Lets try setting a breakpoint on the "main" function to examine what the program is actually doing. Type break main (gdb) b main Breakpoint 3 at 0x401100: file sh_t.c, line 120. (gdb) r The debugger stopped at the beginning of the main function (line 120 of sh_t.c). You can examine the source code of the program by typing list (or l for short). (gdb) list 115 @return status code 116 */ 117 int main(int argc, char **argv) 118 { 119 // Run command loop. 120 sh_loop(); 121 return EXIT_SUCCESS; 122 } Now you can execute the program line by line by typing next (or n for short), which executes the next line. By default typing next will skip over functions. Type step (or s for short) to step into a function. Try stepping into the sh_loop function by running step. (gdb) s sh_loop () at sh_t.c:95 95 { We are now inside the sh_loop function. Type l to list the source code, and then type n repeatedly to execute the function line by line. Note that we can also type n once, and then simply hit Enter asking GDB to execute the last command for us. (gdb) l 90 91 /** 92 @brief Loop getting input and executing it. 93 */ 94 void sh_loop(void) 95 { 96 char *line; 97 char **args; 98 int status; 99 (gdb) n 101 printf("utsh$ "); (gdb) list 96 char *line; 97 char **args; 98 int status; 99 100 do { 101 printf("utsh$ "); 102 line = sh_read_line(); 103 args = sh_split_line(line); 104 status = sh_execute(args); 105 (gdb) n 102 line = sh_read_line(); (gdb) TUI: Graphical User InterfaceThe second most useful feature is the TUI mode that turns GDB into a real modern debugger. Here is a useful discussion about TUI.You can switch into TUI by pressing Ctrl-X and then "1", or start gdb in TUI mode right away $ gdb utsh -tuiYou can also type tui enable in the gdb command prompt (this command doesn't work with some terminal emulators like Kitty, try another one). Start the program from the begginging and single step it with n and s. The source code of the program will be scrolling in the TUI window in the top part of the screen. Examining dataYou can print values of variables with "print", e.g., print the values of i and sum (gdb) p status $1 = 1 (gdb) Conditional breakpointsWhile debugging programs it's often useful to see what the program is doing right before it crashes. One way to do this is to step through, one at a time, every statement of the program, until we get to the point of execution where we want to examine the state of the program. This works, but sometimes you may want to just run until you reach a particular section of code based on a condition, and stop execution at that point so you can examine data at that point of execution.
For instance, in the sh_loop function, you might want to examine the state of the program when the GDB allows you to set conditional breakpoints. To set a conditional breakpoint to break inside the loop of the sum function when the (gdb) b sh_t.c:103 Breakpoint 1 at 0x40139c: file sh_t.c, line 103. (gdb) condition line != 0Note that the 1 in the condition refers to the breakpoint number we were notified about when we initially set the breakpoint. We can also achieve the above in one command statement with the following: (gdb) break sh_t.c:103 if line!=NULL We now start execution of the program with the run or r command, and type (gdb) r utsh$ ll Breakpoint 1, sh_loop () at sh_t.c:103 103 args = sh_split_line(line); When the breakpoint is hit we can check if the value of i is not NULL: (gdb) p line $2 = 0x4056b0 "ll\n" Exploring crashesNow, lets take a look at how you can use GDB to debug your crashing programs. First, lets generate a program that crashes. In sh_t.c, change line 71 totokens[111111] = token; and recompile.
(gdb) r utsh$ sdfsd Program received signal SIGSEGV, Segmentation fault. 0x00000000004012e3 in sh_split_line (line=line@entry=0x4056b0 "sdfsd") at sh_t.c:71 71 tokens[111111] = token;You can use the backtrace (bt) command to look at the backtrace (a chain of function invocations leading to the crash): (gdb) bt #0 0x00000000004012e3 in sh_split_line (line=line@entry=0x4056b0 "sdfsd") at sh_t.c:71 #1 0x00000000004013a1 in sh_loop () at sh_t.c:103 #2 0x0000000000401109 in main (argc=Here, the GDB tells you that sh_split_line got a segmentation fault at line 18 in main.c. You see that there are three stack frames available (0 for sh_split_line, 1 for sh_loop, 2 for main ).
You can use the frame sh_split_line with the list command
(gdb) f 1 #1 0x00000000004013a1 in sh_loop () at sh_t.c:103 103 args = sh_split_line(line); (gdb) l 98 int status; 99 100 do { 101 printf("utsh$ "); 102 line = sh_read_line(); 103 args = sh_split_line(line); 104 status = sh_execute(args); 105 106 free(line); 107 free(args); (gdb) Debugging forked processIt is also possible to debug a child process. See Debugging programs with multiple processes You can test it with the following code: Set a breakpoint afterfork()
#include Part 1: Executing programs (30 Points)NOTE: For the rest of this assignment, you will be doing all the implementation. You are free to modify any functions that we provide, including their signatures. What we provide is a template which we encourage you to use, as we expect it will make things easier for you.Now, finally we can come to the part where we make our tiny shell do what it was created for: starting to execute programs! By now, our shell should start and offer the user a prompt: utsh$ In this part of the assignment, you have to extend the shell to allow simple execution of external programs, for instance utsh$ ls bar.txt foo.txt utsh sh.c utsh$The execution of programs, of course, is handled by the sh_execute function.
int sh_execute(char **args) { if (args[0] == NULL) { return 1; // An empty command was entered. } return sh_launch(args); // launch } You should do this by implementing the Remember to return an appropriate return value from
You might find it useful to look at the manual page for $man 3 execand read about execv .
NOTE: When you type utsh$ /bin/ls This should execute the program You may want to change the utsh shell to always try Your shell should handle arguments to the called program , i.e. this should work utsh$ ls /home aburtsev utsh$TIP: In GDB, if you want to debug child processes, set follow-fork-mode child is sometimes useful. This is a good
reference .
Part 2: I/O redirection (30 Points)Now that you can execute commands, let us extend the features our shell provides. Now you have to implement I/O redirection commands so that you can run: utsh$ echo "utsh is cool" > x.txt utsh$ cat < x.txt utsh is cool utsh$ You should extend You might find the man pages for Part 3: Pipes (40 Points)Finally, you have to implement support for pipes so that you can run command pipelines such as: utsh$ ls | sort | uniq | wc 11 11 85 utsh$ You have to extend Test that you can run the above pipeline. The From one of the CADE machines you should be able to run the following command correctly (here a.out is your utsh shell): $ a.out < t.sh Submit your work Submit your solution through Gradescope Gradescope CS5460/6450
Operating Systems. Pack your shell, / - sh.c
Challenge exercises (total extra 50%, 10% each)The shell we have built is very simple. It does not support built-in commands, likecd , history , etc. It does not support providing a list of commands, or running jobs in the background. There is no support for globbing, quoting or backslash escaping, to name a few important features typical in shells.
You can add any feature of your choice to your shell. But, you may want to consider the following as a start:
|
|||
![]() |
|||
Updated: January, 2024
|