Server Lab

← prev up next →

Server Lab

This lab is based on the Tiny web server by Bryant and O’Hallaron for Computer Systems: A Programmer’s Perspective, Third Edition

Due: Wednesday, December 7, 11:59pm

In this lab, you’ll modify a web server to implement a concurrent chat facility. Part of the server’s functionality will involve acting as a client to import conversations from other web servers that implement the chat protocol.

Implementing the user interaction and service queries enough to handle the "simple.rkt" test is worth a check grade (i.e., 80%). Adding concurrency well enough to handle the "trouble.rkt" test is worth a check~ (which will count as 90%). Finally, adding support the “import” service and passing the "stress.rkt" test is worth a check+ (i.e., 100%).

Getting Started

Start by unpacking servlab-handout.zip. The only file you will be modifying and handing in is "tinychat.c".

You can build and run the initial web server, which is a variant of the book’s Tiny web server, with

$ make

$ ./tinychat ‹port›

where ‹port› is some number between 1024 and 65535. After starting tinychat, you can connect to it from a web browser on the same machine with

http://localhost:‹port›/

The initial server replies to any request with an HTML page that contains a simple form. The server also prints to standard output any query parameters that it received through the request, both in the URL and as POST data. Note that if you submit the form, the text that you entered is printed as a content query parameter.

Server Behavior

Your revised server must keep track (in memory) of any number of conversations, where each conversation is identified by a topic case-sensitive topic string. Any number of clients should be able to connect to the server (up to typical load limits) and communicate in any number of different conversations. The server starts with an empty set of conversations.

To allow users to participate in conversations, your server must work simultaneously in two ways: in a user- and browser-friendly mode as described in User Interaction, and in a program-friendly mode as described in Service Queries.

Finally, your server must be highly available as described in Availability and Client Constraints, which means that it is robust against slow or misbehaving clients. This availability requirement will require concurrency in your server implementation. For grading, we will test your server by throwing a mixture of clients—fast and slow, behaving and misbehaving—all at the same time.

Your server’s output to stdout and stderr will be ignored, so you can use those streams for logging or debugging as you see fit.

User Interaction

When your server is contacted with the root URL (like the one shown above), it should present a form to let a user “sign in” to a specific conversation. That is, the user should provide a name and a topic string. If no conversation exists for the given topic string, a new one is started as initially empty.

The result should be a page that shows the full content of the conversation up to that point, plus a text box for a user to add a new entry to the conversation. Each conversation entry should be prefixed by the name of the user that contributed the content.

If the user adds text from the conversation-specific page, the given text should be added to the conversation as an entry specific from that user. The user should not explicitly re-provide their name or topic when adding to the conversation at this point, since that information should be carried over from the sign-in page.

If the user submits empty text, then the conversation page should just refresh, instead of adding an entry to the conversation. Unlike a real chat program, the conversation does not have to update automatically in a user’s browser as others contribute to the conversation; sending empty content is a way for the user to get the latest conversation content.

The styling and precise details for user interaction are up to you, as long as user interactions work at least in the ways described above.

You are free to use any form of URLs (including query parameters and fragments) to implement user interactions, as long as the root URL is a starting point and as long as other URLs can’t conflict with the URLs for Service Queries. Whatever strategy you use, the browser will have to carry a topic and user name from the sign-in step to later steps that add to a conversation.

For relevant information about HTML forms and HTTP, see About HTML Forms.

Service Queries

In addition to a browser-oriented view of a conversation, your server must support three GET service actions that are more suitable for automated clients:

/conversation?topic=‹topic› — Returns the content of the conversation identified by ‹topic› as plain text (i.e., text/plain; charset=utf-8). The result is empty if no such conversation exists already.
Each conversation entry has the form ‹user›: ‹content› followed by a carriage return and newline. Entries are in the order that they were added to the conversation. Some ‹content› may itself contain :, carriage returns, and newlines; the intent is that conversations are accumulated as strings and don’t need to be parsed by a program.
/say?user=‹user›&topic=‹topic›&content=‹content› — Add ‹content› as a conversation entry by ‹user› to the end of the conversation specified by ‹topic›. Unlike user interactions, supplying an empty ‹content› should still add to the conversation.
The result sent back to the request is unspecified (but must be a valid HTTP response).
/import?topic=‹topic›&host=‹host›&port=‹port› — Contacts a conversation server running on ‹host› at ‹port› to get its conversation content for ‹topic›. That content is appended to the server’s own conversation for the same ‹topic› (creating the conversation if it does not exist already).
The given ‹host› and ‹port› should refer to some chat server. The server that receives the import request should itself not fail if the server at ‹host› and ‹port› fails to respond, and it should only import a conversation if the server reports succees with status code 200. As long as the HTTP status code from the server at ‹host› and ‹port› is 200, then the importing server can assume that the conversation is well-formed.
The given ‹host› and ‹port› can refer to same server that received the import request. In that case, as long as no other client is accessing the server at the same time, the conversation for ‹topic› will become two appended copies of its content.
The importing server should not return a result to the import-requesting client until the import succeeds. The result sent back to a client that makes an import request is unspecified (but must be a valid HTTP response).

In the queries described above, ‹topic›, ‹user›, and ‹content› should all be interpreted as UTF-8 text with the usual encoding within URLs. Basically, that means you don’t have to worry about encodings as long as you use the provided parsing functions.

As an example, suppose that your server has just started running on localhost at port 8090. Then,

$ curl "http://localhost:8090/say?user=me&topic=demo&content=one"

$ curl "http://localhost:8090/say?user=you&topic=demo&content=two"

$ curl "http://localhost:8090/import?topic=demo&host=localhost&port=8090"

$ curl "http://localhost:8090/conversation?topic=demo"

should print, as a result of the last request,

me: one

you: two

me: one

you: two

User interactions and service queries operate on the same set of conversations. For example, if a user adds to a conversation through a browser, then that entry should be reported by a request through /conversation that specifies the same topic.

Availability and Client Constraints

You must make essentially no assumptions about clients of your server. It’s ok to limit the initial request line and header lines to MAXLINE characters. Otherwise, as long as clients follow the HTTP protocol and as long as machine resources are not exhausted (including memory or allowed TCP connections), your server should continue to respond to new requests in a timely manner. Your server must not run out of resources as a result of failing to free unneeded memory or close finished connections.

You should make a good effort to report errors to clients, but it’s also ok to just drop a client that makes an invalid request. It’s ok if communication errors cause the error-checking csapp.[ch] functions to print errors; the starting server includes exit_on_error(0) so that discovered errors are printed and the function returns, instead of exiting the server process.

As long as a client is well behaved, the server should not drop a client’s addition to a conversation, even if multiple clients are adding to the same conversation at the same time. For example, if three clients each add 1000 entries to a conversation concurrently, the result will be a conversation with 3000 entries. The interleaving of entries from the different clients is unspecified in that case, but each entry should be represented by consecutive characters (i.e., two entries must not be interleaved at the character level).

There is no a priori limit on the length of user names, topic names, conversation content, or time that the server must stay running. If your server runs out of memory because the given data is too large, that’s ok. If your server crashes because it didn’t expect a user name to have 0 characters or to have 1,753 characters, that’s not ok. If you server runs out of memory because it has a leak and cannot deal with thousands of sequential requests to access a conversation topic, that’s also not ok.

We will not probe the efficiency of your server by checking whether many additions to a conversation are handled in linear time, or whether a new conversation can be added in constant time. In particular, it should work to simply store a conversation as a string and allocate a new string to append any addition to the conversation.

When reporting a conversation back in human-readable form for user-interactive mode, your server should never generate ill-formed HTML as a result of conversation content, even if the conversation includes, say, the character < by itself. We will send arbitrary data through the service-query interface to make sure that it is reported back intact through the service-query interface, and we’ll try perverse user names and topic strings such as conversation, &, or ". The "more_string.[ch]" library provides relevant encoding and decoding functions.

Support Libraries

In addition to "csapp.[ch]", the starting code in servlab-handout.zip includes "dictionary.[ch]" and "parse.[ch]".

The "dictionary.[ch]" library provides an implementation of dictionaries with case-sensitive or case-insensitive keys. When you add to the dictionary, the given string key is copied, but the given value pointer is added as-is. When creating a dictionary, you supply a function that is used to destroy values in the dictionary when they are no longer needed. For example, if data values are allocated with malloc, supply free to be used to destroy data values when free_dictionary is called or when the value is replaced with a different one. See "dictionary.h" for more information.

The "more_string.[ch]" library provides string helpers and functions for some basic parsing and encoding functions. See "more_string.h" for details.

The starting code in servlab-handout.zip includes a few tests as sanity checks (but you’ll need more of your own):

The "simple.rkt" script runs some basic checks and reports any issues that it detects, but it doesn’t use the /import service query. If a server is running on localhost at port 8090, then
$ racket simple.rkt localhost 8090
reports whether problems are found.
The "trouble.rkt" script helps check how well a server responds to misbehaved clients, and it requires the server to handle concurrent connections. If a server is running on localhost at port 8090, then
$ racket trouble.rkt localhost 8090
reports its status. If the script doesn’t finish in 5-10 seconds, then something has gone wrong.
The "trouble.rkt" script may cause your server to report many connection errors. That’s fine, and you may want to redirect output to /dev/null when running this test. There’s only a problem if the "trouble.rkt" script itself reports errors.
The "stress.rkt" script throws lots of concurrent service queries at a server and uses the /import service query. If a fresh server is running on localhost at port 8090, then
$ racket stress.rkt localhost 8090
runs the stress test and prints no output if no problems are found.
The "stress.rkt" script assumes that the server state is fresh (i.e., all conversations are empty). If your server prints logging information to stdout or stderr, you’ll probably want to redirect it to /dev/null when running this test.

About HTML Forms

For the user interaction part of your server as described in User Interaction, you’ll need a basic knowledge of HTML and forms. Although it’s not covered in the book or lecture, the web is full of information on those topics, naturally.

The initial code serves a form that is implemented as

<form action="reply"

method="post"

enctype="application/x-www-form-urlencoded"

accept-charset="UTF-8">

</form>

The action="reply" part causes a browser to send a request with the path /reply when a user clicks the “Send” button on the served page. The method="post" part causes that request to be a POST request, as opposed to a GET request. The enctype="application/x-www-form-urlencoded" part causes the POST request data to use the same format as the query part of a URL (without a leading ?). The accept-charset="UTF-8" part of the form makes Unicode (including emoji) characters work as submitted content.

Specifically, since the text field has name="content", the data sent in a POST request if the user submits “Hello there!” will be

content=Hello+there%21

To include extra information in the POST sent back to the server, you can use a hidden field in the form. For example, adding

to the form above causes the sent data to be

content=Hello+there%21&demo=yes

When generating HTML that embeds an arbitrary hidden value, be sure to use entity_encode to avoid generating ill-formed HTML.

Tips

Pay attention to the ownership rules for values added to or extracted from a dictionary. See Support Libraries and the comments in "dictionary.h".
If you get seg faults or if values seem to be changing out from under you, don’t forget to try tools like Valgrind.
Hidden fields in an HTML form can be a good way to propagate a user name and conversation topic from the sign-in page, so that they’re sent back to the server as part of the form to add an entry to a conversation.
Use functions like query_encode and entity_encode to ensure that an unusual user name or topic string doesn’t create trouble when embedded in HTML or a URL.
More generally, you don’t want to be in the business of encoding, decoding, or parsing strings. If you find yourself having to parse, encode, or decode a string, check again whether functions in "more_string.[ch]" could be used for the job—maybe with a slightly different approach to the communication pattern.
You’ll need to use Pthread_create to make your server handle multiple clients concurrently, but it makes sense to add concurrency as a last step.
When you do add concurrency, you’ll need to make sure that uses of the conversation table are properly synchronized. The "stress.rkt" script mostly tries to check your server’s synchronization.

← prev up next →

	Schedule [subject to change]
	Course Staff
	Course Description
	Videos
	No Videos
	Labs

	Bomb Lab
	Performance Lab
	Linking Lab
	Shell Lab
	Malloc Lab
	Server Lab