Chapter 9
Socket Programming

File: Steve/Courses/2014/s2/its332/sockets.tex, r3521

We know that many Internet applications use a client/server model for communication: a server listens for connections; and a client initiates connections to the server. How are these client and server programs implemented? In this chapter you will learn the basic programming constructs, called sockets, to create a client and server program. You can use these programming constructs to implement your own client/server application. This chapter explains sockets using the C programming language as an example. Example C source code is given in Appendix D. Sockets are also used in other programming languages. Appendix E gives example Python source code. All the source code is available for download via http://ict.siit.tu.ac.th/~sgordon/netlab/source/.

9.1 Programming with Sockets

Sockets are programming constructs used to communicate between processes. There are different types of systems that sockets can be used for, the main one of interest to us are Internet-based sockets (the other commonly used socket is Unix sockets).

Sockets for Internet programming were created in early versions of Unix (written in C code). Due to the popularity of Unix for network computing at the time, these Unix/C based sockets become quite common. Now, the same concept has been extended to other languages and other operating systems. So although we use C code and a Unix-based system (Ubuntu Linux), the principles can be applied to almost any computer system.

There are two main Internet socket types, corresponding to the two main Internet transport protocols:

  1. Stream sockets use the Transmission Control Protocol (TCP) to communicate. TCP is stream-oriented, sending a stream of bytes to the receiver. It is also a reliable transport protocol, which means it guarantees that all data arrives at the receiver, and arrives in order. TCP starts be setting up a connection (we have seen the 3-way handshake in other labs), and then sending data between sender and receiver. TCP is used for most data-oriented applications like web browsing, file transfer and email.
  2. Datagram sockets use the User Datagram Protocol (UDP) to communicate. UDP is an unreliable protocol. There is no connection setup or retransmissions. The sender simply sends a packet (datagram) to the receiver, and hopes that it arrives. UDP is used for most real-time oriented applications like voice over IP and video conversations.

In this lab we are dealing only with Stream (TCP) sockets.

The basic procedure is shown in Figure 9.1. The server must first create a socket, then associate or bind an IP address and port number to that socket. Then the server listens for connections.


PIC

Figure 9.1: Socket communications

The client creates a socket and then connects to the server. The connect() system call from the client triggers a TCP SYN segment from client to server.

The server accepts the connection from the client. The accept() system call is actually a blocking function—when the program calls accept(), the server does not return from the function until it receives a TCP SYN segment from a client, and completes the 3-way handshake.

After the client returns from the connect() system call, and the server returns from the accept() system call, a connection has been established. Now the two can send data.

Sending and receiving data is performed using the write() and read() functions. read() is a blocking function—it will only return when the socket receives data. You (the application programmer) must correctly coordinate reads and writes between the client and server. If a client calls the read() function, but no data is sent from the server, then the client will wait forever!

9.1.1 Servers Handling Multiple Connections

It is common for a server to be implemented such that it can handle multiple connections at a time. The most common way to do this is for a main server process to listen for connections, and when a connection is established, to create a child process to handle that connection (while the parent process returns to listening for connections). In our example, we use the fork() system call.

The fork() system call creates a new process, which is the child process of the current process. Both the parent and child process execute the next command following the call to fork(). fork() returns a process ID, which may be:

Hence we can use the process ID returned from fork() to determine what to do next—the parent process (pid > 0) will end the current loop and go back to waiting for connections. The child process (pid = 0) will perform the data exchange with the client.

9.1.2 Further Explanation

You should read the source code for the server.c, and then the source code for client.c. The comments contain further explanations of how the sockets communication is performed.

The example code for client.c and server.c came from http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html. You may read through the details on this web page.

Most of the socket system calls are described in detail in their individual man pages. You should use the man pages for finding out further details of each function. Note that you may have to specify the section of man pages to use (which is section 2, the System Calls section):

$ man -S2 accept 
ACCEPT(2)             Linux Programmers Manual           ACCEPT(2) 
 
NAME 
     accept - accept a connection on a socket 
...

Note 3. Unix man pages man pages in Unix are grouped into sections. There may be a command/file/function in Unix with the same name, but in different sections. Execute man man to get a list of the sections. For example, accept is a System Call (Section 2) as well as a System Administration command (Section 8). Executing man accept will give you the manual page for the System Administration command. To see the manual page for the System Call, you must explicitly specify the section: man -S2 accept. If you don’t know the section you ar elooking for, use the -k option to search, for example man -k accept.

9.2 Tasks

For the following tasks you should capture the tests using Wireshark to understand the relationship between the applications and the network communications. For new programs you create, you must demonstate the program and source code to the instructor.

Task 9.1. Download, compile and test the provided client/server sockets programs.

Task 9.2. Modify the client/server programs to allow exchange of multiple messages. To do so, create a “fake” login mechanism. The server should ask the client for a username by sending a username message, and then the client will send the username to the server. Then the server will prompt for a password by sending the password message, and then the client will send the password to the server. Finally, the server will check if the username and password match an already known username/password pair, and send a response back to the client. Then both the client and server can finish (of course the server should still handle more connections). The output of the interaction should look as below:

     Client                               Server  
 
1.   Username:  
2.   <user types in username, eg. "X">  
3.                                        Client login with username "X"  
4.   Password:  
5. <user types in password, eg. "Y">  
6. Client entered password "Y"  
7. Username and password are correct.  
8. You are now logged in.

Task 9.3. Using the supplied client/server sockets programs, implement a third proxy server.