Issue 15, Fall 1999

Client-Server Applications

Lincoln D. Stein

Packages Used

Chatbot::Eliza:                   http://www.perl.com/CPAN
IO::Socket, IO::Handle:     bundled with Perl

Ah, for the good old days, when real programmers used vi, networking software was written in C, and monolithic client/server applications ruled the Internet. Nowadays, of course, it's easy to slap up a network application in a matter of minutes with a little CGI scripting, and if you are using a Web authoring tool like Microsoft FrontPage, you can even write a serviceable network application without knowing any programming whatsoever.

Still, there are times when the Web paradigm breaks down, and you need an application that has the immediacy and interactivity of an old-style client/server application, where custom software on both sides of the connection exchange data using protocols designed specifically for the purpose at hand. It's not generally appreciated how easy it is to write client/server applications in Perl. In this column, I'll show you how to do it.

Our running example uses the Chatbot::Eliza module, John Nolan's marvelous pure-Perl clone of Joseph Weizenbaum's classic psychotherapist simulation and the subject of his TPJ #9 article. Chatbot::Eliza works very simply. It accepts a line of input from the user, transforms it according to a clever set of transformational rules, and echoes it back to the user in the form of a question. The effect is a lot like talking to a Freudian psychoanalyst, except that the conversation never seems to go anywhere. (But do real-life psychoanalysis sessions go anywhere either? What does that mean to you?)

Although Chatbot::Eliza can be customized extensively, most people will use its command-line interface, which can be invoked in just four lines of code (see Listing 1). The program begins by loading the Chatbot::Eliza module. Line 3, which puts STDOUT into autoflush mode, isn't strictly necessary for this example, but comes in handy when we turn the script into a server in the next section.

Listing 1

 0 #!/usr/bin/perl
 1 # file eliza.pl
 2 use Chatbot::Eliza;

 3 $| = 1;
 4 my $bot = Chatbot::Eliza->new;
 5 $bot->command_interface;

The script now creates a new Chatbot::Eliza object, which is stored in the $bot variable, and invokes the object's command_interface() method. This method now takes over the show, reading user input from STDIN, and writing transformed text to STDOUT. When you run the script, the conversation goes something like this:

  % eliza.pl
  Eliza:    Is something troubling you?
  you:    Yes, my mother is getting on my nerves.
  Eliza:    What else comes to mind when you think of your mother?
  you:    I think of my father, of course.
  Eliza:    Your father?
  you:    Yes, the husband of my mother.
  Eliza:    Your mother?
  you:    Yes, the wife of my father.
  Eliza:    Your father?
  you:    Yes, the father of my brother.
  Eliza:    Tell me more about your family.
  you:    I have a dog too.
  Eliza:    That's quite interesting.
  you:    quit
  Eliza:    Life is tough. Hang in there! 

Eliza may seem more insightful when you try it.

What if we wanted to make Chatbot::Eliza available via the Internet? You could probably write a CGI wrapper around the module and use text entry fields to input lines of text, but this wouldn't be entirely straightforward. For one thing, there would be the overhead of receiving and rendering the entire HTML page for each line of the conversation, and the rendering delay would make the conversation feel less spontaneous. For another thing, the Chatbot object maintains a certain amount of state. That is, it remembers portions of the session, allowing it to return to previous topics and restart stalled conversations. However, Web sessions are not stateful, and in order to create a CGI interface to Chatbot::Eliza, you'd have to maintain a pool of Chatbot::Eliza objects in memory or disk, and associate them with users so that one user doesn't end up in the middle of an user's psychoanalysis session. This can be done, but it's not short or elegant.

Using the inetd Super-Daemon

In contrast, we can turn Listing 1 into a server without changing a line of code. If you are using a Unix (or Linux) system, the inetd super-daemon will do all the dirty work for you. Just add the following line to the bottom of the /etc/inetd.conf configuration file:
  12000 stream tcp nowait lstein /tmp/eliza/eliza.pl eliza.pl

You must have superuser access to edit this file. Replace the user name lstein with your own login name, and adjust the path to the eliza.pl script to reflect its actual location.

When you're done editing the file, restart the inetd daemon by sending it a HUP signal. You can do this by finding its process ID (PID) using the ps command and then using the kill command to send the signal. For example:

% ps aux | grep inetd
  root    657  0.0  0.8  1220  552 ?       S    07:07   0:00 inetd
  lstein  914  0.0  0.5   948  352 pts/1   S    08:07   0:00 grep inetd
% kill -HUP 657 

Two shortcuts which work on many Linux systems are shown below:

  % kill -HUP 'cat /var/run/inetd.pid'
  % killall -HUP inetd

Now you can use the vanilla telnet program to talk to Eliza. telnet acts as the client, and eliza.pl is the server:

 % telnet localhost 12000
 Trying 127.0.0.1...
 Connected to localhost.
 Escape character is '^]'.
 Eliza:  Hello, I am a computer program. 
 you:    I am a human being.
 Eliza:  How long have you been a human being?
 you:    About 39 years.
 Eliza:  That is interesting. Please continue.
 you:    It started when I was born...
 Eliza:  Why do you tell me you were born now?
 you:    Because you asked.
 Eliza:  We were discussing you -- not me.
 you:    bye
 Eliza:  Goodbye. It was nice talking to you.
 Connection closed by foreign host. 

If you aren't on a Unix system, or don't have superuser access, don't worry. A Chatbot::Eliza server that will work for you is coming up soon.

Now let's look at that inetd.conf line in more detail. It's divided into seven fields delimited by whitespace (tabs or spaces):

12000: This is the port number that the server will listen to, and can be any number between 1024 and 65536. Numbers between 1-1023 are reserved for use by standard services like email and the Web. Port numbers above this range can be used by any program, but only one program can use a given port at a time. Be sure to check that your system doesn't already use a particular port for some service before adding a new server (you can use the netstat program for this purpose). 12000 is usually a pretty safe bet. This number can be replaced by a symbolic name taken from the file /etc/services.

stream: This field specifies the server type, and can either be stream for connection-oriented services that send and receive data as continuous streams of data, or dgram, for services that send and receive short fixed-length messages. Any program that reads STDIN and writes STDOUT is a stream-based service, so we use stream here.

tcp: This specifies the communications protocol, and may be either tcp or udp (many systems also support a few more esoteric protocols, but we won't discuss them here). The TCP protocol is a connection-oriented, reliable protocol that is used for stream-type communication. UDP is used for message-oriented datagrams. Stream-based services will use tcp.

nowait: This tells inetd what to do after launching the server program. It can be wait, to tell inetd to wait until the server is done before launching the program again to handle a new incoming connection, or nowait, which allows inetd to launch the program multiple times to handle several incoming connections at once. The most typical value for stream-based services is nowait, since communications sessions may be minutes or hours long. The implication of this value, however, is that there may be several copies of the script running at once. Some versions of inetd allow you to put a ceiling on this value.

/tmp/eliza/eliza.pl: This is the full path to the program. /tmp is not the best place to put executables, since many systems clear /tmp at boot time, but it suffices for tests and demos like this one. You'll want to choose a more stable directory, such as /usr/local/bin, or /usr/local/sbin.

eliza.pl: The seventh and subsequent fields are command-line arguments for the script. This can be any number of space-delimited command line arguments and switches. By convention, the first field is the name of the program itself. You can use the actual script name, as shown here, or make up your own name, such as "elizabot". This value will show up in the script in the $0 variable. Other command-line switches will appear in the @ARGV array in the usual manner.

inetd allows you to take any program written in Perl (or another language) and turn it into a server. The main restriction is that the program must use standard input, standard output, and standard error for its interface. Fancy stuff, such as Curses-based graphics, will probably not work. There are very few gotchas, the main one being output buffering issues. By default, when Perl detects that STDOUT is not connected directly to the user's screen, it will buffer its print() statements to make them more efficient. This is okay when Perl is writing to a file, but not okay when it's writing to the network under the control of inetd. The Chatbot::Eliza object will write its initial greeting, but because the greeting is short it just gets buffered on the server side of the connection and the user never sees it.

The solution to this problem is simple. Just turn on autoflushing by setting the $| global to true.

A Standalone Server

What if you don't have superuser access, or are using a system that doesn't support the inetd super-daemon? Or what if the script has to load a lot of modules at startup time, making the launch time delay unacceptable for your application?

Under these circumstances, you can write a standalone server that does all the networking stuff itself. Thanks to Graham Barr's wonderful IO::Socket module, the code is not all that much more complicated than the original script.

Before we walk through the code, some socket theory. Much of the Internet runs across Berkeley sockets, a networking API (application programming interface) that was part of one of the early Berkeley Standard Distribution releases of Unix. A socket is a communications endpoint that can be connected to another socket somewhere else on the same machine or Internet. There are different types of sockets corresponding to different network protocols and each having a unique addressing scheme. The most familiar kind, the TCP/IP socket, uses an address consisting of an IP address and a port number.

Once connected, data sent to the socket at one end appears out the end, and vice versa. From Perl's point of view, sockets are filehandles, just like the more conventional ones that are connected to files and pipes. This makes writing networked applications extremely straightforward.

Consider this complete networking client:

  use IO::Socket;
  my $s = IO::Socket::INET->new( PeerAddr => 'phage.cshl.org',
                               PeerPort => 'daytime');
  die "Can't connect: $@" unless $s;
  print <$s>; 

The first line loads the IO::Socket module, defining a number of new object classes for dealing with sockets and a number of handy constants. The second line attempts to create a new IO::Socket object. There are currently two subclasses of IO::Socket. One, called IO::Socket::INET, is used for Internet communications using TCP/IP. The other, used for communications between two process on the same machine, is called IO::Socket::UNIX.

As we want to make an Internet connection, we attempt to create an IO::Socket::INET object by calling its new() method. new() recognizes multiple named arguments. In this case, we need just two: PeerAddr gives the name of the remote host to contact to (in this case "phage.cshl.org") and PeerPort gives the port number or symbolic name of the service to connect to, in this case the "daytime" service that runs on many Unix machines. new() attempts to connect to the indicated machine and port. If successful, it returns a new IO::Socket object. Otherwise new() returns an undefined value and leaves an error message in $@.

Once created, a socket object looks and feels a lot like a read/write filehandle. You can use it as the argument to print(), or read lines from it using the angle-bracket operator (<>). Socket objects also support a large number of input/output methods inherited from the IO::Handle base class. For example, you can call $socket->print() to transmit some data across the connection, and you can call $socket->print getline() to fetch a line of text.

The daytime service waits for incoming connections and then transmits its idea of the current day and time. We read whatever text it sends us using the <> operator, and immediately print it.

If you run the program, you'll see this (adjusted for the correct time, of course):

  % daytime.pl
  Thu Sep 16 09:50:48 1999 

Servers are not much harder to write. Listing 2 gives the source code for eliza_server.pl, a network-ready pseudo-psychoanalyst. It begins by importing the Chatbot::Eliza and IO::Socket modules, and brings in the WNOHANG constant from the POSIX module (used by the CHLD signal handler, see below).

 
Listing 2

  0  #!/usr/bin/perl
  1  # file: eliza_server.pl
  2  use Chatbot::Eliza;
  3  use IO::Socket;
  4  use POSIX 'WNOHANG';
    
  5  use constant PORT => 12000;

  6  # signal handler for child die events
  7  $SIG{CHLD} = sub { while ( waitpid(-1,WNOHANG)>0 ) { } };
        
  8  my $listen_socket = IO::Socket::INET->new(LocalPort => PORT,
  9                                           Listen => 20,
 10                                           Proto  => 'tcp',
 11                                           Reuse   => 1);
 12  die "Can't create a listening socket: $@" unless $listen_socket;
 13  warn "Server ready.  Waiting for connections...\n";   
    
 14  while (my $connection = $listen_socket->accept) {
 15      die "Can't fork: $!" unless defined (my $child = fork());
 16      if ($child == 0) {
 17          $listen_socket->close;
 18          interact($connection);
 19          exit 0;
 20      }
 21  } continue {
 22          $connection->close;
 23  }
    
 24  sub interact {
 25      my $sock = shift;
 26      STDIN->fdopen($sock,"r")  || die "Can't reopen STDIN: $!";
 27      STDOUT->fdopen($sock,"w") || die "Can't reopen STDOUT: $!";
 28      STDERR->fdopen($sock,"w") || die "Can't reopen STDERR: $!";
 29      STDOUT->autoflush(1);
 30      Chatbot::Eliza->new->command_interface;
 31  }
 

The code then defines a constant containing the port number to run on. We use 12000 again here. Be careful if you've already installed the inetd version of the script, because they both can't share the same port. You should either deactivate the inetd configuration line (by commenting it out and sending inetd a HUP signal), or change the constant to an unused port.

Line 6 sets a handler for the CHLD handler. I will explain this technical detail after the main code walkthrough.

The real fun begins in line 8, where we create a new IO::Socket object to accept incoming connections. Again we call the IO::Socket::INET class' new() method, but the arguments are quite different. Instead of providing new() with PeerAddr and PeerPort arguments, we hand it LocalPort and Listen arguments. LocalPort tells new() that it is to "bind to" (associate itself with) local port 12000, and Listen tells new() that the socket will be used to accept incoming connections. The numeric argument to Listen specifies how many incoming requests can be queued up while waiting the server to call accept() (see Listing 2). For this presumably low-volume service, 20 simultaneous connections is a very generous assumption! The other two arguments are not strictly necessary. Proto specifies the communications protocol, in this case tcp. Since stream-based TCP servers are much more common than message-based UDP servers, IO::Socket::INET's new() method will default to TCP unless otherwise specified.

The Reuse argument tells new() that it is okay to reuse the port number if the program is killed and immediately restarted. Ordinarily, the operating system will impose a small delay of about 90 seconds between the time a socket is killed and the time its port can be reused. During this time, new() will be unable to create a new socket. This delay is protection against one program accidentally inheriting another program's delayed incoming connections. However, this protection is irrelevant when it's the same program opening the socket, so servers generally set Reuse to a true value in order to disable this delay.

Another optional argument, not used in this example, is LocalAddr, which takes a local hostname or IP address. In the event that your machine has more than one network interfaces (or multiple IP addresses associated with the same interface), you can use LocalAddr to choose which interface the socket should listen to. If not specified, the socket will accept incoming connections bound to any of your machine's IP addresses.

If something goes wrong, new() returns undef and places a description of the error in the $@ global. We die() with a suitable message (line 12). Otherwise, we store the returned socket object in the variable $listen_socket. Technically, the socket returned by this call is a "listen socket", as opposed to the "connected socket" that was returned by new() in the short example earlier.

The loop between lines 14 through 23 is where all the action happens. Multiple clients are going to connect to our server, and we must service each one in turn. Since we don't know in advance when a connection is going to come in, the most efficient way to do this is to go to sleep and let the operating system tell us that a new connection is ready for servicing. The accept() method does this. It suspends the process until an incoming connection is attempted, at which point it completes the connection and returns (to the server) a brand new socket object connected to the remote client. The server uses this connected socket to talk to the client. When it's finished, it closes the connected socket. Meanwhile, the original listening socket is still available to accept() new incoming connections.

At the top of the loop (line 14), we call the listen socket's accept() method, and some time later it returns a connected socket. We could now go ahead and work with the connected socket, but there would be a slight problem. While we were working with the connected socket, other clients might be trying to connect, and wouldn't get an answer from the server until it called accept() again. Our server wants to call accept() again as soon as possible - preferably at the same time that it's servicing the current connection.

To do this requires the server to walk and chew gum at the same time. On Unix systems, fork() is the way to do multiprocessing (we'll turn to Windows real soon now). The fork() call spawns a duplicate process called the "child." The child is identical in every respect to its parent, but with one difference. In the parent process, the fork() call's return value is the process ID of the child. In the child process, fork() returns numeric 0. In case of an error, fork() returns undef. The strategy here is for the child process to handle the task of talking to the connected client, while the parent goes back to the top of the loop and calls accept(). This way many clients can connect simultaneously; each will will have a dedicated child process to talk to.

Line 15 calls fork() and saves the result code to the variable $child. If $child is undefined, then the fork() failed for some reason, and the server dies. Otherwise, it looks at the return value. If the value is equal to numeric 0, then the server knows it's in the child process. The child won't be calling accept() again, so it doesn't need the listen socket, so it closes it by calling the socket's close() method. While this closing is not strictly necessary, in network communications it's always a good idea to tidy up unneeded resources, and it avoids the possibility of the child inadvertently trying to perform operations on the listen socket.

The child now calls a subroutine named interact(), passing it the connected socket object. interact() manages the Eliza conversation and returns when the user terminates the connection (by typing bye, for example). After interact() returns, the child process itself terminates by calling exit().

Meanwhile, back in the parent process, the main loop closes the connected socket by calling its close() method (line 22) and goes back to the top of the loop to accept more connections. Explicitly closing the connected socket in this way is good practice because it avoids the possibility of the parent inadvertently interfering with the child's I/O.

The actual input/output operations are performed in the interact() subroutine, lines 24-31. There's a slight problem with wiring Chatbot::Eliza up to the network, because Eliza's command_interface() method is hardwired to read and write to STDIN/STDOUT, whereas we want it to communicate via the connected socket. We could fix this by reaching into the chatbot's published lower-level methods and calling the routines to print the prompts and transform strings ourselves. This isn't much work, but there's an even lazier way to do it. We simply replace the default STDIN and STDOUT filehandles with the connected socket by reopening them.

When we loaded IO::Socket, it also brought in methods from its parent class, IO::Handle. Among these methods is a filehandle method called fdopen(), which allows you to do a brain transplant on any previously opened filehandle, including the standard ones. Essentially, fdopen() closes the existing filehandle and reopens it using information from another filehandle that you give it. We call fdopen() three times, once each for STDIN and STDOUT, and once for STDERR for good measure. Each time we call fdopen(), we pass the socket object and a symbolic file access code. STDIN is reopened for reading with a mode of r, while STDOUT and STDERR are both reopened for writing with a mode of w. Now, almost as if by magic, writing to STDOUT and STDERR will send data flying down the socket, and reading from STDIN will perform a read on the socket.

The last detail is to call STDOUT's autoflush() method. This is equivalent to setting $| to a true value, but is a bit easier to understand. At this point, we create a new Chatbot::Eliza object and invoke its command_interface() method.

When you run this program, it wil print out a message saying that it's waiting for connections, and then it will appear to hang. Go to a second command-line window, telnet to port 12000, and talk to the psychiatrist for a while. Leaving the session open, go to another window, and telnet to the server again. The server should be able to handle both sessions simultaneously. If you go to a fourth window and run the ps command, you should see three copies of the script running. One is the parent server waiting in the accept() method for incoming connections. The other two are the children spawned to deal with the two running sessions.

To stop the server, go back to the original window and press the interrupt key (usually control-C).

Now to explain the CHLD signal handler. Whenever a parent process forks a child, and the child exits before the parent does, the Unix operating system gives the parent a chance to examine the status code from the child to see if it exited normally or as the result of an error. The CHLD signal is used to alert the parent that something has happened to its child, and the wait() and waitpid() calls are used to retrieve the status code, a process known as reaping. In this particular case we don't care about our childrens' exit status codes, but the operating system doesn't know that. If a child exits and the parent doesn't wait() on it, a mummified version of the child process will hang around in the system process table until the parent either waits on it, or the parent exits. These so-called zombie processes can take up system resources and are generally undesirable.

The general technique of avoiding this problem is to install a CHLD signal handler in the parent. The handler calls wait() or waitpid() to retrieve the status code of the exited process and allow the zombie to go to its eternal reward. For a variety of reasons involving the handling of stopped processes and the rare event in which two children exit at nearly the same moment, the best technique is to call waitpid() in a tight loop with a first argument of 1 and a second argument of WNOHANG. Together these arguments tell waitpid() to reap the next child that's available, and prevent the call from blocking if there happens to be no child ready for reaping. The handler will loop until waitpid() returns a negative number or zero, indicating that no more reapable children remain.

A Threaded Server

You Windows and MacPerl users are probably fretting at this point, because neither of the server implementations I've shown so far will run on your platforms. You can't use inetd because there isn't one built into either operating system (though I understand there are plug-in replacements available on the Internet which you could try), and neither supports fork().

Although I can offer no solace for MacPerl developers, Windows users are in luck. (MacPerl developers can use a technique known as I/O multiplexing to simulate the behavior of multithreading. This technique involves keeping track of multiple socket objects simultaneously and using select() to distinguish which ones are ready for reading and writing. Its a bit tedious to write such a server, but the reward is a system that runs more efficiently than either the fork() or the multithreading techniques. However, this is outside the scope of the article. For the gory details, see the W. Richard Stevens book referenced at the end of this article.) Recent ports of Perl can take advantage of multithreading capabilities, which in some ways are an improvement over fork(). In this section, I will show the standalone server rewritten to use threading. It's worth emphasizing here that Perl's multithreading facilities are still unstable. If you are on a Unix system you're safer using fork() rather than threads. Windows users must use multithreading for server applications because other options are limited.

Listing 3 gives the code for the multithreaded version of the server. You must be using Perl 5.005_03 or higher for this program to work, and it must have been compiled with thread support. It starts out similarly to the forking version, except that it brings in the Thread module and my own derivative of the Chatbot::Eliza module called Chatbot::Eliza::Server (line 4). The rationale for using this derivative class will be explained momentarily.

 
Listing 3

  0  #!/usr/bin/perl
  1  # file: eliza_thread.pl
  2  use IO::Socket;
  3  use Thread;
  4  use Chatbot::Eliza::Server;
    
  5  use constant PORT => 12000;
  6  my $listen_socket = IO::Socket::INET->new(LocalPort => PORT,
  7                                        Listen    => 20,
  8                                        Proto     => 'tcp',
  9                                        Reuse     => 1);
 10  die "Can't create a listening socket: $@" unless $listen_socket;
    
 11  warn "Listening for connections...\n";
    
 12  while (my $connection = $listen_socket->accept) {
 13    my $t = Thread->new(\&interact,$connection) || 
                           die "Can't start a thread: $!";
 14    $t->detach;
 15  }
    
 16  sub interact {
 17    my $handle = shift;
 18    Chatbot::Eliza::Server->new->command_interface($handle,$handle);
 19    $handle->close();
 20  }
 

The listening socket is created exactly as before (lines 6-9), but the accept loop and the interact() subroutine are both rather different. After accept() returns a new connected socket, the code creates a new thread of execution by calling the Thread class's new() method. The arguments to new() are a code subroutine reference and an optional list arguments to pass to it. In this case, we pass Thread->new() a reference to the interact() subroutine and the connected socket object.

Perl launches a new thread of execution and immediately calls interact(), returning a new Thread object which we'll call the session thread. When interact() is finished, the thread terminates. Back in the main thread, the Thread object can be used to monitor and control the session thread's activities.

Threads can either be attached or detached. If attached, they hang around indefinitely after they've finished execution waiting for the main thread to call their join() method. This allows threads to return a result or return value to the main thread. Detached threads go into the background and disappear as soon as they're finished executing. This is used for threads that don't have any useful information to return to the main process. The threads that handle connections are of the latter type, so after creating the session thread, the main thread immediately calls the object's detach() method. At this point, the main thread can go back to waiting for incoming connections with accept().

The interact() method is shorter than the previous version. It recovers the connected socket from its argument list and places it in a variable named $handle. It then creates a new Chatbot::Eliza::Server object and immediately invokes command_interface(). Chatbot::Eliza::Server is a small subclass that I wrote for the purposes of supporting the multithreaded server. It is identical in all respects to Chatbot::Eliza, except that the command_interface() method now takes two filehandles as arguments, one for reading user input, and the other for writing psychoanalyst output.

The rationale for creating this subclass is that the previous trick of reopening STDIN and STDOUT won't work in a multithreaded environment. Unlike the multiprocess solution based on fork(), where changes to global variables in the child don't affect the corresponding variables in the parent, each thread of execution in a multithreaded application shares exactly the same globals. Reopening STDIN in one session thread would affect the STDIN filehandle in all threads, with confusing results. For the same reason, you'll notice that the main thread doesn't close the connected socket, and the session thread doesn't close the listen socket. The main thread and each of the session threads are responsible for closing their own sockets when they are finished execution.

Listing 4 shows you the code to Chatbot::Eliza::Server. It is essentially a cut-and-paste job in which I took the command_interface() method out of the Chatbot::Eliza code, and overrode it with a version that reads and writes to filehandle objects that are passed to it at runtime. Essentially, I substituted $in->getline() everywhere that the original was reading from STDIN, and the expression $out->print() everywhere that the original was printing to STDOUT. The bad news is that I had to subclass the Chatbot::Eliza object in order to get the multithreaded server to work. The good news is that Perl's object-oriented features allowed me to do this without messing with the other 99% of the Chatbot::Eliza code.

 
Listing 4

  0  package Chatbot::Eliza::Server;
  1  use Chatbot::Eliza;
    
  2  @ISA = 'Chatbot::Eliza';
    
  3  sub command_interface {
  4    my ($self,$in,$out) = @_;
  5    die "usage: Chatbot::Eliza::Server->new(\$input_handle,\$output_handle)"
  6        unless $in && $out;
  7    my ($user_input, $previous_user_input, $reply);
      
  8    $self->botprompt($self->name . ":\t");  # Set Eliza's prompt 
  9    $self->userprompt("you:\t");           # Set user's prompt
    
 10    # Print an initial greeting
 11    $out->print ($self->botprompt,
 12                 $self->{initial}->[ int rand scalar @{ $self->{initial} } ],
 13                 "\n");
    
 14    while (1) {
 15      $out->print ($self->userprompt);
 16      $previous_user_input = $user_input;
 17      chomp( $user_input = $in->getline ); 
    
 18      # If the user wants to quit,
 19      # print out a farewell and quit.
 20      if ($self->_testquit($user_input) ) {
 21          $reply = $self->{final}->[ int rand scalar @{ $prompt->{final} } ];
 22          $out->print ($self->botprompt,$reply,"\n");
 23          last;
 24      } 
    
 25      # Invoke the transform method
 26      # to generate a reply.
 27      $reply = $self->transform( $user_input );
    
 28      # Print the actual reply
 29      $out->print ($self->botprompt,$reply,"\n");
 30    }
 31  }
    
 32  1;

Launching Standalone Servers from inetd

If you've been following along so far, you may have felt a little bit dissatisfied with both the inetd and standalone server solutions. The problem with inetd is that a new version of the script must be launched each and every time a connection comes in. There will be a perceptible delay while the script is launched, and having many copies of the script running simultaneously will consume memory resources (in contrast, when a script forks, much of its memory space is shared).

The standalone server has its problems too. First of all, you must launch it manually, or arrange for it to be launched at system startup time. This can be inconvenient, particularly if the service is used only occasionally. Secondly, there's actually a bit more work that must be done with the simple standalone server example before it's really ready for production. The server must background itself automatically, dissociate itself from the controlling terminal, respond appropriately to HUP signals, write status messages to the system log, and so forth.

Fortunately inetd provides a mechanism that combines the convenience of inetd with the performance of the standalone server. It's achieved by changing the nowait flag in the inetd.conf line to wait and making a few small changes to the standalone server.

To understand the effect of this, a brief discussion of inetd internals is in order. When inetd is first launched, it scans its configuration file and creates a whole bunch of listening sockets, one for each service defined in the configuration file. inetd monitors all these sockets simultaneously by using I/O multiplexing, a technique we haven't discussed in this article (see the earlier footnote). When a connection comes in, inetd calls accept() to create a connected socket, and then uses the trick from Listing 2 to make the standard input, output, and error file descriptors all point to the connected socket. It now invokes the appropriate server program, which inherits the modified file descriptors. For Perl programs, these file descriptors eventually become the STDIN, STDOUT, and STDERR file handles seen by the script. So writing to STDOUT sends data to the connected socket, and reading from STDIN reads data from the socket. Meanwhile, inetd goes back to waiting for incoming connections on the original listen socket.

This scenario occurs if the fourth field of the inetd.conf line is nowait. What happens if the field is wait? In this case, when inetd detects that a client is trying to establish a connection on a socket, it does not call accept(). Instead, it copies the listen socket into standard input, standard output, and standard error, and invokes the server program. The server must call accept() itself and handle the session. It is free to call accept() again as many times as it likes. inetd will wait politely for the server to finish and exit, at which point it will go back to listening on the socket.

Listing 5 contains the last version of our chatbot server. This one is designed to conserve resources without sacrificing performance. It is launched by inetd using the "wait" mechanism described above. It then services requests until a certain period of idle time goes by without any new incoming connections. At this point the server exits and returns control to inetd.

 
Listing 5

  0  #!/usr/bin/perl
  1  # file: eliza_inetd_server.pl
  2  use Chatbot::Eliza;
  3  use IO::Socket;
  4  use POSIX 'WNOHANG';
        
  5  use constant TIMEOUT => 1; # 1 minute default
  6  my $timeout = shift || TIMEOUT;
    
  7  # signal handler for timeout
  8  $SIG{ALRM} = sub { exit 0 };
  9  # signal handler for child die events
 10  $SIG{CHLD} = sub { while ( waitpid(-1,WNOHANG)>0 ) { } };
    
 11  # retrieve socket from STDIN
 12  die "STDIN is not a socket" unless -S STDIN;
 13  my $listen_socket = IO::Socket->new_from_fd(STDIN,"r+") 
 14    || die "Can't create socket: $!";
    
 15  warn "Server ready.  Waiting for connections...\n";   
    
 16  while (my $connection = $listen_socket->accept) {
 17    die "Can't fork: $!" unless defined (my $child = fork());
 18    if ($child == 0) {
 19        alarm(0);
 20        $listen_socket->close;
 21        interact($connection);
 22        exit 0;
 23    }
 24  } continue {
 25      $connection->close;
 26      alarm ($timeout * 60);
 27  }
    
 28  sub interact {
 29    my $sock = shift;
 30    STDIN->fdopen($sock,"r")  || die "Can't reopen STDIN: $!";
 31    STDOUT->fdopen($sock,"w") || die "Can't reopen STDOUT: $!";
 32    STDERR->fdopen($sock,"w") || die "Can't reopen STDERR: $!";
 33    STDOUT->autoflush(1);
 34    Chatbot::Eliza->new->command_interface;
 35  }

The code is almost identical to the standalone server of Listing 2. One new feature is a command-line argument indicating the number of minutes of idle time to allow before the server exits. Line 6 retrieves this argument, and defaults to one minute if absent. Line 8 installs an ALRM handler. Every time through the main loop the code will set a timer using the alarm() call. If the alarm goes off before accept() returns, this handler will be invoked, causing the server to exit.

Lines 12 and 13 retrieve the listen socket. Instead of creating a new socket by calling IO::Socket::INET's new() method, the code checks STDIN with the -S operator. -S returns true if STDIN is actually a socket. If it's not a socket, then the server dies with an error. (This can happen if someone tries to run the server from the command line.) Otherwise, the code turns STDIN into an IO::Socket object by calling the IO::Socket class' new_from_fd() method. This method is nearly identical to fdopen(), except that it avoids having to first create the IO::Socket, and then reopen it. If this call is successful, the $listen_socket variable will contains a listening socket that is all ready to accept() an incoming connection.

The main loop, lines 16-27, is identical to Listing 2 with the addition of two calls to alarm(). Each time through the loop, in the continue{} block, the code calls alarm() with the value of the timeout expressed in seconds. If the code reaches this point before the alarm goes off, the server gets a new lease on life. Otherwise the ALRM handler is called and the server exits. However, we don't want the alarm to go off within a child session, so the code carefully turns off the alarm each time it forks off a child (line 18).

To run this version of the chatbot server, enter a line like this one in inetd.conf and send inetd a HUP signal:

12000 stream tcp wait lstein/tmp/eliza/eliza_inetd_server.pl
                                  eliza_inetd_server.pl 2

While running ps or top in a separate window, telnet to port 12000 a few times and confirm that the same parent server is processing all the requests. Now refrain from connecting to the server for two minutes. You will see the parent disappear, leaving any active child sessions running. The next time you telnet to the port, a new parent server will be launched.

Further Information

Everything I know about Berkeley sockets I learned from W. Richard Stevens, Unix Network Programming: Networking API: Sockets and Xti (Volume 1), Prentice Hall, 1997. It's written for C programmers, but there's nothing there that can't be applied to Perl immediately. Also read through the POD documentation for IO::Handle, IO::Socket, and perlipc.

__END__


Lincoln Stein is the author of CGI.pm.