Introduction:

This article is a practical tutorial for developing 
HyperText Markup Language (HTML) based GUI UNIX 
administration tools.  It presents a simple login audit 
tool (La Tool) which generates reports for a User ID 
selected from a drop down scroll list.  Details of how 
to restrict access to these HTML based tools are 
outlined with respect to NCSA's httpd v1.3 World Wide 
Web (WWW) server.  

Why use an HTML browser?  Glad you asked! Traditional 
UNIX system administration consisted of editing text 
files in obscure directories. This was adequate when 
punch cards and printer terminals ruled the earth! Text 
only interfaces are inadequate in a modern UNIX 
environment dominated by graphical user interfaces such 
as X11.

X11 has allowed various workstation manufacturers to 
build graphical interfaces for system administration 
tasks.  Unfortunately X11 programming is cumbersome 
and not readily portable to non-UNIX systems.  In an 
age of heterogeneous networked computers it is likely 
that a sysadmin won't be at an Xterm UNIX console.  
This can be unhandy if the sysadmin was trained to use 
GUI tools and doesn't have the foggiest notion of which 
text file needs modification.

What is needed is a non-platform specific way of 
presenting and acting upon graphical/textual data 
across networks.  Fortunately we have a portable 
graphical user interface solution available in the form 
of WWW server/browser technology.  A web server not
only delivers HTML files it can act upon information
coming from the browser.  Mechanisms for data 
processing on the server end are described by the
Common Gateway Interface (CGI).  CGI allows complex 
administration scripting on the server using any 
programming language. 

Security:

WWW server/browser technology allows custom GUI based 
sysadmin tools with a minimum of coding.  A sysadmin 
could contact the UNIX host using a WWW browser which 
supports authentication, running on practically any 
computer/OS!  Of course this flexibility makes security 
a concern. You don't want any person with a WWW browser 
to access your HTML system administration scripts.  
Limits must be placed on accepting incoming data as 
well, otherwise a clever hacker would simply duplicate 
the front end GUI and post his or her own data!  

The NCSA httpd supports user, group and IP address/
domain authentication.   These mechanisms provide a 
security level comparable to root login over a standard 
TCP/IP Telnet connection.  Eeek!  Let's clarify that.  
There are three major types of security breeches when
using a WWW server/client.  Firstly an intruder might 
'listen' to the connection and glean the access 
password.  Or an intruder might copy/substitute the 
HTML source as it is delivered to the client/server.  
If an intruder can do this, they can just as easily 
grab your root password as it is delivered across 
the wire to a remote UNIX box.  Commercial servers, 
such as Netscape, utilize an encryption system that 
keeps casual IP packet snoopers from viewing sensitive 
information.  Finally, an intruder can utilize holes in 
the server security to access these administration 
scripts, or the system itself. 

Think about how often you Telnet to a machine to do
administrative tasks.  Are you on a trusted network?
If so, the basic authentication mechanisms supported by
the free WWW servers will suffice.  If you are security 
conscious and never use Telnet, consider investing in a 
server that does data encryption across the network.

For this article we will use user authentication. Here 
are step-by-step instructions for setting up access 
authentication to the La Tool CGI binary.

1) Create a protected directory for admin tools.  I did 
   this in my home directory and called it 'Dadmintool'.
   Make certain that whatever user id you run the server 
   as has read and execute privilege for this directory.
   My httpd runs with the id set to 'nobody'.

2) Inform the httpd server that this directory contains 
   CGI scripts.  For NCSA httpd v1.3 you would go to the 
   configuration file sub-directory for your server. 
   Edit the 'srm.conf' file and make an alias reference 
   to your just created CGI sub-directory.  Here is my 
   entry: 

   "ScriptAlias /admin/ /home/ccb8m/Dadmintool".

3) Create a '.htaccess' file in the CGI sub-directory.  
   The '.htaccess' file describes the authorization 
   requirements for La Tool or any other admin script 
   in this sub-directory.  This is La Tool's '.htaccess' 
   file:

   AuthUserFile /home/ccb8m/Dadmintool/.htpasswd
   AuthGroupFile /dev/null
   AuthName UNIX AdminTool
   AuthType Basic

   <Limit GET POST>
   require user sys
   </Limit>

4) Create a '.htpasswd' file in the CGI sub-directory.  
   This file looks very similar to "/etc/passwd" but 
   only contains a login name and the encrypted 
   password.  This means you can limit/grant access to 
   HTML documents independent of users with valid login 
   accounts.  To create this file you need to use the 
   'htpasswd' command.  This is supplied in the 
   "support" sub-directory of the httpd 1.3 source 
   release. To create a htpasswd file called '.htpasswd' 
   containing the user 'sys' type the following: 

   htpasswd -c .htpasswd sys

   You will be prompted for a new password along with 
   password verification.  It's easiest if you run the 
   htpasswd command in your recently created 
   administration binary directory.  Otherwise you 
   need to move '.htaccess' and '.htpasswd' to that 
   directory.

Now that we have a protected place for administration 
scripts to reside, let's take a look at how they work 
with the WWW server/browser.

HTML and WWW Client/Server Interaction:

The connection between a WWW browser and server is 
stateless.  For each bit of data an HTML document 
requires, a communication method must be specified.  
The communication method can be thought of as a server 
command, with additional hidden information. Getting 
files from the server requires the GET method, while 
sending information is usually accomplished via the
POST method.  If a CGI program only requires a command 
line parameter, data is bundled with the GET method.  
This stateless transaction paradigm requires user 
authentication to be handled by the server, with 
implicit acknowledgment by the browser.  This means 
that you have the option of typing in the user name/
password ONCE to access a document or CGI binary.
Otherwise you would have to type the login ID/
password for each element retrieved!  The browser keeps 
authentication information cached until you quit.  This 
is obviously a low level of security, but again it is 
no worse than an open Telnet connection.

HTML is a set of commands that can be imbedded in text
documents.  These imbedded commands partition text into 
logical format and link elements. The characters <, >, 
and / differentiate HTML commands from regular text.  
A logical partition beginning for text is specified by 
<HTML_ELEMENT>.  The end of the text partition is
specified by </HTML_ELEMENT>.  Some HTML elements don't 
specifically operate on the text in a document. Instead
they are inclusive, such as inserting graphics or 
specialized format elements such as a paragraph break.
these elements don't require </HTML_ELEMENT>.  Listed 
below are HTML examples, for inline graphics, making 
text bold and italic, inserting a horizontal bar
and specifying the end of a paragraph.

<IMG SRC="http://www.cs.virginia.edu/~ccb8m/Dgif/latool.gif"> 
<B><I>Bold and Italic!!</I></B>
<HR>
This sentence comes right below the horizontal bar. Text
is displayed without any formatting information.  If I
didn't put in a paragraph break, the next bit of text
would be appended to this rambling monologue.<P> This is
the next sentence and it starts a new paragraph.

The above commands are good for display purposes, but 
the most important HTML element is the anchor.  With it 
you can make text or a logical HTML partition point to 
any Internet host in the world! Here is a bit of text 
turned into an HTML anchor: 

<A HREF="http://www.cs.virginia.edu/~ccb8m">
Charles Bundy's home page.</A>

Anchor elements are important, because they let users 
have local control of data while supporting global 
accessibility.  Don't start chewing your finger nails!
If you have httpd authentication set up 'global' refers
to machine type, rather than 'devious hacker'.

The example anchor command requires a hypertext 
reference parameter (HREF). The 'HREF' parameter is a 
URL or Uniform Resource Locator.  The URL is split into 
three parts. The first two parts are network related and 
specify what kind of resource server to contact.  If you 
used 'http' the browser would use default port 80 and 
expect 'httpd' to be accepting commands.  If you use 
'ftp' the browser would use port 21 and expect 'ftpd' to 
be accepting commands. Immediately after '://' is the 
machine address of where to send server commands.  The 
third part of a URL specifies which file to retrieve.  
The path to this file is relative to the default HTML
document directory.  My HTTP server uses "/http/htdocs" 
as the default HTML document directory.  If a ~USERID 
is specified, httpd goes to that user's home directory 
and appends a default HTML document sub-directory.  If 
an HTML document/path isn't specified a default file 
name is used.  Here is a description of the anchor 
example's URL.  The browser opens a connection to the 
University of Virginia Computer Science WWW server. It 
sends a GET method to port 80 requesting that the 
default file 'index.html' be retrieved from the 
sub-directory "/uf12/home/ccb8m/public_html".

Common Gateway Interface (CGI):

Under most circumstances a browser will send commands 
that the server processes directly. Most of these 
commands are "get a file and send it to me".  Once the 
file gets to the browser it is either displayed or a 
helper application is called.  Web browsers can process 
HTML and GIF data, but don't have a WordPerfect viewer 
built in.  Thus, WordPerfect would be called as a 
'helper application' to display a data stream that is 
foreign to the browser.  The reverse case is when a 
browser needs to send something which requires complex 
processing on the server end. 

Servers don't have a way of doing generic processing. 
They must pass data to a local program which can act 
upon the information.  The mechanism for passing data 
to a program and sending the results (usually HTML) 
back to a browser are defined by the Common Gateway 
Interface. CGI isn't a 'programming language', it is 
simply a definition of how data is to be transferred. 
There are three transfer methods specified in the HTML 
1.0 'standard'.  Standard Input, Environment Variables, 
and Command Line Parameters.  Most CGI binaries use a 
combination of Standard Input and Environment Variables, 
and La Tool does this as well.

Case Example: La Tool

La Tool is written in 'C' instead of an interpreted 
scripting language such as 'Perl'.  For those of you 
who know Perl, you will probably be scratching your 
head and asking "Why?"  (See Sidebar 1.)  Suffice to 
say I am more comfortable with 'C' then 'Perl'.

The source code is broken into four parts, shown in 
Listings 1 through 4.  Listing 6 is the makefile used to 
create La Tool.  The makefile consists of environment
variables plus TARGET : DEPENDENCY lines followed by 
ACTION lines. The ACTION lines are differentiated from 
the TARGET : DEPENDENCY lines by TAB characters.  So 
don't put any spaces in front of an ACTION line, use 
TABs.  Otherwise the 'make' command will complain about 
illegal characters.

Listing 5 is the include file which specifies the size 
of the log record array, and the GIF logo and background 
picture URL's. If you hate to type, a source code bundle 
including 'latool.gif' and 'chalk.gif' is available at:

http://www.cs.virginia.edu/~ccb8m/latool.tar.Z

Development should be kept separate from the production 
directory (Dadmintool). I advise creating a "Dsrc/Dtool" 
sub-directory and extracting/creating the source 
listings there.  

To Process or not, That is the Question:

Take a look at listing 1, it contains the main() 
procedure.  main() does some initialization and 
then gets the environment variable REQUEST_METHOD.  
REQUEST_METHOD is one of several standard environment
variables that the server exports to child processes.
If REQUEST_METHOD is set to 'POST' La Tool will extract 
the user ID from STDIN and call check() to process and 
send results back.  Otherwise La Tool calls fillitForm() 
to send an HTML form for a sysadmin to select the user 
ID. 

This is the standard method of determining whether a
script needs to process information or send some 
sort of input form.  Another important thing to note 
is line 23:

printf("Content-TYPE: text/html\n\n");

The server normally sends this line plus a blank line
based upon file extensions. IE: if a GET method is 
issued for an 'index.html' file, Content-TYPE is 
returned as shown above.  When a CGI binary is called 
the WWW server doesn't make any assumptions about the 
return content.  Thus your CGI program has to explicitly 
tell the browser that it is sending text containing HTML 
commands (or a GIF file, etc.) 

When the browser issues a POST method La Tool prints out 
HTML "wrapper" information and calls check().  check() 
does all the work of reading the 'wtmp' file and 
printing out login statistics INDEPENDENT of the HTML 
interface.  This means you can use La Tool as a skeleton 
for any admin code you are using now.  Substitute your 
code for check(), change fillitForm() to suit your input 
needs, and modify the parsing of STDIN.  Viola! Instant 
HTML GUI, with a caveat.  If your fillitForm() needs to
send more than one item of information, you will need
to break up STDIN by watching for '?'.  Browser return
data is of the form:

INPUT_VAR1=INFORMATION?INPUT_VAR2=INFORMATION? ...

The '?' separates variable assignment pairs.  La Tool
only passes one item of information, thus a typical
STDIN stream would be:

ut_name=root

User Interface - fillitForm() and seluser():

Listing 2 contains the code which prints out an HTML
input form.  This form was designed for Netscape, but
looks the same under Mosaic v2.x.  If the La Tool 
interface doesn't suit your browser, get rid of the 
offending HTML elements!

Input elements are contained within the HTML FORM
element.  FORM defines the action to be taken for a 
group of HTML input elements, such as <SELECT>.  
Input elements have a NAME and usually a default value, 
unless they are free form.  This is an example of a 
free form text entry box:

<INPUT TYPE="text" NAME="ut_name" SIZE=12>

The user interface presented by La Tool tries to 
minimize input mistakes.  This is accomplished by 
querying the UNIX passwd file and presenting a 
selection of valid login ID's.  The first version of 
La Tool had a text box in which you typed the login ID.  
This was very unfriendly, since you had to know the 
exact login ID before running La Tool.  This is 
rectified by seluser() which is presented in Listing 3.
It uses the <SELECT> element to build a drop down list 
of valid login ID's. 

seluser() builds this list by parsing the "/etc/passwd"
file.  There is a drawback.  La Tool decides upon the 
input choices for a user.  La Tool's idea of what is
'valid' may not contain all possible valid inputs.  If 
you are running NIS you can't get all the user names 
from "/etc/passwd".  In this instance a free-form text 
box would be better.  The local system might not have 
all user information but the login ID will always be 
recorded in 'wtmp'.

Since this is a distributed GUI, information about which 
machine you are querying is presented. The logging file 
is displayed as well, though it might be better to make 
this a user input.  To make sure no one file has more
lines than La Tool can handle (specified by NUMREC, see
Listing 5) you might prune "/var/adm/wtmp" on a monthly
basis.  La Tool could present another <SELECT> list
which lets a user pick a particular month/year file to 
peruse.

Presentation of Results - check():

Listing 4 contains check() which does the complex 
processing for La Tool.  Unlike the other routines,
which are mostly HTML printf() statements, check()
prints out very little HTML.  It takes two parameters,
a user ID and the log file to process.  check() reads
in the log file and then starts at the first record.
If the record contains a login name which matches the
user ID parameter a second search begins.  This starts
at the next record and terminates when a matching TTY
logout record is discovered.  The search also terminates
if a shutdown record is encountered.  Online time is 
calculated from the start and end records, and this
information is printed along with the two records.
Online time is added to the accumulated time which is
printed when all the records have been scanned.  The
unit time is seconds, but hours and minutes are printed
using a modulus / unit conversion factor.  Any error 
messages or warnings are printed for display by the 
browser. 

Compiling and Running:

If you have all six listings present in Dsrc/Dtool edit
listing 5, latool.h.  Set HTTP_HOST to the machine that
latool is installed on.  I used the IP address, but you
can use a domain name like: viper.cs.virginia.edu.

If you got the archive "latool.tar.Z" you will have 
'latool.gif' and 'chalk.gif'.  Put these into your 
default HTML system directory (or user directory).
Now change BACK_URL and LOGO_URL such that they point
to the proper machine/path.  For example, assume httpd 
is on a local machine called 'rodan' and the gifs are 
in the HTML default system directory. The define 
statements would be:

#define BACK_URL "http://rodan/chalk.gif"
#define LOGO_URL "http://rodan/latool.gif"

If you didn't get the source archive, you can create
a background and logo gif using any paint program.

Now type 'make' to create the latool binary.  Move 
'latool' to the administration directory (Dadmintool).
Fire up your favorite WWW browser and try to access
latool.  If you copied the ScriptAlias as described 
in security setup the URL for 'rodan' would be:

http://rodan/admin/latool

Conclusion:

La Tool is a very simple CGI program which does
some form of system administration.  It demonstrates
basic concepts, but doesn't go in depth about various
input elements.  To find out more about CGI and 
elements such as radio buttons read these two books:

[1] Robert J. Murdy, "Serving the Web", Coriolis Group
[2] Ian S. Graham, "HTML SourceBook", John Wiley & Sons

Sidebar 1: PERL

Perl is an interpreted scripting language whose features 
include fast string manipulation, access to almost all 
UNIX system library functions, security tracing for suid 
programs and automatic dynamic memory allocation for 
arrays.  The source will look very familiar to C 
programmers with a splash of BASIC and REXX thrown in.  
Since Perl is interpreted, a CGI author doesn't have to 
go through a compilation cycle to see results. It's easy 
to read and modify, making it ideal for prototyping.  

Perl seems to be the default CGI scripting language for 
most HTML/CGI authors.  For system administration CGI 
programs I would advocate the use of PERL.  For large 
scale CGI program development use C/C++.

Sidebar 2: Internet Resources

The Internet is a wonderful source of information, and 
a great time saver!  For the UNIX administrator who is 
interested in using CGI and HTML to ease accounting and 
administration check out the following information 
sources:

Anonymous FTP sites:

ftp.ncsa.uiuc.edu - 
This is where you can get the latest and greatest 
versions of the NCSA WWW server and browser.  Ready to 
execute binaries for the Mosaic browser are located in 
the "/Web/Mosaic/Unix/binaries/2.7b" directory.  The 
last sub-directory is the current version, so you might 
want to do an 'ls' on "/Web/Mosaic/Unix/binaries" for 
the latest version sub-directory.  The httpd WWW server 
binaries are located in the "/Web/httpd/Unix/ncsa_httpd/
current" directory.

ftp2.netscape.com
ftp3.netscape.com
      ...
ftp8.netscape.com -
This is the anonymous ftp site for Netscape.  Netscape is
free if you are non-profit, or academic.  Windows, UNIX
and Macintosh distributions can be found in the respective
paths:

/netscape/windows
/netscape/unix
/netscape/mac

WWW UNIX pages:

Sigurdur Bjornss (sib@saga.is) uses an HTML GUI to 
manage users on an Internet service.  This CGI script 
lets a sysadmin create user accounts as well as 
administering account information for billing purposes.  
A password change example is available via anonymous ftp 
at "ftp.saga.is".  The path is: 

/pub/unix/www/cgipasswd.tar.Z

USENET news groups:

comp.infosystems.www.authoring.cgi
comp.infosystems.www.authoring.html
comp.infosystems.www.authoring.images
