
Based on the previous sample program the main program of the URL checker is pretty short:
/* URLCHECK.CMD - IBM REXX Sample Program */
Parse Arg URLList HTMLFile
/* Load REXX Socket library if not already loaded */
If RxFuncQuery("SockLoadFuncs") Then
Do
Call RxFuncAdd "SockLoadFuncs","RXSOCK","SockLoadFuncs"
Call SockLoadFuncs
End
/* check all URLs in the specified file for expiration */
URLS.0 = 0
Changed = ""
Unchanged = ""
Commented = ""
Call CheckURLs URLList
Call WriteHTML HTMLFile, Changed, Unchanged, Commented
Exit
The following variables are used in the main program:
Based on the information in the the three index variables and the 'URLS' stem the function 'WriteHTML' then writes out a HTML file with links to all URLs from the input file grouped by their status.
The following functions are reused from the previous sample and are not listed again:
/********************************************************/
/* */
/* Procedure: CheckURLs */
/* Purpose: Check the modification dates of all URLs */
/* listed in the specified file. If the date */
/* has changed, update the list file with */
/* the new date. */
/* Arguments: URLFile - file containing URL list */
/* Returns: nothing */
/* */
/********************************************************/
CheckURLs: Procedure Expose URLS. Changed Unchanged,
Commented
Parse Arg URLFile
Index = 0
Do While Lines(URLFile)
/* read line with URL and last modification date */
URLLine = LineIn(URLFile)
/* remember line for later update of file */
Index = Index + 1
URLS.0 = Index
URLS.Index = URLLine
/* if first character is not a "#" then process URL */
If SubStr(URLLine, 1, 1) \= "#" Then
Do
/* retrieve header for specified URL */
Parse Var URLLine URL ModDate
Header = GetHeader(URL)
If Length(Header) \= 0 Then
Do
/* header could be read, find date */
DocDate = GetModificationDate(Header)
If Length(ModDate) = 0 | ModDate \= DocDate Then
Do
/* this URL has been changed, add to list */
/* of changed URLs and update the date */
Changed = Changed Index
URLS.Index = URL DocDate
End
Else
/* add index to list of unchanged URLs */
Unchanged = Unchanged Index
End
Else
/* add index to list of unchanged URLs */
Unchanged = Unchanged Index
End
Else
/* add index to list of all commented out URLs */
Commented = Commented Index
End
/* close input stream, erase it and then rewrite it */
Call Stream URLFile, "C", "CLOSE"
"@DEL" URLFile
Do Index = 1 To URLS.0
Call LineOut URLFile, URLS.Index
End
Call Stream URLFile, "C", "CLOSE"
Return
After the all documents have been checked the result will be formatted into
a 'HTML' file with links to the original documents. For details of
HTML (HyperText Markup Language) see RFC 1866. The output file is created
in the function 'WriteHTML'. It deletes an already existing version
of the output file, creates a simple header, formats the lists of changed,
unchanged and commented documents, and finally closes the file with a
simple trailer containing the current time:
/********************************************************/
/* */
/* Procedure: WriteHTML */
/* Purpose: Create a new HTML document with links to */
/* the input URLs grouped by modification. */
/* Arguments: HTML - output filename */
/* Changed - list of changed URL indices */
/* Unchanged - list of unchanged URL indices */
/* Commented - list of commented URL indices */
/* Returns: nothing */
/* */
/********************************************************/
WriteHTML: Procedure Expose URLS.
Parse Arg HTML, Changed, Unchanged, Commented
/* write new HTML document with links to URLs */
"@DEL" HTML "1>NUL 2>NUL"
Call LineOut HTML, "<html><head>"
Call LineOut HTML, "<title>My link list</title>"
Call LineOut HTML, "</head><body>"
Call LineOut HTML, "<h1>Changed documents</h1>"
Call FormatURLList HTML, Changed
Call LineOut HTML, "<h1>Unchanged documents</h1>"
Call FormatURLList HTML, Unchanged
Call LineOut HTML, "<h1>Commented documents</h1>"
Call FormatURLList HTML, Commented
Call LineOut HTML, "<p><i>Documents checked at",
Date() "on" Time() "</i>"
Call LineOut HTML, "</body></html>"
Return
The function 'FormatURLList' is used to format a single index
list into the HTML output format with one URL per line. This version of
the formatter simply creates a hyper link to the document and lists the
URL of the document followed by a line break. Another solution would be
to format the URLS in an unordered list, etc., see the HTML reference
for more formatting options.
/********************************************************/
/* */
/* Procedure: FormatURLList */
/* Purpose: Format a list of URL indices into a HTML */
/* formatted list with links to the URLs. */
/* Arguments: HTML - output filename */
/* List - list of indices */
/* Returns: nothing */
/* */
/********************************************************/
FormatURLList: Procedure Expose URLS.
Parse Arg HTML, List
/* are there any indices in the list? */
If Words(List) > 0 Then
Do
Do Index = 1 To Words(List)
Idx = Word(List, Index)
Parse Var URLS.Idx URL ModDate
URL = Strip(URL, "L", "#")
Call LineOut HTML, "<br><a href=""" || URL || """>"
Call LineOut HTML, URL || "</a>"
If Length(ModDate) > 0 Then
Call LineOut HTML, ", last modified at" ModDate
End
End
Else
Call LineOut HTML, "<p><i>no documents in list</i><p>"
Return
This is is a sample input file for the URL checker:
http://www.ibm.com
http://www.myhost.mydomain/users/chris.html
#http://www2.hursley.ibm.com/rexx/
After running the checker the resulting HTML file could look like that:
<html><head>
<title>My link list</title>
</head><body>
<h1>Changed documents</h1>
<p><i>no documents in list</i><p>
<h1>Unchanged documents</h1>
<br><a href="http://www.ibm.com">
http://www.ibm.com</a>
, last modified at THU, 18 JUL 1996 17:41:10 GMT
<br><a href="http://www.myhost.mydomain/users/chris.html">
http://www.myhost.mydomain/users/chris.html</a>
, last modified at MONDAY, 22-JUL-96 19:51:25 GMT
<h1>Commented documents</h1>
<br><a href="http://www2.hursley.ibm.com/rexx/">
http://www2.hursley.ibm.com/rexx/</a>
<p><i>Documents checked at 24 Jul 1996 on 18:43:56 </i>
</body></html>
Running the program every night or early morning on a server gives you a
daily updated list of the documents you want to follow. By using the
generated HTML file as the startup page in your web browser you can access
the changed documents directly via their links.
The shown URL checker can be improved in many ways, e.g.: