                 Internet JUNKBUSTER Technical Information

    Options  (Windows Installation)  Checking Options  Installation 
                             Copyright  (FAQ)

                                Manual Page

A copy of this page in standard man macro format is included in the tar
archive.

 [Feedback]   Name

junkbuster - The Internet Junkbuster Proxy TM

 [Feedback]   Synopsis

junkbuster configfile (Version 2.0 onwards)
junkbstr.exe configfile (Windows)
junkbuster [-a] [-y] [-s] [-c] [-v]
[-u user_agent] [-r referer] [-t from]
[-b blockfile] [-j jarfile] [-l logfile]
[-w NAME=VALUE] [-x Header_text]
[-h [bind_host_address][:bind_port]]
[-f forward_host[:port]] [-d N]
[-g gw_protocol[:[gw_host][:gw_port]]]
(Version 1.4 and earlier)

 [Feedback]   Description

junkbuster is an instrumentable proxy that filters the HTTP stream between
web servers and browsers. Its main purpose is to enhance privacy.

Versions before 2.0 used command-line options; Versions from 2.0 onward use
a configuration file. The following descriptions of the options first give
the older command-line usage, then the new configfile line.

In Versions 2.0.1 upwards on Windows, a start-up message is printed and the
configuration is read from the file junkbstr.ini if it exists and no
argument was given.

All files except the configfile are checked for changes before each page is
fetched, so they may edited without restarting the proxy.

Options

-b blockfile
blockfile  blockfile
     Block requests to URLs matching any pattern given in the lines of the
     blockfile. The junkbuster instead returns status 202, indicating that
     the request has been accepted (though not completed), and a message
     identifying itself (though the browser may display only a broken image
     icon). (Versions before 2.0 returned an error 403 (Forbidden).) The
     syntax of a pattern is [domain][:port][/path] (the http:// or https://
     protocol part is omitted). To decide if a pattern matches a target,
     the domains are compared first, then the paths.

     To compare the domains, the pattern domain and the target domain
     specified in the URL are each broken into their components.
     (Components are separated by the . (period) character.) Next each of
     the target components is compared with the corresponding pattern
     component: last with last, next-to-last with next-to-last, and so on.
     (This is called right-anchored matching.) If all of the pattern
     components find their match in the target, then the domains are
     considered a match. Case is irrelevant when comparing domain
     components.

     A successfully matching pattern can be an anchored substring of a
     target, but not vice versa. Thus if a pattern doesn't specify a
     domain, it matches all domains. Furthermore, when comparing two
     components, the components must either match in their entirety or up
     to a wildcard * (star character) in the pattern. The wildcard feature
     implements only a "prefix" match capability ("abc*" vs. "abcdefg"),
     not suffix matching ("*efg" vs. "abcdefg") or infix matching
     ("abc*efg" vs. "abcdefg"). The feature is restricted to the domain
     component; it is unrelated to the optional regular expression feature
     in the path (described below).

     If a numeric port is specified in the pattern domain, then the target
     port must match as well. The default port in a target is port 80.

     If the domain and port match, then the target URL path is checked for
     a match against the path in the pattern. Paths are compared with a
     simple case-sensitive left-anchored substring comparison. Once again,
     the pattern can be an anchored substring of the target, but not vice
     versa. A path of / (slash) would match all paths. Wildcards are not
     considered in path comparisons.

     For example, the target URL
        the.yellow-brick-road.com/TinMan/has_no_brain
     would be matched (and blocked) by the following patterns
        yellow-brick-road.com
     and
        Yellow*.COM
     and
        /TinM
     but not
        follow.the.yellow-brick-road.com
     or
        /tinman

     Comments in a blockfile start with a # (hash) character and end at a
     new line. Blank lines are also ignored.

     Lines beginning with a ~ (tilde) character are taken to be exceptions:
     a URL blocked by previous patterns that matches the rest of the line
     is let through. (The last match wins.)

     Patterns may contain POSIX regular expressions provided the junkbuster
     was compiled with this option (the default in Version 2.0 on). The
     idiom /*.*/ad can then be used to match any URL containing /ad (such
     as http://nomatterwhere.com/images/advert/g3487.gif for example).
     These expressions don't work in the domain part.

     In version 1.3 and later the blockfile and cookiefile are checked for
     changes before each request.

-w NAME=VALUE
wafer  NAME=VALUE
     Specifies a pair to be sent as a cookie with every request to the
     server. (Such boring cookies are called wafers.) This option may be
     called more than once to generate multiple wafers. The original
     Netscape specification prohibited semi-colons, commas and white space;
     these characters will be URL-encoded if used in wafers. The Path and
     Domain attributes are not currently supported.

-c cookiefile
cookiefile  cookiefile
     Enforce the cookie management policy specified in the cookiefile. If
     this option is not used all cookies are silently crunched, so that
     users who never want cookies aren't bothered by browsers asking
     whether each cookie should be accepted. However, cookies can still get
     through via JavaScript and SSL, so alerts should be left on.

     In Version 1.2 and later this option must be followed by a filename
     containing instructions on which sites are allowed to receive and set
     cookies. By default cookies are dropped in both the browser's request
     and the server's response, unless the URL requested matches an entry
     in the cookiefile. The matching algorithm is the same as for the
     blockfile. A leading > character allows server-bound cookies only; a <
     allows only browser-bound cookies; a ~ character stops cookies in both
     directions. Thus a cookiefile containing a single line with the two
     characters >* will pass on all cookies to servers but not give any new
     ones to the browser.

-j jarfile
jarfile  jarfile
     All Set-cookie attempts by the server are logged to jarfile. If no
     wafer is specified, one containing a canned notice (the vanilla wafer)
     is added as an alert to the server unless the suppress-vanilla-wafer
     option is invoked.

-v
suppress-vanilla-wafer
     Suppress the vanilla wafer.

-t from
from  from
     If the browser discloses an email address in the FROM header (most
     don't), replace it with from. If from is set to . (the period
     character) the FROM is passed to the server unchanged. The default is
     to delete the FROM header.

-r referer
referer  referer
     Whenever the browser discloses the URL that led to the current
     request, replace it with referer. If referer is set to . (period) the
     URL is passed to the server unchanged. In Version 1.4 and later, if
     referer is set to @ (at) the URL is sent in cases where the cookiefile
     specifies that a cookie would be sent. (No way to send bogus referers
     selectively is provided.) The default is to delete Referer.

     Version 2.0 onwards also accepts the spelling referrer, which most
     dictionaries consider correct.

-u user-agent
user-agent  user-agent
     Information disclosed by the browser about itself is replaced with the
     value user-agent. If user-agent is set to . (period) the User-Agent
     header is passed to the server unchanged, along with any UA headers
     produced by MS-IE (which would otherwise be deleted). In Version 1.4
     and later, if user-agent is set to @ (at) these headers are sent
     unchanged in cases where the cookiefile specifies that a cookie would
     be sent, otherwise only default User-Agent header is sent. That
     default is Mozilla/3.0 (Netscape) with an unremarkable Macintosh
     configuration. If used with a browser less advanced than Mozilla/3.0
     or IE-3, the default may encourage pages containing extensions that
     confuse the browser.

-h [host][:port]
listen-address  [host][:port]
     If host is specified, bind the junkbuster to that IP address. If a
     port is specified, use it. The default port is 8000; the default host
     is localhost. Before Version 2.0.2, the default was to bind to all IP
     addresses (INADDR_ANY); but this has been restricted to localhost to
     avoid unintended security breaches. (To open the proxy to all, use the
     line
        listen-address :8000
     in the configuration file.) For fine-grained control, use the aclfile
     option.

-f forward_host[:port]
forwardfile  forwardfile
     Version 1.X required all HTTP requests from the client to be forwarded
     to the same destination. Version 2.0 onwards takes its routing
     specification from a forwardfile, allowing selection of the proxy
     (a.k.a. forwarding host) and gateway according to the URL. Here is a
     typical line.

     *         lpwa.com:8000      .      .

     Each line contains four fields: target, forward_to, via_gateway_type
     and gateway. As usual, the last target domain that matches the
     requested URL wins, and the * character alone matches any domain. The
     target domain need not be a fully qualified hostname; it can be a
     general domain such as com or co.uk or even just a port number. For
     example, because LPWA does not handle SSL, the line above will
     typically be followed by a line such as

     :443    .      .      .

     to allow SSL transactions to proceed directly. The cautious would also
     add an entry in their blockfile to stop transactions to port 443 for
     all but specified trusted sites.

     If the winning forward_to field is . (the dot character) the proxy
     connects directly to the server given in the URL, otherwise it
     forwards to the host and port number specified. The default port is
     8000. The via_gateway_type and gateway fields also use a dot to
     indicate no gateway protocol. The gateway protocols are explained
     below.

     The example line above in a forwardfile alone would send everything
     through port 8000 at lpwa.com with no gateway protocol, and is
     equivalent to the old -f lpwa.com:8000 with no -g option. For more
     information see the example file provided with the distribution.

     Configure with care: no loop detection is performed. When setting up
     chains of proxies that might loop back, try adding Squid.

-g gw_protocol[:[gw_host][:gw_port]]
     Use gw_protocol as the gateway protocol. This option was introduced in
     Version 1.4, but was folded into the forwardfile option in Version
     2.0. The default is to use no gateway protocol; this may be explicitly
     specified as direct on the command line or the dot character in the
     forwardfile. The SOCKS4 protocol may be specified as socks or socks4.
     The SOCKS4A protocol is specified as socks4a. The SOCKS5 protocol is
     not currently supported. The default SOCKS gw_port is 1080.

     The user's browser should not be configured to use SOCKS; the proxy
     conducts the negotiations, not the browser.

     The user identification capabilities of SOCKS4 are deliberately not
     used; the user is always identified to the SOCKS server as
     userid=anonymous. If the server's policy is to reject requests from
     anonymous, the proxy will not work. Use a debug value of 3 to see the
     status returned by the server.

-d N
debug  N
     Set debug mode. The most common value is 1, to pinpoint offensive
     URLs, so they can be added to the blockfile. The value of N is a
     bitwise logical-OR of the following values:
     1 = URLs (show each URL requested by the browser);
     2 = Connections (show each connection to or from the proxy);
     4 = I/O (log I/O errors);
     8 = Headers (as each header is scanned, show the header and what is
     done to it);
     16 = Log everything (including debugging traces and the contents of
     the pages).
     If no logfile is defined, the debug data goes to standard output.
     Multiple debug lines are permitted; they are logical OR-ed together.

     Because most browsers send several requests in parallel the debugging
     output may appear intermingled, so the single-threaded option is
     recommended when using debug with N greater than 1.

-y
add-forwarded-header
     Add X-Forwarded-For headers to the server-bound HTTP stream indicating
     the client IP address to the server, in the new style of Squid 1.1.4.
     If you want the traditional HTTP_FORWARDED response header, add it
     manually with the -x option.

-x HeaderText
add-header  HeaderText
     Add the HeaderText verbatim to requests to the server. Typical uses
     include adding old-style forwarding notices such as Forwarded: by
     http://pro-privacy-isp.net and reinstating the Proxy-Connection:
     Keep-Alive header (which the junkbuster deletes so as not to reveal
     its existence). No checking is done for correctness or plausibility,
     so it can be used to throw any old trash into the server-bound HTTP
     stream. Please don't litter.

-s
single-threaded
     Doesn't fork() a separate process (or create a separate thread) to
     handle each connection. Useful when debugging to keep the process
     single threaded.

-l logfile
logfile  logfile
     Write all debug data into logfile. The default logfile is the standard
     output.

aclfile  aclfile
     Unless this option is used, the proxy talks to anyone who can connect
     to the listen-address, and everyone who can has equal permissions on
     where they can go. An access file allows restrictions to be placed on
     these two policies, by distinguishing some source IP addresses and/or
     some destination addresses. (If a forwarder or a gateway is being
     used, its address is considered the destination address, not the
     ultimate IP address of the URL requested.)

     Each line of the access file begins with either the word permit or
     deny followed by source and (optionally) destination addresses to be
     matched against those of the HTTP request. The last matching line
     specifies the result: if it was a deny line or if no line matched, the
     request will be refused.

     A source or destination can be specified as a single numeric IP
     address, or with a hostname, provided that the host's name can be
     resolved to a numeric address: this cannot be used to block all .mil
     domains for example, because there is no single address associated
     with that domain name. Either form may be followed by a slash and an
     integer N, specifying a subnet mask of N bits. For example, permit
     207.153.200.72/24 matches the entire Class-C subnet from 207.153.200.0
     through 207.153.200.255. (A netmask of 255.255.255.0 corresponds to 24
     bits of ones in the netmask, as with *_MASKLEN=24.) A value of 16
     would be used for a Class-B subnet. A value of zero for N in the
     subnet mask length will cause any address to match; this can be used
     to express a default rule. For more information see the example file
     provided with the distribution.

     If you like these access controls you should probably have firewall;
     they are not intended to replace one.

trustfile  trustfile
     This feature is experimental, has not been fully documented and is
     very subject to change. The goal is for parents to be able to choose a
     page or site whose links they regard suitable for their young children
     and for the proxy to allow access only to sites mentioned there. To do
     this the proxy examines the referer variable on each page request to
     check they resulted from a click on the ``trusted referer'' site: if
     so the referred site is added to a list of trusted sites, so that the
     child can then move around that site. There are several uncertainties
     in this scheme that experience may be able to iron out; check back in
     the months ahead.

trust_info_url  trust_info_url
     When access is denied due to lack of a trusted referer, this URL is
     displayed with a message pointing the user to it for further
     information.

hide-console
     In the Windows version only, instructs the program to disconnect from
     and hide the command console after starting.

-a
     (Obsolete) Accept the server's Set-cookie headers, passing them
     through to the browser. This option was removed in Version 1.2 and
     replaced by an improvement to the -c option.

 [Feedback]   Installation and Use

Browsers must be told where to find the junkbuster (e.g. localhost port
8000). To set the HTTP proxy in Netscape 3.0, go through: Options; Network
Preferences; Proxies; Manual Proxy Configuration; View. See the FAQ for
other browsers. The Security Proxy should also be set to the same values,
otherwise shttp: URLs won't work.

Note the limitations explained in the FAQ.

 [Feedback]   Checking Options

To allow users to check that a junkbuster is running and how it is
configured, it intercepts requests for any URL ending in /show-proxy-args
and blocks it, returning instead returns information on its version number
and current configuration including the contents of its blockfile. To get
an explicit warning that no junkbuster intervened if the proxy was not
configured, it's best to point it to a URL that does this, such as
http://internet.junkbuster.com/cgi-bin/show-proxy-args on Junkbusters's
website.

 [Feedback]   See Also

http://www.junkbusters.com/ht/en/ijbfaq.html
http://www.junkbusters.com/ht/en/cookies.html
http://internet.junkbuster.com/cgi-bin/show-proxy-args
http://www.cis.ohio-state.edu/htbin/rfc/rfc2109.html
http://squid.nlanr.net/Squid/
http://www-math.uni-paderborn.de/~axel/

 [Feedback]   Copyright and GPL

Written and copyright by the Anonymous Coders and Junkbusters Corporation
and made available under the GNU General Public License (GPL). This
software comes with NO WARRANTY. Internet Junkbuster Proxy is a trademark
of Junkbusters Corporation.

                       [--- Back to Top of Page ---]

Home  Next  Site Map  Legal  Privacy  Cookies  Banner Ads 
Telemarketing  Mail  Spam



Copyright  1996-9 Junkbusters  Corporation. Copying and distribution
permitted under the GNU General Public License. 1999/07/01
http://www.junkbusters.com/ht/en/ijbman.html
webmaster@junkbusters.com
