               Generic Methods of Anti-Virus Technology.
                   Copyright (c) 1995, by Zvi Netiv.

ABSTRACT

This paper provides an introductory discussion of current generic
methods of antivirus security. Integrity analyzers are contrasted with
checksumming methods and several advanced generic techniques of virus
capture and integrity analysis are introduced. These concepts are then
applied to virus detection and system recovery with the conclusion that
generic integrity analyzers provide the most fundamental component in an
integrated antivirus security protocol.

INTRODUCTION.

Antivirus (AV) programs usually belong to one of the following
categories: on-demand scanners, TSR scanners, activity blockers, and
generic AV tools. While scanners and blockers are generally understood
there is some confusion about what is meant by generic AV. For most
users, as well as the majority of computer security experts, generic AV
brings to mind only 'integrity checking.'

However, checksumming isn't really a generic AV method. The term
generic, from the Latin 'genus,' implies that a method is applicable to
a group or kind, and exclusive of others. While checksumming applies to
all viruses, it is not inclusive to ONLY viruses. The checksum of a
file, or of a program, may change for many reasons most of which are not
connected with virus infection. A few common, non-viral reasons for the
changing of a file, and of its checksum, are its replacement with a
newer version, self configuration, and corruption due to truncation and
cross linking. There are other causes, as well.

Checksumming has long been used for antivirus integrity testing since
viruses need to change the hosts they invade in order to get executed
and replicate. Most antivirus products now include some sort of
integrity check. The majority use a CRC, while others use a proprietary
checksum algorithm. Cryptographic methods like DES are used as well.
While not yet reported, it is possible for a virus to compensate for its
presence in a file after infecting it. A CRC algorithm is complex enough
in order to assume that no virus writer will bother to include such an
algorithm in the virus's code.

Checksum integrity checking does not take full advantage of the
capabilities of modern generic antivirus methods. These new technologies
cover all aspects of antivirus protection. Some of the new capabilities
are not possible with signature scanners and activity blockers. Generic
techniques include virus capture, damage recovery, and advanced methods
like correlation.

GENERIC VIRUS CAPTURE.

When a virus strikes, the first event to identify is that something is
wrong. Since viruses are real programs and they have to execute their
code somehow, then they necessarily leave traces. Virus capturing, the
equivalent of detection in generic terminology, is based on the sensing
of phenomena indicating the possible presence of a virus. This is
different than finding a specific signature in memory or in a file which
is the method used in scanners and TSR's. The range of useful phenomena
for virus capturing is broad and includes self-baiting, self integrity
checking, verification of memory stealing, launching bait sequences,
sensing piggybacking or file killing, integrity checking (yes, this
too), and the sensing of deception like the taking away of the file
handle. It's also possible to use tunneling for capturing boot viruses,
either by tracing the origin of certain interrupts, or with hardware
access.

All these generic methods do not require a priori knowledge of specific
viruses, as is true with signature scanners. All viruses, if they
conform to the definition of a virus, will disclose their presence to
one or more of the sensing methods mentioned. This has important
implications. Virus capturing methods can detect the activity of both
existing and new viruses which isn't possible using known virus
signatures.

GENERIC INTEGRITY ANALYZERS.

Integrity analyzers deserve special attention because they are the most
powerful of the generic methods. A properly designed analyzer can tell
you more about a virus, in a matter of moments, than any other tool.
When properly used the integrity analyzer can determine the size of the
attacking virus, whether it uses stealth, if it's full or only semi-
stealth (there is a difference in recovering from each), and if the file
is recoverable.

To understand how integrity analyzers work it is instructive to compare
them to integrity checkers. The latter calculates a number (CRC) that
represents every byte in the processed file. Changing a single byte
anywhere in the file will result in a different calculated CRC.

For example, DOS's SETVER.EXE program contains the version table of
programs written to a special section in the SETVER.EXE file. SETVER
contains an 'internal overlay'. Entries can be added or removed from the
SETVER table. Changes to the SETVER table are legitimate as it is the
purpose of the program, and adding an entry or deleting an entry does
not necessarily indicate infection. An integrity checker will detect
changes to the SETVER table without distinguishing between changes due
to a virus or a legitimate one. An antivirus integrity analyzer can tell
the difference.

The DOS programs that interest us are of two types, COM and EXE. There
are other executable types, as well, such as SYS, OVL, DLL, and DRV,
etc. Yet they are of no particular interest to us. Most of them are
covered under the umbrella of the EXE and New EXE types. Every program
has an entry point, this is where DOS starts reading the instructions
and executing the program. The entry point is usually indicated by a
'jump' or 'call' instruction at the beginning of COM files, or by a set
of pointers, contained in the header of EXE programs. In both cases, the
entry point is well defined and can be found from the file parameters.

When a virus infects a program, it either appends, prepends, or inserts
its code into the file and then modifies the entry pointer(s) to start
execution from the virus code. Let's assume, for example, that we take a
snapshot of a few bytes at the entry point of the uninfected file. If we
compare the entry point of an original and infected file, and the bytes
at the same relative offset to the entry point, we'll see that they have
changed. Normally we will find three kind of changes in the infected
file when it is compared to the uninfected one. The header (or the jump
address at the beginning of the file) has changed, the code itself at
the entry point has changed, and the location of the entry point has
changed, as well.

Just a few significant bytes can reveal more information about virus
infection than a complete file CRC, or the cryptographic signature of
the whole file. Moreover, while the latter is incapable of revealing
whether the change was caused by a virus, the data used in generic
integrity analysis inherently contains this capability. We need, of
course, a few more parameters to confirm whether the change is viral, or
due to non-viral factors such as corruption, or perhaps installation of
a newer version of the program. A possible way to discriminate between
the options is to include in a file's integrity database parameters that
should NOT change as a result of infection by a virus. The presence of
these parameters in the modified file necessarily indicate a virus
infection, and their absence implies that the change is not viral.

Let's now consider the potential of integrity analyzers for determining
the specific file alterations and methods of operation used by a
particular virus, and how to handle them. Many of the newer viruses use
stealth, although not all have full stealth capability. Discriminating
between semi- and full stealth depends upon whether an integrity checker
can detect changes when the virus is active in memory. If no changes are
detected, then the virus is full stealth and cooperative methods can be
used to remove it. If changes can be seen when the virus is active in
memory, but no change can be seen in file size, then the virus is
semi--stealth. Some semi--stealth viruses will show a DECREASE in file
size by exactly the length of the virus code. This is the result of an
unsuccessful attempt to conceal their presence.

The conclusion, which is perhaps surprising to some, is that CRC
methods, and cryptographic signatures, are not the best suited for
antivirus integrity checking. A generic integrity analyzer is better
adapted for this purpose. It can identify changes in program files
specific to virus presence and also potentially critical virus
characteristics.

GENERIC RECOVERY FROM VIRUS ATTACKS.

Generic recovery is the restoration of infected programs to their
original, pre-infected state. Many security experts recommend a system
should be recovered by deleting the infected programs and replacing them
with clean ones from backup. The main reason experts recommend
replacement instead of restoration, is because they claim that you can't
be sure the restoration results in a byte for byte identity with the
original program. The fact is, however, that this can be easily
verified. In the highly technological world in which we live, there is
no room for superstition or speculation, certainly not if the facts can
be verified. For a privately owned PC, or for a critical application PC,
replacement might be the simplest course of action. But in business
network environments, where time is of prime importance, fast recovery
of executable programs may be imperative. Critical data files can always
be restored from backup if there is a concern for their integrity. In
fact, rapid program restoration can speed the process of restoring
critical data files from backup by returning a system to operational
status more quickly.

Computer viruses are deterministic code and they function in a
deterministic way, too. Virus names like Satan and Devil's Dance are
just folklore and have nothing to do with unnatural powers. The
exactness of the recovery of a program from a virus can be verified
easily by comparing, byte for byte, a restored program to a clean one
from backup. In case of a massive infection, generically restore a few
infected samples and compare them to clean originals to determine
whether complete restoration is possible. If it is, then you can be sure
that the restoration of the rest will be complete, as well. This results
from the deterministic nature of a specific virus's method of infection
and the inherent logical structure of executable files.

An advantage of advanced generic methods is that file integrity
authentication and file restoration can be accomplished using the same
database files. This capability results from the generic nature of the
processes involved. It also further demonstrates the value of generic
integrity analysis over CRC and cryptographically based checksummers.
The databases of the latter do not contain the information required to
restore infected programs. A checksum (or CRC) is just a 16, 32 or 64
bit number. How can you restore a file with the knowledge that its
pre-infected checksum was 1234 and when infected it is 4321? By
contrast, when critical program file characteristics have been sampled
and stored in databases, it is possible to use this information to
restore files to their original condition, byte for byte.

COOPERATIVE RECOVERY METHODS.

A special category of generic recovery methods are the cooperative ones.
These apply only to full stealth viruses of both the boot and file
infector types.

The principle involved is extremely simple. The recovery process takes
advantage of the fact that a full stealth virus, either boot or file,
will present the correct, uninfected data of the inspected sector or
file, when the virus is active in memory.

To recover from a stealthed boot infector (MBR infectors are referred to
as "boot infectors," as well), simply copy the stealthed image of the
infected sector and rewrite it to the same place using tunneling
techniques. The advantage of cooperative recovery, over the undocumented
'generic' technique known as FDISK/MBR, is that with the cooperative one
you write EXACTLY what was there in the master boot sector in the first
place, while in many cases you might cause more harm than good with
FDISK. There are products that implement this cooperative recovery
method under the name 'SeeThru' technology.

Another effective file recovery technique is by cooperative integrity
checking. A new integrity database is established while the virus is
still active in memory. Then the computer is rebooted and the programs
are restored from the database made when the virus was resident. This
technique is effective only against full stealth viruses. It has been
implemented successfully against Tremor, a common virus in Germany, and
works against other full stealth viruses such as NATAS, Die_Hard,
Hemlock, N8fall, Invisible Man, Uruguay, and all strains of Frodo, as
well.

VIRUS ANALYZERS.

The problem that haunts security experts when they face a new, or
modified virus, is that it usually takes days, sometimes weeks or even
months, until antivirus developers have an algorithm available to
restore systems from the new virus. We have seen instances when whole
enterprises were halted for days because of attacks from new viruses.

Generic methods have a lot to offer in such situations. First, generic
recovery will allow return of systems to operational status in the
shortest possible time. In most cases, programs can be completely
restored using the integrity database without waiting for the
disassembly and analysis of the virus. In the case of destructive,
overwriting viruses, the integrity database can be used to identify
which files need to be removed and replaced.

The correlation scanner, a breakthrough in generic AV technology, can be
used to spot infected files that were not found by the integrity
analyzer and the source of the infection, too. Since an attacking virus
can be of an unknown type, then no scanner will find it. The file that
brought the infection into the system will not have a pre-infected,
"clean" record in the database since it was already infected when the
database for it was created. The correlation scanner will find the
original infector, and other infected files, by similarity to an
infected sample identified by an integrity analyzer, captured through a
baiting process, or designated by the user. Correlation scanners have
recently been enhanced so that a library of signatures can now be used
with them, as well.

The generic correlator identifies similarities in the processing that a
file undergoes during infection (a virus infection is a 'process'), in
the cryptographic model used in the virus code, and the matching of a
signature string. The correlation scanner has proven itself effective
with plain, encrypted, and even polymorphic viruses.

The correlator may replace the scanner in many cases, and can be used to
disinfect a computer, without needing updates, and with no delays. It is
a tool that empowers users to restore their systems independently of
antivirus security experts who may not always have the resources to
respond quickly to user needs and requests.

INTEGRATED ANTIVIRUS PROTECTION.

Having reviewed the basics of generic antivirus technology, let's now
discuss overall antivirus strategy. An effective antivirus defense
should consist of several elements. Because of its broad effective
spectrum, generic protection can and should have a pivotal role in any
integrated AV defense strategy. It can capture a virus that passes
through a known virus screening process and act as a buffer BEFORE
loading an activity blocker or TSR scanner, if one is used.

On-demand and TSR scanners can detect only viruses that are contained in
their signature database. Often, this does not include all known
viruses. Therefore, virus scanning should always be PRECEDED by running
a generic probe and integrity analyzer, especially before scanning a
file server. There is always the risk that a new fast infector, or even
a known one not included or accurately detected, is active in memory.
Signature scanning, without first performing a generic integrity
analysis, can infect every executable inspected by the scanner.

The backbone of an integrated antivirus strategy is the integrity
analyzer. First, you need to run it regularly in order to keep its
database up to date. Secondly, it's arguably the most effective means to
detect an infection in its earliest stage. Thirdly, it will be the first
to capture the presence of a fast infector BEFORE it can infect any
significant number of programs, even if the virus is new. And lastly, it
provides a very effective way to quickly and fully recover a system to
its original, pre-infected condition, in the event of an infection.

Virus scanners are a convenience, for scanning new software before
installation, and for scanning floppies. They do offer some potential
for detecting known viruses prior to executing their hosts. However, no
scanner detects all known viruses and new viruses are produced daily, as
well. And, it is very easy to mask existing viruses from detection so
that even a comprehensive scanner will fail to detect them. These are
significant practical limitations that preclude use of signature
scanners as a primary method of AV security and defense.

The core component of a comprehensive AV defense can be modeled upon the
military equivalent of a "fail safe." A "fail safe" is a counter
defensive process that will operate despite unknown and variable
conditions and methods of attack.

Generic methodologies best implement this concept as they function
without regard to the specific details prevailing at the time of the
viral attack, and use multiple, overlapping and partially redundant
mechanisms for virus capture and, if needed, system recovery. Deliberate
redundancy means that one or more of the generic methods will operate
effectively despite variability in virus hosts or methods of viral
action. In contrast, signature based antivirus can only handle known
threats, contained in their database.

Therefore, a more defensible AV strategy combines generic capture and
restoration methods with known virus scanners. A cost effective benefit
of this approach is that scanners will not need to be updated as
frequently since the main purpose of updates is to detect new viruses.
The generic methods can act as both a buffer and safety net to protect
systems until preventative methods using scanners are possible. Since it
can take anywhere from days to months for signature detection and
cleaning algorithms to be made available, generic methods provide a
highly significant "fail safe" to overall AV defense strategies. Even
more importantly, advanced generic methods such as those presented,
here, will provide both virus capture AND system restoration
immediately.

The use of antivirus TSR's and activity blockers is debatable. They
always adversely and, at times, significantly impact system performance
and available resources. They can also interfere with normal program
operation, and conflict with other applications. Finally, they can be
used as vectors to spread infection across all files scanned when either
a known or new virus not detected is present. Yet there are users that
won't give them up. Sometimes this decision is based on the mistaken
assumption that TSR's provide protection beyond what is available from
scanners. In fact, the latter are invariably more comprehensive and
reliable than their respective TSR, from the same vendor.

From a practical standpoint, when generic, TSR scanner, and activity
blocking methods are used the latter two should be run AFTER completing
the generic tests and integrity analyses. Generics and AV TSR's do not
coexist well. The latter intercept generic probes as if they were
viruses. This is especially true of activity blockers.

After gaining some experience with the various elements of an integrated
antivirus strategy, users will be better able to decide how important
signature scanning, TSR's, and activity blockers are to a cost
effective, and efficient AV defense. Past experience shows that users
tend with time to rely on generic AV since it proves dependable, and is
the least intrusive and obstructive to them. Users don't disable generic
AV methods like they often do with TSR's and scanners. This is true even
after long periods without viral incidents because the generic methods
do not interfere with normal system operation, consume system resources,
or negatively impact system speed and performance.

When a virus does strike, then the generic AV will be in place to
capture it, stop it, and restore the system to operational status
quickly.


Acknowledgements to Robert C. Casas, Ph.D., CPC Ltd, Glenview, IL, USA,
for assistance in revising and editing the original paper which was
published elsewhere.


Zvi Netiv is the author of InVircible (IV), the first all-generic
antivirus. He manages NetZ Computing, Israel, which continues
development and production of IV.
