                                The Art
                              of Lossless
                           Data Compression
                                vol. 22t

Here are the results of tests performed in May 2001 to compare
lossless compression of english texts by all known good enough programs
developed for such purpose, including RK, DC, YBS, Bzip2, IMP, RAR and 7-zip.

See Archive Comparison Test by J.Gilchrist for more details: http://act.by.net

If anybody wants to start or continue such tests,
or can suggest some other sets of texts, or other compression programs,
 (not sources or algorithm descriptions, executable programs only)
or knows we have missed something important,
 (some new fantastic technology, an algorithm or even a program capable
 of lossless compression of up to 1000:1 etc.)
please let us know immediately: artest@inbox.ru   Thank you!


[[1]] COMPRESSION QUALITY
=========================
             (see also
             [[2]] Speed
             [[3]] Details
             [[4]] Comments)

Fifth line shows results for the sum of four Canterbury Corpus Large Set files,
eleventh line - for the sum of all 1231 files in six sets.


Original PPMonstr PPMD    RK      DC      BOA    SBC     BEE     YBS     UHArc
585.61%  100%   105.23  100.70  105.54  107.16  105.98  109.66  106.08  105.36
414.65% 101.41  105.75  102.94  102.19  101.35  103.73  105.87  102.81   100%
591.98%  100%   104.97  101.61  104.18  108.09  104.72  107.91  104.47  103.34
675.45%  100%   108.79  102.80  114.03  115.60  112.90  115.35  112.80  116.17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
529.35%  100%   104.77  101.06  103.95  105.26  104.63  107.42  104.23  103.13

492.94%  100%   104.25  101.90  103.73  106.62  105.57  107.26  106.54  105.45
398.92%  100%   102.76  101.46  101.83  103.66  103.64  105.35  104.36  104.05
436.99%  100%   102.63  101.61  102.66  104.30  104.13  105.02  105.19  105.50
733.54%  100%   101.39  101.42  111.46  112.25  108.82  113.68  110.65  113.37
341.03%  100%   102.39  106.22  107.84  103.75  106.50  104.23  105.70  110.54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
427.88%  100%   102.48  102.51  104.14  104.43  104.77  105.30  105.33  106.68


  BA     ZZip     ACB    777     SZip    ERI    BZip2     ACE    RAR     7-zip
110.28  110.48  108.75  115.54  111.99  113.05  122.35  139.58  139.64  160.82
104.68  104.01  103.67  101.29  104.65  107.02  111.82  113.43  113.35  111.96
108.65  108.78  108.23  113.96  113.02  111.34  122.46  142.25  143.32  163.83
113.47  113.41  114.26  130.89  118.43  115.63  133.69  143.59  145.24  190.08
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
107.10  106.92  106.35  110.59  109.26  109.58  118.69  129.78  130.25  145.56

108.52  109.13  109.26  114.41  112.96  112.49  118.83  137.33  137.55  155.12
106.41  107.07  107.85  108.93  110.25  110.17  114.01  131.81  135.76  143.27
107.35  108.10  108.40  110.07  110.61  111.58  116.94  135.22  136.58  148.86
119.88  114.63  117.48  117.65  118.78  119.80  137.36  150.03  155.95  180.86
108.68  109.75  109.43  111.02  111.11  111.93  112.85  130.72  130.22  138.02
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
108.05  108.52  108.85  110.52  110.98  111.69  116.54  134.15  135.61  147.05


[[2]] Speed
===========
Canterbury Corpus Large Set http://corpus.canterbury.ac.nz/ftp/large.zip
was used for this test, and an AMD-K6-400 machine with 192Mb RAM and Windows98.

 Programs,options        Overall      Average      Compress Extract  Compressed
                          score,       Users'        time,   time,     size,
                                       score,       seconds seconds    bytes
                      seconds  %    seconds  %
NO COMPRESSION         4446   538%   4446   562%        0       0    16005619
7z a -tufa1            1324   160%   1057   133%      296       8     3672086
7z a -tufa1 -mx        1322   160%   1057   133%      294       8     3672086
7z a -tzip             1283   155%   1231   155%       58       5     4393637
7z a -tzip -mx         1325   160%   1237   156%       97       6     4401174
7zip a                 1278   154%   1229   155%       55       3     4393637
7zip a -mx             1325   160%   1237   156%       98       5     4401174
777 a -mg              1372   166%   1159   146%      237     151     3544038
acb B                  3236   392%   2202   278%     1148    1156     3352388
acb b                  3934   476%   2585   327%     1499    1527     3272388
acb u                  5115   619%   3243   410%     2080    2139     3225662
ace32 a                1212   146%   1124   142%       98       5     3992645
ace32 a -d4096         1168   141%   1072   135%      106       6     3801917
ace32 a -d4096 -s-     1208   146%   1116   141%      103       5     3962381
ace32 a -d4096 -m1     1160   140%   1112   140%       53       6     3965841
ace32 a -d4096 -m5     1353   164%   1076   136%      309       5     3746553
arh a                  1133   137%   1078   136%       61      59     3647067
arh a -2 -mm           1132   137%   1078   136%       60      59     3647067
arh a -1 -mm           1438   174%   1302   164%      152       8     4605607
arh a -2 -1            1283   155%   1093   138%      212      59     3647067
ba -k                  1024   124%    977   123%       52      23     3421195
ba -k -1               1148   139%   1113   140%       39      22     3914655
ba -k -50              1006   121%    945   119%       68      22     3298943
bee a -m1 -d3          1407   170%   1217   153%      211     198     3593467
bee a -m2 -d3          1460   177%   1229   155%      256     238     3479698
bee a -m3 -d3          1700   206%   1347   170%      392     355     3432029
bix a -mdg -s          1141   138%    995   125%      163       3     3514944
boa -m15               1344   162%   1142   144%      225     236     3182739
boa -m15 -s            1321   160%   1122   141%      221     230     3132810
boa -m7                1316   159%   1130   142%      207     216     3217354
boa -m1                1400   169%   1260   159%      155     166     3886863
bzip2 -k               1060   128%   1021   129%       43      13     3616113
bzip2 -k -1            1185   143%   1155   146%       33      12     4106479
bzip2 -k -5            1077   130%   1044   132%       37      13     3700097
bzip2 -k -9            1057   128%   1021   129%       40      13     3616113
dc e                    927   112%    902   114%       27      17     3179173
dc e -ft                933   113%    906   114%       29      17     3192832
dc e -b16300            826   100%    791   100%       39      17     2773427
dc e -b16300 -mt5       825   100%    790   100%       38      17     2773427
eri a -m1              1057   128%    970   122%       96      22     3378440
eri a -m2              1054   127%    962   121%      102      23     3346586
eri a -m3              1060   128%    958   121%      113      25     3318853
eri a                  1070   129%    958   121%      124      26     3313568
imp98 a -mm -m3        1218   147%   1140   144%       87       4     4059874
imp98 a -mm -2         1024   124%    995   125%       33      10     3533763
imp98 a -2 -s4         1025   124%    995   125%       33      10     3533695
imp a -2 -s4           1021   123%    992   125%       32       9     3530158
pkzip -es              1657   200%   1653   209%        4       2     5945622
pkzip -a               1317   159%   1305   165%       14       1     4691491
pkzip -exx             1390   168%   1291   163%      110       1     4605942
ppmd e -o3 -m184       1093   132%   1083   137%       11      13     3849571
ppmd e -o4 -m184        985   119%    973   123%       13      15     3447452
ppmd e -o5 -m184        938   113%    925   116%       15      17     3263988
ppmd e -o6 -m184        912   110%    897   113%       17      19     3155348
ppmd e -o7 -m184        898   108%    880   111%       20      22     3084162
ppmd e -o8 -m184        890   107%    869   110%       23      25     3032824
ppmd e -o9 -m184        891   108%    867   109%       27      29     3007612
ppmd e -o10 -m184       901   109%    865   109%       40      41     2953155
ppmd e -o11 -m184       915   110%    864   109%       56      56     2891692
ppmd e -o12 -m184      1029   124%    937   118%      102     112     2935640
ppmonstr e -o3 -m184   1136   137%   1107   140%       32      35     3850354
ppmonstr e -o4 -m184   1031   125%    998   126%       37      40     3437676
ppmonstr e -o5 -m184    986   119%    949   120%       42      44     3243866
ppmonstr e -o6 -m184    966   117%    924   116%       47      49     3132547
ppmonstr e -o7 -m184    952   115%    905   114%       52      55     3040773
ppmonstr e -o8 -m184    942   114%    888   112%       60      63     2949530
ppmonstr e -o9 -m184    942   114%    880   111%       68      72     2888367
ppmonstr e -o10 -m184   959   116%    882   111%       85      87     2834425
ppmonstr e -o11 -m184   987   119%    892   112%      106     108     2785525
ppmonstr e -o12 -m184  1100   133%    959   121%      157     158     2831191
rar a                  1191   144%   1130   143%       67       5     4029084
rar a -mm -m1          1233   149%   1204   152%       33       5     4304860
rar a -mm -m5          1438   174%   1132   143%      340       5     3938355
rar a -mm -s           1193   144%   1129   142%       71       5     4023405
rk -mf1                1147   139%   1127   142%       23      20     3978408
rk -mf2                1186   143%   1095   138%      101      48     3735704
rk -mf3                1280   155%   1096   138%      204      50     3693704
rk -mx1                1385   167%   1163   147%      248     279     3093640
rk -mx2                1454   176%   1200   151%      282     315     3086308
rk -mx3                1461   177%   1204   152%      285     319     3087044
sbc c -on -b59          947   114%    897   113%       56      18     3146705
sbc c -oa -b59          966   117%    912   115%       59      19     3195883
sbc c -of -b59          959   116%    907   114%       58      19     3176987
sbc c -os -b59          871   105%    813   102%       65      20     2832457
szip -o6               1021   123%    994   125%       29      27     3475264
szip -o8               1020   123%    985   124%       39      29     3430586
szip -o8 -b41          1001   121%    964   121%       41      30     3348344
ufa a                  1387   168%   1137   143%      277      28     3895425
ufa a -mg -mu32        1335   161%   1159   146%      196     211     3344003
uharc a -m1 -md8192    1262   153%   1184   149%       87      25     4141149
uharc a -m2 -md8192    1275   154%   1139   144%      152      25     3955624
uharc a -m3 -md8192    1434   173%   1108   140%      363      25     3768111
uharc a -mz -md8192    1134   137%   1111   140%       26      30     3884071
uharc a -mx -md8192    1019   123%    932   117%       97      83     3023083
ybs -m16mu              914   110%    820   103%      104      17     2857446
ybs -m16mu -r           933   113%    827   104%      118      16     2878433
ybs -m8m                935   113%    888   112%       51      16     3123345
zzip a                 1017   123%    972   122%       51      23     3400243
zzip a -mm -mx         1015   123%    969   122%       52      24     3383215
zzip a -mm -a          1017   123%    970   122%       52      25     3383215

Overall score is calculated by adding compression time, extraction time, and
time it would take to transfer the compressed file over a 28,800bps network:
(compressed_size)/3600 , because 28800 bits_per_second is 3600 bytes_per_second

Average Users' score is calculated by adding (compress_time/10)+ extract_time +
time it would take to transfer the compressed file over a 28,800bps network.
Compression time is divided by 10 here, because more than 90% of people would
never compress anything during their life (with compression programs), but they
use compressed data almost _every_ time they use computers and/or Internet.
That's why compression time is not so actual for them.


[[3]] Details
=============
are no longer put to this main text
(1490 lines reporting 65614 results on 1231 files in 6 sets),
but can be found in FULL version with TEXTS.DAT and *.BAT
at http://geocities.com/SiliconValley/Bay/1995/artest22.zip
or http://artest1.tripod.com/artest22.zip


[[4]] Comments
==============
Links to download programs:
~~~~~~~~~~~~~~~~~~~~~~~~~~~
7-Zip 2.24    :W http://www.7-zip.com/dl/7zip224.exe                              463K
ACE32 2.02    :W ftp://ftp.forlangs.net/pub/windows/winace/ace202.exe             587K
ERI32 4.16fre :e http://geocities.com/eri32/eri416fr.zip                           94K
PkzipC  4.00  :W ftp://ftp.pkware.com/pkzc400s.exe                               3470K
RK-dos 1.04.1 :e http://rksoft.virtualave.net/downloads/rk104a1d.exe              461K
RK     1.04.1 :W http://rksoft.virtualave.net/downloads/rk104a1w.exe              380K
RAR32  2.80   :e ftp://ftp.netlab.sk/public/rarsoft/rar/rarx280.exe               269K
WinRAR 2.80   :W ftp://ftp.netlab.sk/public/rarsoft/rar/wrar280.exe               621K
BA 1.01b5     :e http://hem.spray.se/mikael.lundqvist/ba101br5.zip                 61K
SBC 0.860b    :e http://geocities.com/sbcarchiver/sbc0860b.zip                    208K
ZZip 0.36c    :W http://www.via.ecp.fr/~damien/downloads/zzip-win32.zip            35K
PPMD var.H,
PPmonstr v.H  :W ftp://ftp.cdrom.com/.2/sac/pack/ppmdh.rar                         57K

BIX 1.00b7    :W http://www.7-zip.com/dl/ufa/bix100b7.zip                          89K
777 0.04b1    :W http://www.7-zip.com/dl/ufa/777004b1.zip                          72K
UFA 0.04b1    :W http://www.7-zip.com/dl/ufa/ufa004b1.zip                          64K
ArHanGeL 1.40 :a http://geocities.com/SiliconValley/Lab/6606/arh140.zip            50K
Imp     1.1   :e http://www.winimp.com/imp110d.zip                                266K
Imp-win 1.12  :W http://www.winimp.com/imp112.exe                                 122K
PkZip   2.50  :a ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers/pk250dos.exe     202K
ACB 2.00c     :e ftp://ftp.simtel.net/pub/simtelnet/msdos/compress/acb_200c.zip    42K
BOA 0.58b     :e ftp://ftp.cdrom.com/.2/sac/pack/boa058.zip                        74K
DC 0.98b      :W ftp://ftp.cdrom.com/.2/sac/pack/dc124.zip                         55K
Bzip2 1.0.1   :W ftp://sourceware.cygnus.com/pub/bzip2/v100/bzip2-100-x86-win32.exe 68K
SZip 1.12a    :W http://www.compressconsult.com/szip/szip_112a_win32.zip           71K
UHArc 0.2b    :e ftp://ftp.cdrom.com/.2/sac/pack/uharc02.zip                      101K
YBS 0.03e     :e http://members.nbci.com/vycct/ybs003ed.zip                        55K
YBS 0.03e     :W http://members.nbci.com/vycct/ybs003ew.zip                        43K
BEE 0.4.8     :W mailto:Andrew.Filinsky@p11.f4.n452.z2.fidonet.org

:a - any DOS  - DOS programs, will run under pure DOS or in a DOS box
:e - extender - DOS programs using DOS extenders like DOS/4GW or CWSDPMI
:W - windows  - Windows95/98/NT/etc programs

If direct link doesn't work-most probably newer version of the program appeared
at the same site: visit web page, or read the whole directory from ftp server
(i.e. try the same URL, but without filename).


Homepages:
~~~~~~~~~~
Arhangel     : http://geocities.com/SiliconValley/Lab/6606
BA           : http://hem.spray.se/mikael.lundqvist
Eri32        : http://geocities.com/eri32
      mirror : http://artest1.tripod.com
RK           : http://rksoft.virtualave.net
Imp,WinImp   : http://www.technelysium.com.au
      mirror : http://www.winimp.com
ACE,WinACE   : http://www.winace.com
PkZip        : http://www.pkware.com
RAR,WinRAR   : http://www.rarsoft.com
BZip2        : http://sources.redhat.com/bzip2
SZip         : http://www.compressconsult.com/szip
ZZip         : http://www.zzip.f2s.com
YBS          : http://members.nbci.com/vycct
SBC          : http://geocities.com/sbcarchiver
Ufa,777,
    BIX,7-Zip: http://www.7-zip.com

PPMD, PPMonstr, ACB, Bee, BOA, DC, UHArc - no homepage.


What's new:
~~~~~~~~~~~
12 new programs tested: RK, SBC, ZZip, ACE, 7-zip, RAR32, WinRAR,
ERI32, BA, PPMD, PPMonstr, UHARC.

Test data was updated, a set of Russian texts was added.

Latest beta versions of BEE, DC, UFA, UHArc are available
from authors by e-mail request:
BEE: Andrew.Filinsky@p11.f4.n452.z2.fidonet.org
DC: EdgarBinder@t-online.de
UFA: support@7-zip.com
UHARC: Uwe.Herklotz@gmx.de

 Results of ArHanGeL, IMP, BICOM, BIX, Pkzip
 are in full version only, TEXTS.DAT file.


WARNINGS:
~~~~~~~~~
BA 1.00beta5 can't correctly decompress shaks12.txt .

DC 0.99.158b failed to decompress 1DFRE10.dc , ANDES10.dc , and BTI0110.dc,
saying "Corrupted block" (while t(est) command writes "Test successful").

ERI32 4.8fre can't compress files larger than (free DPMI memory)/6, i.e.
about 10Mb on a PC with 64Mb RAM. The largest 44Mb file was split to 5 chunks
9000000 bytes long (last chunk was 8894190 bytes).

Problems in all other compressors were not found.


The LATEST RELEASE, and all previous versions of these tests can be found
at http://geocities.com/SiliconValley/Bay/1995/ and http://artest1.tripod.com/


Send your suggestions, comments to artest@inbox.ru
With best kind regards,
A.Ratushnyak,
RAO Inc.
