                                The Art
                              of Lossless
                           Data Compression
                                vol. 26t

Here are the results of tests performed in December 2003 to compare
lossless compression of "plain" texts by all known good enough programs
developed for such purpose, including UHArc, PPMd, Bzip2, RAR, ACE and 7-zip.

See Archive Comparison Test by J.Gilchrist for more details:
http://compression.ca

If anybody wants to start or continue such tests,
or can suggest some other sets of texts, or other compression programs,
 (not sources or algorithm descriptions, executable programs only)
or knows we have missed something important,
 (some new fantastic technology, an algorithm or even a program capable
 of lossless compression of up to 1000:1 etc.)
please let us know immediately: artest@inbox.ru   Thank you!


[[1]] COMPRESSION QUALITY
=========================
             (see also
             [[2]] Speed
             [[3]] Details
             [[4]] Comments)

Last seventh line shows results for the sum of all 1231 texts in six sets.

Origin DURILCA Entropy   Slim    RKC     EPM  Compressia PAQ6  PPMonstr  PPMN 
555.90% 100.04  100.47  100.13   100%   101.76  100.92  101.06  102.24  106.01
567.66%  100%   104.53  108.58  104.13  110.79  108.38  110.82  113.42  115.01
455.43%  100%   104.56  108.09  106.31  109.96  108.47  110.10  112.89  111.94
513.18%  100%   104.14  109.85  107.26  112.58  110.55  113.25  115.19  114.70
799.24% 100.41  101.80  108.25  124.62  102.27  106.97  102.55   100%   115.93
432.59%  100%   122.52  122.68  127.29  123.95  130.67  124.45  125.58  123.82
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
506.08%  100%   108.04  111.99  112.20  113.73  114.12  114.19  115.87  116.11

 ASH     BEE      PPMd    RAR    UHArc    DC     SBC    BZip2   7-zip   pkzip 
101.08  107.67  108.14  109.58  105.80  109.17  109.49  124.64  152.86  159.97
113.49  117.44  118.78  120.19  117.81  119.46  120.79  136.85  178.63  186.09
112.30  116.43  117.05  117.39  115.89  116.26  117.70  130.16  163.57  170.65
114.86  119.36  120.27  120.44  120.33  120.57  121.66  137.33  174.83  181.91
110.63  111.40  109.06  110.11  117.68  121.44  118.06  149.67  197.06  205.34
126.87  128.81  129.39  130.16  138.55  136.80  135.14  143.16  175.08  181.85
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~..~~~~~~~~~~~~~~~~~~~~~~
116.43  120.13  120.75  121.21  123.03  123.18  123.45  137.84  173.93  181.03

Results of many other programs are in full version only, TEXTS.DAT file.


[[2]] Speed
===========
Canterbury Corpus Large Set http://corpus.canterbury.ac.nz/resources/large.zip
was used for this test, and a 970MHz PC with 256Mb RAM and Windows98.

Programs,             Compression/    Overall    Average Users'   Compressed
options               Extraction,      Score         Score           Size
                        seconds     seconds, %    seconds, %       bytes ,  %
no compression            0     0     4446  559     4446  577    16005619  600
7za a -t7z               88     1     1104  138     1024  133     3650998  138
7za a -t7z -mx          154     1      982  123      843  109     2975903  113
7za a -tzip              23     0     1244  156     1223  158     4393623  167
7za a -tzip -mx          44     0     1268  159     1227  159     4401160  167
ash04a /o6 /m230        110   114     1100  138     1001  130     3154310  119
ash04a /o9 /m230        164   170     1130  142      982  127     2863384  108
ash04a /o16 /m230       221   210     1524  191     1324  171     3932449  149
ash04a /o6 /m230 /s16   117   121     1113  140     1007  130     3146157  119
ash04a /o9 /m230 /s16   172   177     1142  143      987  128     2855460  108
ash04a /o16 /m230 /s16  235   224     1541  193     1329  172     3895407  148
bee a -m1                70    71     1049  132      986  128     3268345  124
bee a -m2               144   148     1167  146     1037  134     3147914  119
bee a -m3               201   204     1267  159     1085  140     3099133  117
durilca e -o8 -t2(31)    77    77      964  121      895  116     2915855  110
durilca e -o9 -t2(31)    80    80      969  121      897  116     2911029  110
durilca e -o10 -t2(31)   82    83      970  122      896  116     2899602  110
durilca e -o12 -t2(31)   83    85      970  122      895  116     2886375  109
durilca e -o16 -t2(31)   85    86      972  122      895  116     2880306  109
durilca e -o32 -t2(31)   87    88      975  122      896  116     2878335  109
durilca e -o64 -t2(31)   88    89      977  122      898  116     2878490  109
durilca e -o128 -t2(31)  88    90      978  123      898  116     2878614  109
epm9 c008               117   116     1034  130      929  120     2882720  109
epm9 c012               142   142     1119  140      991  128     3007492  114
epm9 c016               147   147     1138  143     1006  130     3040288  115
grzipii e                16    10      908  114      893  115     3170895  120
paq6v2a                 388   387     1597  200     1247  161     2957673  112
paq6v2a -6             2551  2710     6002  754     3706  481     2664718  101
rar a -m1                13     1     1239  155     1227  159     4408290  167
rar a -m2                21     1     1169  147     1150  149     4131324  157
rar a -m3                36     1     1156  145     1123  145     4026937  153
rar a -m4                19    11      914  114      896  116     3178761  120
rar a -m5                25    16      920  115      898  116     3164814  120
rar a -m5 -s             25    16      923  116      900  116     3173148  120
rar a -mc16t -s          35    26      992  124      960  124     3347746  127
rar a -mc16t+ -s         35    26      992  124      960  124     3347746  127
rar a -mc16:128t -s      40    31      955  120      919  119     3180234  120
rar a -mc16:128t+ -s     40    31      955  120      919  119     3180234  120
rar32 a -mc16t -s        37    28      996  125      962  124     3347746  127
rkc -mf -M230M -o8       94     6     1083  136      998  129     3537812  134
rkc -mx -M230M -o8      120   130     1019  128      911  118     2766843  105
rkc -mxx -M230M -o8     276   276     1320  166     1071  139     2763319  105
rkc -mxx -M230M -o12    297   297     1334  167     1067  138     2663521  101
rkc -mxx -M230M -o16    318   312     1361  171     1074  139     2630041  100
rkc -mxx -M230M -ft     316   316     1367  172     1082  140     2645990  100
rkc -mf -M230M -td+      94     6     1083  136      998  129     3537812  134
rkc -mx -M230M -td+     144   156     1032  129      902  117     2633402  100
rkc -mxx -M230M -td+    305   311     1346  169     1072  139     2630041  100
slim a -d32 -w21        581   650     2034  255     1511  196     2890141  109
slim a -d16 -w21        577   648     2029  255     1509  195     2890280  109
slim a -d8 -w21         566   626     1995  250     1486  192     2891166  109
slim a -d4 -w21         521   583     1908  239     1439  186     2892897  109
uhbc e                   40    31      951  119      914  118     3164344  120
//previous
ace32 a -d4096           66     2     1124  141     1058  137     3801917  142
ace32 a -d4096 -m1       31     2     1134  143     1104  143     3965841  149
ace32 a -d4096 -m5      206     2     1249  157     1045  136     3746553  140
arh a                    38    40     1091  137     1053  137     3647067  137
arh a -2 -1              68    40     1121  141     1054  137     3647067  137
ba -k -50                35    12      964  121      929  121     3298943  124
bix a -mdg -s            92     1     1069  134      978  127     3514944  132
boa -m1                  86    88     1253  158     1168  152     3886863  146
boa -m15                139   141     1165  146     1027  133     3182739  119
boa -m15 -s             138   140     1148  144     1011  131     3132810  117
bzip2 -k                 21     6     1032  130     1011  131     3616113  136
bzip2 -k -9              20     6     1031  130     1011  131     3616113  136
Entropy   t o12          94    95     1003  126      910  118     2932445  110
Entropy   t o16          98    99     1001  126      904  117     2892711  108
Entropy   t o32         105   106     1009  127      905  118     2873677  108
Entropy   t o64         112   111     1022  128      911  118     2873318  108
compcl c -b15            37    20      904  114      868  113     3049569  114
compcl c -b15 -s         38    29      808  102      770  100     2668128  100
dc e                     13     7      903  114      890  116     3179173  119
dc e -b16300 -mt5        17     7      795  100      778  101     2773427  104
eri a                    39    17      936  118      897  116     3168414  119
eri a -m3                59    21      996  125      937  122     3295385  124
eri a -m6                59    21      989  124      931  121     3272926  123
gcac a                   26    12      980  123      954  124     3390603  127
gcac s                   26    12      981  123      955  124     3395064  127
imp98 a -mm              31     1     1175  148     1143  148     4112387  154
imp98 a -mm -2           13     5      999  126      986  128     3533761  132
imp98 a -2 -s4           13     5      999  126      986  128     3533693  132
pkzip -es                 1     1     1654  208     1652  215     5945622  223
pkzip -a                  4     1     1308  164     1304  169     4691491  176
pkzip -exx               16     1     1296  163     1280  166     4605942  173
ppmdi e -o7 -m232        11    12      904  114      893  116     3169000  119
ppmdi e -o12 -m232       25    26      915  115      891  116     3113630  117
ppmdi e -o16 -m232       27    28      916  115      890  116     3100943  116
ppmn_km e -o6 -MT1       30    30      931  117      901  117     3132278  117
ppmn_km e -o8 -MT1       64    65      993  125      929  121     3107654  116
ppmn_km e -o9            62    63      990  125      929  121     3115560  117
ppmn_km e -o9 -M:50      49    50      949  119      900  117     3058436  115
ppmonstr e -o7 -m232     64    67      974  123      911  118     3035498  114
ppmonstr e -o8 -m232     71    74      980  123      910  118     3007964  113
ppmonstr e -o64 -m232   101   103     1020  128      920  119     2937387  110
qlfc a                   22    11      973  122      952  124     3385084  127
rk -mf2                  50    20     1108  139     1058  137     3735704  140
rk -mx1                 144   143     1147  144     1004  130     3093640  116
rk -mx2                 173   173     1203  151     1032  134     3086312  116
sbc c -b63               29     9      914  115      885  115     3151930  118
sbc c -os -b63           29     9      810  102      782  101     2779632  104
szip -o4                  4    10     1027  129     1023  133     3647445  137
szip -o6                 17    14      996  125      979  127     3475264  130
szip -o8 -b41            27    17      973  122      947  123     3348344  125
zzip a                   21    11      977  123      956  124     3400243  127
zzip a -mx               22    12      973  122      952  124     3383060  127
zzip a -mx -30m          30    12      940  118      910  118     3233147  121
abc13 -c                 20     9      950  119      931  120     3313820  124
abc24 -c                 29    16      923  116      897  116     3159570  118
uharc a -m1 -md32768     63     5     1026  129      969  125     3446069  129
uharc a -m2 -md32768    100     5      980  123      890  115     3151572  118
uharc a -m3 -md32768    110     5      973  122      874  113     3087249  115
uharc a -mz -md32768      8     9     1084  136     1077  139     3842041  144
uharc a -mx -md32768     60    55      936  117      882  114     2953184  110
ybs -m1m                 22     8      952  119      931  120     3316356  124
ybs -m2m                 25     8      937  117      915  118     3255538  122
ybs -m4m                 28     8      919  115      894  116     3178183  119
ybs -m8m                 31     8      905  113      877  113     3116271  116
ybs -m16mu               33     9      835  105      805  104     2852642  106
ybs -m16mu -r            34     9      841  105      811  105     2874130  107
ybs_d -m16mu             34     9      836  105      805  104     2852642  106

Overall score is calculated by adding compression time, extraction time, and
time it would take to transfer the compressed file over a 28,800bps network:
(compressed_size)/3600

Average Users' score is calculated by adding (compress_time/10)+ extract_time +
time it would take to transfer the compressed file over a 28,800bps network.
Compression time is divided by 10 here, because more than 90% of people would
never compress anything during their life (with compression programs), but they
use compressed data almost _every_ time they use computers and/or Internet.
That's why compression time is not so actual for them.


[[3]] Details
=============
are no longer put to this main text
(thousands of lines reporting about 60,000 results on 1231 files in 6 sets),
but can be found in FULL version with TEXTS.DAT and *.BAT
at http://compression.ru/artest/artest26.zip
or http://artest1.tripod.com/artest26.zip


[[4]] Comments
==============
Links to download programs
and Homepages
are now in links.htm file

What's new:
~~~~~~~~~~~
12 new programs were tested:

ASH 04a
7-zip 3.13
RAR 3.30b5
UHBC 1.0
EPM 9
Slim 0.021a
BEE 0.7.7
Durilca 0.3a
PAQ 6v2
RKC 1.02
GRZipII 0.2.3
BWIC

Latest beta versions of DC, Entropy, UHArc were available
from authors by e-mail request:
Entropy: artest@inbox.ru
DC: EdgarBinder@t-online.de
UHArc: Uwe.Herklotz@gmx.de

Results of many other programs are in full version only, TEXTS.DAT file.

The set of Russian texts was at http://arte.nm.ru, but now only artest@inbox.ru


WARNINGS:
~~~~~~~~~
Beta versions of RKC, EPM and BWIC fail to compress and/or decompress many
files. Authors are notified.
ASH 04a can fail to decompress some large files if it lacks memory.

BA 1.00beta5 can't correctly decompress shaks12.txt and set used for speed
measurements.

DC 0.99.158b failed to decompress 1DFRE10.dc , ANDES10.dc , and BTI0110.dc,
saying "Corrupted block" (while t(est) command writes "Test successful").

Problems in all other compressors were not found.

ESP, Rkive and many other programs are not tested any more,
their results and links can be found in previous volumes of ARTest.

The LATEST RELEASE, and all previous volumes can be found
at http://compression.ru/artest/


Send your suggestions, comments to artest@inbox.ru
With best kind regards,
A.Ratushnyak
