January 1996

---------------------------------------------------------------------------

NTFS disc fragmentation

 [Image]

NTFS has so many advantages over the dreadfully archaic FAT file format
that is difficult to know where to start listing them. Suffice it to say
that on almost every count, NTFS is better or equal to FAT format. And so
it should be considered the disc format of choice for serious NT Server and
Workstation users.

There is only one real reason for using FAT format: to maintain
compatibility with Dos, thus allowing for dual-booting back to Windows 3.x
or Windows 95.

Amongst all the claimed advantages of NTFS is the one that suggests that an
NTFS disc partition cant become fragmented, where a file gets scattered
across the disc in small pieces like confetti. Obviously such fragmentation
is not good for performance, because the drive has to do multiple seeks in
order to read the whole file, rather than just reading the file straight in
one gulp.

The performance of a FAT partition can become considerable deteriorated due
to fragmentation, which is why there are tools to repack, or defragment, a
FAT partition. However, no such tools come with NT, and the NTFS disc
format cannot be defragmented using Dos FAT-format tools.

Microsoft have always maintained that NTFS doesnt need a defragger tool,
that it is effectively self-defragmenting, and that because it is a
sensible operating system, such problems just wont arise.

And, to an extent, they certainly have a point. All of the dreadful "large
floppy disc" heritage of FAT format is gone, replaced by a properly
designed and structured file system that has similarities to some Unix
filesystems.

Many of the NTFS performance features are barely documented, although most
are definitely in the realms of "standard accepted practise" for sensible
disc systems.

So NTFS does things like elevator seeking, whereby the head movement is
minimised. Imagine a situation where app one wants some data from track 1,
then app two wants some data from track 100, then app three wants some data
from track 20, and finally app four wants data from track 60. Imagine that
all the requests are made almost simultaneously. The read order would be
track 1, track 100, track 20 and finally track 60 which is inefficient
because it requires three head movements that need changes of direction. NT
resequences the reads so that the data is read in the order Tracks 1, 20,
60 and then 100. The correct data is then sent back to the relevant
applications. If it was a FAT partition being read by Windows 3.x, the head
reads would be in the originating order, not the optimised one.

Another example where NTFS scores over FAT is that the directory appears to
be stored in the middle the disc, so its at the optimum position for
reading. Compare this with FAT where the directory is at the first tracks.

When it comes to storing data, NT does some clever tricks with NTFS. For
example, if the amount of data is small (less than around 2k or so), the
data is actually stored in the directory entry itself  no separate disc
file is created for the data.

And NT tries to find a slot big enough for it  it doesnt just start
writing at the beginning of the disc, hopping over existing data as
necessary.

Finally, NT seems to leave space before and after a file, to allow it a bit
of room to grow without causing defragmentation. The net result, however,
is that it appears that an NTFS disc can go into defragmentation when the
disc isnt nearly 100% full. Imagine the scenario with a cleanly formatted
FAT system of 1Gb in size. You copy 900Mb of data onto the drive  the data
is written sequentially starting from the beginning of the disc, and there
should be no fragmentatation. Because NTFS appears to leave guard space
before and after files, the disc can be "full" (from the perspective of
fragmentation) well before a full discs-worth of data has been written.

Details on this from Microsoft are rather sketchy. However, I have heard
figures of around 70% capacity being possible before fragmentation occurs.
In other words, if you take a 1Gb NTFS partition and put 700Mb of data onto
it, all of the disc "space" will be used, because 700Mb of data actually
equals 700Mb of data plus 300Mb of guard-band space too. Obviously, you can
put 1Gb of data onto a 1Gb NTFS partition, but that last 300Mb or so will
be fragmented.

This appears to be a good argument for not running NTFS partitions at full
capacity. However, the real-world performance of NTFS has never in my
experience really been affected by such things, so the power features of NT
and NTFS (elevator seeking, unified cache manager etc) must be covering up
and alleviating any problems.

So why am I talking about NTFS defragmentation? Simple: Executive Software
have just launched an NTFS defragmenter tool! Executive Software are well
known in the VMS mini computer marketplace for their defragmentation tools,
so its good to see such a company come to the NT marketplace.

Their tool, called DisKeeper is available in two versions  one for
workstation and one for server. Installation was relatively simple and
straightforward and requires a system reboot.

Diskeeper is not like the classical Norton Speedisk. It doesnt repack all
the files into a nice ordered arrangement, putting the space at the end of
the partition. Such an arrangement would be less than ideal for NTFS,
because of the completely different architecture of the disc format.
However, it is worthwhile making sure that each file is contiguous and thus
not fragmented, and so thats what it does. It examines each file for
fragmentation, and unfragments the file if necessary.

However, its real tour de force is that this works even on a live drive
that you are using. It even works on files that you have open at the time.
It manages this feat by working at a very low level in the NT disc
operating system, working co-operatively with the NTFS fault tolerant
drivers. Diskeeper runs as a low-priority background task, and thus doesnt
impact foreground performance. The basic mode of operation is called "Set
it and forget it", whereby you schedule the system to check each drive
every four hours or so, and then to do whatever defragmenting is necessary.
Then you can just forget that Diskeeper is running.

In order to work out what fragmentation, there are two analysis tools
provided. One, called Fragmentation Analysis, does a file-by-file
examination of the disc, reporting things like number of fragmented files,
the number of file fragments used by these files, and the percentage of
disk space that is fragmented. It also gives an "average fragments per
file" rating.

The other analysis tool is called Fragmentation Monitor, which is a
graphical representation of the drive. This applet is most useful, because
it gives you a "45,000 feet" view of your disc, and you can immediately see
where there is fragmentation. Blue bars are unfragmented files, grey is
unused disc space, and red is fragmentation. This analysis tool sounds
ideal, except it appears to have one major limitation  it is too coarse in
its action. By this I mean that there is no zooming in facility. On a
1280x1024 pixel display, there are 1280 horizontal pixels available. If you
are looking at a 4Gb partition, then each one-pixel wide vertical stripe
represents some 3.125Mb of disc space. Given that the cluster size (and
hence unit of fragmentation) is some 4Kb, it means that each pixel
represents some 800 units of allocatable disc space. It would appear that
Fragmentation Monitor indicates a red bar if there is any fragmentation
within that block of disc space, which means it can be rather pessimistic
in its operation. Ive suggested a zoom mode, where you can actually
drill-down to the logical sector level  maybe this is something for the
next release.

The obvious first question about Diskeeper is the inevitable "is it safe?".
Well, it seems fine so far, and (touch wood!) hasnt lunched any data. And
Ive been running it continuously on two machines armed with about 10Gb of
disc store.

The next question is "does it really disrupt the foreground operation?".
Well, the Fragmentation Monitor is somewhat of a processor hog, and really
shouldnt be run unless you need its specific functionality (which is
rather limited in real usefulness in this release). But the actual
defragmenting task takes little processor load, and it certainly yealds
well to foreground tasks doing real work. So no problems there.

Finally, is it worth it? Have I seen significant performance improvements?
To be perfectly honest, Id have to say that Ive not really seen any
dramatic improvements in performance. Viewed in that light, the cost of the
software seems a little extravagent. However, I know of some users who have
reported genuine and meaningful performance speed ups when using Diskeeper,
so I am quite happy that the benefit is there. Why havent seen big
improvements? I think a lot comes down to your machines. My main NTW
workstation has a lot of processor horsepower (twin 120MHz Pentiums) but
more importantly it has a lot of ram (96Mb). So I regularly have 20-30Mb of
NT disc cache in operation. In addition, it has a very fast SCSI adapter
(Adaptec 3940 PCI) and fast hard discs (including the awesome 4Gb Quantum
Atlas drive). So I think that the amount of benefit that Id see, given
that NT does read-ahead caching, is likely to be small anyway. The same is
true of my Primary Domain Controller NT Server box  a Pentium 100 with
64Mb of ram. This is basically a network server box, used to host SQL
Server 6.0 and Exchange Server. I cant really say Ive seen any
performance boosts from them either.

Discussions with colleagues who have used Diskeeper successfully would tend
to suggest that the defragmentation speed-up is noticeable on machines of
limited resources. In other words, a machine with less than 32Mb of ram.
One friend reported significant speed up on a 16Mb ram computer with a full
1Gb hard disc running NT Server.

There are two flies in the ointment. Firstly, Diskeeper installs slightly
modified versions of NTFS.SYS, NTOSKRNL.EXE and so on which allows it to do
its low-level work. So if you install a Service Pack, Diskeeper will stop
working until you get a new update from Executive Software. I would be a
lot happier if the Diskeeper functions were built into the standard
Microsoft shipping versions of these files, to ensure that you are always
up to date. Although Im sure that the changes that Executive Software
makes to these files is quite trivial, I would far rather they were built
in.

Secondly, it has a few bugs and awkward areas in the user interface that
really need to be sorted out. The Fragmentation Analysis tool gets all hot
and bothered when you give it a partition thats larger than 4Gb, although I
would stress that the core defragmenting software works fine. And the main
user interface could do with a major clean-up to remove some aspects which
are just not intuitive at first glance.

Given that, Im in a quandary. Diskeeper certainly does what it sets out to
achieve. And defragmenting files is certainly "a good thing". But whether
it is worth the entry ticket is a point of debate  some users say "without
doubt it is", others (like me) are not so sure. And I dont like its
reliance on Executive Software keeping releasing fixes to stay in sync with
Microsoft NT Service Packs. However, it is certainly worthy of examination
 contact the helpful chaps at Serverware at 01732 464624 for more
information.

 [Image]

---------------------------------------------------------------------------

Service Pack 2

This was released a few hours before I finished this column, so Ive just
installed it onto the big Asustek computer.

Well, the helpfile lists a number of bugs that have been fixed, and Im
sure they are all laudable ones. However, its time to trot out my usual
whinge about poor documentation. For example, the Visual Basic 4 Enterprise
Edition readme file states "Visual Basic applications using machines acting
as Remote Automation servers (in other words, running AUTMGR32.EXE) may
consume resources until they finally exhaust all available resources and
the machine hangs. This typically occurs when memory is not cleaned up
after objects are created and then released by the operating system. This
problem is fixed by NT 3.51 Service Pack 2."

So naturally one would expect this NT3.51 problem, which is apparently
fixed in Service Pack 2, to be documented in the fixed bug-list that is
supplied with Service Pack 2. Errr, no. I cant find any mention of it.

Ive said this before and I will repeat it until the Hell freezes over 
this level of documentation on bug-fixes is simply not acceptable for a
mission-critical product. I dont give a damn what other companies get away
with  there are absolute standards that are important, and the sooner
Redmond and Winnersh wakes up to this, the better. For sure, I might get
better documentation if I spent thousands of pounds a year on Microsoft
corporate support, but there is no excuse for such a complete list being
absent from the standard service pack. Size is hardly an excuse  SP2
weighs in at some 7Mb zipped!

The Office95 redraw mess, which I detailed a couple of issues back and
which has been confirmed by many users on the NTWORK forum on Compuserve on
a wide variety of hardware platforms, is still there in SP2. As is the
Z-order window manager bug, also confirmed on a number of machines. As is
the SUBST.EXE inability to work with long directory names containing spaces
(which I reported to Redmond during the NT 3.51 beta program back in
March). And dare I mention the word "Calculator"?

---------------------------------------------------------------------------

Organic Art Update

Just had Mark Atkinson on the phone  hes the programmer wizard and
technical guru behind the Organic Art package I described recently. He says
that the release date has been put back to the end of January, to allow for
improvements to be made. Rather than "just" being a screensaver, it will
hopefully allow you to design and create your own scenes too. And it should
support the GLINT graphics accellerator like the Creative 3D Blaster card.
Cost is still hopefully in the 40-50 pounds bracket. Sounds mouthwatering!

 [Image]

---------------------------------------------------------------------------

Octopus 1.5

Ive just received Octopus 1.5 in beta format. This is the hugely
impressive file system replication engine that Ive described before, which
allows real-time replication of directories and files between NT
workstations and servers. At the time, I said it was an essential purchase
for any serious user of NT Server who had backup servers and who wanted
protection against machine failure.

Well, Octopus had one flaw  if the source server went down, the
destination server didnt automatically kick in and take over. According to
the documentation, this now happens automatically. The two servers maintain
a little conversation between them, and when the destination server sees
that the source server is no longer there, it renames itself and restarts
for you. Hence any client connections will be automatically remade to the
destination server. Sounds impressive, and Ill be trying it out this
month. Hopefully Octopus should be released by the time you read this  its
from Serverware too, like Diskeeper.
