View the hierarchical structure of the disk c. Disk structure. Check out Windows Explorer
Arkhangelsk State University
Kotlas branch
full-time department
Faculty: technical
Specialty: PGS
Course work
Discipline: computer science
Topic: Disk File Structure
Performed
1st year student
Zhubreva Olga
Alexandrovna
Checked:
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
§ 1 The concept of a file system. . . . . . . . . . . . . . . . . . . .
§ 2 MS-DOS file system. . . . . . . . . . . . . . . . . . . .
§ 3 File system Windows 95. . . . . . . . . . . . . . . . . .
§ 4 Windows NT file system. . . . . . . . . . . . . . . . . .
Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction.
The methodological manual reveals the essence of the concept of “file system”,
which is one of the most important concepts in the course “Software
computer support”, and also presents the structure of file systems of such
operating systems such as MS-DOS, Windows 95, Windows NT.
The structure of the present is determined by an attempt to achieve this goal.
manuals: the topic material is divided into 4 main parts (parts are presented in
form of a paragraph), each of the parts is also, as necessary, divided into
smaller detailed parts.
§ 1 The concept of a file system.
1.1. File system definition.
File (in English File) - folder, folder.
A file is a named area of memory on some physical
a medium intended for storing information.
Total funds operating system providing access to
information on external media is called a file management system or
file system.
File system is a functional part of the operating system.
system that is responsible for exchanging data with external storage
devices.
ORGANIZING ACCESS TO THE FILE
Directory structure
We hope that you have a good idea of organizing the storage of books in
library and, accordingly, the procedure for searching for the desired book by its code from
catalogue. Transfer your understanding of this to the way you store files
on the disk and organizing access to it.
Access - the procedure for establishing communication with memory and a file located in it
for writing and reading data.
The name of the logical drive that appears before the file name in the specification,
specifies the logical drive on which to search for the file. On the same disk
a directory is organized in which the full names of the files, as well as their
characteristics: date and time of creation;
volume (in bytes); special attributes. Similar to the library system
directory organization the full name of the file registered in the directory,
will serve as a cipher by which the operating system finds
location of the file on disk.
Directory - a directory of files indicating their location on the disk.
There are two directory states - current (active) and passive. MS
DOS remembers the current directory on each logical drive.
The current (active) directory is the directory in which the user is working
produced in current machine time.
Passive directory - a directory with which this moment no time
The MS DOS operating system adopts a hierarchical structure
(Fig. 9.1) directory organization. Each disk always has
a single main (root) directory. He is at level 0
hierarchical structure and is indicated by the symbol "\". Root directory
created when formatting (initializing, marking) a disk, has
limited in size and cannot be deleted using DOS tools. To main
directory may include other directories and files that are created by commands
operating system and can be removed using the appropriate commands.
Rice. 9.1. Hierarchical directory organization structure
Parent directory is a directory that has subdirectories. Subdirectory
A directory that is included in another directory.
So any directory containing lower level directories can
to be, on the one hand, parental towards them, and on the other hand,
subordinate to the top-level directory. As a rule, if this
does not cause confusion, use the term "catalogue" to mean either
subdirectory, or parent directory depending on the context.
Directories on disks are organized as system files. The only thing
the exception is the root directory, for which a fixed space is allocated on
disk. Directories can be accessed as if they were a regular file.
Note. The directory structure may contain directories that are not
The names of subdirectories are the same as the rules for naming files (see.
subsection 9.1). For formal differences from files, usually subdirectories
assign only names, although you can add a type according to the same rules as
and for files.
Access to the file contents is organized from the main directory, through
a chain of subordinate directories (subdirectories) of the i-th level. In the catalog
records of both files and directories can be stored at any level
lower level. are called empty.
In Fig. 9.2 shows the simplest directory structure, where in the main
directory 0
level only records about files of lower-level directories are stored
does not exist
In Fig. Figure 9.3 shows the hierarchical structure of the directory, where in the directories
any level stores records about files and directories at the lower level. Moreover
transition to a lower-level directory can only be organized
sequentially through subordinate directories.
Rice. 9.2. Simplest directory structure with no directories
lower level
Rice. 93,.. Typical structure directory consisting of directories below
level: when designating a lower-level directory, three numbers are used:
the first digit indicates the level number; the second is the serial number of this
catalog on this level, the third indicates at what level
his name is registered. Each directory has a KAT name with indexes.
For example, CAT342 is the name of a third-level directory that is registered in
second level catalog number 4
You cannot go from the main directory directly to a directory, for example, level 5.
It is imperative to go through all the previous top-level directories.
The principle described above for organizing access to a file through a directory
is the basis of the file system.
The file system is the part of the operating system that manages the location and
access to files and directories on the disk.
The concept of a disk file structure is closely related to the concept of a file system.
by which we mean how they are located on the disk: the main directory,
subdirectories, files, operating system, and which ones are allocated for them
volumes of sectors, clusters, tracks.
Rules for forming the disk file structure. When creating a file
disk structure, the MS DOS operating system follows a number of rules:
A file or directory can be registered with the same name in
different directories, but in the same directory only once;
The order of file names and subdirectories in the parent directory
arbitrary;
The file can be divided into several parts, for which
sections of disk space of the same volume on different tracks and
sectors.
Path and invitation
From Fig. 9.1 - 9.3 you can see that the file is accessed through the directory
thanks to the name registered in it this file. If the directory has
hierarchical structure, then the operating system organizes access to the file
depending on the position of the subdirectory in which the name is registered
the file you are looking for.
Access to the file can be organized as follows:
If the file name is registered in the current directory, then it is sufficient for
to access a file, specify only its name;
If the file name is registered in a passive directory, then, while in
current directory, you must specify the path, i.e. chain of subordinates
directories through which the file should be accessed.
Path is a chain of subordinate directories that must be traversed along
hierarchical structure to the directory where the desired file is registered. At
When specifying a path, directory names are written in order and separated
separated from each other by the \ symbol.
User interaction with the operating system is carried out with
with help command line displayed on the display screen. At first
The command line always has a prompt that ends with
>. The prompt may display: the name of the current drive, the name of the current
directory, current time and date, path, delimiter characters.
The operating system prompt is an indication on the information display screen,
indicating the operating system is ready to input user commands.
Example 9.8.
The current drive is floppy drive A.
the current directory is the main directory, as indicated by the \ symbol.
C:\CAT1\CAT2
The current disk is HDD C. Current catalog -
second level catalog CAT2, included in the first level catalog
CAT1, which, in turn, is registered in the main
catalogue.
There are three options for organizing the file access path, depending on
places of its registration:
The file is in the current directory (no path). When organizing
To access a file, you just need to specify its full name;
The file is located in a passive directory of one of the lower levels,
subordinate to the current directory. When organizing access to a file
you must specify a path that lists all directory names
lower level lying on this path (including the directory in which
this file is registered);
the file is in a passive directory on a different branch from
the location of the current directory of the hierarchical structure. At
to organize access to the file, you must specify the path starting with
main directory, i.e. starting with the \ character. This is explained by the fact that in
hierarchical structure, movement is possible only vertically from above -
Horizontal transitions from directory to directory are not allowed.B
The examples below illustrate possible options ways.
Example 9.9.
Condition: file F1.TXT is registered in the current 1st level directory K1
hard drive C. Therefore, the invitation C:\K1 is displayed on the screen
Explanation: In this case there is no path and to access the file it is enough
indicate only its full name F1.TXT
Example 9.10.
Condition: file F1.TXT is registered in the 2nd level directory K2 hard
drive C. The current directory is K1. Therefore, an invitation is displayed on the screen
Explanation: in this case the path will start from the directory
K1 down through its subordinate directory K2. Therefore, before
The full file name indicates the path from the current K2 directory
Having become familiar with the concept of a path, let us return to what was introduced in subsection. 9.1
the concept of a file specification. There is a shortened file specification and
the complete file specification that the path participates in. In Fig.
Figure 9.4 shows options for the rule for generating a file specification.
Rice. 9.4. Specification formats (optional parameter specified)
Example 9.12. Short form of the file specification C:\KIT.BAS
The file with the BASIC program KIT.BAS is located in the main
hard drive directory.
Full form file specification
C:\CAT1\CAT2\BOOC1.TXT
The text file BOOOK1.ТХТ is registered in the directory of the second
CAT2 level of hard drive C.
Structure of directory entries
Now you have to get acquainted with the structure of the records stored in the directory
with information about lower-level files and subdirectories.
A file entry in a directory contains the name and type of the file, the file size in
bytes, creation date, creation time and a number of other parameters required
operating system to organize access.
The entry for a lower-level subdirectory in the parent directory contains it
name, attribute, date and time of creation.
Let's consider possible options for the contents of the directory. 1st option. In the catalog
Only records about files are stored (Fig. 9.5). Before the file entries
A message about the directory name is displayed. In this case, this is the main
floppy disk directory A. At the end of the directory contents, a message appears about
the number of files stored on the disk and the free disk space in
bytes For example, the directory above displays the following message:
4 file(s) 359560 bytes free
Number of files on disk. Volume of free
disk space, byte2nd option.
The directory stores only entries about lower-level directories (Fig. 9.6).
Rice. 9.7. The main directory stores files and subdirectories
At the end of the directory, as in the previous case, you will see a similar
The above-mentioned entry about the amount of free disk space.
3rd option: The directory stores records of both files and directories
lower level (Fig. 9.7). From this structure it is clear that in this directory
there are 3 files and 2 lower level directories BASIC and LEXICON. On disk
free space 2.6575 MB.
The three directory presentation options discussed above reflect the contents of
main directory. Directory structure, starting from level 1 and below,
identical and differs from the main one only in that before the file entries
and lower-level directories, two entries with an ellipsis are placed (Fig. 9.8).
The dots you see at the beginning mean that the content is called up on the screen
subdirectory (1st level directory) KNIGA, which contains two text
SVET and TON files.
|Directory of C:\KNIGA | | |
| |11-12-90 |09:40 |
| |10-10-91 |08:30 |
|svet txt 55700 |04-04-90 |10:05 |
|ton txt 60300 |03-05-91 |11:20 |
|2 files 912348 bytes free | | |
|Fig. 9.8. Structure of entries in a subdirectory |
1.2. File system FAT.
Windows operating systems are used, developed for
DOS FAT file system, in which for each DOS partition and volume there is
boot sector, and each DOS partition contains two copies of the table
file allocation table (FAT).
FAT is a matrix that states the relationship
between the files and folders of the partition and their physical location on the hard drive
In front of each hard disk partition there are two sequentially located
copies of FAT. Like boot sectors, FAT is located outside
area of the disk visible to the file system.
When written to disk, files do not necessarily take up space,
equivalent to their size. Typically files are split into clusters
of a certain size, which can be scattered throughout the section.
As a result, the FAT table is not a list of files and their
locations, and a list of section clusters and their contents, and at the end
FAT table entries are 12-, 16-, and 32-bit
hexadecimal numbers, the size of which is determined by the FDISK program, and
the value is directly generated by the FORMAT program.
All floppy disks and hard disks up to 16 MB in size
FAT uses 12-bit elements. Tough and removable drives having
size from 16 MB or more, 16-bit elements are usually used.
The FAT file system was used in all versions of MS-DOS and in the first
two releases of OS/2 (versions 1.0 and 1.1). Each logical volume had
own FAT, which performed two functions: contained information
distributions for each file in the volume in the form of a list of module associations
distributions (clusters) and indicated which distribution modules are free.
When the FAT table was invented, it was an excellent solution for
disk space management, mainly because floppy disks,
on which it was used were rarely more than a few Mb in size.
FAT was small enough to remain in memory permanently,
allowed for very fast random access to any part
any file.
When FAT was applied to hard drives, she got too big
for memory resident and degraded system performance.
In addition, since information regarding free disk space
space was distributed “across” a large number of FAT sectors,
it was impractical when allocating file space, and
File fragmentation has proven to be a barrier to high efficiency.
In addition, the use of relatively large clusters on hard
disks led to a large number of unused areas, since in
On average, for each file, half the cluster was wasted.
For several years, Microsoft and IBM have been trying to extend
life of the FAT file system due to the removal of volume size restrictions,
improving distribution strategies, pathname caching, and relocation
tables and buffers into extended memory. But they can only be regarded
as temporary measures because the file system simply didn't fit
large random access devices.
§ 2 File system of the MS-DOS operating system.
One of the concepts of the MS DOS file system is a logical disk.
Logical drives:
DOS, each logical disk is a separate magnetic disk. Each logical
the disk has its own unique name. As a logical drive name
letters of the English alphabet from A to Z (inclusive) are used.
Quantity logical drives, thus no more than 26.
Letters A and B are reserved strictly for the floppy disks available in the IBM PC (
Starting with the letter C, logical drives (partitions) are named HDD (
Winchester).
The pictures show an image of a logical disk.
If a given IBM PC has only one FDD, the letter B is skipped
Only logical drives A and C can be system drives. File
logical disk structure:
To access information on disk (located in a file), you need
know the physical address of the first sector, (Nsurfaces+Ntracks+Nsectors),
the total number of clusters occupied by this file, the address of the next
cluster, if the file size is larger than the size of one cluster, etc. All
it's very vague, difficult and unnecessary.
MS DOS saves the user from such work and does it itself. For
providing access to files - the MS DOS file system organizes and
maintains a specific file structure on a logical disk.
File structure elements:
Start sector(sector bootstrap, Boot sector),
Data area (remaining free disk space)
These elements are created by special programs (in MS DOS environment) in the process
disk initialization.
Start sector (boot sector, Boot sector):
Here is the information required by MS DOS to work with the disk:
OS ID (if the disk is system),
Disk sector size,
Number of sectors in the cluster,
Number of spare sectors at the beginning of the disk,
Number of FAT copies on the disk (standard - two),
Number of items in the directory,
Number of sectors on the disk,
Disc format type,
Number of sectors in FAT,
Number of sectors per track,
Number of surfaces
OS boot block,
Behind the starting sector is FAT.
FAT(File Allocation Table):
The disk data area (see above) is presented in MS DOS as a sequence
numbered clusters.
FAT is an array of elements that address clusters of the disk's data area.
Each data area cluster corresponds to one FAT element.
FAT elements serve as a chain of links to file clusters in the area
FAT is an extremely important element of the File structure. Violations in FAT can
lead to complete or partial loss of information on the entire logical disk.
That is why two copies of FAT are stored on the disk. There are special programs
which monitor the status of FAT and correct violations.
Root directory:
This is a specific area of the disk created during the initialization process.
(formatting) the disk, which contains information about files and directories,
stored on disk.
The Root Directory always exists on a formatted disk. On
There is always only one root directory on one disk. Root size
directory for a given disk is a fixed value, so the maximum
the number of files and other (child) directories “attached” to it
(Subdirectories) - strictly defined.
So, summing up all of the above, we can conclude MS-DOS - 16-
bit operating system running in real processor mode.
§ 4 Operating file system Windows systems 95.
4.1. Background to the creation of FAT 32.
In the field of personal computers, a crisis arose in 1987.
Features of the FAT file system, developed by Microsoft over ten years
years before for the Standalone Disk Basic interpreter and later
adapted for the DOS operating system were exhausted. FAT
was intended for hard drives with a capacity of no more than 32 MB, and the new HDDs
larger capacities turned out to be completely useless for PC users.
Some independent vendors have offered their own solutions
this problem, but only with the advent of DOS 4.0 this crisis was overcome -
for a while.
Significant changes to the file system structure in DOS 4.0
allowed the operating system to work with disks with a capacity of up to 128 MB; With
Subsequent minor additions raised this limit to
2 GB. At that time, it seemed that this amount of memory exceeded any
imaginable needs. However, if the history of personal computers is anything to go by
and taught, then precisely that the capacity “exceeds any conceivable
needs", very quickly becomes "almost insufficient for serious
works." Indeed, hard drives are currently commercially available
capacity is usually 2.5 GB and higher, and sometimes very high and
The 2 GB ceiling that freed us from limitations has turned into yet another
an obstacle to be overcome.
4.2. Description of FAT 32.
Microsoft has developed a new extension for Windows 95 systems.
FAT - FAT32 systems, without any loud statements provided for in
OEM Service Pack 2.
The FAT32 system is installed only in new PCs, and do not count on
get it when you go to new version Windows 95, although it claims
Microsoft, this extension will become part of the main package for
Windows upgrades
4.2.1. Disk areas
This file system provides a number of special areas on
disk allocated to organize disk space during its
formatting - boot head record, disk partition table, record
downloads, file allocation table (from which the FAT system got its
name) and root directory.
On physical level disk space is split into 512-bytes
areas called sectors. The FAT system allocates space for files
blocks, which consist of an integer number of sectors and are called clusters.
The number of sectors in a cluster must be a multiple of a power of two. At Microsoft
these clusters are called memory allocation units, and in
SCANDISK report indicates their size, for example "16,384 bytes each
memory allocation unit."
4.2.2. FAT chain
FAT is a database that links disk clusters
file spaces. This database provides for each cluster
only one element. The first two elements contain information about the
FAT system. The third and subsequent elements are matched
clusters of disk space, starting with the first cluster allocated
for files. FAT elements can contain several special values,
indicating that
The cluster is free, i.e. not used by any file;
The cluster contains one or more sectors with physical defects and
should not be used;
This cluster is the last cluster of the file.
For any element used by the file, but not the last cluster
FAT contains the number of the next cluster occupied by the file.
Each directory - regardless of root or subdirectory - also
is a database. In the DOS directory for each file
there is one master record (B Windows environment 95 for long names
files, additional entries have been entered). Unlike FAT, where each element
consists of a single field, entries for a file in a directory consist of
several fields. Some fields - name, extension, size, date and time -
can be displayed on the screen using the DIR command. But the FAT system provides
the field that is not displayed by the DIR command is the field numbered first
cluster allocated for the file.
When a program sends a request to the operating system, with
request to provide it with the contents of some file, the OS looks through
a directory entry for it to find the first cluster of that file. Then she
accesses the FAT entry for a given cluster to find the next
cluster in the chain. Repeating this process until the last one is detected
file cluster, the OS determines exactly which clusters belong to this
file and in what sequence. In this way the system can provide
program any part of the file it requests. This way of organizing
The file is called a FAT chain.
In the FAT system, files are always allocated an integer number of clusters. At 1.2-
GB hard disk with 32 KB clusters in the directory can be specified,
what size text file containing the words "hello, world" is
only 12 bytes, but in fact this file takes up 32 KB of disk space
space. The unused part of the cluster is called wasted space
(slack). In small files, almost the entire cluster may be lost
place, and on average the losses are half the size of the cluster.
On an 850 MB hard drive with 16 KB clusters at a medium size
files about 50 KB about 16% of the disk space allocated for files
space will be lost to unused but allocated files
One way to free up disk space is by using
disk compression programs such as DriveSpace, which highlight "lost"
space" for use by other files.
4.2.3. Other changes in FAT32
To ensure the ability to work with an increased number of clusters, in
directory entries for each file must allocate 4 bytes for the initial
file cluster (instead of 2 bytes in the FAT16 system). Traditionally, each entry in
directory consists of 32 bytes (Fig. 1). In the middle of this record there are 10 bytes
used (bytes 12 to 21), which Microsoft has reserved for
their own needs in the future. Two of them are now allocated as
additional bytes required to indicate the starting cluster in the system
The operating system has always provided for the presence of two
FAT instances, but only one of them was used. With the transition to FAT32
the operating system can work with any of these copies. Another
The change is that the root directory, which previously had a fixed
size and strictly defined disk space, you can now freely
grow as needed, like a subdirectory. Doesn't exist now
restrictions on the number of entries in the root directory. This is especially important
because there are multiple entries for each long file name
catalogue.
Combination of Roaming Root and Feature
using both copies of FAT are good prerequisites for unhindered
dynamic resizing of disk partitions, for example reducing a partition
in order to free up space for another operating system. This new one
approach is less dangerous than those used in third-party programs
to change disk partitions when working with FAT16.
From all of the above we can conclude:
MS-DOS was a purely 16-bit operating system and ran in
real processor mode. IN Windows versions 3.1 part of the code was 16-
bit, and some are 32-bit. Windows 3.0 supported real mode
processor operation, when developing version 3.1 it was decided to abandon it
support.
Windows 95 is a 32-bit operating system that
bit code for compatibility with MS-DOS mode. Windows 95 32-bit
bit code.
§ 5 File system of the Windows NT operating system.
5.1. Short description Windows NT operating system.
At the moment, the global computer industry is developing very
rapidly. System performance increases, and therefore
The ability to process large volumes of data is increasing.
Operating systems of the MS-DOS class can no longer cope with this
data flow and cannot fully use the resources of modern
computers. Therefore in Lately there is a transition to more powerful and
the most advanced operating systems of the UNIX class, an example of which is
is Windows NT released by Microsoft Corporation
When the user sees the Microsoft operating system for the first time
Windows NT, a clear external resemblance to
favorite interface of the Windows 3.+ system. However, this is a visible similarity
is only minor part of Windows N.T.
Windows NT is a 32-bit operating system with
priority multitasking. As fundamental components
The operating system includes security features and
developed network service.
Windows NT also provides compatibility with many other
operating and file systems, as well as networks.
As shown in the following figure, Windows NT is
a modular (more advanced than a monolithic) operating system that
consists of separate interconnected relatively simple modules.
The main modules of Windows NT are (listed in order
following from the lower level of the architecture to the upper): level
hardware abstractions HAL (Hardware Abstraction Layer), kernel (Kernel),
execution system (Executive), protected subsystems (protected
subsystems) and environment subsystems.
Modular structure of Windows NT
5.2. Windows NT file system.
When Windows NT first came out, it included
support for three file systems. This is the File Allocation Table (FAT),
providing compatibility with MS-DOS, a file system with increased
performance (HPFS), providing compatibility with LAN Manager, and
a new file system called the Emerging Technologies File System
NTFS had a number of advantages compared to those used on
that point for most file servers is file systems.
To ensure data integrity, NTFS has a transaction log.
This approach does not exclude the possibility of information loss, however,
significantly increases the likelihood that access to file system
will be possible even if the integrity of the system is compromised
server. This becomes possible by using a transaction log to
tracking incomplete attempts to write to disk during subsequent boot
Windows NT. The transaction log is also used to check the disk for
presence of errors instead of checking each file, in case of using
file allocation tables.
One of the main advantages of NTFS is security. NTFS
provides the ability to make access control entries (Access Control
Entries, ACE) to the Access Control List (ACL). ACE
contains a group or user identification name and an access token,
which can be used to restrict access to certain
directory or file. This access may include the ability to read,
recording, deleting, executing and even owning files.
On the other hand, an ACL is a container containing one
or more ACE records. This allows you to restrict access to certain
users or user groups to specific directories or files in
In addition, NTFS supports working with long names that have
up to 255 characters long and containing uppercase and lowercase letters in any
sequences. One of the main characteristics of NTFS is
automatic creation of equivalent names compatible with MS-DOS.
NTFS also has a compression feature, which first appeared in the NT version
3.51. It provides the ability to compress any file, directory or disk
NTFS. Unlike MS-DOS compression programs that create a virtual disk,
having the appearance of a hidden file and compressing all data on this disk,
Windows NT uses an additional layer of the file subsystem for compression
and decompressing the required files without creating virtual disk. This
turns out to be useful when compressing either a specific part of the disk (for example,
user directory), or files of a specific type
(for example, graphic files). The only disadvantage of NTFS compression is
is low, in comparison with MS-DOS compression schemes, level
compression. But NTFS is more reliable and
productivity.
So, from all of the above we can conclude:
To be compatible with various operating systems, Windows
NT contains the FAT 32 file system. In addition, Windows NT contains its own
its own NTFS file system, which is not compatible with FAT 16. This
the file system has a number of advantages over FAT, as well as
features higher reliability and performance.
Conclusion.
MS-DOS - 16-bit operating system, runs in real
processor mode. In versions of Windows 3.1, some of the code is 16-bit, and some
32-bit. Windows 3.0 supported real processor mode,
During the development of version 3.1, it was decided to abandon its support.
Windows 95 is a 32-bit operating system that
works only in protected processor mode. Core including management
memory and process dispatching, contains only 32-bit code. This
reduces costs and speeds up work. Only some modules have 16-
bit code for compatibility with MS-DOS mode. On Windows 95 32-bit
the code is used wherever possible, which allows us to ensure
increased reliability and fault tolerance of the system. In addition to this, for
compatibility with legacy applications and drivers is used and 16-
bit code.
Windows NT is not a further development of earlier
existing products. Its architecture was created from scratch, taking into account
requirements for a modern operating system. Striving
ensure compatibility of the new operating system,
Windows NT developers retained the familiar Windows interface and implemented
support for existing file systems (such as FAT) and various
applications (written for MS - Dos, Windows 3.x). The developers also
included in Windows NT tools for working with various network
means.
Reliability and robustness
provide architectural features that protect application
programs from being damaged by each other and by the operating system. Windows NT
uses fault-tolerant structured exception handling on
all architectural levels, which includes recoverable file
NTFS system and provides protection using the built-in system
security and advanced memory management techniques.
Users access files by symbolic names. However, human memory limits the number of object names that a user can refer to by name. The hierarchical organization of the namespace allows us to significantly expand these boundaries. This is why most file systems have a hierarchical structure, in which levels are created by allowing a lower-level directory to be contained within a higher-level directory (Figure 19).
Rice. 19. File system hierarchy:
a – single-level organization; b – tree; in – network
The graph describing the directory hierarchy can be a tree or a network. Directories form a tree if a file is allowed to be included in only one directory (Fig. 19, b), and a network - if the file can be included in several directories at once (Fig. 19, c). For example, in MS-DOS and Windows, directories form a tree structure, while in UNIX they form a network structure. In a tree structure, each file is a leaf. The top-level directory is called the root directory, or root.
With this organization, the user is freed from remembering the names of all files; he only needs to have a rough idea of which group a particular file can be assigned to in order to find it by sequentially browsing directories. The hierarchical structure is convenient for multi-user work: each user with their files is localized in their own directory or subtree of directories, and at the same time, all files in the system are logically connected.
A special case of a hierarchical structure is a single-level organization, when all files are included in one directory (Fig. 19, a).
File names
All file types have symbolic names. Hierarchically organized file systems typically use three types of file names: simple, compound, and relative.
A simple, or short, symbolic name identifies a file within a single directory. Simple names are assigned to files by users and programmers, and they must take into account OS restrictions on both the range of characters and the length of the name. Until relatively recently, these boundaries were very narrow. Thus, in the FAT file system, the length of names was limited to scheme 8.3 (8 characters - the name itself, 3 characters - the name extension), and in the s5 file system, supported by many versions of the UNIX OS, a simple symbolic name could not contain more than 14 characters. However, it is much more convenient for the user to work with long names because they allow you to give the files easy-to-remember names that clearly indicate what is contained in the file. Therefore, modern file systems, as well as improved versions of pre-existing file systems, tend to support long, simple symbolic file names. For example, on the NTFS and FAT32 file systems included with the Windows NT operating system, a file name can contain up to 255 characters.
Examples of simple file and directory names:
Supplement to CD 254L in Russian.doc
installable filesystem manager.doc
In hierarchical file systems, different files are allowed to have the same simple symbolic names, provided they belong to different directories. That is, the “many files - one simple name” scheme works here. To uniquely identify a file in such systems, a so-called full name is used.
The full name is a chain of simple symbolic names of all directories through which the path from the root to the given file passes. Thus, the full name is a compound name, in which simple names are separated from each other by the separator accepted in the OS. Often a forward or backslash is used as a delimiter, and it is customary not to specify the name of the root directory. In Fig. 19, b two files have the simple name main.exe, but their compound names /depart/main.exe and /user/anna/main exe are different.
In a tree file system, there is a one-to-one correspondence between a file and its full name “one file – one full name”. In file systems that have a network structure, a file can be included in several directories, which means it can have several full names; here the correspondence “one file - many full names” is valid. In both cases, the file is uniquely identified by its full name.
A file can also be identified by a relative name. The relative file name is determined through the concept of “current directory”. For each user, at any given time, one of the file system directories is the current directory, and this directory is selected by the user himself upon an OS command. The file system captures the name of the current directory so that it can then use it as a complement to relative names to form the fully qualified file name. When using relative names, the user identifies a file by the chain of directory names through which the route from the current directory to the given file passes. For example, if the current directory is /user, then the relative file name /user/anna/main.exe is anna/main.exe.
Some operating systems allow you to assign multiple simple names to the same file, which can be interpreted as aliases. In this case, just as in a system with a network structure, the correspondence “one file – many full names” is established, since each simple file name corresponds to at least one full name.
And although the full name uniquely identifies the file, it is easier for the operating system to work with the file if there is a one-to-one correspondence between the files and their names. For this purpose, it assigns a unique name to the file, so that the ratio “one file - one unique name” is valid. The unique name exists along with one or more symbolic names assigned to the file by users or applications. The unique name is a numeric identifier and is intended only for the operating system. An example of such a unique file name is the inode number in UNIX system.
Mounting
In general, a computer system may have several disk devices. Even a typical personal computer usually has one hard drive, one floppy drive, and a CD-ROM drive. Powerful computers are usually equipped with big amount disk drives on which disk packages are installed. Moreover, even one physical device, using operating system tools, can be represented as several logical devices, in particular by dividing the disk space into partitions. The question arises: how to organize file storage in a system with several devices? external memory?
The first solution is that each device hosts a self-contained file system, that is, the files located on this device are described by a directory tree that is in no way connected to the directory trees on other devices. In this case, to uniquely identify the file, the user must specify the logical device identifier along with the compound symbolic file name. An example of such an autonomous existence of file systems is the MS-DOS operating system, in which the full file name includes the letter identifier of the logical drive. So, when accessing a file located on drive A, the user must specify the name of this drive: A:\privat\letter\uni\let1.doc.
Another option is to organize file storage in which the user is given the opportunity to combine file systems located on different devices into a single file system, described by a single directory tree. This operation is called mounting. Let's look at how this operation is carried out using the UNIX OS as an example.
Among all the logical disk devices available in the system, the operating system distinguishes one device, called the system one. Let there be two file systems located on different logical drives (Fig. 20), and one of the drives is the system drive.
File system located on system disk, is assigned to root. To link file hierarchies in the root file system, an existing directory is selected, in this example the man directory. Once the mount is complete, the selected man directory becomes the root directory of the second file system. Through this directory, the mounted file system is attached as a subtree to the general tree (Fig. 21).
Rice. 20. Two file systems before mounting
Rice. 21. Shared file system after mounting
Once a shared file system is mounted, there is no logical difference for the user between the root and mounted file systems; in particular, file naming is done in the same way as if it had been a single file system to begin with.
File attributes
The concept of “file” includes not only the data and name it stores, but also its attributes. File attributes are information that describes the properties of a file. Examples of possible file attributes:
file type (regular file, directory, special file, etc.);
file owner;
file creator;
password to access the file;
information about permitted file access operations;
times of creation, last access and last change;
current file size;
maximum size file;
“read-only” sign;
“hidden file” sign;
sign “ system file”;
sign “archive file”;
“binary/character” attribute;
“temporary” attribute (remove after the process is completed);
blocking sign;
length of the record in the file;
pointer to the key field in the record;
key length.
The set of file attributes is determined by the specifics of the file system: in file systems different types Different sets of attributes can be used to characterize files. For example, on file systems that support flat files, there is no need to use the last three attributes in the list that are related to file structuring. In a single-user OS, the set of attributes will lack characteristics relevant to users and security, such as the owner of the file, the creator of the file, the password for accessing the file, information about authorized access to the file.
The user can access attributes using the facilities provided for this purpose by the file system. Typically, you can read the values of any attribute, but only change some. For example, a user can change the permissions of a file (provided they have the necessary permissions to do so), but they cannot change the creation date or current size of the file.
File attribute values can be directly contained in directories, as is done in the MS-DOS file system (Fig. 22, a). The figure shows the structure of a directory entry containing a simple symbolic name and file attributes. Here the letters indicate the characteristics of the file: R - read-only, A - archived, H - hidden, S - system.
Rice. 22. Directory structure:
a – MS-DOS directory entry structure (32 bytes); b – UNIX OS directory entry structure
Another option is to place attributes in special tables, when the catalogs contain only links to these tables. This approach is implemented, for example, in the ufs file system of the UNIX OS. In this file system, the directory structure is very simple. The record for each file contains a short symbolic file name and a pointer to the file index descriptor, this is the name in ufs for the table in which the file attribute values are concentrated (Fig. 22, b).
In both versions, directories provide a link between file names and the files themselves. However, the approach of separating the file name from its attributes makes the system more flexible. For example, a file can easily be included in several directories at once. Entries for this file in different directories may have different simple names, but the link field will have the same inode number.
Articles to read:
Hierarchical Clustering | Stanford University
A variable-length object called file.
File - is a named sequence of bytes of arbitrary length. Since a file can have zero length, creating a file involves giving it a name and registering it in the file system - this is one of the OS functions.
Usually in separate file store data belonging to the same type. In this case, the data type determines file type.
Since there is no size limit in the file definition, one can imagine a file having 0 bytes (empty file), and a file having any number of bytes.
When defining a file, special attention is paid to the name. It actually carries address data, without which the data stored in the file will not become information due to the lack of a method to access it. In addition to addressing-related functions, a file name can also store information about the type of data contained in it. This is important for automatic tools for working with data, because based on the file name (or rather, its extension), they can automatically determine an adequate method for extracting information from the file.
File structure - hierarchical structure in which the operating system displays files and directories (folders).
Serves as the top of the structure carrier name, where files are saved. Next, the files are grouped into directories (folders), within which can be created nested directories
Names of external storage media. The disks on which information is stored on the computer have their own names - each disk is named with a letter of the Latin alphabet, followed by a colon. So, floppy disks are always assigned letters A: And IN:. The logical drives of the hard drive are named starting with the letter WITH:. All logical drive names are followed by CD drive names. For example, installed: a floppy drive, a hard drive divided into 3 logical drives and a CD drive. Identify the letters of all storage media. A:- floppy disk drive; WITH:, D:, E:- logical drives of the hard drive; F:- CD drive.
Logical drive or volume(English) volume or English partition) - part of the computer's long-term memory, considered as a whole for ease of use. The term "logical disk" is used in contrast to "physical disk", which refers to the memory of one specific disk medium.
For the operating system, it does not matter where the data is located - on laserdisc, on a hard drive partition, or on a flash drive. To unify the represented areas of long-term memory, the concept of a logical disk is introduced.
In addition to the stored information, the volume contains a description of the file system - as a rule, this is a table listing all files and their attributes (File Allocation Table, FAT). The table determines, in particular, in which directory (folder) a particular file is located. Thanks to this, when moving a file from one folder to another within the same volume, the data is not transferred from one part of the physical disk to another, but simply changes the entry in the file allocation table. If a file is transferred from one logical drive to another (even if both logical drives are located on the same physical drive), physical data transfer will necessarily occur (copying with further deletion of the original if successful).
For the same reason, formatting and defragmenting each logical drive does not affect the others.
Catalog (folder) - disk space (special system file) that stores service information about files (name, extension, creation date, size, etc.). Directories at lower levels are nested within directories at higher levels and are for them nested. The top-level directory (superdirectory) in relation to lower-level directories is called the parent directory. The top level of nesting of the hierarchical structure is root directory disk (Fig. 1). The directory that the user is currently working with is called current.
The rules for naming a directory are no different from the rules for naming a file, although it is not customary to specify name extensions for directories. When writing a file access path through a system of subdirectories, all intermediate directories are separated by a specific symbol. Many operating systems use "\" (backslash) as this character.
The requirement for a unique file name is obvious - without this it is impossible to guarantee unambiguous access to data. In means computer technology the requirement of name uniqueness is ensured automatically - neither the user nor the automation can create a file with a name identical to an existing one.
When a file is used that is not in the current directory, the program accessing the file needs to indicate where exactly the file is located. This is done by specifying the path to the file.
The path to the file- this is the name of the media (disk) and a sequence of directory names, separated by the “\” character in Windows OS (the “/” character is used in UNIX line OS). This path specifies the route to the directory in which the desired file is located.
There are two different methods used to specify the file path. In the first case, each file is given absolute path name (full file name), consisting of the names of all directories from the root to the one that contains the file, and the name of the file itself. For example, the path C:\Abby\Doc\otchet.doc means that the root directory of the disk WITH: contains a directory Abby, which in turn contains a subdirectory Doc where the file is located report.doc. Absolute path names always begin with the media name and root directory and are unique. Applies also relative path name. It is used together with the concept current directory. The user can designate one of the directories as the current working directory. In this case, all pathnames that do not begin with a delimiter character are considered relative and counted relative to the current directory. For example, if the current directory is C:\Abby, then to the file with absolute path C:\Abby\ can be contacted as Doc\otchet.doc.
Due to the fact that the file structure of a computer can be significant, search for the necessary documents by simply navigating through file structure not always convenient. It is usually believed that every computer user should know (and remember) the structure of the folders in which he stores documents. However, there are times when documents are saved outside of this structure. For example, many applications save documents to default folders if the user has forgotten to explicitly specify where the document should be saved. This default folder can be the folder that was last saved, the folder in which the application itself is located, some kind of service folder, for example \ My Documents and so on. IN similar cases Document files can be “lost” in the mass of other data.
The need to search for files especially often arises during setup work. A typical case is when, in search of the source of uncontrolled changes in the operating system, you need to find all the files that have been changed recently. By means automatic search files are also widely used by specialists who set up computer systems - it is difficult for them to navigate the file structure of “alien” personal computer, and search necessary files by navigating is not always productive for them.
Primary search tool Windows XP launch from the Main Menu with the command Start > Find > Files and Folders. Another launch option is no less convenient - from any folder window (View > Explorer Bars > Search > Files and Folders or key F3).
The controls provided on the search panel allow you to localize the search area based on the available information about the file name and address. Wildcard characters are allowed when entering a file name «*» And «?» . Symbol «*» replaces any number of arbitrary characters, and the character «?» replaces any one character. So, for example, searching for a file named *.txt will end with all files having a name extension displayed. txt, and the result of searching for files with the name *.??t will be a list of all files with name extensions. txt, .bat, .dat and so on.
When searching for files with “long” names, you should keep in mind that if the “long” name contains spaces (and this is acceptable), then when creating a search task, such a name should be enclosed in quotes, for example: "Current work.doc".
The search bar has additional hidden controls. They appear when you click on the downward expanding arrow.
· Question When were the last changes made? allows you to limit the search scope by the date the file was created, last modified, or opened.
· Question What is the file size? allows you to limit your search to files of a certain size.
· Paragraph Extra options allows you to specify the file type, allow viewing hidden files and folders, as well as set some other search parameters.
In cases where an unformatted text document is being sought, it is possible to search not only by file attributes, but also by its content. The desired text can be entered in the field A word or phrase in a file.
Searching for a document based on a text fragment does not produce results if it is a document that has formatting, because the formatting codes violate the natural sequence of text character codes. In these cases, you can sometimes use the search tool that comes with the application that formats the documents.
19.Data compression and file archiving.
A characteristic feature of most “classical” data types that people traditionally work with is a certain redundancy. The degree of redundancy depends on the type of data. In addition, the degree of data redundancy depends on the coding system adopted. So, for example, we can say that coding text information by means of the Russian language (using the Russian alphabet) gives on average 20-30% more redundancy than encoding adequate information by means of the English language.
Redundancy also plays an important role in information processing. However, when it comes not to processing, but to storing finished documents or transmitting them, redundancy can be reduced, which gives the effect of data compression.
If information compression methods are applied to finished documents, then the term data compression is often replaced with the term data archiving, and software Those who perform these operations are called archivers.
Depending on the object in which the data being compressed is located, there are:
- compaction (archiving) of files;
- compaction (archiving) of folders;
- disc compaction.
If data content changes during data compression, the compression method is irreversible and when data is restored from a compressed file, the original sequence is not completely restored. Such methods are also called loss-controlled compression methods. They are applicable only for those types of data for which the formal loss of part of the content does not lead to a significant decrease in consumer properties. First of all, this applies to multimedia data: video sequences, music recordings, sound recordings and drawings. Lossy compression methods usually provide much higher compression ratios than reversible methods, but they cannot be applied to text documents, databases and, especially, to program code. Typical lossy compression formats are:
- JPG for graphic data;
- .MPG for video data;
- . M RZ for audio data.
If data compression only changes its structure, then the compression method is reversible. From the resulting code, you can restore the original array by applying the reverse method. Reversible methods are used to compress any type of data. Typical lossless compression formats are:
- .GIF, TIP,. PCX and many others for graphics data;
- .AVI for video data;
- .ZIP, .ARJ, .BAR, .LZH, .LH, .CAB and many others for any data type.
The “classical” data compression formats, widely used in everyday computer work, are the .ZIP and .ARJ formats. Recently, the popular .RAR format has been added to them.
The basic functions that most modern archive managers perform include:
- extracting files from archives;
- creation of new archives;
- adding files to an existing archive;
- creation of self-extracting archives;
- creation of distributed archives on low-capacity media;
- testing the integrity of the archive structure;
- full or partial restoration of damaged archives;
- protection of archives from viewing and unauthorized modification.
Self-extracting archives. A self-extracting archive is prepared on the basis of a regular archive by attaching a small software module to it. The archive itself receives a name extension.EXE, which is typical for executable files.
Distributed archives. Some managers (for example WinZip) perform splitting directly onto floppy disks, and some (for example WinRAR and WinArj) allow you to pre-split the archive into fragments of a given size on the hard drive. Subsequently, they can be transferred to external media by copying.
When creating distributed archives, the WinZip Manager has unpleasant feature: Each volume carries files with the same names. As a result, it is not possible to determine the volume numbers stored on each floppy disk by file name. WinArj and WinRAR archive managers label all distributed archive files with different names and therefore do not create such problems.
Archive protection. In most cases, archives are protected using a password, which is requested when you try to view, unpack or change the archive.
TO additional functions archive managers include service functions that make work more convenient. They are often implemented external connection additional utilities and provide:
- viewing files of various formats without extracting them from the archive;
search for files and data inside archives;
installation of programs from archives without preliminary unpacking;
absence check computer viruses in the archive before it is unpacked;
cryptographic protection of archival information;
message decoding Email;
“transparent” compaction of executable files.EXE and.DLL;
creation of self-extracting multi-volume archives;
selecting or adjusting the information compression ratio.
You can double-click on the folder icon, after which Explorer will launch and show you the contents of the selected folder (see Fig. 21.1).
When you double-click a file's icon, the program that created that file launches and displays its contents. Although in fact it may not be the same program that created the file. For example, graphic files can be opened with special program to view them, not the graphics editing program that created them.
When you open program file, the program starts.
Once you open a folder, you will see its contents in the folder window. You can configure Windows so that each folder opens in its own window. Here's how to do it.
1. In the folder window, select Tools=>Folder Options.
The Folder Options dialog box appears.
2. On the General tab, select Open each folder in a separate window.
3. Click OK.
When you're done, don't forget to close all folder windows.
View tree structure
The hardest part about working with folders and files is organizing them into what computer scientists call a tree structure. The tree structure is clearly visible on the left side of the Explorer window. This area of the window is called Folders (see Figure 21.1). If you don't see this list, click the Folders button on the toolbar. Or select View^Browser Panels^Folders from the menu.
Using the mouse, you can quickly find any folder in the tree structure, if, of course, you know where to look for it. After clicking on a folder, its contents are displayed on the right in the window.
By clicking on the “+” (plus) sign next to the corresponding folder, you can see all its subfolders, i.e. branch of a tree structure.
By clicking on the “-” (minus) sign next to a folder, you close the corresponding branch of the tree structure.
How to hide a tree structure
When the Folders panel is closed, the Explorer window displays a list of tasks for files and folders, as shown in Fig. 21.2. This list contains basic operations with files in a given folder, transitions to other directories on the computer, and other similar tasks.
The list of tasks depends on the type of folder you are viewing, the selected file, and its type.
Note that any of the taskbars can be shown or hidden by clicking the arrow icon.
The initial sector of the hard disk contains the main root record, which is loaded into memory and executed.
The last part of this sector contains the partition table - a 4-element table with 16-byte elements. This table is manipulated by the FDISK program (or an equivalent utility on another operating system).
During boot, the ROM-BIOS loads the master root entry and transfers control to its code. This code reads the partition table to determine the partition that is marked as active. The correct root sector is then read into memory and executed.
Table 1. Structure of the master root entry and partition table
Table 2. Section Descriptor Structure
The partition code is used to determine the presence and location of the primary and extended partitions on the disk. Once the desired partition has been located, its size and coordinates can be extracted from the corresponding descriptor fields. If 0 is written in the partition code field, then the descriptor is considered empty, that is, it does not define any partition on the disk.
Table 3. Microsoft operating system partition codes
Code | Section type | Size | FAT type | OS |
---|---|---|---|---|
01h | Basic | 0-15 MB | FAT12 | MS-DOS 2.0 |
04h | Basic | 16-32 MB | FAT16 | MS-DOS 3.0 |
05h | Advanced | 0-2 GB | - | MS-DOS 3.3 |
06h | Basic | 32 MB-2 GB | FAT16 | MS-DOS 4.0 |
0Bh | Basic | 512 MB-2 GB | FAT32 | OSR2 |
0Ch | Advanced | 512 MB-2 TB | FAT32 | OSR2 |
0Eh | Basic | 32 MB-2 GB | FAT16 | Windows 95 |
0Fh | Advanced | 0-2 GB | - | Windows 95 |
The following codes are reserved for operating systems of other companies:
- 02h - CP/M section;
- 03h - Xenix section;
- 07h - OS/2 partition (HPFS file system).
Notes:
- Cylinder and sector numbers occupy 10 and 6 bits, respectively:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 c c c c c c c c c c s s s s s s
They are arranged so that when you load CX with a 16-bit value, it is ready to call interrupt INT 13h to read the desired portion of the disk. Thus, after reading the Master Load Record into the sect_buf memory area, the code is CMP byte ptr sect_buf, 80hwill check if the first partition is active, and the code
MOV CX, sect_buf
will load CX to call INT 13h to read the root sector of partition #1.
- The "relative sector" value at offset 08h in each partition is equivalent to the head, sector and cylinder of the partition's starting address. Relative sector 0 coincides with cylinder 0, head 0, sector 1. The relative sector number increases first for each sector on the head, then for each head, and finally for each cylinder.
Applicable formula:
Rel_sec = (#Cyl * sec_per_cyl * heads) + (#Goal * sec_per_cyl) + (#Sec -1)
Partitions start at an even cylinder number, with the exception of the first partition, which can start at cylinder 0, head 0, sector 2 (since sector 1 is occupied by the Master Boot Record).
When the root partition entry gains control, DS:SI points to the corresponding partition table entry.
Root sector structure
Table 4. Format of the root sector of a floppy disk or hard disk partition
00h | 3 | JMP | xx xx | NEAR jump to download code | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
03h | 8 | "I" | "B" | "M" | "4" | "." | "0" | OEM company name and system version | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0Bh | 2 | SectSiz | number of bytes in sector (always 512) | start of BPB 0Dh | 1
| ClustSiz | number of sectors in the cluster |
0Eh | 2
| ResSecs | number of spare sectors (sectors before FAT #1) |
10h | 1
| FatCnt | number of FAT tables |
11h | 2
| RootSiz | number of 32-byte elements of the root directory (for FAT32 - 0) |
13h | 2
| TotSecs | total number of sectors on the media (DOS partition) |
15h | 1
| Media | media type (same as 1st byte of FAT) |
16h | 2
| FatSize | number of sectors in one FAT | end BPB |
18h | 2
| TrkSecs |
|
| number of sectors per track |
1Ah | 2
| HeadCnt |
| number of heads |
1Ch | 4
| HidnSec | number of hidden sectors (used in partition schemes) |
20h | 4
| TotSecs | total sectors if size >32 MB |
24h | 1
| 128
|
|
| physical disk number |
25h | 1
|
|
| reserve |
26h | 1
| 29h |
| sign of extended structure |
27h | 4
|
| Volume ID (serial number) |
2Bh | Bh |
| label (NO NAME) |
36h | 8
|
| File system ID (FAT12) |
3Eh |
| start of loading code and data |
|
Notes:
- Types of storage media:
- F0h - floppy disk, 2 sides, 18 sectors per track;
- F8h - hard drive;
- F9h - floppy disk, 2 sides, 15 sectors per track;
- FCh - floppy disk, 1 side, 9 sectors per track;
- FDh - floppy disk, 2 sides, 9 sectors per track;
- FEh - floppy disk, 1 side, 8 sectors per track;
- FFh - floppy disk, 2 sides, 8 sectors per track.
- Use absolute read INT 25h (DX=0) to read this sector. OR:
- floppy disks: root sector = BIOS INT 13h head 0, track 0, sector 1;
- hard: read Partition_Table for head/track/sector BIOS.
- BPB (BIOS Parameter Block) is a subset of data contained in the root_sector. The "Build BPB" driver request requires the driver to fill out the block noted above. BPB length = 13 bytes
Floppy disk parameters table
This 10-byte structure is also known as a "Disk Base Table". It is located at the interrupt vector address INT 1Eh (4-byte address at 0:0078). This table specifies some important variables for floppy disk devices. It is initialized by ROM-BIOS and modified by DOS to improve the performance of floppy disks.
Table 5. Floppy Disk Parameter Table Format
Bias | Length | Content |
---|---|---|
00h | 1 | First byte of the specification: bits 0-3 - head loading time; bits 4-7 - head step duration |
01h | 1 | Second byte of the specification: bit 0 - DMA mode flag; bits 1-7 - head loading time |
02h | 1 | Delay before turning off the motor (in “ticks” of the system clock) |
03h | 1 | Sector size (bytes): 0 - 128, 1 - 256, 2 - 512, 3 - 1024 |
04h | 1 | Number of sectors per track |
05h | 1 | Intersector gap length for read/write operations |
06h | 1 | Data area length |
07h | 1 | Intersector gap length for format operation |
08h | 1 | Placeholder character for formatting (usually 0F6h, i.e. "Ў") |
09h | 1 | Head installation time (in milliseconds) |
0Ah | 1 | Motor start time (in 1/8 s) |
Hard disk parameters table
This 16-byte structure is located at interrupt vector address INT 41h (4-byte address at 0:0104). The parameters for the second hard drive (if there is one) are located at vector address INT 46h. These tables define some important variables for hard drive operations.
Table 6. Hard disk table format
Bias | Length | Content |
---|---|---|
00h | 2 | Number of cylinders |
02h | 1 | Number of heads |
03h | 2 | Not used (always 0) |
05h | 2 | Precompensation starting cylinder number |
07h | 1 | Maximum ECC block length |
08h | 1 | Control byte: bits 0-2 - not used (always 0); bit 3 - set if the number of heads is more than 8; bit 4 - not used (always 0); bit 5 - set if the manufacturer has placed a defect map on the cylinder with the number “maximum working cylinder + 1”; bit 6 - ECC recheck prohibition; bit 7 - ECC control disabled |
09h | 1 | Not used (always 0) |
0Ah | 1 | Not used (always 0) |
0Bh | 1 | Not used (always 0) |
0Ch | 2 | Parking zone cylinder number |
0Eh | 1 | Number of sectors per track |
0Fh | 1 | Reserve |
File Allocation Table (FAT)
File size may change over time. If you allow a file to be stored only in adjacent sectors, then when the file size increases, the OS must completely rewrite it to another suitable size (free) area of the disk. To simplify and speed up the operation of adding new data to a file, modern operating systems use file distribution tables (File Allocation Table, abbreviated FAT), which allow you to store a file in several non-contiguous sections.
When using FAT, the data area of a logical drive is divided into equally sized sections - clusters. A cluster can consist of one or several sectors located sequentially on a disk. The number of sectors in a cluster must be a multiple of 2 N and can take values from 1 to 64 (the cluster size depends on the type of FAT used and the size of the logical disk).
Each cluster is assigned its own FAT table element. The first two FAT elements are reserved - if there are K data clusters on the disk, then the number of FAT elements will be K+2. The FAT type is determined by the value of K:
- if K<4085 - используется FAT12;
- if 4084>K<65525 - используется FAT16;
- if 65524> K - FAT32 is used.
The name of FAT types comes from the size of the element. So a FAT12 element has a size of 12 bits, FAT16 - 16 bits, FAT32 - 32 bits. Please note that in FAT32, the four most significant binary bits are reserved and are ignored during OS operation (that is, only the seven least significant hexadecimal bits of the element are significant).
FAT is a linked list that the OS uses to keep track of the physical location of data on a disk and to find free memory for new files.
The file directory (table of contents) for each file contains the number of the starting element in the FAT table, corresponding to the first cluster in the file distribution chain. The corresponding FAT element either indicates the end of the chain, or refers to the next element, etc. Example:
This diagram illustrates the basic concepts of FAT. From it it is clear that:
- MYFILE.TXT occupies 10 clusters. The first cluster is cluster 08, the last cluster is 1Bh. Cluster chain - 08h, 09h, 0Ah, 0Bh, 15h, 16h, 17h, 19h, 1Ah, 1Bh. Each element points to the next element in the chain, and the last element contains special code(see Table 7).
- Cluster 18h is marked as defective and is not included in the distribution chain.
- Clusters 06h, 07h, 0Ch-14h and 1Ch-1Fh are empty and available for distribution.
- Another chain begins with cluster 02h and ends with cluster 05h. To find out the file name, you need to find the table of contents element with the starting cluster number 02h.
Table 7. FAT element values
FAT usually starts at logical sector 1 in the DOS partition (i.e. it can be read by INT 25h with DX=1). In general, you first need to read the root_sector (DX=0) and take the offset 0Eh . It indicates how many root and reserve sectors are in front of the FAT. Then use this number (usually 1) as the contents of DX to read the FAT via INT 25h .
There may be multiple copies of FAT. Typically two identical copies are maintained. In these cases, all copies are located directly next to each other.
Comment:
- According to a common misconception, it is believed that 16-bit FAT does not allow DOS to work with disks larger than 32 megabytes. In fact, the limitation is that INT 25h/26h is unable to work with SECTOR numbers greater than 65535. Since the sector size is usually 512 bytes, or half a kilobyte, this dictates a 32-megabyte limit. On the other hand, nothing prevents you from having larger sectors, so theoretically DOS can work with any disk.
- Multiply the cluster number by 3.
- If the element number is even, AND the word read and the mask 0FFFh. If the element number is odd, shift the value to the right by 4 bits. As a result, you will get the desired value of the FAT element.
Now let's look at the procedure for writing an element to FAT12.
- Multiply the cluster number by 3.
- Divide the result by 2 (element length is 1.5 (3/2) bytes).
- Read a 16-bit word from FAT using the result of the previous operation as the address.
- If the element number is even, perform an AND operation on the word read and the mask 0F000h, and then an OR operation on the resulting result and the value of the element written. If the element number is odd, AND the word read and the mask 0F000h, then shift the value left 4 bits and OR the result of the previous operation.
- Write the resulting 16-bit word back to FAT.
Comment:
- A 12-bit element can cross two sector boundaries, so be careful if you are reading one FAT sector at a time.
16-bit elements are simpler - each element contains the 16-bit offset (from the beginning of the FAT) of the next element in the chain.
32-bit elements - Each element contains the 32-bit offset of the next element in the chain.
In assembly language programs, the shift-and-add algorithm is often used instead of the MUL instruction to perform multiplication by 3: the original number is copied, the copy of the number is shifted left one place (multiplying by 2), and then both numbers are added (x + 2x = 3x). Instead of the DIV command, shift right one bit.
The FAT element contains the cluster number, but when working with disks at a low level, the addressable unit of data is the sector, not the cluster.
A floppy disk (or hard disk partition) is structured as follows:
- root and reserve sectors;
- FAT#1;
- FAT #2;
- root directory (does not exist in FAT32);
- data area.
Each section in this structure has a variable length, and to correctly convert the cluster number to the sector number, you need to know the length of each such section.
To get the cluster starting sector number from the ClustNum cluster number (read from the corresponding field in the directory entry or FAT chain), you can use the undocumented OS 32h function, or read the root sector and apply the following formulas:
root_sectors = (RootSiz * 32) / 512 start_data = ResSecs + (FatSize * FatCnt) + root_sectors start_sector = start_data + ((ClustNum - 2) * ClustSiz) ,
where the values of the variables: RootSiz, ResSecs, FatSize, FatCnt, ClustSiz are retrieved from the root sector or from BPB.
Set DX=start_sector before the INT 25h read or INT 26h write operation.
File directories
The file directory is an array of 32-byte elements - file descriptors. From the operating system's point of view, all directories (except the root directory in FAT12 and FAT16 systems) look like files and can contain an arbitrary number of entries.
The Root Directory is the main directory of the disk from which the subdirectory tree begins. For the root directory in FAT12 and FAT16, a special fixed-size space (16 KB) is allocated in the system area of the logical disk, designed to store 512 elements. In a FAT32 system, the root directory is a file of any size.
Table 8. Catalog Item Structure
Bias | Length | Content |
---|---|---|
00h | 11 | Short file name |
0Bh | 1 | File attributes |
0Сh | 1 | *Reserved for Windows NT (must contain 0) |
0Dh | 1 | *Field specifying the file creation time (in tens of milliseconds). The field value can range from 0 to 199 |
0Eh | 2 | *File creation time |
10h | 2 | *File creation date |
12h | 2 | *Date of the last access to the file to write or read data |
14h | 2 | *The most significant word of the number of the first cluster of the file |
16h | 2 | Time of last write operation to file |
18h | 2 | Date of the last write operation to the file |
1Ah | 2 | Low word of the file's first cluster number |
1Ch | 4 | File size in bytes (32-bit number) |
The "*" sign means that the field is processed only in the FAT32 file system. In FAT12 and FAT16 systems, the field is considered reserved and contains the value 0.
The short file name consists of two fields: an 8-byte field containing the actual file name, and a 3-byte field containing the extension. If the file name entered by the user is shorter than eight characters, then it is padded with spaces (space code - 20h), if the entered extension is shorter than three characters, then it is also padded with spaces.
Some DOS functions require a file attribute byte as a parameter. The bits of the attribute byte are set to 1 if the file has the corresponding property:
- bit 0 - read only;
- bit 1 - hidden;
- bit 2 - system;
- bit 3 - volume identifier;
- bit 4 - directory;
- bit 5 - archived;
- bits 6 and 7 are reserved (set to 0).
The file creation time field and the time field of the last write operation to the file have the following format:
15 | 9 | 8 | 5 | 4 | 0 |
When creating files, dates are counted from the beginning of the MS-DOS era, i.e. from 01/01/1980. Bits 9-15 contain the year number minus 1980 (valid values from 0 to 127).
Long file names
Starting with Windows 95, a file can be assigned (in addition to a short name) a so-called long name. To store a long name, empty directory elements adjacent to the main element - the file descriptor - are used. The presence of ones in bits 0-3 of the attribute bytes is a sign that a free directory element is used to store a portion of a long file name (this combination is not possible for file and directory descriptors). The short and long file names are unique, i.e. must not appear twice in the same directory.
A long name is written not in ASCII characters, but in the Unicode format, where each national alphabet has a corresponding set of codes. The price to pay for the universality of Unicode is a reduction in information storage density - each character occupies two bytes (16-bit word). In empty directory elements, the long name is written cut into pieces (see Table 9).
Table 9. Structure of a directory element storing a fragment of a long file name
The long name is written to the directory first, with the fragments placed in reverse order, starting with the last one:
All directories, with the exception of the root directory, contain special links in the first two elements instead of file descriptors. Element No. 0 contains a pointer to the directory itself, and the name field contains a single dot ("."). Element #1 contains a pointer to the parent directory, and the name field contains two dots (".."). If the FAT table reference for item #1 has a null value, then the current directory is in the root directory.
The disk information block is formed by the UNDOCUMENTED DOS 32h function.
All the information contained here can be obtained by reading the root sector and calling a number of other OS functions with some calculations, but the information block is useful in that it contains all the data together. This is the only call that returns the address of the device driver header.
Table 10. Disk information block diagram
Bias | Length | Content |
---|---|---|
00h | 1 | Disc number (0=A, 1=B, etc.) |
01h | 1 | Subdevice number from the device header (one driver can manage multiple drives) |
02h | 2 | Sector size in bytes |
04h | 1 | Number of sectors per cluster -1 (max. sector per cluster) |
05h | 1 | Shift a cluster to a sector (cluster = 2# sectors) (sectors per cluster in powers of two: 2 for 4, 3 for 8) |
06h | 2 | Number of spare sectors (root, start of root section) (N of first FAT sector) |
08h | 1 | Number of FAT tables |
09h | 2 | Max. number of elements in root table of contents |
0Bh | 2 | Sector number for cluster No. 2 (1st data cluster) |
0Dh | 2 | Total clusters +2 (highest cluster number) |
0Fh | 1 | Number of sectors occupied by one FAT |
10h | 2 | Sector number of the beginning of the root table of contents |
12h | 4 | Device_header address |
16h | 1 | media_descriptor byte |
17h | 1 | Access flag: 0 if the device was accessed |
18h | 4 | Address of the next disk information block (0FFFFh if the block is the last one) |
Opening mode bit flags:
- 0-2: Process access rights on the network
000 - reading; 001 - record; 010 - read and write. - 4-6: Split mode:
000 - compatibility mode
001 = exclusive file capture
010 = reject entry
011 = reject reading
100 = don't reject anything - 7: Inheritance:
1 - the file is private for this process 0 - inherited by child processes
If the file attribute byte indicates read-only, it overrides these flags.
The network permissions and sharing mode bits only have an effect when the SHARE program is installed.