View the hierarchical structure of the disk c. Disk structure. Check out Windows Explorer

Arkhangelsk State University

Kotlas branch

full-time department

Faculty: technical

Specialty: PGS

Course work

Discipline: computer science

Topic: Disk File Structure

Performed

1st year student

Zhubreva Olga

Alexandrovna

Checked:

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

§ 1 The concept of a file system. . . . . . . . . . . . . . . . . . . .

§ 2 MS-DOS file system. . . . . . . . . . . . . . . . . . . .

§ 3 File system Windows 95. . . . . . . . . . . . . . . . . .

§ 4 Windows NT file system. . . . . . . . . . . . . . . . . .

Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . .

Introduction.

The methodological manual reveals the essence of the concept of “file system”,

which is one of the most important concepts in the course “Software

computer support”, and also presents the structure of file systems of such

operating systems such as MS-DOS, Windows 95, Windows NT.

The structure of the present is determined by an attempt to achieve this goal.

manuals: the topic material is divided into 4 main parts (parts are presented in

form of a paragraph), each of the parts is also, as necessary, divided into

smaller detailed parts.

§ 1 The concept of a file system.

1.1. File system definition.

File (in English File) - folder, folder.

A file is a named area of ​​memory on some physical

a medium intended for storing information.

Total funds operating system providing access to

information on external media is called a file management system or

file system.

File system is a functional part of the operating system.

system that is responsible for exchanging data with external storage

devices.

ORGANIZING ACCESS TO THE FILE

Directory structure

We hope that you have a good idea of ​​organizing the storage of books in

library and, accordingly, the procedure for searching for the desired book by its code from

catalogue. Transfer your understanding of this to the way you store files

on the disk and organizing access to it.

Access - the procedure for establishing communication with memory and a file located in it

for writing and reading data.

The name of the logical drive that appears before the file name in the specification,

specifies the logical drive on which to search for the file. On the same disk

a directory is organized in which the full names of the files, as well as their

characteristics: date and time of creation;

volume (in bytes); special attributes. Similar to the library system

directory organization the full name of the file registered in the directory,

will serve as a cipher by which the operating system finds

location of the file on disk.

Directory - a directory of files indicating their location on the disk.

There are two directory states - current (active) and passive. MS

DOS remembers the current directory on each logical drive.

The current (active) directory is the directory in which the user is working

produced in current machine time.

Passive directory - a directory with which this moment no time

The MS DOS operating system adopts a hierarchical structure

(Fig. 9.1) directory organization. Each disk always has

a single main (root) directory. He is at level 0

hierarchical structure and is indicated by the symbol "\". Root directory

created when formatting (initializing, marking) a disk, has

limited in size and cannot be deleted using DOS tools. To main

directory may include other directories and files that are created by commands

operating system and can be removed using the appropriate commands.

Rice. 9.1. Hierarchical directory organization structure

Parent directory is a directory that has subdirectories. Subdirectory

A directory that is included in another directory.

So any directory containing lower level directories can

to be, on the one hand, parental towards them, and on the other hand,

subordinate to the top-level directory. As a rule, if this

does not cause confusion, use the term "catalogue" to mean either

subdirectory, or parent directory depending on the context.

Directories on disks are organized as system files. The only thing

the exception is the root directory, for which a fixed space is allocated on

disk. Directories can be accessed as if they were a regular file.

Note. The directory structure may contain directories that are not

The names of subdirectories are the same as the rules for naming files (see.

subsection 9.1). For formal differences from files, usually subdirectories

assign only names, although you can add a type according to the same rules as

and for files.

Access to the file contents is organized from the main directory, through

a chain of subordinate directories (subdirectories) of the i-th level. In the catalog

records of both files and directories can be stored at any level

lower level. are called empty.

In Fig. 9.2 shows the simplest directory structure, where in the main

directory 0

level only records about files of lower-level directories are stored

does not exist

In Fig. Figure 9.3 shows the hierarchical structure of the directory, where in the directories

any level stores records about files and directories at the lower level. Moreover

transition to a lower-level directory can only be organized

sequentially through subordinate directories.

Rice. 9.2. Simplest directory structure with no directories

lower level

Rice. 93,.. Typical structure directory consisting of directories below

level: when designating a lower-level directory, three numbers are used:

the first digit indicates the level number; the second is the serial number of this

catalog on this level, the third indicates at what level

his name is registered. Each directory has a KAT name with indexes.

For example, CAT342 is the name of a third-level directory that is registered in

second level catalog number 4

You cannot go from the main directory directly to a directory, for example, level 5.

It is imperative to go through all the previous top-level directories.

The principle described above for organizing access to a file through a directory

is the basis of the file system.

The file system is the part of the operating system that manages the location and

access to files and directories on the disk.

The concept of a disk file structure is closely related to the concept of a file system.

by which we mean how they are located on the disk: the main directory,

subdirectories, files, operating system, and which ones are allocated for them

volumes of sectors, clusters, tracks.

Rules for forming the disk file structure. When creating a file

disk structure, the MS DOS operating system follows a number of rules:

A file or directory can be registered with the same name in

different directories, but in the same directory only once;

The order of file names and subdirectories in the parent directory

arbitrary;

The file can be divided into several parts, for which

sections of disk space of the same volume on different tracks and

sectors.

Path and invitation

From Fig. 9.1 - 9.3 you can see that the file is accessed through the directory

thanks to the name registered in it this file. If the directory has

hierarchical structure, then the operating system organizes access to the file

depending on the position of the subdirectory in which the name is registered

the file you are looking for.

Access to the file can be organized as follows:

If the file name is registered in the current directory, then it is sufficient for

to access a file, specify only its name;

If the file name is registered in a passive directory, then, while in

current directory, you must specify the path, i.e. chain of subordinates

directories through which the file should be accessed.

Path is a chain of subordinate directories that must be traversed along

hierarchical structure to the directory where the desired file is registered. At

When specifying a path, directory names are written in order and separated

separated from each other by the \ symbol.

User interaction with the operating system is carried out with

with help command line displayed on the display screen. At first

The command line always has a prompt that ends with

>. The prompt may display: the name of the current drive, the name of the current

directory, current time and date, path, delimiter characters.

The operating system prompt is an indication on the information display screen,

indicating the operating system is ready to input user commands.

Example 9.8.

The current drive is floppy drive A.

the current directory is the main directory, as indicated by the \ symbol.

C:\CAT1\CAT2

The current disk is HDD C. Current catalog -

second level catalog CAT2, included in the first level catalog

CAT1, which, in turn, is registered in the main

catalogue.

There are three options for organizing the file access path, depending on

places of its registration:

The file is in the current directory (no path). When organizing

To access a file, you just need to specify its full name;

The file is located in a passive directory of one of the lower levels,

subordinate to the current directory. When organizing access to a file

you must specify a path that lists all directory names

lower level lying on this path (including the directory in which

this file is registered);

the file is in a passive directory on a different branch from

the location of the current directory of the hierarchical structure. At

to organize access to the file, you must specify the path starting with

main directory, i.e. starting with the \ character. This is explained by the fact that in

hierarchical structure, movement is possible only vertically from above -

Horizontal transitions from directory to directory are not allowed.B

The examples below illustrate possible options ways.

Example 9.9.

Condition: file F1.TXT is registered in the current 1st level directory K1

hard drive C. Therefore, the invitation C:\K1 is displayed on the screen

Explanation: In this case there is no path and to access the file it is enough

indicate only its full name F1.TXT

Example 9.10.

Condition: file F1.TXT is registered in the 2nd level directory K2 hard

drive C. The current directory is K1. Therefore, an invitation is displayed on the screen

Explanation: in this case the path will start from the directory

K1 down through its subordinate directory K2. Therefore, before

The full file name indicates the path from the current K2 directory

Having become familiar with the concept of a path, let us return to what was introduced in subsection. 9.1

the concept of a file specification. There is a shortened file specification and

the complete file specification that the path participates in. In Fig.

Figure 9.4 shows options for the rule for generating a file specification.

Rice. 9.4. Specification formats (optional parameter specified)

Example 9.12. Short form of the file specification C:\KIT.BAS

The file with the BASIC program KIT.BAS is located in the main

hard drive directory.

Full form file specification

C:\CAT1\CAT2\BOOC1.TXT

The text file BOOOK1.ТХТ is registered in the directory of the second

CAT2 level of hard drive C.

Structure of directory entries

Now you have to get acquainted with the structure of the records stored in the directory

with information about lower-level files and subdirectories.

A file entry in a directory contains the name and type of the file, the file size in

bytes, creation date, creation time and a number of other parameters required

operating system to organize access.

The entry for a lower-level subdirectory in the parent directory contains it

name, attribute, date and time of creation.

Let's consider possible options for the contents of the directory. 1st option. In the catalog

Only records about files are stored (Fig. 9.5). Before the file entries

A message about the directory name is displayed. In this case, this is the main

floppy disk directory A. At the end of the directory contents, a message appears about

the number of files stored on the disk and the free disk space in

bytes For example, the directory above displays the following message:

4 file(s) 359560 bytes free

Number of files on disk. Volume of free

disk space, byte2nd option.

The directory stores only entries about lower-level directories (Fig. 9.6).

Rice. 9.7. The main directory stores files and subdirectories

At the end of the directory, as in the previous case, you will see a similar

The above-mentioned entry about the amount of free disk space.

3rd option: The directory stores records of both files and directories

lower level (Fig. 9.7). From this structure it is clear that in this directory

there are 3 files and 2 lower level directories BASIC and LEXICON. On disk

free space 2.6575 MB.

The three directory presentation options discussed above reflect the contents of

main directory. Directory structure, starting from level 1 and below,

identical and differs from the main one only in that before the file entries

and lower-level directories, two entries with an ellipsis are placed (Fig. 9.8).

The dots you see at the beginning mean that the content is called up on the screen

subdirectory (1st level directory) KNIGA, which contains two text

SVET and TON files.

|Directory of C:\KNIGA | | |

| |11-12-90 |09:40 |

| |10-10-91 |08:30 |

|svet txt 55700 |04-04-90 |10:05 |

|ton txt 60300 |03-05-91 |11:20 |

|2 files 912348 bytes free | | |

|Fig. 9.8. Structure of entries in a subdirectory |

1.2. File system FAT.

Windows operating systems are used, developed for

DOS FAT file system, in which for each DOS partition and volume there is

boot sector, and each DOS partition contains two copies of the table

file allocation table (FAT).

FAT is a matrix that states the relationship

between the files and folders of the partition and their physical location on the hard drive

In front of each hard disk partition there are two sequentially located

copies of FAT. Like boot sectors, FAT is located outside

area of ​​the disk visible to the file system.

When written to disk, files do not necessarily take up space,

equivalent to their size. Typically files are split into clusters

of a certain size, which can be scattered throughout the section.

As a result, the FAT table is not a list of files and their

locations, and a list of section clusters and their contents, and at the end

FAT table entries are 12-, 16-, and 32-bit

hexadecimal numbers, the size of which is determined by the FDISK program, and

the value is directly generated by the FORMAT program.

All floppy disks and hard disks up to 16 MB in size

FAT uses 12-bit elements. Tough and removable drives having

size from 16 MB or more, 16-bit elements are usually used.

The FAT file system was used in all versions of MS-DOS and in the first

two releases of OS/2 (versions 1.0 and 1.1). Each logical volume had

own FAT, which performed two functions: contained information

distributions for each file in the volume in the form of a list of module associations

distributions (clusters) and indicated which distribution modules are free.

When the FAT table was invented, it was an excellent solution for

disk space management, mainly because floppy disks,

on which it was used were rarely more than a few Mb in size.

FAT was small enough to remain in memory permanently,

allowed for very fast random access to any part

any file.

When FAT was applied to hard drives, she got too big

for memory resident and degraded system performance.

In addition, since information regarding free disk space

space was distributed “across” a large number of FAT sectors,

it was impractical when allocating file space, and

File fragmentation has proven to be a barrier to high efficiency.

In addition, the use of relatively large clusters on hard

disks led to a large number of unused areas, since in

On average, for each file, half the cluster was wasted.

For several years, Microsoft and IBM have been trying to extend

life of the FAT file system due to the removal of volume size restrictions,

improving distribution strategies, pathname caching, and relocation

tables and buffers into extended memory. But they can only be regarded

as temporary measures because the file system simply didn't fit

large random access devices.

§ 2 File system of the MS-DOS operating system.

One of the concepts of the MS DOS file system is a logical disk.

Logical drives:

DOS, each logical disk is a separate magnetic disk. Each logical

the disk has its own unique name. As a logical drive name

letters of the English alphabet from A to Z (inclusive) are used.

Quantity logical drives, thus no more than 26.

Letters A and B are reserved strictly for the floppy disks available in the IBM PC (

Starting with the letter C, logical drives (partitions) are named HDD (

Winchester).

The pictures show an image of a logical disk.

If a given IBM PC has only one FDD, the letter B is skipped

Only logical drives A and C can be system drives. File

logical disk structure:

To access information on disk (located in a file), you need

know the physical address of the first sector, (Nsurfaces+Ntracks+Nsectors),

the total number of clusters occupied by this file, the address of the next

cluster, if the file size is larger than the size of one cluster, etc. All

it's very vague, difficult and unnecessary.

MS DOS saves the user from such work and does it itself. For

providing access to files - the MS DOS file system organizes and

maintains a specific file structure on a logical disk.

File structure elements:

Start sector(sector bootstrap, Boot sector),

Data area (remaining free disk space)

These elements are created by special programs (in MS DOS environment) in the process

disk initialization.

Start sector (boot sector, Boot sector):

Here is the information required by MS DOS to work with the disk:

OS ID (if the disk is system),

Disk sector size,

Number of sectors in the cluster,

Number of spare sectors at the beginning of the disk,

Number of FAT copies on the disk (standard - two),

Number of items in the directory,

Number of sectors on the disk,

Disc format type,

Number of sectors in FAT,

Number of sectors per track,

Number of surfaces

OS boot block,

Behind the starting sector is FAT.

FAT(File Allocation Table):

The disk data area (see above) is presented in MS DOS as a sequence

numbered clusters.

FAT is an array of elements that address clusters of the disk's data area.

Each data area cluster corresponds to one FAT element.

FAT elements serve as a chain of links to file clusters in the area

FAT is an extremely important element of the File structure. Violations in FAT can

lead to complete or partial loss of information on the entire logical disk.

That is why two copies of FAT are stored on the disk. There are special programs

which monitor the status of FAT and correct violations.

Root directory:

This is a specific area of ​​the disk created during the initialization process.

(formatting) the disk, which contains information about files and directories,

stored on disk.

The Root Directory always exists on a formatted disk. On

There is always only one root directory on one disk. Root size

directory for a given disk is a fixed value, so the maximum

the number of files and other (child) directories “attached” to it

(Subdirectories) - strictly defined.

So, summing up all of the above, we can conclude MS-DOS - 16-

bit operating system running in real processor mode.

§ 4 Operating file system Windows systems 95.

4.1. Background to the creation of FAT 32.

In the field of personal computers, a crisis arose in 1987.

Features of the FAT file system, developed by Microsoft over ten years

years before for the Standalone Disk Basic interpreter and later

adapted for the DOS operating system were exhausted. FAT

was intended for hard drives with a capacity of no more than 32 MB, and the new HDDs

larger capacities turned out to be completely useless for PC users.

Some independent vendors have offered their own solutions

this problem, but only with the advent of DOS 4.0 this crisis was overcome -

for a while.

Significant changes to the file system structure in DOS 4.0

allowed the operating system to work with disks with a capacity of up to 128 MB; With

Subsequent minor additions raised this limit to

2 GB. At that time, it seemed that this amount of memory exceeded any

imaginable needs. However, if the history of personal computers is anything to go by

and taught, then precisely that the capacity “exceeds any conceivable

needs", very quickly becomes "almost insufficient for serious

works." Indeed, hard drives are currently commercially available

capacity is usually 2.5 GB and higher, and sometimes very high and

The 2 GB ceiling that freed us from limitations has turned into yet another

an obstacle to be overcome.

4.2. Description of FAT 32.

Microsoft has developed a new extension for Windows 95 systems.

FAT - FAT32 systems, without any loud statements provided for in

OEM Service Pack 2.

The FAT32 system is installed only in new PCs, and do not count on

get it when you go to new version Windows 95, although it claims

Microsoft, this extension will become part of the main package for

Windows upgrades

4.2.1. Disk areas

This file system provides a number of special areas on

disk allocated to organize disk space during its

formatting - boot head record, disk partition table, record

downloads, file allocation table (from which the FAT system got its

name) and root directory.

On physical level disk space is split into 512-bytes

areas called sectors. The FAT system allocates space for files

blocks, which consist of an integer number of sectors and are called clusters.

The number of sectors in a cluster must be a multiple of a power of two. At Microsoft

these clusters are called memory allocation units, and in

SCANDISK report indicates their size, for example "16,384 bytes each

memory allocation unit."

4.2.2. FAT chain

FAT is a database that links disk clusters

file spaces. This database provides for each cluster

only one element. The first two elements contain information about the

FAT system. The third and subsequent elements are matched

clusters of disk space, starting with the first cluster allocated

for files. FAT elements can contain several special values,

indicating that

The cluster is free, i.e. not used by any file;

The cluster contains one or more sectors with physical defects and

should not be used;

This cluster is the last cluster of the file.

For any element used by the file, but not the last cluster

FAT contains the number of the next cluster occupied by the file.

Each directory - regardless of root or subdirectory - also

is a database. In the DOS directory for each file

there is one master record (B Windows environment 95 for long names

files, additional entries have been entered). Unlike FAT, where each element

consists of a single field, entries for a file in a directory consist of

several fields. Some fields - name, extension, size, date and time -

can be displayed on the screen using the DIR command. But the FAT system provides

the field that is not displayed by the DIR command is the field numbered first

cluster allocated for the file.

When a program sends a request to the operating system, with

request to provide it with the contents of some file, the OS looks through

a directory entry for it to find the first cluster of that file. Then she

accesses the FAT entry for a given cluster to find the next

cluster in the chain. Repeating this process until the last one is detected

file cluster, the OS determines exactly which clusters belong to this

file and in what sequence. In this way the system can provide

program any part of the file it requests. This way of organizing

The file is called a FAT chain.

In the FAT system, files are always allocated an integer number of clusters. At 1.2-

GB hard disk with 32 KB clusters in the directory can be specified,

what size text file containing the words "hello, world" is

only 12 bytes, but in fact this file takes up 32 KB of disk space

space. The unused part of the cluster is called wasted space

(slack). In small files, almost the entire cluster may be lost

place, and on average the losses are half the size of the cluster.

On an 850 MB hard drive with 16 KB clusters at a medium size

files about 50 KB about 16% of the disk space allocated for files

space will be lost to unused but allocated files

One way to free up disk space is by using

disk compression programs such as DriveSpace, which highlight "lost"

space" for use by other files.

4.2.3. Other changes in FAT32

To ensure the ability to work with an increased number of clusters, in

directory entries for each file must allocate 4 bytes for the initial

file cluster (instead of 2 bytes in the FAT16 system). Traditionally, each entry in

directory consists of 32 bytes (Fig. 1). In the middle of this record there are 10 bytes

used (bytes 12 to 21), which Microsoft has reserved for

their own needs in the future. Two of them are now allocated as

additional bytes required to indicate the starting cluster in the system

The operating system has always provided for the presence of two

FAT instances, but only one of them was used. With the transition to FAT32

the operating system can work with any of these copies. Another

The change is that the root directory, which previously had a fixed

size and strictly defined disk space, you can now freely

grow as needed, like a subdirectory. Doesn't exist now

restrictions on the number of entries in the root directory. This is especially important

because there are multiple entries for each long file name

catalogue.

Combination of Roaming Root and Feature

using both copies of FAT are good prerequisites for unhindered

dynamic resizing of disk partitions, for example reducing a partition

in order to free up space for another operating system. This new one

approach is less dangerous than those used in third-party programs

to change disk partitions when working with FAT16.

From all of the above we can conclude:

MS-DOS was a purely 16-bit operating system and ran in

real processor mode. IN Windows versions 3.1 part of the code was 16-

bit, and some are 32-bit. Windows 3.0 supported real mode

processor operation, when developing version 3.1 it was decided to abandon it

support.

Windows 95 is a 32-bit operating system that

bit code for compatibility with MS-DOS mode. Windows 95 32-bit

bit code.

§ 5 File system of the Windows NT operating system.

5.1. Short description Windows NT operating system.

At the moment, the global computer industry is developing very

rapidly. System performance increases, and therefore

The ability to process large volumes of data is increasing.

Operating systems of the MS-DOS class can no longer cope with this

data flow and cannot fully use the resources of modern

computers. Therefore in Lately there is a transition to more powerful and

the most advanced operating systems of the UNIX class, an example of which is

is Windows NT released by Microsoft Corporation

When the user sees the Microsoft operating system for the first time

Windows NT, a clear external resemblance to

favorite interface of the Windows 3.+ system. However, this is a visible similarity

is only minor part of Windows N.T.

Windows NT is a 32-bit operating system with

priority multitasking. As fundamental components

The operating system includes security features and

developed network service.

Windows NT also provides compatibility with many other

operating and file systems, as well as networks.

As shown in the following figure, Windows NT is

a modular (more advanced than a monolithic) operating system that

consists of separate interconnected relatively simple modules.

The main modules of Windows NT are (listed in order

following from the lower level of the architecture to the upper): level

hardware abstractions HAL (Hardware Abstraction Layer), kernel (Kernel),

execution system (Executive), protected subsystems (protected

subsystems) and environment subsystems.

Modular structure of Windows NT

5.2. Windows NT file system.

When Windows NT first came out, it included

support for three file systems. This is the File Allocation Table (FAT),

providing compatibility with MS-DOS, a file system with increased

performance (HPFS), providing compatibility with LAN Manager, and

a new file system called the Emerging Technologies File System

NTFS had a number of advantages compared to those used on

that point for most file servers is file systems.

To ensure data integrity, NTFS has a transaction log.

This approach does not exclude the possibility of information loss, however,

significantly increases the likelihood that access to file system

will be possible even if the integrity of the system is compromised

server. This becomes possible by using a transaction log to

tracking incomplete attempts to write to disk during subsequent boot

Windows NT. The transaction log is also used to check the disk for

presence of errors instead of checking each file, in case of using

file allocation tables.

One of the main advantages of NTFS is security. NTFS

provides the ability to make access control entries (Access Control

Entries, ACE) to the Access Control List (ACL). ACE

contains a group or user identification name and an access token,

which can be used to restrict access to certain

directory or file. This access may include the ability to read,

recording, deleting, executing and even owning files.

On the other hand, an ACL is a container containing one

or more ACE records. This allows you to restrict access to certain

users or user groups to specific directories or files in

In addition, NTFS supports working with long names that have

up to 255 characters long and containing uppercase and lowercase letters in any

sequences. One of the main characteristics of NTFS is

automatic creation of equivalent names compatible with MS-DOS.

NTFS also has a compression feature, which first appeared in the NT version

3.51. It provides the ability to compress any file, directory or disk

NTFS. Unlike MS-DOS compression programs that create a virtual disk,

having the appearance of a hidden file and compressing all data on this disk,

Windows NT uses an additional layer of the file subsystem for compression

and decompressing the required files without creating virtual disk. This

turns out to be useful when compressing either a specific part of the disk (for example,

user directory), or files of a specific type

(for example, graphic files). The only disadvantage of NTFS compression is

is low, in comparison with MS-DOS compression schemes, level

compression. But NTFS is more reliable and

productivity.

So, from all of the above we can conclude:

To be compatible with various operating systems, Windows

NT contains the FAT 32 file system. In addition, Windows NT contains its own

its own NTFS file system, which is not compatible with FAT 16. This

the file system has a number of advantages over FAT, as well as

features higher reliability and performance.

Conclusion.

MS-DOS - 16-bit operating system, runs in real

processor mode. In versions of Windows 3.1, some of the code is 16-bit, and some

32-bit. Windows 3.0 supported real processor mode,

During the development of version 3.1, it was decided to abandon its support.

Windows 95 is a 32-bit operating system that

works only in protected processor mode. Core including management

memory and process dispatching, contains only 32-bit code. This

reduces costs and speeds up work. Only some modules have 16-

bit code for compatibility with MS-DOS mode. On Windows 95 32-bit

the code is used wherever possible, which allows us to ensure

increased reliability and fault tolerance of the system. In addition to this, for

compatibility with legacy applications and drivers is used and 16-

bit code.

Windows NT is not a further development of earlier

existing products. Its architecture was created from scratch, taking into account

requirements for a modern operating system. Striving

ensure compatibility of the new operating system,

Windows NT developers retained the familiar Windows interface and implemented

support for existing file systems (such as FAT) and various

applications (written for MS - Dos, Windows 3.x). The developers also

included in Windows NT tools for working with various network

means.

Reliability and robustness

provide architectural features that protect application

programs from being damaged by each other and by the operating system. Windows NT

uses fault-tolerant structured exception handling on

all architectural levels, which includes recoverable file

NTFS system and provides protection using the built-in system

security and advanced memory management techniques.

Users access files by symbolic names. However, human memory limits the number of object names that a user can refer to by name. The hierarchical organization of the namespace allows us to significantly expand these boundaries. This is why most file systems have a hierarchical structure, in which levels are created by allowing a lower-level directory to be contained within a higher-level directory (Figure 19).

Rice. 19. File system hierarchy:

a – single-level organization; b – tree; in – network

The graph describing the directory hierarchy can be a tree or a network. Directories form a tree if a file is allowed to be included in only one directory (Fig. 19, b), and a network - if the file can be included in several directories at once (Fig. 19, c). For example, in MS-DOS and Windows, directories form a tree structure, while in UNIX they form a network structure. In a tree structure, each file is a leaf. The top-level directory is called the root directory, or root.

With this organization, the user is freed from remembering the names of all files; he only needs to have a rough idea of ​​which group a particular file can be assigned to in order to find it by sequentially browsing directories. The hierarchical structure is convenient for multi-user work: each user with their files is localized in their own directory or subtree of directories, and at the same time, all files in the system are logically connected.

A special case of a hierarchical structure is a single-level organization, when all files are included in one directory (Fig. 19, a).

File names

All file types have symbolic names. Hierarchically organized file systems typically use three types of file names: simple, compound, and relative.

A simple, or short, symbolic name identifies a file within a single directory. Simple names are assigned to files by users and programmers, and they must take into account OS restrictions on both the range of characters and the length of the name. Until relatively recently, these boundaries were very narrow. Thus, in the FAT file system, the length of names was limited to scheme 8.3 (8 characters - the name itself, 3 characters - the name extension), and in the s5 file system, supported by many versions of the UNIX OS, a simple symbolic name could not contain more than 14 characters. However, it is much more convenient for the user to work with long names because they allow you to give the files easy-to-remember names that clearly indicate what is contained in the file. Therefore, modern file systems, as well as improved versions of pre-existing file systems, tend to support long, simple symbolic file names. For example, on the NTFS and FAT32 file systems included with the Windows NT operating system, a file name can contain up to 255 characters.

Examples of simple file and directory names:

Supplement to CD 254L in Russian.doc

installable filesystem manager.doc

In hierarchical file systems, different files are allowed to have the same simple symbolic names, provided they belong to different directories. That is, the “many files - one simple name” scheme works here. To uniquely identify a file in such systems, a so-called full name is used.

The full name is a chain of simple symbolic names of all directories through which the path from the root to the given file passes. Thus, the full name is a compound name, in which simple names are separated from each other by the separator accepted in the OS. Often a forward or backslash is used as a delimiter, and it is customary not to specify the name of the root directory. In Fig. 19, b two files have the simple name main.exe, but their compound names /depart/main.exe and /user/anna/main exe are different.

In a tree file system, there is a one-to-one correspondence between a file and its full name “one file – one full name”. In file systems that have a network structure, a file can be included in several directories, which means it can have several full names; here the correspondence “one file - many full names” is valid. In both cases, the file is uniquely identified by its full name.

A file can also be identified by a relative name. The relative file name is determined through the concept of “current directory”. For each user, at any given time, one of the file system directories is the current directory, and this directory is selected by the user himself upon an OS command. The file system captures the name of the current directory so that it can then use it as a complement to relative names to form the fully qualified file name. When using relative names, the user identifies a file by the chain of directory names through which the route from the current directory to the given file passes. For example, if the current directory is /user, then the relative file name /user/anna/main.exe is anna/main.exe.

Some operating systems allow you to assign multiple simple names to the same file, which can be interpreted as aliases. In this case, just as in a system with a network structure, the correspondence “one file – many full names” is established, since each simple file name corresponds to at least one full name.

And although the full name uniquely identifies the file, it is easier for the operating system to work with the file if there is a one-to-one correspondence between the files and their names. For this purpose, it assigns a unique name to the file, so that the ratio “one file - one unique name” is valid. The unique name exists along with one or more symbolic names assigned to the file by users or applications. The unique name is a numeric identifier and is intended only for the operating system. An example of such a unique file name is the inode number in UNIX system.

Mounting

In general, a computer system may have several disk devices. Even a typical personal computer usually has one hard drive, one floppy drive, and a CD-ROM drive. Powerful computers are usually equipped with big amount disk drives on which disk packages are installed. Moreover, even one physical device, using operating system tools, can be represented as several logical devices, in particular by dividing the disk space into partitions. The question arises: how to organize file storage in a system with several devices? external memory?

The first solution is that each device hosts a self-contained file system, that is, the files located on this device are described by a directory tree that is in no way connected to the directory trees on other devices. In this case, to uniquely identify the file, the user must specify the logical device identifier along with the compound symbolic file name. An example of such an autonomous existence of file systems is the MS-DOS operating system, in which the full file name includes the letter identifier of the logical drive. So, when accessing a file located on drive A, the user must specify the name of this drive: A:\privat\letter\uni\let1.doc.

Another option is to organize file storage in which the user is given the opportunity to combine file systems located on different devices into a single file system, described by a single directory tree. This operation is called mounting. Let's look at how this operation is carried out using the UNIX OS as an example.

Among all the logical disk devices available in the system, the operating system distinguishes one device, called the system one. Let there be two file systems located on different logical drives (Fig. 20), and one of the drives is the system drive.

File system located on system disk, is assigned to root. To link file hierarchies in the root file system, an existing directory is selected, in this example the man directory. Once the mount is complete, the selected man directory becomes the root directory of the second file system. Through this directory, the mounted file system is attached as a subtree to the general tree (Fig. 21).

Rice. 20. Two file systems before mounting

Rice. 21. Shared file system after mounting

Once a shared file system is mounted, there is no logical difference for the user between the root and mounted file systems; in particular, file naming is done in the same way as if it had been a single file system to begin with.

File attributes

The concept of “file” includes not only the data and name it stores, but also its attributes. File attributes are information that describes the properties of a file. Examples of possible file attributes:

 file type (regular file, directory, special file, etc.);

 file owner;

 file creator;

 password to access the file;

 information about permitted file access operations;

 times of creation, last access and last change;

 current file size;

maximum size file;

 “read-only” sign;

 “hidden file” sign;

 sign “ system file”;

 sign “archive file”;

 “binary/character” attribute;

 “temporary” attribute (remove after the process is completed);

 blocking sign;

 length of the record in the file;

 pointer to the key field in the record;

 key length.

The set of file attributes is determined by the specifics of the file system: in file systems different types Different sets of attributes can be used to characterize files. For example, on file systems that support flat files, there is no need to use the last three attributes in the list that are related to file structuring. In a single-user OS, the set of attributes will lack characteristics relevant to users and security, such as the owner of the file, the creator of the file, the password for accessing the file, information about authorized access to the file.

The user can access attributes using the facilities provided for this purpose by the file system. Typically, you can read the values ​​of any attribute, but only change some. For example, a user can change the permissions of a file (provided they have the necessary permissions to do so), but they cannot change the creation date or current size of the file.

File attribute values ​​can be directly contained in directories, as is done in the MS-DOS file system (Fig. 22, a). The figure shows the structure of a directory entry containing a simple symbolic name and file attributes. Here the letters indicate the characteristics of the file: R - read-only, A - archived, H - hidden, S - system.

Rice. 22. Directory structure:

a – MS-DOS directory entry structure (32 bytes); b – UNIX OS directory entry structure

Another option is to place attributes in special tables, when the catalogs contain only links to these tables. This approach is implemented, for example, in the ufs file system of the UNIX OS. In this file system, the directory structure is very simple. The record for each file contains a short symbolic file name and a pointer to the file index descriptor, this is the name in ufs for the table in which the file attribute values ​​are concentrated (Fig. 22, b).

In both versions, directories provide a link between file names and the files themselves. However, the approach of separating the file name from its attributes makes the system more flexible. For example, a file can easily be included in several directories at once. Entries for this file in different directories may have different simple names, but the link field will have the same inode number.

Articles to read:

Hierarchical Clustering | Stanford University

A variable-length object called file.

File - is a named sequence of bytes of arbitrary length. Since a file can have zero length, creating a file involves giving it a name and registering it in the file system - this is one of the OS functions.

Usually in separate file store data belonging to the same type. In this case, the data type determines file type.

Since there is no size limit in the file definition, one can imagine a file having 0 bytes (empty file), and a file having any number of bytes.

When defining a file, special attention is paid to the name. It actually carries address data, without which the data stored in the file will not become information due to the lack of a method to access it. In addition to addressing-related functions, a file name can also store information about the type of data contained in it. This is important for automatic tools for working with data, because based on the file name (or rather, its extension), they can automatically determine an adequate method for extracting information from the file.

File structure - hierarchical structure in which the operating system displays files and directories (folders).

Serves as the top of the structure carrier name, where files are saved. Next, the files are grouped into directories (folders), within which can be created nested directories

Names of external storage media. The disks on which information is stored on the computer have their own names - each disk is named with a letter of the Latin alphabet, followed by a colon. So, floppy disks are always assigned letters A: And IN:. The logical drives of the hard drive are named starting with the letter WITH:. All logical drive names are followed by CD drive names. For example, installed: a floppy drive, a hard drive divided into 3 logical drives and a CD drive. Identify the letters of all storage media. A:- floppy disk drive; WITH:, D:, E:- logical drives of the hard drive; F:- CD drive.

Logical drive or volume(English) volume or English partition) - part of the computer's long-term memory, considered as a whole for ease of use. The term "logical disk" is used in contrast to "physical disk", which refers to the memory of one specific disk medium.

For the operating system, it does not matter where the data is located - on laserdisc, on a hard drive partition, or on a flash drive. To unify the represented areas of long-term memory, the concept of a logical disk is introduced.

In addition to the stored information, the volume contains a description of the file system - as a rule, this is a table listing all files and their attributes (File Allocation Table, FAT). The table determines, in particular, in which directory (folder) a particular file is located. Thanks to this, when moving a file from one folder to another within the same volume, the data is not transferred from one part of the physical disk to another, but simply changes the entry in the file allocation table. If a file is transferred from one logical drive to another (even if both logical drives are located on the same physical drive), physical data transfer will necessarily occur (copying with further deletion of the original if successful).

For the same reason, formatting and defragmenting each logical drive does not affect the others.

Catalog (folder) - disk space (special system file) that stores service information about files (name, extension, creation date, size, etc.). Directories at lower levels are nested within directories at higher levels and are for them nested. The top-level directory (superdirectory) in relation to lower-level directories is called the parent directory. The top level of nesting of the hierarchical structure is root directory disk (Fig. 1). The directory that the user is currently working with is called current.

The rules for naming a directory are no different from the rules for naming a file, although it is not customary to specify name extensions for directories. When writing a file access path through a system of subdirectories, all intermediate directories are separated by a specific symbol. Many operating systems use "\" (backslash) as this character.

The requirement for a unique file name is obvious - without this it is impossible to guarantee unambiguous access to data. In means computer technology the requirement of name uniqueness is ensured automatically - neither the user nor the automation can create a file with a name identical to an existing one.

When a file is used that is not in the current directory, the program accessing the file needs to indicate where exactly the file is located. This is done by specifying the path to the file.

The path to the file- this is the name of the media (disk) and a sequence of directory names, separated by the “\” character in Windows OS (the “/” character is used in UNIX line OS). This path specifies the route to the directory in which the desired file is located.

There are two different methods used to specify the file path. In the first case, each file is given absolute path name (full file name), consisting of the names of all directories from the root to the one that contains the file, and the name of the file itself. For example, the path C:\Abby\Doc\otchet.doc means that the root directory of the disk WITH: contains a directory Abby, which in turn contains a subdirectory Doc where the file is located report.doc. Absolute path names always begin with the media name and root directory and are unique. Applies also relative path name. It is used together with the concept current directory. The user can designate one of the directories as the current working directory. In this case, all pathnames that do not begin with a delimiter character are considered relative and counted relative to the current directory. For example, if the current directory is C:\Abby, then to the file with absolute path C:\Abby\ can be contacted as Doc\otchet.doc.

Due to the fact that the file structure of a computer can be significant, search for the necessary documents by simply navigating through file structure not always convenient. It is usually believed that every computer user should know (and remember) the structure of the folders in which he stores documents. However, there are times when documents are saved outside of this structure. For example, many applications save documents to default folders if the user has forgotten to explicitly specify where the document should be saved. This default folder can be the folder that was last saved, the folder in which the application itself is located, some kind of service folder, for example \ My Documents and so on. IN similar cases Document files can be “lost” in the mass of other data.

The need to search for files especially often arises during setup work. A typical case is when, in search of the source of uncontrolled changes in the operating system, you need to find all the files that have been changed recently. By means automatic search files are also widely used by specialists who set up computer systems - it is difficult for them to navigate the file structure of “alien” personal computer, and search necessary files by navigating is not always productive for them.

Primary search tool Windows XP launch from the Main Menu with the command Start > Find > Files and Folders. Another launch option is no less convenient - from any folder window (View > Explorer Bars > Search > Files and Folders or key F3).

The controls provided on the search panel allow you to localize the search area based on the available information about the file name and address. Wildcard characters are allowed when entering a file name «*» And «?» . Symbol «*» replaces any number of arbitrary characters, and the character «?» replaces any one character. So, for example, searching for a file named *.txt will end with all files having a name extension displayed. txt, and the result of searching for files with the name *.??t will be a list of all files with name extensions. txt, .bat, .dat and so on.

When searching for files with “long” names, you should keep in mind that if the “long” name contains spaces (and this is acceptable), then when creating a search task, such a name should be enclosed in quotes, for example: "Current work.doc".

The search bar has additional hidden controls. They appear when you click on the downward expanding arrow.

· Question When were the last changes made? allows you to limit the search scope by the date the file was created, last modified, or opened.

· Question What is the file size? allows you to limit your search to files of a certain size.

· Paragraph Extra options allows you to specify the file type, allow viewing hidden files and folders, as well as set some other search parameters.

In cases where an unformatted text document is being sought, it is possible to search not only by file attributes, but also by its content. The desired text can be entered in the field A word or phrase in a file.

Searching for a document based on a text fragment does not produce results if it is a document that has formatting, because the formatting codes violate the natural sequence of text character codes. In these cases, you can sometimes use the search tool that comes with the application that formats the documents.

19.Data compression and file archiving.

A characteristic feature of most “classical” data types that people traditionally work with is a certain redundancy. The degree of redundancy depends on the type of data. In addition, the degree of data redundancy depends on the coding system adopted. So, for example, we can say that coding text information by means of the Russian language (using the Russian alphabet) gives on average 20-30% more redundancy than encoding adequate information by means of the English language.
Redundancy also plays an important role in information processing. However, when it comes not to processing, but to storing finished documents or transmitting them, redundancy can be reduced, which gives the effect of data compression.
If information compression methods are applied to finished documents, then the term data compression is often replaced with the term data archiving, and software Those who perform these operations are called archivers.
Depending on the object in which the data being compressed is located, there are:
- compaction (archiving) of files;
- compaction (archiving) of folders;
- disc compaction.
If data content changes during data compression, the compression method is irreversible and when data is restored from a compressed file, the original sequence is not completely restored. Such methods are also called loss-controlled compression methods. They are applicable only for those types of data for which the formal loss of part of the content does not lead to a significant decrease in consumer properties. First of all, this applies to multimedia data: video sequences, music recordings, sound recordings and drawings. Lossy compression methods usually provide much higher compression ratios than reversible methods, but they cannot be applied to text documents, databases and, especially, to program code. Typical lossy compression formats are:
- JPG for graphic data;
- .MPG for video data;
- . M RZ for audio data.
If data compression only changes its structure, then the compression method is reversible. From the resulting code, you can restore the original array by applying the reverse method. Reversible methods are used to compress any type of data. Typical lossless compression formats are:
- .GIF, TIP,. PCX and many others for graphics data;
- .AVI for video data;
- .ZIP, .ARJ, .BAR, .LZH, .LH, .CAB and many others for any data type.
The “classical” data compression formats, widely used in everyday computer work, are the .ZIP and .ARJ formats. Recently, the popular .RAR format has been added to them.
The basic functions that most modern archive managers perform include:
- extracting files from archives;
- creation of new archives;
- adding files to an existing archive;
- creation of self-extracting archives;
- creation of distributed archives on low-capacity media;
- testing the integrity of the archive structure;
- full or partial restoration of damaged archives;
- protection of archives from viewing and unauthorized modification.
Self-extracting archives. A self-extracting archive is prepared on the basis of a regular archive by attaching a small software module to it. The archive itself receives a name extension.EXE, which is typical for executable files.
Distributed archives. Some managers (for example WinZip) perform splitting directly onto floppy disks, and some (for example WinRAR and WinArj) allow you to pre-split the archive into fragments of a given size on the hard drive. Subsequently, they can be transferred to external media by copying.
When creating distributed archives, the WinZip Manager has unpleasant feature: Each volume carries files with the same names. As a result, it is not possible to determine the volume numbers stored on each floppy disk by file name. WinArj and WinRAR archive managers label all distributed archive files with different names and therefore do not create such problems.
Archive protection. In most cases, archives are protected using a password, which is requested when you try to view, unpack or change the archive.
TO additional functions archive managers include service functions that make work more convenient. They are often implemented external connection additional utilities and provide:
- viewing files of various formats without extracting them from the archive;
search for files and data inside archives;
installation of programs from archives without preliminary unpacking;
absence check computer viruses in the archive before it is unpacked;
cryptographic protection of archival information;
message decoding Email;
“transparent” compaction of executable files.EXE and.DLL;
creation of self-extracting multi-volume archives;
selecting or adjusting the information compression ratio.

You can double-click on the folder icon, after which Explorer will launch and show you the contents of the selected folder (see Fig. 21.1).

When you double-click a file's icon, the program that created that file launches and displays its contents. Although in fact it may not be the same program that created the file. For example, graphic files can be opened with special program to view them, not the graphics editing program that created them.

When you open program file, the program starts.

Once you open a folder, you will see its contents in the folder window. You can configure Windows so that each folder opens in its own window. Here's how to do it.

1. In the folder window, select Tools=>Folder Options.

The Folder Options dialog box appears.

2. On the General tab, select Open each folder in a separate window.

3. Click OK.

When you're done, don't forget to close all folder windows.

View tree structure

The hardest part about working with folders and files is organizing them into what computer scientists call a tree structure. The tree structure is clearly visible on the left side of the Explorer window. This area of ​​the window is called Folders (see Figure 21.1). If you don't see this list, click the Folders button on the toolbar. Or select View^Browser Panels^Folders from the menu.

Using the mouse, you can quickly find any folder in the tree structure, if, of course, you know where to look for it. After clicking on a folder, its contents are displayed on the right in the window.

By clicking on the “+” (plus) sign next to the corresponding folder, you can see all its subfolders, i.e. branch of a tree structure.

By clicking on the “-” (minus) sign next to a folder, you close the corresponding branch of the tree structure.

How to hide a tree structure

When the Folders panel is closed, the Explorer window displays a list of tasks for files and folders, as shown in Fig. 21.2. This list contains basic operations with files in a given folder, transitions to other directories on the computer, and other similar tasks.

The list of tasks depends on the type of folder you are viewing, the selected file, and its type.

Note that any of the taskbars can be shown or hidden by clicking the arrow icon.

The initial sector of the hard disk contains the main root record, which is loaded into memory and executed.

The last part of this sector contains the partition table - a 4-element table with 16-byte elements. This table is manipulated by the FDISK program (or an equivalent utility on another operating system).

During boot, the ROM-BIOS loads the master root entry and transfers control to its code. This code reads the partition table to determine the partition that is marked as active. The correct root sector is then read into memory and executed.

Table 1. Structure of the master root entry and partition table

Table 2. Section Descriptor Structure

The partition code is used to determine the presence and location of the primary and extended partitions on the disk. Once the desired partition has been located, its size and coordinates can be extracted from the corresponding descriptor fields. If 0 is written in the partition code field, then the descriptor is considered empty, that is, it does not define any partition on the disk.

Table 3. Microsoft operating system partition codes

CodeSection typeSizeFAT typeOS
01hBasic0-15 MBFAT12MS-DOS 2.0
04hBasic16-32 MBFAT16MS-DOS 3.0
05hAdvanced0-2 GB- MS-DOS 3.3
06hBasic32 MB-2 GBFAT16MS-DOS 4.0
0BhBasic512 MB-2 GBFAT32OSR2
0ChAdvanced512 MB-2 TBFAT32OSR2
0EhBasic32 MB-2 GBFAT16Windows 95
0FhAdvanced0-2 GB- Windows 95

The following codes are reserved for operating systems of other companies:

  • 02h - CP/M section;
  • 03h - Xenix section;
  • 07h - OS/2 partition (HPFS file system).

Notes:

  1. Cylinder and sector numbers occupy 10 and 6 bits, respectively:
    15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
    ccccccccccssssss

    They are arranged so that when you load CX with a 16-bit value, it is ready to call interrupt INT 13h to read the desired portion of the disk. Thus, after reading the Master Load Record into the sect_buf memory area, the code is CMP byte ptr sect_buf, 80h

    will check if the first partition is active, and the code

    MOV CX, sect_buf

    will load CX to call INT 13h to read the root sector of partition #1.

  2. The "relative sector" value at offset 08h in each partition is equivalent to the head, sector and cylinder of the partition's starting address. Relative sector 0 coincides with cylinder 0, head 0, sector 1. The relative sector number increases first for each sector on the head, then for each head, and finally for each cylinder.

    Applicable formula:

    Rel_sec = (#Cyl * sec_per_cyl * heads) + (#Goal * sec_per_cyl) + (#Sec -1)

    Partitions start at an even cylinder number, with the exception of the first partition, which can start at cylinder 0, head 0, sector 2 (since sector 1 is occupied by the Master Boot Record).

    When the root partition entry gains control, DS:SI points to the corresponding partition table entry.

Root sector structure

Table 4. Format of the root sector of a floppy disk or hard disk partition

00h3 JMPxx xxNEAR jump to download code
03h8 "I""B""M" "4" "." "0" OEM company name and system version
0Bh2 SectSiznumber of bytes in sector (always 512)start of BPB
0Dh1 ClustSiznumber of sectors in the cluster
0Eh2 ResSecsnumber of spare sectors (sectors before FAT #1)
10h1 FatCntnumber of FAT tables
11h2 RootSiznumber of 32-byte elements of the root directory (for FAT32 - 0)
13h2 TotSecstotal number of sectors on the media (DOS partition)
15h1 Mediamedia type (same as 1st byte of FAT)
16h2 FatSizenumber of sectors in one FATend BPB
18h2 TrkSecs number of sectors per track
1Ah2 HeadCnt number of heads
1Ch4 HidnSecnumber of hidden sectors (used in partition schemes)
20h4 TotSecstotal sectors if size >32 MB
24h1 128 physical disk number
25h1 reserve
26h1 29h sign of extended structure
27h4 Volume ID (serial number)
2BhBh label (NO NAME)
36h8 File system ID (FAT12)
3Eh start of loading code and data

Notes:

  1. Types of storage media:
    • F0h - floppy disk, 2 sides, 18 sectors per track;
    • F8h - hard drive;
    • F9h - floppy disk, 2 sides, 15 sectors per track;
    • FCh - floppy disk, 1 side, 9 sectors per track;
    • FDh - floppy disk, 2 sides, 9 sectors per track;
    • FEh - floppy disk, 1 side, 8 sectors per track;
    • FFh - floppy disk, 2 sides, 8 sectors per track.
  2. Use absolute read INT 25h (DX=0) to read this sector. OR:
    • floppy disks: root sector = BIOS INT 13h head 0, track 0, sector 1;
    • hard: read Partition_Table for head/track/sector BIOS.
  3. BPB (BIOS Parameter Block) is a subset of data contained in the root_sector. The "Build BPB" driver request requires the driver to fill out the block noted above. BPB length = 13 bytes

Floppy disk parameters table

This 10-byte structure is also known as a "Disk Base Table". It is located at the interrupt vector address INT 1Eh (4-byte address at 0:0078). This table specifies some important variables for floppy disk devices. It is initialized by ROM-BIOS and modified by DOS to improve the performance of floppy disks.

Table 5. Floppy Disk Parameter Table Format

BiasLengthContent
00h1 First byte of the specification:
bits 0-3 - head loading time;
bits 4-7 - head step duration
01h1 Second byte of the specification:
bit 0 - DMA mode flag;
bits 1-7 - head loading time
02h1 Delay before turning off the motor (in “ticks” of the system clock)
03h1 Sector size (bytes): 0 - 128, 1 - 256, 2 - 512, 3 - 1024
04h1 Number of sectors per track
05h1 Intersector gap length for read/write operations
06h1 Data area length
07h1 Intersector gap length for format operation
08h1 Placeholder character for formatting (usually 0F6h, i.e. "Ў")
09h1 Head installation time (in milliseconds)
0Ah1 Motor start time (in 1/8 s)

Hard disk parameters table

This 16-byte structure is located at interrupt vector address INT 41h (4-byte address at 0:0104). The parameters for the second hard drive (if there is one) are located at vector address INT 46h. These tables define some important variables for hard drive operations.

Table 6. Hard disk table format

BiasLengthContent
00h2 Number of cylinders
02h1 Number of heads
03h2 Not used (always 0)
05h2 Precompensation starting cylinder number
07h1 Maximum ECC block length
08h1 Control byte:
bits 0-2 - not used (always 0);
bit 3 - set if the number of heads is more than 8;
bit 4 - not used (always 0);
bit 5 - set if the manufacturer has placed a defect map on the cylinder with the number “maximum working cylinder + 1”;
bit 6 - ECC recheck prohibition;
bit 7 - ECC control disabled
09h1 Not used (always 0)
0Ah1 Not used (always 0)
0Bh1 Not used (always 0)
0Ch2 Parking zone cylinder number
0Eh1 Number of sectors per track
0Fh1 Reserve

File Allocation Table (FAT)

File size may change over time. If you allow a file to be stored only in adjacent sectors, then when the file size increases, the OS must completely rewrite it to another suitable size (free) area of ​​the disk. To simplify and speed up the operation of adding new data to a file, modern operating systems use file distribution tables (File Allocation Table, abbreviated FAT), which allow you to store a file in several non-contiguous sections.

When using FAT, the data area of ​​a logical drive is divided into equally sized sections - clusters. A cluster can consist of one or several sectors located sequentially on a disk. The number of sectors in a cluster must be a multiple of 2 N and can take values ​​from 1 to 64 (the cluster size depends on the type of FAT used and the size of the logical disk).

Each cluster is assigned its own FAT table element. The first two FAT elements are reserved - if there are K data clusters on the disk, then the number of FAT elements will be K+2. The FAT type is determined by the value of K:

  1. if K<4085 - используется FAT12;
  2. if 4084>K<65525 - используется FAT16;
  3. if 65524> K - FAT32 is used.

The name of FAT types comes from the size of the element. So a FAT12 element has a size of 12 bits, FAT16 - 16 bits, FAT32 - 32 bits. Please note that in FAT32, the four most significant binary bits are reserved and are ignored during OS operation (that is, only the seven least significant hexadecimal bits of the element are significant).

FAT is a linked list that the OS uses to keep track of the physical location of data on a disk and to find free memory for new files.

The file directory (table of contents) for each file contains the number of the starting element in the FAT table, corresponding to the first cluster in the file distribution chain. The corresponding FAT element either indicates the end of the chain, or refers to the next element, etc. Example:

This diagram illustrates the basic concepts of FAT. From it it is clear that:

  1. MYFILE.TXT occupies 10 clusters. The first cluster is cluster 08, the last cluster is 1Bh. Cluster chain - 08h, 09h, 0Ah, 0Bh, 15h, 16h, 17h, 19h, 1Ah, 1Bh. Each element points to the next element in the chain, and the last element contains special code(see Table 7).
  2. Cluster 18h is marked as defective and is not included in the distribution chain.
  3. Clusters 06h, 07h, 0Ch-14h and 1Ch-1Fh are empty and available for distribution.
  4. Another chain begins with cluster 02h and ends with cluster 05h. To find out the file name, you need to find the table of contents element with the starting cluster number 02h.

Table 7. FAT element values

FAT usually starts at logical sector 1 in the DOS partition (i.e. it can be read by INT 25h with DX=1). In general, you first need to read the root_sector (DX=0) and take the offset 0Eh . It indicates how many root and reserve sectors are in front of the FAT. Then use this number (usually 1) as the contents of DX to read the FAT via INT 25h .

There may be multiple copies of FAT. Typically two identical copies are maintained. In these cases, all copies are located directly next to each other.

Comment:

  • According to a common misconception, it is believed that 16-bit FAT does not allow DOS to work with disks larger than 32 megabytes. In fact, the limitation is that INT 25h/26h is unable to work with SECTOR numbers greater than 65535. Since the sector size is usually 512 bytes, or half a kilobyte, this dictates a 32-megabyte limit. On the other hand, nothing prevents you from having larger sectors, so theoretically DOS can work with any disk.
  1. Multiply the cluster number by 3.
  2. If the element number is even, AND the word read and the mask 0FFFh. If the element number is odd, shift the value to the right by 4 bits. As a result, you will get the desired value of the FAT element.

Now let's look at the procedure for writing an element to FAT12.

  1. Multiply the cluster number by 3.
  2. Divide the result by 2 (element length is 1.5 (3/2) bytes).
  3. Read a 16-bit word from FAT using the result of the previous operation as the address.
  4. If the element number is even, perform an AND operation on the word read and the mask 0F000h, and then an OR operation on the resulting result and the value of the element written. If the element number is odd, AND the word read and the mask 0F000h, then shift the value left 4 bits and OR the result of the previous operation.
  5. Write the resulting 16-bit word back to FAT.

Comment:

  • A 12-bit element can cross two sector boundaries, so be careful if you are reading one FAT sector at a time.
    16-bit elements are simpler - each element contains the 16-bit offset (from the beginning of the FAT) of the next element in the chain.
    32-bit elements - Each element contains the 32-bit offset of the next element in the chain.

In assembly language programs, the shift-and-add algorithm is often used instead of the MUL instruction to perform multiplication by 3: the original number is copied, the copy of the number is shifted left one place (multiplying by 2), and then both numbers are added (x + 2x = 3x). Instead of the DIV command, shift right one bit.

The FAT element contains the cluster number, but when working with disks at a low level, the addressable unit of data is the sector, not the cluster.

A floppy disk (or hard disk partition) is structured as follows:

  1. root and reserve sectors;
  2. FAT#1;
  3. FAT #2;
  4. root directory (does not exist in FAT32);
  5. data area.

Each section in this structure has a variable length, and to correctly convert the cluster number to the sector number, you need to know the length of each such section.

To get the cluster starting sector number from the ClustNum cluster number (read from the corresponding field in the directory entry or FAT chain), you can use the undocumented OS 32h function, or read the root sector and apply the following formulas:

root_sectors = (RootSiz * 32) / 512 start_data = ResSecs + (FatSize * FatCnt) + root_sectors start_sector = start_data + ((ClustNum - 2) * ClustSiz) ,

where the values ​​of the variables: RootSiz, ResSecs, FatSize, FatCnt, ClustSiz are retrieved from the root sector or from BPB.

Set DX=start_sector before the INT 25h read or INT 26h write operation.

File directories

The file directory is an array of 32-byte elements - file descriptors. From the operating system's point of view, all directories (except the root directory in FAT12 and FAT16 systems) look like files and can contain an arbitrary number of entries.

The Root Directory is the main directory of the disk from which the subdirectory tree begins. For the root directory in FAT12 and FAT16, a special fixed-size space (16 KB) is allocated in the system area of ​​the logical disk, designed to store 512 elements. In a FAT32 system, the root directory is a file of any size.

Table 8. Catalog Item Structure

BiasLengthContent
00h11 Short file name
0Bh1 File attributes
0Сh1 *Reserved for Windows NT (must contain 0)
0Dh1 *Field specifying the file creation time (in tens of milliseconds).
The field value can range from 0 to 199
0Eh2 *File creation time
10h2 *File creation date
12h2 *Date of the last access to the file to write or read data
14h2 *The most significant word of the number of the first cluster of the file
16h2 Time of last write operation to file
18h2 Date of the last write operation to the file
1Ah2 Low word of the file's first cluster number
1Ch4 File size in bytes (32-bit number)

The "*" sign means that the field is processed only in the FAT32 file system. In FAT12 and FAT16 systems, the field is considered reserved and contains the value 0.

The short file name consists of two fields: an 8-byte field containing the actual file name, and a 3-byte field containing the extension. If the file name entered by the user is shorter than eight characters, then it is padded with spaces (space code - 20h), if the entered extension is shorter than three characters, then it is also padded with spaces.

Some DOS functions require a file attribute byte as a parameter. The bits of the attribute byte are set to 1 if the file has the corresponding property:

  • bit 0 - read only;
  • bit 1 - hidden;
  • bit 2 - system;
  • bit 3 - volume identifier;
  • bit 4 - directory;
  • bit 5 - archived;
  • bits 6 and 7 are reserved (set to 0).

The file creation time field and the time field of the last write operation to the file have the following format:

15 9 8 5 4 0

When creating files, dates are counted from the beginning of the MS-DOS era, i.e. from 01/01/1980. Bits 9-15 contain the year number minus 1980 (valid values ​​from 0 to 127).

Long file names

Starting with Windows 95, a file can be assigned (in addition to a short name) a so-called long name. To store a long name, empty directory elements adjacent to the main element - the file descriptor - are used. The presence of ones in bits 0-3 of the attribute bytes is a sign that a free directory element is used to store a portion of a long file name (this combination is not possible for file and directory descriptors). The short and long file names are unique, i.e. must not appear twice in the same directory.

A long name is written not in ASCII characters, but in the Unicode format, where each national alphabet has a corresponding set of codes. The price to pay for the universality of Unicode is a reduction in information storage density - each character occupies two bytes (16-bit word). In empty directory elements, the long name is written cut into pieces (see Table 9).

Table 9. Structure of a directory element storing a fragment of a long file name

The long name is written to the directory first, with the fragments placed in reverse order, starting with the last one:

All directories, with the exception of the root directory, contain special links in the first two elements instead of file descriptors. Element No. 0 contains a pointer to the directory itself, and the name field contains a single dot ("."). Element #1 contains a pointer to the parent directory, and the name field contains two dots (".."). If the FAT table reference for item #1 has a null value, then the current directory is in the root directory.

The disk information block is formed by the UNDOCUMENTED DOS 32h function.

All the information contained here can be obtained by reading the root sector and calling a number of other OS functions with some calculations, but the information block is useful in that it contains all the data together. This is the only call that returns the address of the device driver header.

Table 10. Disk information block diagram

BiasLengthContent
00h1 Disc number (0=A, 1=B, etc.)
01h1 Subdevice number from the device header (one driver can manage multiple drives)
02h2 Sector size in bytes
04h1 Number of sectors per cluster -1 (max. sector per cluster)
05h1 Shift a cluster to a sector (cluster = 2# sectors) (sectors per cluster in powers of two: 2 for 4, 3 for 8)
06h2 Number of spare sectors (root, start of root section) (N of first FAT sector)
08h1 Number of FAT tables
09h2 Max. number of elements in root table of contents
0Bh2 Sector number for cluster No. 2 (1st data cluster)
0Dh2 Total clusters +2 (highest cluster number)
0Fh1 Number of sectors occupied by one FAT
10h2 Sector number of the beginning of the root table of contents
12h4 Device_header address
16h1 media_descriptor byte
17h1 Access flag: 0 if the device was accessed
18h4 Address of the next disk information block
(0FFFFh if the block is the last one)

Opening mode bit flags:

  1. 0-2: Process access rights on the network
    000 - reading; 001 - record; 010 - read and write.
  2. 4-6: Split mode:
    000 - compatibility mode
    001 = exclusive file capture
    010 = reject entry
    011 = reject reading
    100 = don't reject anything
  3. 7: Inheritance:
    1 - the file is private for this process 0 - inherited by child processes

If the file attribute byte indicates read-only, it overrides these flags.

The network permissions and sharing mode bits only have an effect when the SHARE program is installed.