Binary coding of text information. Binary coding Information and information processes. Presentation on the topic binary coding of information

Slide 1

Slide 2

The concept of “information” and properties of information Measurement of information. Alphabetical approach Measuring information. Content-based approach Presentation and coding of information Representation of numerical information using number systems Translation of numbers in positional number systems Arithmetic operations in positional number systems Representation of numbers in a computer Binary coding of information Storage of information

Slide 3

The concept of “information” and properties of information

The concept of “information” Information in philosophy Information in physics Information in biology Properties of information

Slide 4

What is information?

The word “information” comes from the Latin word information, which translates as explanation, presentation. The concept of “information” is fundamental in the course of computer science; it is impossible to define it through other, more “simple” concepts.

Slide 5

In the simplest everyday understanding, the term “information” is usually associated with some information, data, knowledge. Information is transmitted in the form of messages that determine its form and presentation. Examples of messages are: a piece of music, a TV show, text printed on a printer, etc. It is assumed that there is a source of information and a recipient of information. A message from a source to a recipient is transmitted through some medium that is a communication channel. (Fig. 1.) The concept of “information” is used in various sciences.

Slide 6

Information in philosophy

Student message

Slide 7

Slide 8

Slide 9

Information Properties

Man is a social being; in order to communicate with other people, he must exchange information with them, and the exchange of information is always carried out in a certain language - Russian, English, etc. participants in the discussion must speak the language in which the communication is conducted, then the information will be understandable to all participants in the exchange of information. The information must be useful, then the discussion acquires practical value. Useless information creates information noise, which makes it difficult to perceive useful information.

Slide 10

The term “mass media” is widely known, which brings information to every member of society. Such information must be reliable and up-to-date. False information misleads members of society and can cause social unrest. Irrelevant information is useless and that is why no one except historians reads last year's newspapers. In order for a person to correctly navigate the world around him, information must be complete and accurate. The task of obtaining complete and accurate information faces science. Mastering scientific knowledge in the learning process allows a person to obtain complete and accurate information about nature, society and technology.

Slide 11

Measuring information. Alphabetical approach

The alphabetic approach is used to measure the amount of information in a text represented as a sequence of characters from some alphabet. This approach is not related to the content of the text. The amount of information in this case is called the information volume of the text, which is proportional to the size of the text - the number of characters that make up the text. This approach to measuring information is sometimes called the volumetric approach.

Slide 12

Each character of the text carries a certain amount of information. It is called the information weight of the symbol. Therefore, the information volume of the text is equal to the sum of the information weights of all the characters that make up the text. Here it is assumed that the text is a sequential chain of numbered characters. In formula (1), i1 denotes the information weight of the first character of the text, i2 – the information weight of the second character of the text, etc.; K – text size, i.e. total number of characters in the text

Slide 13

The entire set of different symbols used to write texts is called the alphabet. The size of the alphabet is an integer called the power of the alphabet. It should be borne in mind that the alphabet includes not only the letters of a certain alphabet, but all other symbols that can be used in the text: numbers, punctuation marks, various brackets. Determining the information weights of characters can occur in two approximations: under the assumption of equal probability (equal frequency of occurrence) of any character in the text; taking into account the different probabilities (different frequency of occurrence) of various characters in the text.

Slide 14

Approximation of equal probability of characters in text

If we assume that all characters of the alphabet in any text appear with the same frequency, then the information weight of all characters will be the same. Then the share of any character in the text is 1/Nth part of the text. By definition of probability, this value is equal to the probability of a character appearing in each text position: p=1/N.

Slide 15

From the position alphabetical approach to the information dimension 1 bit is the information weight of a symbol from the binary alphabet. A larger unit of information is the byte. 1 byte is the information weight of a character from an alphabet with a capacity of 256. (1 byte = 8 bits) To represent texts stored and processed in a computer, an alphabet with a capacity of 256 symbols is most often used. Therefore, 1 character of such text “weighs” 1 byte. 1 KB (kilobyte)=210 bytes=1024 bytes 1 MB (megabyte)=210 KB=1024 KB 1GB (gigabyte)=210 MB=1024 MB

Slide 16

Approximation of different probabilities of characters in text

This approximation takes into account that in real text different characters occur with different frequencies. It follows that the probabilities of the appearance of different characters in a certain position of the text are different and, therefore, their information weights are different. Statistical analysis of Russian texts shows that the frequency of occurrence of the letter “o” is 0.09. This means that for every 100 characters, the letter “o” appears on average 9 times. The same number indicates the probability of the letter “o” appearing in a certain position in the text: p0=0.09. It follows that the information weight of the letter “o” in a Russian text is 3.47393 bits.

Slide 17

Measuring information. Content approach

From the perspective of a meaningful approach to measuring information, the question of the amount of information in a message received by a person is resolved. The following situation is considered: a person receives a message about some event; at the same time, the uncertainty of a person’s knowledge about the expected event is known in advance. Uncertainty of knowledge can be expressed either by number possible options events, or the probability of expected variants of the event;

Slide 18

2) as a result of receiving the message, the uncertainty of knowledge is removed: from a certain possible number of options, one was chosen; 3) the formula calculates the amount of information in the received message, expressed in bits. The formula used to calculate the amount of information depends on the situations, of which there can be two: All possible options for an event are equally probable. Their number is finite and equal to N. The probabilities (p) of possible variants of the event are different and they are known in advance: (pi), i=1..N. Here, as before, N is the number of possible options for the event.

Equally probable events

Unequally probable events

Slide 19

If we denote by the letter i the amount of information in the message that one of N equally probable events has occurred, then the values i and N are related to each other by Hartley’s formula: 2i = N (1) The value I is measured in bits. This leads to the following conclusion: 1 bit is the amount of information in a message about one of two equally probable events. Hartley's formula is an exponential equation. If i is an unknown quantity, then the solution to equation (1) will be:

(2) Example 1 Example 2

Slide 20

Task. How much information does the message that a queen of spades was drawn from a deck of cards contain? Solution: deck – 32 cards. In a shuffled deck, any card falling out is an equally probable event. If i is the amount of information in the message that a specific card (queen of spades) fell out, then from Hartley’s equation: 2i = 32 = 25 Hence: I = 5 bits

Slide 21

Task. How much information does the message about rolling up a side with the number 3 on a six-sided die contain? Solution: Considering the loss of any edge to be an equally probable event, we write Hartley’s formula: 2i = 6. Hence:

Slide 22

If the probability of some event is p, and i (bit) is the amount of information in the message that this event occurred, then these quantities are related to each other by the formula: 2i = 1/p (*) Solving the exponential equation (*) for i , we get: Formula (**) was proposed by K. Shannon, therefore it is called Shannon’s formula

Slide 23

Presentation and coding of information

1. Language as a sign system 2. Representation of information in living organisms 3. Coding of information

Slide 24

Language as a sign system

Language is a specific system of symbolic representation of information. “Language is a set of symbols and a set of rules that determine how to compose meaningful messages from these symbols” (dictionary of school computer science). Because a meaningful message is information, then the definitions coincide. LANGUAGE

natural formal Language of computer science

Slide 25

Natural languages

Historically developed languages of national speech. Most modern languages are characterized by the presence of oral and written forms of speech. The analysis of natural languages is largely the subject of philological sciences, in particular linguistics. In computer science, natural language analysis is carried out by specialists in the field of artificial intelligence. One of the goals of developing a fifth-generation computer project is to teach the computer to understand natural languages.

Slide 26

Formal languages

Artificially created languages for professional use. They are usually international in nature and in written form. Examples of such languages are mathematics, the language of chemical formulas, and musical notation. Formal languages are characterized by belonging to a limited subject area. The purpose of a formal language is an adequate description of the system of concepts and relationships characteristic of a given subject area.

Slide 27

The following concepts are associated with any language: the alphabet is the set of symbols used; syntax – rules for writing language structures; semantics – the semantic side of language constructions; pragmatics - the practical consequences of applying the text to given language. Natural languages are not limited in their application; in this sense, they can be called universal. However, it is not always convenient to use only natural language in highly specialized areas. In such cases, people resort to formal languages. There are examples of languages that are in an intermediate state between natural and formal. The Esperanto language was created artificially for communication between people of different nationalities. And Latin in our time has become the formal language of medicine and pharmacology, having lost its function as a spoken language.

Slide 28

Representation of information in living organisms

A person perceives information about the world around him using his senses. Sensitive nerve endings of the sense organs perceive the impact and transmit it to neurons, the circuits of which make up the nervous system. A neuron can be in one of two states: non-excited and excited. An excited neuron generates an electrical impulse that is transmitted throughout the nervous system. The state of a neuron (no impulse, there is an impulse) can be considered as signs of a certain alphabet of the nervous system, with the help of which information is transmitted.

Slide 29

Genetic information largely determines the structure and development of living organisms and is inherited. Genetic information is stored in the cells of organisms in the structure of DNA (deoxyribonucleic acid) molecules. The DNA molecule consists of two chains twisted together into a spiral, built from four nucleotides: A, G, T, C, which form the genetic alphabet. The human DNA molecule includes about 3 billion nucleotide pairs and therefore all information about the human body is encoded in it: its appearance, health or susceptibility to disease, abilities.

Slide 30

Encoding information

The presentation of information occurs in various forms in the process of perception of the environment by living organisms and humans, in the processes of information exchange between man and man, man and computer, computer and computer, and so on. Transforming information from one form of representation to another is called encoding. The entire set of symbols used for encoding is called the encoding alphabet. For example, in computer memory, any information is encoded using a binary alphabet containing only two characters: 0 and 1.

Slide 31

In the process of exchanging information, it is often necessary to perform operations of encoding and decoding information. When you enter an alphabet character into a computer by pressing the corresponding key on the keyboard, the character is encoded, that is, it is converted into computer code. When a character is displayed on a monitor screen or printer, the reverse process occurs - decoding, when from computer code the sign is converted into its graphic image.

Slide 32

Representing numerical information using number systems

Number system Decimal number system Binary number system Positional number systems with arbitrary base

Slide 33

Notation

Numbers are used to record information about the number of objects. Numbers are written using special sign systems called number systems. A number system is a way of representing numbers and the corresponding rules for operating numbers. The various number systems that existed in the past and that are used today can be divided into non-positional and positional. The signs used to write numbers are called digits.

Slide 34

Non-positional number systems

In non-positional number systems, the meaning of a digit does not depend on its position in the number. An example of a non-positional number system is the Roman system (Roman numerals). In the Roman system, Latin letters are used as numbers: I V X L C D M 1 5 10 50 100 500 1000 Example 1 Example 2 Example 3 In Roman numerals, numbers are written from left to right in descending order. In this case, their values are added together. If a smaller number is written and a larger one on the right, then their values are subtracted.

Slide 35

Slide 36

Slide 37

MCMXCVIII = 1000 + (- 100 + 1000) + + (- 10 + 100) + 5 + 1 + 1 + 1 = 1998

Slide 38

Positional number systems

The first positional number system was invented in Ancient Babylon, and the Babylonian numbering was sexagesimal, that is, it used sixty digits! It is interesting that we still use a base of 60 when measuring time. In the 19th century, the duodecimal number system became quite widespread. Until now, we often use dozen: there are two dozen hours in a day, a circle contains thirteen dozen degrees, and so on. In positional number systems, the value denoted by a digit in the notation of a number depends on its position. The number of digits used is called the base of the positional number system.

Slide 39

The most common positional number systems today are decimal, binary, octal, and hexadecimal. In positional number systems, the base of the system is equal to the number of digits (signs in its alphabet) and determines how many times the values of identical digits in adjacent positions of the number differ.

Slide 40

Decimal number system

Let's take as an example decimal number 555. The number 5 appears three times, with the rightmost 5 representing 5 units, the second from the right representing five tens, and finally the third from the right representing five hundreds. The position of a digit in a number is called…. The digit of a number increases from right to left, from low to high digits. The number 555 is a collapsed form of writing the number. In the expanded form of writing a number, multiplying a digit of a number by various powers of 10 is written explicitly. That.

discharge

Slide 41

In general, in the decimal number system, the recording of the number A10, which contains n integer digits of the number and m fractional digits of the number, looks like this: The coefficients ai in this recording are the digits of the decimal number, which in collapsed form is written like this: From the above formulas it is clear that multiplication or dividing a decimal number by 10 (the base value) moves the decimal point separating the whole part from the fractional part one place to the right or left, respectively.

Slide 42

Binary number system

In the binary number system, the base is 2, and the alphabet consists of two digits (0 and 1). Consequently, numbers in the binary system in expanded form are written as a sum of powers of base 2 with coefficients, which are the digits 0 or 1. For example, the expanded notation of a binary number may look like this:

Slide 43

In general, in the binary system, the recording of the number A2, which contains n integer digits of the number and m fractional digits of the number, looks like this: Collapsed recording of a binary number: From the above formulas it is clear that multiplying or dividing a binary number by 2 (the base value) leads to movement a comma separating the integer part from the fractional part by one digit to the right or left, respectively.

Slide 44

Positional number systems with arbitrary base

It is possible to use a variety of positional number systems, the base of which is equal to or greater than 2. In number systems with base q (q-ary number system), numbers in expanded form are written as a sum of powers of base q with coefficients, which are the numbers 0, 1, q-1: The coefficients ai in this notation are the digits of the number written in the q-ary number system.

Slide 45

So, in the octal system the base is equal to eight (q=8). Then the octal number A8=673.28 written in collapsed form in expanded form will look like: In the hexadecimal system, the base is sixteen (q=16), then the hexadecimal number A16=8A,F16 written in collapsed form in expanded form will look like: If we express hexadecimal digits through their decimal values, then the number will take the form:

Slide 46

Translation of numbers in positional number systems

Converting numbers to the decimal number system Converting numbers from the decimal system to binary, octal and hexadecimal Converting numbers from the binary number system to octal and hexadecimal and vice versa

Slide 47

Converting numbers to the decimal number system

Converting numbers in binary, octal, and hexadecimal to decimal is fairly easy. To do this, you need to write down the number in expanded form and calculate its value Converting a number from binary to decimal Converting numbers from octal to decimal Converting numbers from hexadecimal to decimal

Slide 48

Converting a number from binary to decimal

10.112 Convert the following numbers to decimal system: 1012, 1102, 101.012

Slide 49

Converting numbers from octal to decimal

67.58 Convert the following numbers to decimal system: 78.118, 228, 34.128

Slide 50

Converting numbers from hexadecimal to decimal

19F16 (F=15) Convert the following numbers to the decimal system: 1A16, BF16, 9C,1516

Slide 51

Converting numbers from decimal to binary, octal and hexadecimal

Converting numbers from decimal to binary, octal and hexadecimal is more complex and can be done different ways. Let's consider one of the translation algorithms using the example of converting numbers from the decimal system to the binary system. It should be taken into account that the algorithms for converting integers and proper fractions will differ. Algorithm for converting whole decimal numbers into the binary number system Algorithm for converting proper decimal fractions into the binary number system. Converting numbers from base p to base q

Slide 52

Algorithm for converting integer decimal numbers to binary number system

Consistently divide the original integer decimal number and the resulting integer quotients by the base of the system until you get a quotient that is less than the divisor, that is, less than 2. Write down the resulting remainders in reverse order. EXAMPLE

Slide 53

19 2 9 18 1 4 8 0 1910=100112

Convert decimal number 19 to binary number system

Another recording method

Slide 54

Algorithm for converting proper decimal fractions into the binary number system.

Consistently multiply the original decimal fraction and the resulting fractional parts of the products by the base of the system (by 2) until a zero fractional part is obtained or the required calculation accuracy is achieved. Write down the resulting whole parts of the work in direct sequence. EXAMPLE

Slide 55

Convert 0.7510 to binary number system

A2=0,a-1a-2=0.112

Slide 56

Converting numbers from base p to base q

The conversion of numbers from a positional system with an arbitrary base p to a system with a base q is carried out using algorithms similar to those discussed above. Let's consider the algorithm for converting integers using the example of converting the integer decimal number 42410 to the hexadecimal system, that is, from a number system with base p=10 to a number system with base q=16. In the process of executing the algorithm, it is necessary to pay attention that all actions must be carried out in the original number system (in this case, decimal), and the resulting remainders must be written in numbers new system number (in this case hexadecimal).

Slide 57

Let us now consider the algorithm for converting fractional numbers using the example of converting the decimal fraction A10=0.625 into the octal system, that is, from a number system with base p=10 to a number system with base q=8. Translation of numbers containing both integer and fractional parts is carried out in two stages. The whole part is translated separately using the appropriate algorithm, and the fractional part is translated separately. In the final recording of the resulting number, the integer part from the fractional part is separated by a comma.

Slide 58

Converting numbers from binary to octal and hexadecimal and vice versa

Converting numbers between number systems whose bases are powers of 2 (q=2n) can be done using simpler algorithms. Such algorithms can be used to convert numbers between binary (q=21), octal (q=23) and hexadecimal (q=24) number systems. Converting numbers from binary to octal. Converting numbers from binary to hexadecimal. Converting numbers from octal and hexadecimal number systems to binary.

Slide 59

Converting numbers from binary to octal.

To write binary numbers, two digits are used, that is, in each digit of the number, 2 writing options are possible. We solve the exponential equation: 2=2I. Since 2=21, then I= 1 bit. Each bit of a binary number contains 1 bit of information. To write octal numbers, eight digits are used, that is, in each digit of the number, 8 writing options are possible. We solve the exponential equation: 8=2I. Since 8=23, then I= 3 bits. Each octal number contains 3 bits of information.

Slide 60

So, to convert an integer binary number to octal, you need to break it down into groups of three digits, from right to left, and then convert each group to an octal digit. If the last, left, group contains less than three digits, then it must be supplemented on the left with zeros. Let's convert the binary number 1010012 into octal in this way: 101 0012 To simplify the translation, you can use the table for converting binary triads (groups of 3 digits) into octal digits.

Slide 61

To convert a fractional binary number (proper fraction) into octal, you need to break it into triads from left to right (not taking into account the zero before the decimal point) and, if the last, right, group contains less than three digits, supplement it with zeros on the right. Next, you need to replace triads with octal numbers. For example, we convert the fractional binary number A2=0.1101012 into the octal number system: 110 101 0.658

Slide 62

Converting numbers from binary to hexadecimal

To write hexadecimal numbers, sixteen digits are used, that is, in each digit of the number, 16 writing options are possible. We solve the exponential equation: 16=2I. Since 16=24, then I= 4 bits. Each octal number contains 4 bits of information.

Slide 63

Thus, to convert an entire binary number to hexadecimal, it must be divided into groups of four digits (tetrads), from right to left, and if the last, left, group contains less than four digits, then it must be padded on the left with zeros. To convert a fractional binary number (proper fraction) to hexadecimal, you need to divide it into tetrads from left to right (not taking into account the zero before the decimal point) and, if the last, right, group contains less than four digits, add zeros to the right. Next, you need to replace the tetrads with hexadecimal numbers. Conversion table for tetrads to hexadecimal numbers

Slide 64

Converting numbers from octal and hexadecimal number systems to binary

To convert numbers from octal and hexadecimal number systems to binary, you need to convert the digits of the number into groups of binary digits. To convert from octal to binary, each digit of a number must be converted into a group of three binary digits (triad), and when converting a hexadecimal number, into a group of four digits (tetrad).

Slide 71

Representing numbers in fixed point format

Integers in a computer are stored in memory in fixed-point format. In this case, each digit of the memory cell always corresponds to the same digit of the number, and the “comma” is “located” to the right after the least significant digit, that is, outside the bit grid. One memory cell (8 bits) is allocated to store non-negative integers. For example, the number A2=111100002 will be stored in a memory cell as follows:

Slide 72

The maximum value of a non-negative integer is achieved when all cells contain ones. For an n-bit representation it will be equal to 2n – 1. Let us determine the range of numbers that can be stored in random access memory in the format of non-negative integers. The minimum number corresponds to the eight zeros stored in the eight bits of the memory cell and is equal to zero. The maximum number corresponds to eight units and is equal to the range of changes in non-negative integers: from 0 to 255

Slide 73

To store signed integers, two memory cells (16 bits) are allocated, and the most significant (left) bit is allocated to the sign of the number (if the number is positive, then 0 is written to the sign bit, if the number is negative - 1). The representation of positive numbers in a computer using the sign-magnitude format is called a direct number code. For example, the number 200210=111110100102 would be represented in 16-bit notation as follows: The maximum positive number (allowing for the allocation of one digit per sign) for signed integers in n-bit notation is: A = 2n-1 - 1

Slide 74

To represent negative numbers, two's complement is used. Additional code allows you to replace the arithmetic operation of subtraction with an addition operation, which significantly simplifies the work of the processor and increases its performance. The complement code of a negative number A stored in n cells is 2n - |A|. To obtain the additional code of a negative number, you can use a fairly simple algorithm: 1. Write the modulus of the number in direct code in n binary digits. 2. Get the reverse code of the number; for this, invert the values of all bits (replace all ones with zeros and replace all zeros with ones). 3. Add one to the resulting reverse code. EXAMPLE

Slide 75

The advantages of representing numbers in a fixed-point format are the simplicity and clarity of the representation of numbers, as well as the simplicity of the algorithms for implementing arithmetic operations. The disadvantage of representing numbers in a fixed point format is the small range of representation of quantities, which is insufficient for solving mathematical, physical, economic and other problems that involve both very small and very large numbers.

Slide 76

Slide 77

Representation of numbers in floating point format

Real numbers are stored and processed in a computer in floating point format. In this case, the position of the decimal point in the number may change. The floating point number format is based on scientific notation, in which any number can be represented. So the number A can be represented in the form: where m is the mantissa of the number; q – base of the number system; n – number order.

Slide 78

This means that the mantissa must be a proper fraction and have a non-zero digit after the decimal point. Let's convert the decimal number 555.55, written in natural form, into exponential form with a normalized mantissa:

Slide 83

Data storage

Information encoded using natural and formal languages, as well as information in the form of visual and audio images, is stored in human memory. However, for long-term storage of information, its accumulation and transmission from generation to generation, information carriers are used. (student message)

To use preview presentations create yourself an account ( account) Google and log in: https://accounts.google.com

Slide captions:

Binary encoding of symbolic information 12/17/2015 1 Prepared by: Computer science teacher MBOU Secondary School No. 2 Lipetsk Kukina Ekaterina Sergeevna

2 When binary encoding of text information, each character is assigned a unique decimal code from 0 to 255 or a corresponding binary code from 00000000 to 11111111. This is how a person distinguishes characters by their outline, and a computer by their code.

Using a formula connecting the number of messages N and the amount of information i, you can calculate how much information is needed to encode each character 3

4 Assigning a particular binary code to a symbol is a matter of convention, which is recorded in the code table. The first 33 codes (from 0 to 32) correspond not to characters, but to operations (line feed, entering a space, etc.). Codes 33 to 127 are international and correspond to characters of the Latin alphabet, numbers, arithmetic symbols and punctuation marks.

5 Codes from 128 to 255 are national, i.e. in national encodings different characters correspond to the same code. There are 5 single-byte encoding tables for Russian letters, so texts created in one encoding will not be displayed correctly in another.

6 Chronologically, one of the first standards for encoding Russian letters on computers was the code KOI – 8 (“Information Exchange Code – 8 bit”). This encoding is used on computers running the UNIX operating system.

7 The most common encoding is the standard Cyrillic encoding Microsoft Windows, abbreviated CP1251 (“CP” stands for “Code Page”). All Windows applications that work with the Russian language support this encoding.

8 To work in the MS-DOS operating system environment, an “alternative” encoding is used, in Microsoft terminology – CP 866 encoding.

9 Apple company developed its own encoding of Russian letters for Macintosh computers (Mac)

10 The International Standards Organization (ISO) has approved another encoding called ISO 8859 – 5 as a standard for the Russian language.

KOI - 8 - UNIX CP1251 (“CP” stands for “Code Page”) - Microsoft Windows CP 866 - MS-DOS Mac - Macintosh ISO 8859 – 5 Encoding standards 11

Character encoding table Binary code Decimal code KOI8 CP1251 CP866 Mac ISO 0000 0000 0 ……… 0000 1000 8 Delete last character (Backspace key) ……… 0000 1101 13 Line feed (Enter key) ……… 0010 0000 32 Space 0010 0001 3 3 ! ……… 0101 1010 90 Z ……… 0111 1111 127 ……… 128 - b A A K ……… 1100 0010 194 B B - - T ……… 1100 1100 204 L M: : b ……… 1101 1101 221 Ш E - Ё N……… 1111 1111 225 b i Neraz. space Neraz. space n 12

13 V Lately a new international standard has appeared, Unicode, which allocates not one byte for each character, but two, and therefore with its help you can encode not 256 characters, but 2 16 = 65,536 different characters. This encoding is supported by editors starting with MS Office 97.

Task 1: identify the symbol by its numeric code. Launch NOTEBOOK Press ALT and 0224 (on the optional numeric keypad). The symbol a will appear. Repeat this operation for numeric codes from 0225 to 0233. The characters in the encoding (CP 1251 Windows) appear. Write them down in your notebook. Press ALT and 161 (on the optional numeric keypad). The symbol b will appear. Repeat this operation for numeric codes 160, 169, 226. Characters in the encoding (CP 866 MS-DOS) will appear. Write them down in your notebook. 14

Task 2: Determine the numeric code for the characters Determine the numeric code to enter by holding Alt key to get the characters: ☼, §, $, ♀ Explanation: this code contained in the range from 0 to 50. 15

16 Thank you for your attention!

2 Contents Binary coding in a computer Analog and discrete form of information representation Analog and discrete form of information representation Binary coding of graphic images Binary coding of graphic images Binary coding of audio Binary coding of video information Binary coding of text information

3 Binary coding in a computer All information that a computer processes must be represented in binary code using two digits: 0 and 1. These two symbols are usually called binary digits or bits. The computer must be organized: encoding and decoding Encoding is the transformation of input information into form perceived by the computer, i.e. binary code Decoding - converting data from binary code into a human-readable form Hello!

4 Why binary coding It is convenient to encode information as a sequence of zeros and ones, if you imagine these values as two possible stable states of an electronic element: 0 - absence of an electrical signal; 1 – presence of an electrical signal. The disadvantage of binary coding is long codes. But in technology it is easier to deal with big amount simple elements than with a small number of complex ones. The methods of encoding and decoding information in a computer, first of all, depend on the type of information, namely, what should be encoded: numbers, text, graphics or sound.

5 Analogue and discrete form of information representation A person is able to perceive and store information in the form of images (visual, sound, tactile, gustatory and olfactory). Visual images can be saved in the form of images (drawings, photographs, etc.), and sound recorded on records, magnetic tapes, laser discs and so on Information, including graphic and audio, can be presented in analog or discrete form. With analog representation, a physical quantity takes on an infinite set of values, and its values change continuously. With discrete representation, a physical quantity takes on a finite set of values, and its value changes abruptly

6 Analog and discrete form of information representation An example of analog and discrete information representation: the position of a body on an inclined plane and on a staircase is specified by the values of the X and Y coordinates. When a body moves along an inclined plane, its coordinates can take on an infinite number of continuously changing values from a certain range, and when moving up the stairs only a certain set of values, and changing abruptly

7 Sampling Example of analog representation graphic information a painting whose color changes continuously, and a discrete image printed using inkjet printer and consisting of individual dots of different colors. An example of analog storage of sound information is a vinyl record ( soundtrack changes its shape continuously), and a discrete audio CD (the sound track of which contains sections with different reflectivity) The conversion of graphic and sound information from analogue to discrete form is carried out by sampling, that is, splitting a continuous graphic image and a continuous (analog) sound signal into individual elements. The process of sampling involves encoding, that is, assigning each element a specific value in the form of a code. Sampling is the conversion of continuous images and sound into a set of discrete values in the form of codes

10 Step 1. Sampling: splitting into pixels. Raster coding Step 2. A single color is determined for each pixel. A pixel is the smallest element of a design that can be independently set to color. Resolution: pixels per inch, dots per inch (dpi) screen 96 dpi, print dpi, typography 1200 dpi

11 Raster coding (True Color) Step 3. From color to numbers: RGB model color = R + G + B red red blue blue green green R = 218 G = 164 B = 32 R = 135 G = 206 B = 250 Step 4. Numbers – in binary system. How much memory is needed to store the color of 1 pixel? ? How many different colors can you code? ? 256·256·256 = (True Color) R: 256=2 8 options, need 8 bits = 1 byte R G B: only 3 bytes Color depth

12 RGB color model Color images can have different color depths, which are determined by the number of bits used to encode the color of a point. If we encode the color of one point in an image with three bits (one bit for each RGB color), we will get all eight different colors

13 True Color In practice, to store information about the color of each point of a color image in the RGB model, 3 bytes (i.e. 24 bits) are usually allocated - 1 byte (i.e. 8 bits) for the color value of each component. Thus Thus, each RGB component can take a value in the range from 0 to 255 (total 2 8 = 256 values), and each point of the image, with such a coding system, can be colored in one of the colors. This set of colors is usually called True Color (true colors ), because the human eye is still unable to distinguish more variety

14 Let's calculate the amount of video memory In order for an image to be formed on the monitor screen, information about each point (dot color code) must be stored in the computer's video memory. Let's calculate the required amount of video memory for one of the graphics modes B modern computers The screen resolution is usually 1280 x 1024 pixels. Those. total 1280 * 1024 = points. With a color depth of 32 bits per pixel, the required amount of video memory is: 32 * = bit = byte = 5120 KB = 5 MB

15 Raster coding (True Color) CMYK model Subtractive (subtractive), used when preparing images for printing on a professional printer and serves as the basis for four-color printing technology. The color components of this model are the colors obtained by subtracting the primary ones from white: blue (Cuan) = white - red = green - blue; magenta (Magenta) = white - green = red + blue; yellow (Yellow) = white - blue = red + green. The problem of the SMU color model: in practice, no paint is absolutely pure and necessarily contains impurities, overlapping additional colors in practice it does not produce pure black. Therefore, a pure black component was included in this color model.

17 Encoding vector images Vector image is a set of graphic primitives (point, line, ellipse...). Each primitive is described by mathematical formulas. Coding depends on the application environment Advantage vector graphics is that files storing vector graphics are relatively small in size. It is also important that vector graphics can be enlarged or reduced without loss of quality

18 Vector drawings Constructed from geometric shapes: segments, broken lines, rectangles, circles, ellipses, arcs, smooth lines (Bézier curves) For each shape, the following are stored in memory: dimensions and coordinates in the drawing, color and style of the border, color and fill style (for closed shapes) Formats files: WMF (Windows Metafile) CDR (CorelDraw) AI (Adobe Illustrator) FH (FreeHand)

19 Vector drawings The best way for storing drawings, diagrams, maps; there is no loss of information during encoding; there is no distortion when resizing; smaller file size, depends on the complexity of the drawing; ineffective to use for photographs and blurry images

20 Graphics file formats Formats graphic files determine the method of storing information in a file (raster or vector), as well as the form of storing information (compression algorithm used) The most popular raster formats: BMP GIF JPEG TIFF PNG

21 Graphic file formats Bit MaP image (BMP) a universal raster graphics file format used in the operating room Windows system. Supported by many graphic editors, including the Paint editor. Recommended for storing and exchanging data with other applications Tagged Image File Format (TIFF) is a raster graphics file format that is supported by all major graphics editors and computer platforms. Includes a lossless compression algorithm. Used to exchange documents between different programs. Recommended for use when working with publishing systems

22 Graphics file formats Graphics Interchange Format (GIF) is a raster graphics file format supported by applications for a variety of operating systems. Includes a lossless compression algorithm that allows you to reduce the file size by several times. Recommended for storing images created programmatically (diagrams, graphs, etc.) and drawings (such as applications) with limited quantity colors (up to 256). Used to place graphic images on Web pages on the Internet Portable Network Graphic (PNG) is a raster graphic file format similar to the GIF format. The Joint Photographic Expert Group (JPEG) raster graphics file format is recommended for posting graphic images on Web pages on the Internet, which implements an effective compression algorithm (JPEG method) for scanned photographs and illustrations. The compression algorithm allows you to reduce the file size by tens of times, but leads to irreversible loss of some information. Supported by applications for various operating systems. Used to place graphic images on Web pages on the Internet

23 Questions and tasks: What types of computer images do you know? What is the maximum number of colors that can be used in an image if 3 bits are allocated for each pixel? What do you know about the RGB color model? Calculate the required amount of video memory for graphics mode: screen resolution 800 x 600, color quality 16 bits.

25 Sound coding Sound is a wave with a continuously changing amplitude and frequency: the greater the amplitude, the louder it is for a person, the higher the frequency, the higher the tone. Complex continuous sound signals can be represented with sufficient accuracy as the sum of a certain number of simple sinusoidal oscillations. Each sinusoid, can be precisely specified by a certain set of numerical parameters - amplitude, phase and frequency, which can be considered as a sound code at some point in time

26 Time sampling of sound In the process of encoding an audio signal, its time sampling is carried out - a continuous wave is divided into separate small time sections and for each such section a certain amplitude value is established. Thus, the continuous dependence of the signal amplitude on time is replaced by a discrete sequence of volume levels

27 The quality of binary audio encoding is determined by the encoding depth and sampling frequency. Sampling frequency – the number of signal level measurements per unit time. The number of volume levels determines the encoding depth. Modern sound cards provide 16-bit audio encoding depth. In this case, the number of volume levels is N = 2 I = 2 16 = 65536

29 Presentation of video information Processing of video information requires very high speed computer system What is the film from a computer science point of view? First of all, it is a combination of sound and graphic information. In addition, to create the effect of movement on the screen, an inherently discrete technology for quickly changing static images is used. Studies have shown that if more than one frame changes in one second, the human eye perceives the changes as continuous.

30 Presentation of video information When using traditional methods of storing information electronic version the film will turn out to be too large. A fairly obvious improvement is to remember the first frame in its entirety (in the literature it is usually called the key frame), and in the following ones to save only the differences from the initial frame (difference frames)

31 Some Video File Formats There are many different formats for representing video data. Video for Windows, based on universal files with AVI extension ( Audio Video Interleave - alternating audio and video) Video compression systems have recently become increasingly widespread, allowing for some image distortion invisible to the eye in order to increase the compression ratio. The most well-known standard of this class is MPEG (Motion Picture Expert Group). The methods used in MPEG are not easy to understand and rely on fairly complex mathematics. A technology called DivX (Digital Video Express) has become more widespread. Thanks to DivX, it was possible to achieve a compression level that made it possible to fit a high-quality recording of a full-length film onto one CD - compressing a 4.7 GB DVD film to 650 MB

32 Sound file formats MIDI - recording of musical works in the form of commands to a synthesizer, compact, does not reproduce the human voice, (corresponds to vector representation in graphics) WAV - universal sound format, it stores full information about digitized sound (corresponds to the bmp format in graphics). Occupies a very large amount of memory (15 MB for 1 minute of sound) MP3 is an audio compression format with controlled information loss that allows you to compress files several times depending on the specified bitrate (on average 11 times). Even at the highest bitrate - 320 kbit/s - it provides 4 times compression compared to APE CDs - an audio compression format without loss of information (and therefore quality), a compression ratio of about 2

33 Multimedia Multimedia (multimedia, from the English multi - many and media - carrier, environment) is a set of computer technologies that simultaneously use several information media: text, graphics, video, photography, animation, sound effects, high-quality sound Under the word “multimedia” » understand the impact on the user in several ways information channels simultaneously. Multimedia is the combination of images on a computer screen (including graphic animation and video frames) with text and sound. Multimedia systems are most widespread in the field of education, advertising, and entertainment.

35 Binary coding of text information Since the 60s, computers have increasingly begun to be used for processing text information, and currently most of the PCs in the world are engaged in processing text information. Traditionally, to encode one character, the amount of information = 1 byte is used (1 byte = 8 bits).

37 Binary encoding of text information Coding is that each character is assigned a unique binary code from to (or decimal code from 0 to 255). It is important that assigning a specific code to a symbol is a matter of agreement, which is fixed in the code table

38 Encoding table A table in which all characters of the computer alphabet are assigned serial numbers (codes) is called an encoding table For different types Computers are used different encodings. With the spread of the IBM PC, the ASCII (American Standard Code for Information Interchange) encoding table became an international standard.

39 ASCII encoding table Only the first half is standard in this table, i.e. characters with numbers from 0 () to 127 (). This includes letters of the Latin alphabet, numbers, punctuation marks, parentheses and some other symbols. The remaining 128 codes are used in different options. Russian encodings contain characters from the Russian alphabet. Currently, there are 5 different code tables for Russian letters (KOI8, SR1251, SR866, Mac, ISO). Currently, the new international Unicode standard has become widespread, which allocates two bytes for each character. It can be used to encode (2 16 =) different characters.

42 The most common encoding currently used is Microsoft Windows, abbreviated CP1251 (“CP” stands for “Code Page”). CP1251

45 The International Standards Organization (ISO) approved another encoding called ISO ISO as a standard for the Russian language

48 Information volume of the text Today, many people use computers to prepare letters, documents, articles, books, etc. text editors. Computer editors mainly work with an alphabet of 256 characters. In this case, it is easy to calculate the amount of information in the text. If 1 character of the alphabet carries 1 byte of information, then you just need to count the number of characters; the resulting number will give the information volume of the text in bytes. Let a small book made using a computer contain 150 pages; each page has 40 lines, each line has 60 characters. This means that the page contains 40x60=2400 bytes of information. The volume of all information in the book: 2400 x 150 = bytes

49 Pay attention! Numbers are encoded using the ASCII standard in two cases - during input/output and when they appear in text. If numbers are involved in calculations, then they are converted into another binary code (see the lesson “Representing numbers in a computer”). Let's take the number 57. When used in text, each digit will be represented by its own code in accordance with the ASCII table. In the binary system this is - When used in calculations, the code of this number will be obtained according to the rules for converting to the binary system and we get -

50 Questions and tasks: What is the encoding of text information in a computer? Encode your last name, first name, class number using ASCII code. What message is encoded in the Windows-1251 encoding: Assuming that each character is encoded by one byte, estimate the information volume of the following sentence from Pushkin's quatrain: The singer-David was small in stature, But he knocked down Goliath!

51 Questions and tasks: Calculate the required amount of video memory for graphics mode: screen resolution 800 x 600, color quality 16 bits. To store a raster image measuring 64*64 pixels, 1.5 KB of memory were allocated. What is the maximum possible number of colors in the image palette? Specify the minimum amount of memory (in KB) that is sufficient to store any 64*64 pixel bitmap image if you know the image has a palette of 256 colors. There is no need to store the palette itself. How many seconds will it take for a modem transmitting messages at a bit/s rate to transmit color raster image 800*600 pixels in size, provided that the palette has 16 million colors? A color image measuring 10*10 cm is scanned. The scanner resolution is 1200*1200 dpi, color depth is 24 bits. What information volume will the resulting graphic file have?

Since the 60s, computers have increasingly begun to be used for processing text information, and currently most of the PCs in the world are engaged in processing text information.

Traditionally, to encode one character, the amount of information = 1 byte is used (1 byte = 8 bits).

Binary coding of text information

Coding consists of assigning each character a unique binary code from 00000000 to 11111111 (or a decimal code from 0 to 255).

It is important that assigning a specific code to a symbol is a matter of agreement, which is fixed in a code table.

ASCII encoding table

Only the first half is standard in this table, i.e. characters with numbers from 0 (00000000) to 127 (0111111). This includes letters of the Latin alphabet, numbers, punctuation marks, parentheses and some other symbols.

The remaining 128 codes are used in different ways. Russian encodings contain characters from the Russian alphabet.

IN Currently there are 5 different code tables for Russian letters (KOI8, SR1251, SR866, Mac, ISO).

IN Currently, the new international standard Unicode has become widespread, which

ASCII standard part table

Table

extended code

Note! !

Numbers are encoded using the ASCII standard in two cases - during input/output and when they appear in text. If numbers are involved in calculations, then they are converted into another binary code.

Let's take the number 57.

When used in text, each digit will be represented

with its code in accordance with the ASCII table. In binary it is 00110101 00110111.

When used in calculations, the code of this number will be obtained according to the rules for converting to the binary system and we will obtain - 00111001.