O'Level Academy

OLevel

Computer Science 2210

Data Storage

• Data is stored in a computer system to be accessed by the processor.

• It is a process that allows the computer system to retain information temporarily or permanently.

• This data is usually in the form of optical or electromagnetic form.

Types of Data Storages:

• There are two types of data storages i.e. primary and secondary.

• The primary storage retains data in RAM (Random Access Memory), ROM(Read Only Memory), or L1 & L2 cache.

• The secondary storage stores data in hard disks, RAID (Redundant Array of Independent Disks Systems), Zip drivers, etc.

• Primary storage is faster to access whereas secondary storage can store more data.

• Primary storage is also known as Main Storage whereas Secondary Storage is also known as Auxilary Storage.

File Compression:

• It is a process that allows you to package a single file or multiple files to use less disk space.

• There are two types of file compression:

1. Lossy file compression

2. Lossless file compression

Lossless File Compression:

• This file compression allows the original file to be reconstructed when uncompressed.

• It is best for file formats where data loss can damage the information. E.g. account statements, attendance spreadsheets, etc.

Lossy File Compression:

• In contrary to lossless, lossy compression removes the unnecessary data to compress the files.

• The original file cannot be reconstructed.

• It is used where the quality degradation cannot harm the information e.g. MP3 and JPEG.

File Formats:

• In computer systems, there are various types of file formats. Following are the ones we will discuss in detail:

- MP3

- MP4

- MIDI

- Jpeg

- Text and numbers

MP3:

• MP3 is a technology that compresses music files.

• It is also known as audio compression.

• It compresses a typical music file by 90%.

• E.g. A100 MB music file can be converted into an MP3 file format with a size of 10 MB.

• These types of files can be used in cellphones, computers or MP3 players.

• The music files are compressed using a technology known as ‘Perceptual Music Shaping’.

• This technology removes the sounds that the human ear cannot hear meaning that compression is done by removing some part of the music without affecting the overall quality of music.

• It uses a Lossy Format for compression.

MP4:

• The MP4 format contrary to the MP3 format allows the storage of not only music but also the storage of videos, animation, photos, etc.

• Using this format, videos can be streamed over the internet without compromising the quality.

JPEG:

• JPEG stands for Joint Photograph Experts Group.

• JPEG is an image file format that changes the image resolution i.e. pixels per centimeter to store the image file.

• When the image file is compressed its size is reduced and quality takes the toll for it.

• Since JPEG, reduces the file size by losing the quality it is also an example of the lossy format of compression.

• The original quality cannot be reconstructed once the file is compressed.

MIDI (Musical Instrument Digital Interface):

• It is a standard that allows sound to be represented in binary format.

• It stores the sound description, not the sound itself.

• It stores a series of control messages containing sound events e.g. pitch, volume, and duration.

• When these control messages are received by the MIDI-compatible device the messages are interpreted and reproduced.

• The MIDI data can also be compressed however it does not need any special compression algorithm.

Text & Numbers:

• Text and numbers can be stored in various formats.

• Typically, the text is stored in ASCII.

• However, numbers can be stored in different number formats. E.g. real numbers, date, time, integers, currency, etc.

• The files containing numbers undergo a lossless format of compression since this type of data cannot be compromised.

• The text format can also be compressed and uses a complex algorithm that uses redundancy.

• The compression of text is also lossless.

Error Checking Methods

Introduction:

• When you transmit data, there is always a risk for data corruption i.e. caused due to fault in communication equipment, noise, etc.

• In compressed data, the risk of loss of information increases since redundancy has already reduced to a minimum to reduce the file size.

• Therefore, error control measures are taken to make sure the data that is transferred through communication channels is error-free.

• These error control measures usually contain error detection and correction.

• Error detection detects the errors in the data or message while error correction is the process of reconstruction of the original data.

Error Detection & Correction Methods:

1. Parity

2. Checksum

3. Check Digit

4. Automatic Repeat Request (ARQ)

1. Parity:

• In this error detection method, a parity bit is added to the original message.

• Systems that use even parity counts the occurrences of 1s; adds a 0 parity bit if the count is already even and adds a 1 parity bit to make the occurrence of 1s even if it is not even already.

• In an odd parity system, the number of 1s occurrences needs to odd including the parity bit.

Example 1:

Consider the byte 1101100

• If this byte is using an even parity system, then the parity bit needs to be ‘0’ since the number of occurrences of 1s is already even.

• However, if it is using an odd parity system then the parity bit needs to be ‘1’ to make the number of occurrences of 1s odd.

Example 2:

Now consider the following example bytes and identify the parity system used each one of them.

• In this byte, the parity system used is odd since the number of occurrences of 1s is odd.

• In this byte, the parity system used is even.

Example 3:

Consider an example, in which even parity (vertical parity check) system is used to transmit 9 bytes of data. The following table shows the data at the receiving end.

• If this table is studied properly then it can be seen that:

• Row 8 has incorrect parity i.e. the number of occurrences of 1s is not even so the parity should have been 1.

• Column 5 also has an odd number of occurrences of 1s and the parity bit is wrong.

• This information reveals that error has occurred at the intersection of column 5 and row 8.

• And byte 8 should have been this:

Shortcoming of Parity:

• If more than 1 bit of a byte was replaced during transmission, then it would have been impossible to detect the error.

• Suppose using even parity system, the following byte has been sent:

• This byte could have received like this:

• Or like this:

• In both situations, it would not have triggered the error since the number of occurrences of 1s has remained even.

2. Checksum:

• It is an error detection method that sends an additional value with the original data.

• This additional value is known as the checksum.

• It is a fixed-length modular arithmetic sum of the message. E.g. a byte.

• This sum can be negated by a 1s complement operation before sending the data stream or message to detect errors in the message.

• To understand how it works, assume the checksum is 1 byte in length i.e. the max value can be 28 - 1 = 255.

<= 255:

• If the sum of all the bytes transferred is less than or equal to 255 then checksum will be this value 28 - 1 = 255.

>255:

• If the sum of all the bytes transferred is greater than 255 then checksum will be calculated using the following method.

Example 1:

Suppose the sum of the bytes is 1185.

• Since it is greater than 255 therefore, we will use the second method.

• First, 1185 will be divided by 256. i.e. 1185/256 = 4.496

• Round this value to the nearest whole number i.e. 4.496 rounds off to 4

• Multiply the rounded value to 256 i.e. 4 * 256 = 1024

• Calculate the difference i.e. 1185 – 1024 = 127 checksum

Note:

• When data is to be transmitted, its checksum is calculated and attached to the original message before the transmission.

• At the receiving end, the checksum of the received block is again calculated and compared with the transmitted checksum.

• If both checksums are the same, then the data is error-free.

3. Check Digit:

• It is an error detection system in which an additional number is added to the series (e.g. account no. etc.) to check the accuracy.

• This number is usually derived from the original series of numbers.

• For example, consider a number 232, the sum of these three digits (2+3+2=7) can be added as the last digit to the original series i.e. 2327.

Example 1:

Consider an ISBN-10 number 0 - 2 0 1 - 5 3 0 8 2 - X that is typically used on books that use the module 11 system (X inclusive).

• To calculate the value of X, first, we need to find out the placement of each digit.

• Multiply each digit with its position,

(0x10) + (2x9) + (0x8) + (1x7) + (5x6) + (3x5) + (0x4) + (8x3) + (2x2)

= 0 + 18 + 0 + 7 + 30 + 15 + 0 + 24 + 4

= 98

• Divide the total with 11,

98/11
= 8 remainder 11

• Check the difference, i.e. subtract X placement from the remainder,

11 – 10
= 1

• This value is your check digit and the final ISBN becomes,

4. Automatic Repeat Request (ARQ):

• This error detection method uses acknowledgment and timeout.

• An acknowledgment is a message specifying correct data has been received and i.e. sent by the receiver.

• A Timeout is a deadline or defined time, or time elapsed before the receiving of the acknowledgment.

• If the acknowledgment is not sent by the receiver before timeout then the message will be sent again automatically.

Computer Science 2210

Data Storage

Data Storage

Types of Data Storages:

File Compression:

Lossless File Compression:

Lossy File Compression:

File Formats:

MP3:

MP4:

JPEG:

MIDI (Musical Instrument Digital Interface):

Text & Numbers:

Error Checking Methods

Introduction:

Error Detection & Correction Methods:

1. Parity:

Example 1:

Example 2:

Example 3:

Shortcoming of Parity:

2. Checksum:

<= 255:

>255:

Example 1:

Note:

3. Check Digit:

Example 1:

(0x10) + (2x9) + (0x8) + (1x7) + (5x6) + (3x5) + (0x4) + (8x3) + (2x2)

= 0 + 18 + 0 + 7 + 30 + 15 + 0 + 24 + 4

= 98

98/11= 8 remainder 11

11 – 10 = 1

4. Automatic Repeat Request (ARQ):

98/11
= 8 remainder 11

11 – 10
= 1