1 Information Representation

Overview of this chapter
Data representation
 Number systems
 Binary, Hexadecimal and denary
 Conversion between number systems
 Signed numbers
 “Sign and magnitude” and “Two’s complement”
 Calculation of signed numbers
 Binary coded decimal (BCD)
 Text coding
 ASCII code
 Unicode
Multimedia
 Images
 Vector graphics
 Bitmaps
 Sound
 Videos
Compression techniques
 Lossy compression
 Lossless compression
Data Representation
Number Systems
The number systems
There are three systems you need to be familiar with:Binary, based 2
 All computer technology is engineered with components which has only 2 states.
 One binary digit is known as a Bit.
Denary, based 10.
 What we use normally in daily lives
Hexadecimal, based 16.
 Represented from 0~F.
 Has shorter length than binary, therefore
 Easier to debug
 Can represent error codes shorter, easy to write down or to remember
Conversion of number systems
From bin/hex to denary

Sum of [every digit * its place value.]

Example for hex to denary:
For hex: 4 C 5 F Place value: 16^3 16^2 16^1 16^0 Denary: 4*(16^3) + 12*(16^2) + 5*(16^1) + 15*(16^0) = 19551

Example for binary to denary:
For binary: 1 0 1 1 Place value: 2^3 2^2 2^1 2^0 Denary: 1*(2^3) + 0*(2^2) + 2*(2^1) + 1*(2^0) = 10
From denary to bin/hex
Steps Start from the most significant place value.
 Divide the number by the place value.
 The quotient will be the digit on that place
 Carry out the remainder with next place until it becomes 0.
Example for denary to hex:
For denary: 260 Hex Place value for first 3 bits: 256 16 1 1. Start from 256. 260 / 256 = 1…4 So first digit is <1>. 2. Carry on calculation with remainder <4>. 4 / 16 = 0…4 So second digit is <0>. 3. Carry on calculation with remainder <4>. 4 / 1 = 4…0 So third digit is 4. 4. The remainder is now 0. Calculation complete. Final answer: 104 in Hex.
Conversion between hex and binary
Hex → Binary: Break down each hex to 4 bits. Join the bits together.
Binary → Hex: Break down to 4bit group. Convert each group to Hex.Example:
For hex 4F3E: 4 = 0100 F = 1111 3 = 0011 E = 1110 Final answer: 0100 1111 0011 1110
Signed numbers
There are three methods to represent signed numbers.
You will need to carry out calculations on: Addition of +ve numbers in <sign and magnitude>
 Addition of +ve and ve numbers in <2’s complement>
Note the characteristics on the graph:
 The three systems represent positive numbers identically.
 In all three systems numbers beginning with ‘1’ is negative.
Representation method
Sign and magnitude method

First bit represent sign(+ or ): 1 means negative and 0 means positive.

Example: evaluate 1 0010 0100 in deanery
1 0010 0100 ↑ Negative 0010 0100 = 33 Final answer: 33
Two’s complement method

The first digit represents the sign also.
 For positive numbers, it’s the same as normal binary.
 For negative numbers, a special algorithm is used.

Conversion
Unsigned binary (+a)  ^   Flip all digits (1>0 and 0>1) This operation reverses sign.   v  One’s complement (a) in 1’s complement  ^ 1   +1 v  Two’s complement (a) in 2’s complement

Example: 36 is 0010 0100 in binary. Find 36 in 2’s complement.
1. Flip the digits from 0>1, 1>0. 0010 0100 > 1101 1011 2. The number is now 1’s complement. Add 1 to get 2’s complement. 1101 1011 > 1101 1100 Final answer: 1101 1100
Calculation
To calculate, convert to 1’s complement first.
You cannot calculate in 2’s complement.Example: Calculate 5736 in binary
1. Carry out standard binary addition 1101 1011  (36) <in 1’s complement> + 0011 1001  (+57) <in 1’s complement> ————————————————————————— 1 0001 0100 ↑ Overflow. Indicates +ve. (If no overflow then ve) Discard the extra overflow digit. 2. Convert back to 2’s complement by adding 1. Final answer: 0001 0101  =(+21)
Binary coded Hexadecimal (BCD)
Another way to represent integers in hexadecimal.
Representation
Convert every digit into 4bit binary, and then join them together.
Example: Convert 0916 to BCD (That’s my birthday BTW)0 = 0000 9 = 1001 1 = 0001 6 = 0110 Final answer: 0000 1001 0001 0110
Calculation
 Do standard binary calculation
 Whenever the 4bit nibble exceeds 1001 (>9 in deanery):
 Add correction nibble <0110> to the nibble
 Carry on the carry bit
Example: 916 + 227
Do the standard binary addition first. 1001 0001 0110  (+916) in BCD + 0010 0010 0111  (+227) in BCD ————————————————————————————————— 1011 0011 1101  This gives (11 2 13), which is wrong. #3 #2 #1 Nibble number 1. Starting with the least significant nibble #1:<1101> 1101 > 1001, so correction needed. 1101 + 0110 = 1 0011 ^ ^  Nibble 1 Carry on Final answer for nibble #1: 0011 Carry on <1> to next calculation. 2. Nibble #2: <0011> 0011 < 1001, so no correction needed. Add on carried nibble, 0011 + 0001 = 0100 Final answer for nibble #2: 0100 3. Nibble #3: <1011> 1011 > 1001, so correction needed. 1011 + 0110 = 1 0001 Final answer for Nibble #3: 0001 Carry on <1> to next nibble. 4. Nibble #4 <0000> 0000 < 1001, so no correction needed Add on carried nibble, 0000 + 0001 = 0001 Final answer for nibble #4: 0001 So final answer for addition: 0001 0001 0100 0011  =(+1143) in BCD #4 #3 #2 #1 Nibble number
Text coding
There are three major coding schemes: EBDIC, ASCII and Unicode.
ASCII (American Standard Code for Information Interchange) Coding
Characteristics
 7bit code
 The most significant bit in the byte is always set to 0, to prevent confusion with Unicode
 Consists of 128 codes:
 Majority: Printing or graphical characters (e.g. ‘a’, ‘&’)
 Few: Control characters (e.g. ‘SOH’ for start of heading),
 Numbers and characters are in sequence
 Upper / Lower case defer by 0x20 (hex:20) / 0010 0000 (in binary)
 i.e Bit 5’s value determines upper / lower case of letter
Unicode
Characteristics
 In code plane 0, it uses 16bit code format
 e.g. U+4EA0
 Aimed to represent every character in the world
 Character code is known as ‘Code point’
 Including every languages
 First 128 characters are ASCII characters
 They always begin with 0 (since max for 8bit is 128)
 …So only one byte is needed to represent ASCII characters in Unicode
Multimedia
Images
There are two ways to represent image: Vector graphics and Bitmaps.
 Vector graphics: A graphic consisting of components defined by
 Geometric formulae (e.g. y=x+1 {5<x<1}) and
 Associating properties (e.g. White color, thickness = 500px)
 Bitmaps
Vector graphic Bitmap Defined by Geometric formulae (e.g. y=x+1 {5<x<1}), defined relatively to the imaginary drawing canvas.<br /><br />Associating properties (e.g. color=white, thickness=500px, fill=black) Pixels. Each pixel is defined with a 2D position vector and color. <br /><br />Color depth shows number of bits used per pixel. More bits can represent more color. At least 8 bits are required for colored image. <br /><br />Resolution shows number of pixels in each row / column image. Higher resolution brings more detailed image. Screen resolution could limit number of pixels displayed. Scaling up Scalable.<br/><br />New calculations are applied when size changes. Lowers quality.<br/><br />Total pixels count don’t change. Individual pixels will be evident. File size Small Big.<br/><br />Size (in bits) = width * length * bit_depth File information Requires a header: resolution and coding scheme. File size calculation
Use 1024 (2^10) when converting file sizes.
Kibi (KB) 2^10 = 1024 bytes
Mebi (MB) 2 ^ 20 = 1048 576 bytes
Gibi (GB) 2 ^ 30 bytes
Sound files
Sampling rate
Definition: number of samples taken per second.
More samples taken per second increases sound quality. However, it also increases file size. Nyquist's therm indicates sampling should be done at least twice the highest frequency in the sample. (20 Hz~20k Hz is the human ear's limit)
Analogue to digital convertor
Blue line: actual sound (Continuous data)
Red line: computer recorded sound (Discrete data)
Horizontal dotted lines: defined sound levels the computer can record.The amplitude of sound cannot be measured accurately by a computer. It is approximated to the nearest defined amplitude. This causes quantization error.
Sampling resolution
Sampling resolution indicates how many bits are used to record a sample of sound. If more bits are used there will be more defined amplitudes. This increases sound quality but also increases the file size. Usually 16 bits are used.
File size calculation
Size(in bits) = time * sampling_resolution * sample_rate * channels
Sound editing softwares
Common functions are:
 Combining sound from different sources
 Fading in or out of sound
 Edit the sound to remove noise and other perfections.
Video
Simply a succession of stilled images.
Refresh rate must be >50 times per second so human eye cannot notice the flicker. Due to bandwidth restrictions 50fps is hard to reach.
Interlaced encoding splits each frame into two parts: Even rows and Odd rows. It first refreshes the even rows, then wait, and refresh the odd rows.. Actual refresh rate is 25fps for each row, but when combined it seems 50fps to the eye.
Progressive encoding is another encoding technique. Each frame is displayed completely in that way.
Compression techniques
Lossy compression and lossless compression are coding techniques to reduce file size.
Lossy compression Lossless compression Data loss Some information is lost.<br/>Original file cannot be recovered. No information loss.<br/>Original file can be recovered through subsequent decoding. Examples Sometimes quality is little or not affected, by reducing information human cannot be aware of.<br/>For example, remove sound data over 20 kHz. Human cannot hear it anyway.<br/>Reduce colour depth from 128 bit to 64 bit has hardly any difference to human eye.<br /><br />Most of the time quality is compromised. Huffan encoding (For texts) <br />The mostoften used characters replaced by a shorter code. <br />Prefix priority prevents ambiguity. None of the codes begin with the sequence of bits representing a shorter code.<br />Now the data becomes a bit stream.<br /> e.g. in ASCII ‘a’ is 0110 0001. <8 bits> Now use ‘10‘ to replace it. <becomes 2 bits> <br />Prefix priority means no other code begins with ’10’.<br />To decode after transmission, simply convert ‘10‘ back to ‘a’.<br /><br />Run length encoding (For bitmap files) <br />Converts sequences of same bit into a code. <br />It defines the bit pattern and the number of times it is repeated.<br /><br />e.g. In a 8bit bitmap file there’s 5 black pixels.<br/>Instead of saying 0000 0000, 0000 0000, 0000 0000, 0000 0000, 0000 0000<br/>We can say 0101, 0000 0000 , meaning five times of colour 0000 0000 is repeated.
 Number systems