Presentasi sedang didownload. Silahkan tunggu

Presentasi sedang didownload. Silahkan tunggu

INFORMATION THEORY & BASIC TECHNIQUE

Presentasi serupa


Presentasi berjudul: "INFORMATION THEORY & BASIC TECHNIQUE"— Transcript presentasi:

1 INFORMATION THEORY & BASIC TECHNIQUE
BAB 2 INFORMATION THEORY & BASIC TECHNIQUE Universitas Telkom INFORMATION THEORY

2 BAB 2A INFORMATION THEORY Universitas Telkom INFORMATION THEORY

3 Tujuan Mengenal istilah dan definisi Nilai informasi dan Entropi
Memahami Teorema Source Coding Memahami Teorema Channel Coding Mengenal Shannon Limit

4 NILAI INFORMASI Pada suatu eksperimen probabilistik dengan suatu random variable diskrit S. S = { s1, s2, …., sN} Jumlah informasi yang diproduksi oleh kejadian sk adalah : Jika pk = 1 (kejadian pasti terjadi)  maka I(sk) = 0  Apabila suatu kejadian sudah pasti akan terjadi, maka nilai informasinya = 0 Sifat nilai informasi : I(sk)  0 0  pk  1 I(sk)  I(si) pk < pi Suatu kejadian yang mempunyai nilai kemungkin terjadi lebih kecil maka nilai informasinya akan lebih besar apabila kejadian tersebut terjadi.

5 NILAI INFORMASI Jika sk dan si independent, maka :
I(sksi) =I(sk) + I(si) Bilangan dasar yang digunakan untuk menghitung nilai informasi pada persamaan di atas bisa bermacam-macam. Untuk sistem digital yang menggunakan bilangan biner, maka dipakai bilangan dasar 2.

6 H = - (p log2 (p) + (1 – p) log2 (1 – p)
ENTROPI Entropi H adalah nilai informasi rata-rata per simbol untuk keluaran dari suatu sumber informasi tertentu. H = E [I(sk)] bit/ simbol Untuk sinyal biner (N=2)dengan probabilitas kemunculan p dan (1 – p), maka: H = - (p log2 (p) + (1 – p) log2 (1 – p)

7 ENTROPI Beberapa catatan untuk entropi :
bit pada satuan informasi untuk biner (Bilangan dasar 2) tidak sama dengan binary digit Entropi berkonotasi pada ketidakpastian. Nilai entropi akan maksimum apabila kejadian yang keluar semakin tidak pasti. Contoh : untuk pelemparan uang logam dengan probabilitas yang sama (0,5), maka kejadian yang keluar akan susah ditebak (tidak pasti), sehingga nilai entropinya maksimum.

8 ENTROPI 2 KEJADIAN Untuk sinyal biner dengan probabilitas kemunculan p dan 1 – p, maka :

9 ENTROPI : CONTOH Contoh :
Hitung nilai entropi (nilai informasi rata-rata) dalam bit / karakter untuk abjad latin (26) huruf apabila : a. probabilitas kemunculan tiap huruf sama b. probabilitas kemunculan terdistribusi sebagai berikut : p = 0,10 untuk huruf a, e, o, t p = 0,07 untuk huruf h, i, n, r, s p = 0,02 untuk huruf c, d, f,l, m, p, u, y p = 0,01 untuk huruf lainnya Jawab : H = - (4 x 0,1 log2 0,1 + 5 x 0,07 log2 0, x 0,02 log2 0,02 + 8 x 0,01 log2 0,01 ) = 4,17 bit / karakter bit/ karakater

10 DISCRETE MEMORYLESS CHANNEL (DMC)
Karakteristik DMC : Input diskrit Output diskret Output kanal suatu saat hanya tergantung pada input kanal saat itu juga, tidak tergantung input sebelum maupun sesudahnya (tidak ada memori dalam kanal)

11 DISCRETE MEMORYLESS CHANNEL (DMC)
Pada kanal memoryless elemen-elemennya masing-masing independen. Probabilitas output merupakan perkalian dari probabilitas dari masing-masing elemen-elemennya. Z = z1, z2, …., zm, …. zN  output U =u1, u2, …., um, … uN  input

12 BINARY SYMMETRIC CHANNEL (BSC)
BSC merupakan kasus khusus dari DMC di mana input dan output terdiri dari elemen biner (0 dan 1). Conditional probability : p(0|1) = p(1|0) = p p(1|1) = p(0|0) = 1 - p

13 MUTUAL INFORMATION Informasi mutual menyatakan jumlah informasi yang dikirim jika xi dikirimkan dan yj diterima. Jumlah rata-rata informasi mutual adalah : I(X;Y) melibatkan adanya pengirim dan penerima. Sedangkan entropi H(X) menyatakan jumlah rata-rata bit per symbol. Jadi tidak menyatakan adanya pengiriman (tidak berhubungan dengan masalah kanal). Persamaan I(X;Y) di atas diubah menjadi :

14 EQUIVOCATION Dengan memperhatikan bahwa : dan Maka Sehingga Di mana :
H(X|Y) disebut equivocation. Secara kualitatif bisa dinyatakan bahwa jumlah informasi rata-rata yang ditransfer (informasi mutual) adalah sama dengan nilai entropi sumber H(x) dikurangi equivocation yang merupakan loss pada kanal yang bernoise.

15 EQUIVOCATION Persamaan Informasi Mutual tersebut di atas dapat pula dituliskan sebagai :

16 INFORMASI MUTUAL I(X,Y) PADA KANAL BSC
p(y1) = (1 – a)p + a(1 – p) = a + p – 2ap Karena

17 Limitations in designing a DCS
The Nyquist theoretical minimum bandwidth requirement The Shannon-Hartley capacity theorem (and the Shannon limit) Government regulations Technological limitations Other system requirements (e.g satellite orbits)

18 Shannon limit Channel capacity: The maximum data rate at which the error-free communication over the channel is performed. Channel capacity on AWGN channel (Shannon-Hartley capacity theorem):

19 Shannon limit Shannon theorem puts a limit on transmission data rate, not on error probability: Theoretically possible to transmit information at any rate , where with an arbitrary small error probability by using a sufficiently complicated coding scheme For an information rate , it is not possible to find a code that can achieve an arbitrary small error probability.

20 Shannon limit C/W [bits/s/Hz] Unattainable region Practical region
SNR [bits/s/Hz]

21 Shannon limit Shannon limit There exists a limiting value of below which there can be no error-free communication at any information rate. By increasing the bandwidth alone, the capacity can not be increased to any desired value.

22 Shannon limit W/C [Hz/bits/s] Practical region -1.6 [dB] Unattainable

23 Bandwidth efficiency plane
R/W [bits/s/Hz] R=C R>C Unattainable region M=256 M=64 Bandwidth limited M=16 M=8 M=4 R<C Practical region M=2 M=4 M=2 M=8 M=16 Shannon limit MPSK MQAM MFSK Power limited

24 Power and bandwidth limited systems
Two major communication resources: Transmit power and channel bandwidth In many communication systems, one of these resources is more precious than the other. Hence, systems can be classified as: Power-limited systems: save power at the expense of bandwidth (for example by using coding schemes) Bandwidth-limited systems: save bandwidth at the expense of power (for example by using spectrally efficient modulation schemes)

25 BAB 2B BASIC TECHNIQUE Universitas Telkom INFORMATION THEORY

26 Interpixel Redundancy
There is often correlation between adjacent pixels, i.e., the value of the neighbors of an observed pixel can often be predicted from the value of the observed pixel. Coding methods: Run-Length coding. Difference coding

27 Run-Length Coding Run-length coding is a very widely used and simple compression technique which does not assume a memoryless source We replace runs of symbols (possibly of length one) with pairs of (run-length, symbol) Run Length Coding is Lossless Technique For images, the maximum run-length is the size of a row

28 Run-Length Coding Every code word is made up of a pair (g, l) where g is the gray level, and l is the number of pixels with that gray level (length, or “run”). E.g., creates the run-length code (56, 3)(82, 3)(83, 1)(80, 4)(56, 5). The code is calculated row by row. Very efficient coding for binary data. Important to know position, and the image dimensions must be stored with the coded image. Used in most fax machines.la University) Image Coding an

29 Run-Length Coding Suppose we have a sequence of values:
The sequence uses 17 separate values. We could code this by saying: We have one 1, three 2’s , 2 1’s …….. In run length code this would be Taking only 12 values No use if we don’t have runs five values would be coded. taking ten values.

30 Run-Length Coding We also have to decide and specify how many spaces we will leave for the data and how much for the run length value. For example, in the above the values and the run lengths are all less than 10, the spaces are inserted to explain the principle. The code could mean 11 3’s, 22 1’s, 53 2’s and 14 6’s if we did not know the allocation of data for the values and the run length. It will be inefficient to allocate this data without consideration of the original data.

31 Run-Length Coding Runs with different characters
Send the actual character with the run-length HHHHHHHUFFFFFFFFFYYYYYYYYYYYDGGGGG code = 7, H, 1, U, 9, F, 11, Y, 1, D, 5, G SAVINGS IN BITS (considering ASCII): ?

32 Run-Length Coding

33 Run-Length Coding

34 Run-Length Coding Facsimile Compression
ITU standard (A4 document, 210 by 297 mm) 1728 pixels per line If 1 bit for each pixel, then over 3 million bits for each page A typical page contains many consecutive white or black pixels -> RLE

35 Run-Length Coding Run lengths may vary from 0 to 1728 -> many
Possibilities and inefficiency with a fixed size code Some runs occur more frequently than others, e.g. most typed pages contain 80% white pixels, spacing between letters is fairly consistent => probabilities of certain runs are predictable => Frequency dependent code on run lengths

36 Run-Length Coding Some Facsimile compression codes (Terminating, less than 64) Pixels in the run Code: White Code: Black

37 Run-Length Coding Some Facsimile compression codes (Make up, greater than or equal to 64) Pixels in the run Code: White Code: Black 256 512 # 129 white: Savings: No-prefix property, better compression for long-runs

38 Difference coding f (xi ) = E.g., The code is calculated rob by row.
original Code f(xi ) − The code is calculated rob by row. Both run-length coding, and difference coding are reversible, and can be combined with, e.g., Huffman coding Xi if i = 0, xi − xi-1 if i > 0

39 Difference coding : Example

40 Difference coding : Example

41 SCALAR QUANTIZATION Reduce the number of data Lossy compression
If the data in the form of large number, it is converted to smaller number. Not all data is used If the data to be compressed is analog, quantization is used to sample and digitized it into small number. The smaller the number, the better the compression, But also the greater the loss of information

42 SCALAR QUANTIZATION-EXAMPLE
Data 8 bit  delete the LSB bit  data 7 bit Input data [0,255]  just take quantized value 0, s, 2s, …., ks where (k+1)s < 255 s = 3  output data : 0,3,6,9, 12, … , 255 s = 4  output data : 0, 4, 8, 12, …, 252, 255 PCM (Puse Code Modulation) in voice voice 4 kHz (analog) is sampled 8000 sample/s and encode to 8 bit per sample = 64 kbps

43 STATISTICAL METHODE Using variable size code
 shorter code assigned to symbol that appear more often (have a higher probability of occurrence) Example : Morse, Huffman code, etc

44 FIXED LENGTH CODE Each symbol is represented as fixed length code
Example : ASCII code  code length : 7 bit + 1 bit parity = 8 bit Total bit number = number of characters * 8 bit

45 VARIABLE SIZE CODE Assigning Code that can be decoded unambiguously
Assigning code with the minimum average size Example : Entropi 1.57 Bit per simbol = 1.77 If the data have equal probabiliy (0.25)  entropi = bit/ simbol = 2 bit Symbol Probability Code 1 Code 2 A1 0.49 1 A2 0.25 01 A3 010 000 a4 0.01 001

46 Prefix Code (= Prefix-free Code)
A prefix code is a type of code system (typically a variable-length code) distinguished by its possession of the "prefix property"; which states that there is no valid code word in the system that is a prefix (start) of any other valid code word in the set. a receiver can identify each word without requiring a special marker between words.

47 Prefix Code : Example 1 A code with code words {9, 59, 55} has the prefix property; a code consisting of {9, 5, 59, 55} does not, because "5" is a prefix of both "59" and "55". A prefix code is an example of a uniquely decodable code:

48 Binary Prefix Code Dapat direpresentasikan dalam pohon biner
Ciri khas: setiap simbol akan menjadi leaves nodes, tidak ada yang menjadi internal nodes

49 Prefix Code : Example 2 Prefix-free Code {01, 10, 11, 000, 001}
Jika ni = banyaknya codeword yang memiliki panjang bit i, maka: n2 = 3 (pada level 2, ada 3 codeword) n3 = 2 (pada level 3 ada 2 codeword

50 Prefix Code : Example 3 Code {0, 01, 011, 0111} Code {0, 01, 11}
Bukan prefix-free code, Code {0, 01, 11}

51 Ketidaksamaan Kraft-McMillan
Teorema 1 Jika C adalah sebuah kode yang unique decodable yang terdiri atas N buah codewords, maka panjang keseluruhan codewordsnya akan memenuhi ketidaksamaan: N = banyaknya codewords lj = panjang codeword ke-j (dalam bit) ni = banyaknya codeword yang memiliki panjang bit i , b = base (dalam hal ini = 2) M = panjang maksimum codeword (M buah bit)

52 Example {01, 10, 11, 000, 001} {0, 01, 011, 0111} {0, 01, 11, 111}

53 Ketidaksamaan Kraft-McMillan
Setiap kode yang tidak memenuhi ketidaksamaan Kraft-McMillan, pasti bukan unique-decodable-code Setiap unique-decodable-code, pasti memenuhi ketidaksamaan Kraft-McMillan Bisa prefix-free code, atau Bukan prefix-free code Setiap kode yang memenuhi ketidak samaan Kraft-McMillan, belum tentu unique-decodable-code

54 Example {0, 01, 110, 111} Decode : {1, 10, 110, 111} Decode:

55 Ketidaksamaan Kraft-McMillan
Teorema 2 Untuk setiap himpunan codewords yang panjangnya memenuhi ketidaksamaan Kraft-McMillan yaitu akan selalu dapat dibentuk prefix code dengan panjang codewords yang memenuhi Kraft-McMillan tersebut.

56 Ketidaksamaan Kraft-McMillan
Dengan kata lain, setiap prefix-free code pasti memenuhi ketidaksamaan Kraft-McMillan, dan Untuk setiap komposisi panjang codeword pada kode yang memenuhi ketidaksamaan Kraft-McMillan, pasti dapat dibentuk prefix-code

57 Example {0, 01, 110, 111} Tidak unique decodable, tapi komposisi panjang codeword pada kode tersebut memenuhi ketidaksamaan KM {0, 10, 110, 111} Prefix code dengan komposisi panjang codeword yang sama dengan kode di atas

58 ASSIGNMENT #1 Make a description paper (and complete it with algorithm) about Tunstall Code and Golomb Code in Bahasa Indonesia.


Download ppt "INFORMATION THEORY & BASIC TECHNIQUE"

Presentasi serupa


Iklan oleh Google