IKI10230 Pengantar Organisasi Komputer Bab 12: Memori

IKI10230 Pengantar Organisasi Komputer Bab 12: Memori
Sumber: 1. Paul Carter, PC Assembly Language 2. Hamacher. Computer Organization, ed-5 3. Materi kuliah CS61C/2000 & CS152/1997, UCB 19 Mei 2004 L. Yohanes Stefanus Bobby Nazief bahan kuliah:

Memori: Tempat Penyimpanan Data
Keyboard, Mouse Computer Processor (active) Memory (passive) (where programs, data live when running) Devices Input Control (“brain”) Disk (permanent storages) Datapath (“brawn”) Output That is, any computer, no matter how primitive or advance, can be divided into five parts: 1. The input devices bring the data from the outside world into the computer. 2. These data are kept in the computer’s memory until ... 3. The datapath request and process them. 4. The operation of the datapath is controlled by the computer’s controller. All the work done by the computer will NOT do us any good unless we can get the data back to the outside world. 5. Getting the data back to the outside world is the job of the output devices. The most COMMON way to connect these 5 components together is to use a network of busses. Display, Printer

Connection: Memory - Processor
k-bit address bus Memory Sampai 2k addressable locations MAR n-bit data bus Panjang word = n bits MDR Control lines, R/W, MFC, etc.

Organisasi Internal Memori
Bentuk array: terdiri dari sel memori Sel berisi 1 bit informasi Baris dari sel membentuk untaian satu word Contoh: 128 x 8 memori memori mengandung 128 word setiap word terdiri dari 8 bit data Kapasitas memori: 128 x 8 = 1024 bit Address Decoder digunakan untuk memilih baris word mana yang akan diakses alamat merupakan indeks dari baris pada array tersebut

Organisasi Memori: 1-level-decode SRAM (128 x 8)
Word  8 bit data b7 b7’ b1 b1’ b0 b0’ Address decoder W0 A0 W1 A1 memory cells A6 W127 128 words sense/write amps sense/write amps sense/write amps R/W’ CS Input/output lines d7 d1 d0

Static RAM (SRAM) SRAM dapat menyimpan “state” (isi RAM) selama terdapat “tegangan” power supply Sangat cepat, 10 nano-detik Densitas (bits per chip) rendah  memerlukan 6 transistor per-sel  mahal Pilihan teknologi untuk memori yang sangat cepat dengan kapasitas kecil  cache

Review: Static RAM Cell
6-Transistor SRAM Cell Latch  menyimpan state 1 bit Transistor T bertindak sebagai switch Contoh: state 1 Latch dapat berubah dengan: put bit value pada b dan b’ word line pull high (select) word (row select) 1 T T 1 b’ b Write: 1. Drive bit lines sesuai dengan bit (mis. b = 1, b’ = 0) 2. Select row  store nilai b dan b’ menjadi state latch Read: Precharge (set) bit lines high Select row 3. Sense amp mendeteksi bit lines mana yang low state bit The classical SRAM cell looks like this. It consists of two back-to-back inverters that serves as a flip-flop. Here is an expanded view of this cell, you can see it consists of 6 transistors. In order to write a value into this cell, you need to drive from both sides. For example, if you want to write a 1, you will drive “bit” to 1 while at the same time, drive “bit bar” to zero. Once the bit lines are driven to their desired values, you will turn on these two transistors by setting the word line to high so the values on the bit lines will be written into the cell. Remember now these are very very tiny transistors so we cannot rely on them to drive these long bit lines effectively during read. Also, the pull down devices are usually much stronger than the pull up devices. So the first thing we need to do on read is to charge these two bit lines to a high values. Once these bit lines are charged to high, we will turn on these two transistors so one of these inverters (the lower one in our example) will start pulling one of the bit line low while the other bit line will remain at HI. It will take this small inverter a long time to drive this long bit line to low but we don’t have to wait that long since all we need to detect the difference between these two bit lines. And if you ask any circuit designer, they will tell you it is much easier to detect a “differential signal” (point to bit and bit bar) than to detect an absolute signal. +2 = 30 min. (Y:10)

Densitas tinggi: 1 transistor/bit
Dynamic RAM (DRAM) Slower than SRAM access time ~60 ns (paling cepat: 35 ns) Nonpersistant every row must be accessed every ~1 ms (refreshed) Densitas tinggi: 1 transistor/bit Lebih murah dari SRAM ~$1/MByte [2002] Fragile electrical noise, light, radiation Pilihan teknologi memori untuk kapasitas besar dan “low cost”  main memory

Review: 1-Transistor Memory Cell (DRAM)
Kapasitor menyimpan state 1 (charged) atau 0 (discharge) Perlu refresh! row select Write: 1. Drive bit line 2. Select row (T sebagai switch) Read: 1. Select row 2. Sense Amp (terhubung dengan bit line): sense & drives sesuai dengan value (threshold) 3. Write: restore the value (high or low) Refresh Just do a dummy read to every cell. T C bit The state of the art DRAM cell only has one transistor. The bit is stored in a tiny transistor. The write operation is very simple. Just drive the bit line and select the row by turning on this pass transistor. For read, we will need to precharge this bit line to high and then turn on the pass transistor. This will cause a small voltage change on the bit line and a very sensitive amplifier will be used to measure this small voltage change with respect to a reference bit line. Once again, the value we stored will be destroyed by the read operation so an automatic write back has to be performed at the end of every read. + 2 = 48 min. (Y:28)

Classical DRAM Organization (square)
bit (data) lines r o w d e c Each intersection represents a 1-T DRAM Cell RAM Cell Array word (row) select Similar to SRAM, DRAM is organized into rows and columns. But unlike SRAM, which allows you to read an entire row out at a time at a word, classical DRAM only allows you read out one-bit at time time. The reason for this is to save power as well as area. Remember now the DRAM cell is very small we have a lot of them across horizontally. So it will be very difficult to build a Sense Amplifier for each column due to the area constraint not to mention having a sense amplifier per column will consume a lot of power. You select the bit you want to read or write by supplying a Row and then a Column address. Similar to SRAM, each row control line is referred to as the word line and each vertical data line is referred to as the bit line. +2 = 57 min. (Y:37) Column Selector & I/O Circuits row address Column Address Row and Column Address together: Select 1 bit a time data

DRAM-based Memory Systems
n address DRAM Controller DRAM 2^n x 1 chip n/2 (Row & Column Addresses) w Bus Drivers

Rewrite/Refreshed (~30ns)
Operasi DRAM Row Address (~50ns) Set Row address pada address lines & strobe RAS Seluruh row dibaca & disimpan di column latches Isi dari row memori cells akan di-refresh Column Address (~10ns) Set Column address pada address lines & strobe CAS Access selected bit READ: transfer from selected column latch to Dout WRITE: Set selected column latch to Din Rewrite/Refreshed (~30ns) Write back entire row

Must Refresh Periodically
DRAM: Kinerja Timing Access time = 60ns < cycle time = 90ns Need to rewrite row Model asinkron: operasi memori dilakukan oleh controller circuit  delay prosesor menunggu sampai cycle time selesai lalu melakukan request lagi. Must Refresh Periodically Perform complete memory cycle for each row Approx. every 1ms Handled in background by memory controller

Perkembangan Teknologi Memori DRAM
Teknologi memori: segi kecepatan akses berkembang sangat lambat Gap yang semakin membesar dengan kecepatan prosesor (cycle sangat kecil => 1 nsec, akses memori orde puluhan nsec). Perkembangan teknologi DRAM Basis tetap sama: 1-transistor memori cell (menggunakan kapasitor) Inovasi dilakukan dari cara melakukan akses memotong waktu akses (mis. CAS tidak diperlukan) burst mode: sekaligus mengambil data sebanyak mungkin (seluruh word) perlu tambahan rangkaian: register, latch dll

Enhanced Performance DRAMs
Conventional Access Row + Col RAS CAS RAS CAS ... Page Mode Row + Series of columns RAS CAS CAS CAS ... Gives successive bits Video RAM Shift out entire row sequentially At video rate Row address latch Column decoder 256x256 cell array sense/write amps column & latch A15-A8/ A7-A0 \ 8 R/W’ CAS RAS row col Entire row buffered here

Fast Page Mode Operation
Column Address Fast Page Mode (FPM) DRAM N x M “SRAM” to save a row After a row is read into the register Only CAS is needed to access other M-bit blocks on that row RAS’ remains asserted while CAS’ is toggled EDO DRAM More modern FPM DRAM N cols DRAM Row Address N rows N x M “SRAM” M bits M-bit Output So with this register in place, all we need to do is assert the RAS to latch in the row address, then entire row is read out and save into this register. After that, you only need to provide the column address and assert the CAS needs to access other M-bit within this same row. I like to point out that even I use the word “SRAM” here but this is no ordinary sram. It has to be very small but the good thing is that it is internal to the DRAM and does not have to drive any external load. Anyway, this type of operation where RAS remains asserted while CAS is toggled to bring in a new column address is called Page Mode operation. Strore orw so don’t have to repeat: SRAM It will become clearer why this is called Page Mode operation when we look into the operation of the SPARCstation 20 memory system. + 2 = 71 min. (Y:51) 1st M-bit Access 2nd M-bit 3rd M-bit 4th M-bit RAS’ CAS’ A Row Address Col Address Col Address Col Address Col Address

SDRAM & DDR SDRAM SDRAM: Synchronous DRAM
Address & Data are buffered in registers Burst Mode: Read/Write of different data lengths  CAS signals are provided internally Standards: PC100, PC133 DDR SDRAM: Double-Data-Rate SDRAM Data is transferred on both edges of the clock Cell array is organized in 2 banks  allows interleaving of word’s access Standards: PC2100, PC2300 RDRAM: Rambus DRAM High transfer rate using differential signaling Memory cells are organized in multiple banks Standards: proprietary owned by Rambus Inc.

Read-Only Memory ROM – Read Only Memory PROM – Programmable ROM
ROMs are RAMs with data built-in when the chip is created. Usually stores BIOS info. Older uses included storage of bootstrap info Write once, by manufacturer PROM – Programmable ROM A ROM which can be programmed Write once, by user EPROM – Erasable PROM A PROM which can be programmed, erased by exposure to UV radiation EEPROM – Electrically, Erasable PROM A PROM programmed & erased electrically Flash ~EEPROM Write in blocks Low power consumption  battery driven Implementation: Flash Cards Flash Drives: Better than disk (no movable parts  faster response)

MEMORY HIERARCHY

So how do we account for this gap?
Memory Hierarchy (1/4) Prosesor menjalankan program sangat cepat waktu eksekusi dalam orde nanoseconds sampai dengan picoseconds perlu mengakses kode dan data program! Dimana program berada? Disk HUGE capacity (virtually limitless) VERY slow: runs on order of milliseconds So how do we account for this gap?  Menggunakan teknologi hierarki memori!

Solusi: menyediakan (ilusi) kapasitas besar dan akses cepat!
Memory Hierarchy (2/4) Memory (DRAM) Kapasitas jauh lebih besar dari registers, lebih kecil dari disk (tetap terbatas) Access time ~ nano-detik, jauh lebih cepat dari disk (mili-detik) Mengandung subset data pada disk (basically portions of programs that are currently being run) Fakta: memori dengan kapasitas besar (murah!) lambat, sedangkan memori dengan kapasitas kecil (mahal) cepat. Solusi: menyediakan (ilusi) kapasitas besar dan akses cepat!

Levels in memory hierarchy
Processor Increasing Distance from Proc., Decreasing cost / MB Levels in memory hierarchy Higher Level 1 Level 2 Level n Level 3 . . . Lower Size of memory at each level

Tingkat paling rendah (biasanya disk) menyimpan seluruh data
Memory Hierarchy (4/4) Pada tingkat yang lebih dekat dengan Prosesor, mempunyai karakteristik: Lebih kecil, Lebih cepat, Menyimpan subset dari data (mis. menyimpan data yang sering digunakan), Efisien dalam pemilihan mana data yang akan disimpan, karena tempat terbatas Tingkat paling rendah (biasanya disk) menyimpan seluruh data

Memory Hierarchy Analogy: Library (1/2)
You’re writing a term paper (Processor) at a table in Doe Doe Library is equivalent to disk essentially limitless capacity very slow to retrieve a book Table is memory smaller capacity: means you must return book when table fills up easier and faster to find a book there once you’ve already retrieved it

Memory Hierarchy Analogy: Library (2/2)
Open books on table are cache smaller capacity: can have very few open books fit on table; again, when table fills up, you must close a book much, much faster to retrieve data Illusion created: whole library open on the tabletop Keep as many recently used books open on table as possible since likely to use again Also keep as many books on table as possible, since faster than going to library

The Principle of Locality:
Why hierarchy works The Principle of Locality: Program access a relatively small portion of the address space at any instant of time. Address Space 2^n - 1 Probability of reference The principle of locality states that programs access a relatively small portion of the address space at any instant of time. This is kind of like in real life, we all have a lot of friends. But at any given time most of us can only keep in touch with a small group of them. There are two different types of locality: Temporal and Spatial. Temporal locality is the locality in time which says if an item is referenced, it will tend to be referenced again soon. This is like saying if you just talk to one of your friends, it is likely that you will talk to him or her again soon. This makes sense. For example, if you just have lunch with a friend, you may say, let’s go to the ball game this Sunday. So you will talk to him again soon. Spatial locality is the locality in space. It says if an item is referenced, items whose addresses are close by tend to be referenced soon. Once again, using our analogy. We can usually divide our friends into groups. Like friends from high school, friends from work, friends from home. Let’s say you just talk to one of your friends from high school and she may say something like: “So did you hear so and so just won the lottery.” You probably will say NO, I better give him a call and find out more. So this is an example of spatial locality. You just talked to a friend from your high school days. As a result, you end up talking to another high school friend. Or at least in this case, you hope he still remember you are his friend. +3 = 10 min. (X:50)

Memory Hierarchy: How Does it Work?
Temporal Locality (Locality in Time):  Keep most recently accessed data items closer to the processor Spatial Locality (Locality in Space):  Move blocks consists of contiguous words to the upper levels Lower Level Memory Upper Level To Processor From Processor Blk X Blk Y How does the memory hierarchy work? Well it is rather simple, at least in principle. In order to take advantage of the temporal locality, that is the locality in time, the memory hierarchy will keep those more recently accessed data items closer to the processor because chances are (points to the principle), the processor will access them again soon. In order to take advantage of the spatial locality, not ONLY do we move the item that has just been accessed to the upper level, but we ALSO move the data items that are adjacent to it. +1 = 15 min. (X:55)

Memory Structure in Modern Computer System
By taking advantage of the principle of locality: Present the user with as much memory as is available in the cheapest technology. Provide access at the speed offered by the fastest technology. Control Datapath Secondary Storage (Disk) Processor Registers Main Memory (DRAM) Second Level Cache (SRAM) On-Chip 1s 10,000,000s (10s ms) Speed (ns): 10s 100s Gs Size (bytes): Ks Ms Tertiary (Tape) 10,000,000,000s (10s sec) Ts The design goal is to present the user with as much memory as is available in the cheapest technology (points to the disk). While by taking advantage of the principle of locality, we like to provide the user an average access speed that is very close to the speed that is offered by the fastest technology. (We will go over this slide in details in the next lecture on caches). +1 = 16 min. (X:56)

How is the hierarchy managed?
Registers ↔ Memory by compiler (programmer?) Cache ↔ Memory by the hardware Memory ↔ Disks by the hardware and operating system (virtual memory) by the programmer (files)

IKI10230 Pengantar Organisasi Komputer Bab 12: Memori

Presentasi serupa

Presentasi berjudul: "IKI10230 Pengantar Organisasi Komputer Bab 12: Memori"— Transcript presentasi:

Presentasi serupa

Tentang proyek

Tanggapan

Masuk

Otorisasi melalui jaringan sosial:

IKI10230 Pengantar Organisasi Komputer Bab 12: Memori

Presentasi serupa

Presentasi berjudul: "IKI10230 Pengantar Organisasi Komputer Bab 12: Memori"— Transcript presentasi:

Presentasi serupa

Tentang proyek

Tanggapan