IKI10230 Pengantar Organisasi Komputer Bab 12: Memori
Diterbitkan olehYudi ArdianTelah diubah "2 tahun yang lalu
Presentasi berjudul: "IKI10230 Pengantar Organisasi Komputer Bab 12: Memori"— Transcript presentasi:
1 IKI10230 Pengantar Organisasi Komputer Bab 12: Memori Sumber: 1. Paul Carter, PC Assembly Language 2. Hamacher. Computer Organization, ed-5 3. Materi kuliah CS61C/2000 & CS152/1997, UCB19 Mei 2004L. Yohanes Stefanus Bobby Naziefbahan kuliah:
2 Memori: Tempat Penyimpanan Data Keyboard, MouseComputerProcessor(active)Memory(passive)(whereprograms,datalive whenrunning)DevicesInputControl(“brain”)Disk (permanent storages)Datapath(“brawn”)OutputThat is, any computer, no matter how primitive or advance, can be divided into five parts:1. The input devices bring the data from the outside world into the computer.2. These data are kept in the computer’s memory until ...3. The datapath request and process them.4. The operation of the datapath is controlled by the computer’s controller.All the work done by the computer will NOT do us any good unless we can get the data back to the outside world.5. Getting the data back to the outside world is the job of the output devices.The most COMMON way to connect these 5 components together is to use a network of busses.Display, Printer
3 Connection: Memory - Processor k-bit address busMemorySampai 2k addressablelocationsMARn-bit data busPanjang word= n bitsMDRControl lines,R/W, MFC, etc.
4 Organisasi Internal Memori Bentuk array: terdiri dari sel memoriSel berisi 1 bit informasiBaris dari sel membentuk untaian satu wordContoh: 128 x 8 memorimemori mengandung 128 wordsetiap word terdiri dari 8 bit dataKapasitas memori: 128 x 8 = 1024 bitAddress Decoder digunakan untuk memilih baris word mana yang akan diaksesalamat merupakan indeks dari baris pada array tersebut
5 Organisasi Memori: 1-level-decode SRAM (128 x 8) Word 8 bit datab7b7’b1b1’b0b0’AddressdecoderW0A0W1A1memorycellsA6W127128 wordssense/writeampssense/writeampssense/writeampsR/W’CSInput/output linesd7d1d0
6 Static RAM (SRAM)SRAM dapat menyimpan “state” (isi RAM) selama terdapat “tegangan” power supplySangat cepat, 10 nano-detikDensitas (bits per chip) rendah memerlukan 6 transistor per-sel mahalPilihan teknologi untuk memori yang sangat cepat dengan kapasitas kecil cache
7 Review: Static RAM Cell 6-Transistor SRAM CellLatch menyimpan state 1 bitTransistor T bertindak sebagai switchContoh: state 1Latch dapat berubah dengan:put bit value pada b dan b’word line pull high (select)word(row select)1TT1b’bWrite:1. Drive bit lines sesuai dengan bit (mis. b = 1, b’ = 0)2. Select row store nilai b dan b’ menjadi state latchRead:Precharge (set) bit lines highSelect row3. Sense amp mendeteksi bit lines mana yang low state bitThe classical SRAM cell looks like this. It consists of two back-to-back inverters that serves as a flip-flop. Here is an expanded view of this cell, you can see it consists of 6 transistors.In order to write a value into this cell, you need to drive from both sides. For example, if you want to write a 1, you will drive “bit” to 1 while at the same time, drive “bit bar” to zero.Once the bit lines are driven to their desired values, you will turn on these two transistors by setting the word line to high so the values on the bit lines will be written into the cell.Remember now these are very very tiny transistors so we cannot rely on them to drive these long bit lines effectively during read.Also, the pull down devices are usually much stronger than the pull up devices. So the first thing we need to do on read is to charge these two bit lines to a high values.Once these bit lines are charged to high, we will turn on these two transistors so one of these inverters (the lower one in our example) will start pulling one of the bit line low while the other bit line will remain at HI.It will take this small inverter a long time to drive this long bit line to low but we don’t have to wait that long since all we need to detect the difference between these two bit lines.And if you ask any circuit designer, they will tell you it is much easier to detect a “differential signal” (point to bit and bit bar) than to detect an absolute signal.+2 = 30 min. (Y:10)
8 Densitas tinggi: 1 transistor/bit Dynamic RAM (DRAM)Slower than SRAMaccess time ~60 ns (paling cepat: 35 ns)Nonpersistantevery row must be accessed every ~1 ms (refreshed)Densitas tinggi: 1 transistor/bitLebih murah dari SRAM~$1/MByte Fragileelectrical noise, light, radiationPilihan teknologi memori untuk kapasitas besar dan “low cost” main memory
9 Review: 1-Transistor Memory Cell (DRAM) Kapasitor menyimpan state 1 (charged) atau 0 (discharge)Perlu refresh!row selectWrite:1. Drive bit line2. Select row (T sebagai switch)Read:1. Select row2. Sense Amp (terhubung dengan bit line): sense & drives sesuai dengan value (threshold)3. Write: restore the value (high or low)RefreshJust do a dummy read to every cell.TCbitThe state of the art DRAM cell only has one transistor. The bit is stored in a tiny transistor.The write operation is very simple. Just drive the bit line and select the row by turning on this pass transistor.For read, we will need to precharge this bit line to high and then turn on the pass transistor.This will cause a small voltage change on the bit line and a very sensitive amplifier will be used to measure this small voltage change with respect to a reference bit line.Once again, the value we stored will be destroyed by the read operation so an automatic write back has to be performed at the end of every read.+ 2 = 48 min. (Y:28)
10 Classical DRAM Organization (square) bit (data) linesrowdecEach intersection representsa 1-T DRAM CellRAM CellArrayword (row) selectSimilar to SRAM, DRAM is organized into rows and columns.But unlike SRAM, which allows you to read an entire row out at a time at a word, classical DRAM only allows you read out one-bit at time time.The reason for this is to save power as well as area. Remember now the DRAM cell is very small we have a lot of them across horizontally.So it will be very difficult to build a Sense Amplifier for each column due to the area constraint not to mention having a sense amplifier per column will consume a lot of power.You select the bit you want to read or write by supplying a Row and then a Column address.Similar to SRAM, each row control line is referred to as the word line and each vertical data line is referred to as the bit line.+2 = 57 min. (Y:37)Column Selector &I/O CircuitsrowaddressColumnAddressRow and Column Address together:Select 1 bit a timedata
11 DRAM-based Memory Systems naddressDRAMControllerDRAM2^n x 1chipn/2(Row & ColumnAddresses)wBus Drivers
12 Rewrite/Refreshed (~30ns) Operasi DRAMRow Address (~50ns)Set Row address pada address lines & strobe RASSeluruh row dibaca & disimpan di column latchesIsi dari row memori cells akan di-refreshColumn Address (~10ns)Set Column address pada address lines & strobe CASAccess selected bitREAD: transfer from selected column latch to DoutWRITE: Set selected column latch to DinRewrite/Refreshed (~30ns)Write back entire row
13 Must Refresh Periodically DRAM: KinerjaTimingAccess time = 60ns < cycle time = 90nsNeed to rewrite rowModel asinkron: operasi memori dilakukan oleh controller circuit delay prosesor menunggu sampai cycle time selesai lalu melakukan request lagi.Must Refresh PeriodicallyPerform complete memory cycle for each rowApprox. every 1msHandled in background by memory controller
14 Perkembangan Teknologi Memori DRAM Teknologi memori: segi kecepatan akses berkembang sangat lambatGap yang semakin membesar dengan kecepatan prosesor (cycle sangat kecil => 1 nsec, akses memori orde puluhan nsec).Perkembangan teknologi DRAMBasis tetap sama: 1-transistor memori cell (menggunakan kapasitor)Inovasi dilakukan dari cara melakukan aksesmemotong waktu akses (mis. CAS tidak diperlukan)burst mode: sekaligus mengambil data sebanyak mungkin (seluruh word)perlu tambahan rangkaian: register, latch dll
15 Enhanced Performance DRAMs Conventional AccessRow + ColRAS CAS RAS CAS ...Page ModeRow + Series of columnsRAS CAS CAS CAS ...Gives successive bitsVideo RAMShift out entire row sequentiallyAt video rateRowaddresslatchColumndecoder256x256cell arraysense/writeampscolumn& latchA15-A8/A7-A0\8R/W’CASRASrowcolEntire row buffered here
16 Fast Page Mode Operation ColumnAddressFast Page Mode (FPM) DRAMN x M “SRAM” to save a rowAfter a row is read into the registerOnly CAS is needed to access other M-bit blocks on that rowRAS’ remains asserted while CAS’ is toggledEDO DRAMMore modern FPM DRAMN colsDRAMRowAddressN rowsN x M “SRAM”M bitsM-bit OutputSo with this register in place, all we need to do is assert the RAS to latch in the row address, then entire row is read out and save into this register.After that, you only need to provide the column address and assert the CAS needs to access other M-bit within this same row.I like to point out that even I use the word “SRAM” here but this is no ordinary sram. It has to be very small but the good thing is that it is internal to the DRAM and does not have to drive any external load.Anyway, this type of operation where RAS remains asserted while CAS is toggled to bring in a new column address is called Page Mode operation.Strore orw so don’t have to repeat: SRAMIt will become clearer why this is called Page Mode operation when we look into the operation of the SPARCstation 20 memory system.+ 2 = 71 min. (Y:51)1st M-bit Access2nd M-bit3rd M-bit4th M-bitRAS’CAS’ARow AddressCol AddressCol AddressCol AddressCol Address
17 SDRAM & DDR SDRAM SDRAM: Synchronous DRAM Address & Data are buffered in registersBurst Mode:Read/Write of different data lengths CAS signals are provided internallyStandards: PC100, PC133DDR SDRAM: Double-Data-Rate SDRAMData is transferred on both edges of the clockCell array is organized in 2 banks allows interleaving of word’s accessStandards: PC2100, PC2300RDRAM: Rambus DRAMHigh transfer rate using differential signalingMemory cells are organized in multiple banksStandards: proprietary owned by Rambus Inc.
18 Read-Only Memory ROM – Read Only Memory PROM – Programmable ROM ROMs are RAMs with data built-in when the chip is created. Usually stores BIOS info. Older uses included storage of bootstrap infoWrite once, by manufacturerPROM – Programmable ROMA ROM which can be programmedWrite once, by userEPROM – Erasable PROMA PROM which can be programmed, erased by exposure to UV radiationEEPROM – Electrically, Erasable PROMA PROM programmed & erased electricallyFlash~EEPROMWrite in blocksLow power consumption battery drivenImplementation:Flash CardsFlash Drives:Better than disk (no movable parts faster response)
20 So how do we account for this gap? Memory Hierarchy (1/4)Prosesormenjalankan programsangat cepat waktu eksekusi dalam orde nanoseconds sampai dengan picosecondsperlu mengakses kode dan data program! Dimana program berada?DiskHUGE capacity (virtually limitless)VERY slow: runs on order of millisecondsSo how do we account for this gap? Menggunakan teknologi hierarki memori!
21 Solusi: menyediakan (ilusi) kapasitas besar dan akses cepat! Memory Hierarchy (2/4)Memory (DRAM)Kapasitas jauh lebih besar dari registers, lebih kecil dari disk (tetap terbatas)Access time ~ nano-detik, jauh lebih cepat dari disk (mili-detik)Mengandung subset data pada disk (basically portions of programs that are currently being run)Fakta: memori dengan kapasitas besar (murah!) lambat, sedangkan memori dengan kapasitas kecil (mahal) cepat.Solusi: menyediakan (ilusi) kapasitas besar dan akses cepat!
22 Levels in memory hierarchy ProcessorIncreasing Distance from Proc., Decreasing cost / MBLevels in memory hierarchyHigherLevel 1Level 2Level nLevel 3. . .LowerSize of memory at each level
23 Tingkat paling rendah (biasanya disk) menyimpan seluruh data Memory Hierarchy (4/4)Pada tingkat yang lebih dekat dengan Prosesor, mempunyai karakteristik:Lebih kecil,Lebih cepat,Menyimpan subset dari data (mis. menyimpan data yang sering digunakan),Efisien dalam pemilihan mana data yang akan disimpan, karena tempat terbatasTingkat paling rendah (biasanya disk) menyimpan seluruh data
24 Memory Hierarchy Analogy: Library (1/2) You’re writing a term paper (Processor) at a table in DoeDoe Library is equivalent to diskessentially limitless capacityvery slow to retrieve a bookTable is memorysmaller capacity: means you must return book when table fills upeasier and faster to find a book there once you’ve already retrieved it
25 Memory Hierarchy Analogy: Library (2/2) Open books on table are cachesmaller capacity: can have very few open books fit on table; again, when table fills up, you must close a bookmuch, much faster to retrieve dataIllusion created: whole library open on the tabletopKeep as many recently used books open on table as possible since likely to use againAlso keep as many books on table as possible, since faster than going to library
26 The Principle of Locality: Why hierarchy worksThe Principle of Locality:Program access a relatively small portion of the address space at any instant of time.Address Space2^n - 1Probabilityof referenceThe principle of locality states that programs access a relatively small portion of the address space at any instant of time.This is kind of like in real life, we all have a lot of friends. But at any given time most of us can only keep in touch with a small group of them.There are two different types of locality: Temporal and Spatial. Temporal locality is the locality in time which says if an item is referenced, it will tend to be referenced again soon.This is like saying if you just talk to one of your friends, it is likely that you will talk to him or her again soon.This makes sense. For example, if you just have lunch with a friend, you may say, let’s go to the ball game this Sunday. So you will talk to him again soon.Spatial locality is the locality in space. It says if an item is referenced, items whose addresses are close by tend to be referenced soon.Once again, using our analogy. We can usually divide our friends into groups. Like friends from high school, friends from work, friends from home.Let’s say you just talk to one of your friends from high school and she may say something like: “So did you hear so and so just won the lottery.”You probably will say NO, I better give him a call and find out more.So this is an example of spatial locality. You just talked to a friend from your high school days. As a result, you end up talking to another high school friend. Or at least in this case, you hope he still remember you are his friend.+3 = 10 min. (X:50)
27 Memory Hierarchy: How Does it Work? Temporal Locality (Locality in Time): Keep most recently accessed data items closer to the processorSpatial Locality (Locality in Space): Move blocks consists of contiguous words to the upper levelsLower LevelMemoryUpper LevelTo ProcessorFrom ProcessorBlk XBlk YHow does the memory hierarchy work? Well it is rather simple, at least in principle.In order to take advantage of the temporal locality, that is the locality in time, the memory hierarchy will keep those more recently accessed data items closer to the processor because chances are (points to the principle), the processor will access them again soon.In order to take advantage of the spatial locality, not ONLY do we move the item that has just been accessed to the upper level, but we ALSO move the data items that are adjacent to it.+1 = 15 min. (X:55)
28 Memory Structure in Modern Computer System By taking advantage of the principle of locality:Present the user with as much memory as is available in the cheapest technology.Provide access at the speed offered by the fastest technology.ControlDatapathSecondaryStorage(Disk)ProcessorRegistersMainMemory(DRAM)SecondLevelCache(SRAM)On-Chip1s10,000,000s(10s ms)Speed (ns):10s100sGsSize (bytes):KsMsTertiary(Tape)10,000,000,000s(10s sec)TsThe design goal is to present the user with as much memory as is available in the cheapest technology (points to the disk).While by taking advantage of the principle of locality, we like to provide the user an average access speed that is very close to the speed that is offered by the fastest technology.(We will go over this slide in details in the next lecture on caches).+1 = 16 min. (X:56)
29 How is the hierarchy managed? Registers ↔ Memoryby compiler (programmer?)Cache ↔ Memoryby the hardwareMemory ↔ Disksby the hardware and operating system (virtual memory)by the programmer (files)