Memory hierarchyIn computer organisation, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlling technologies. Memory hierarchy affects performance in computer architectural design, algorithm predictions, and lower level programming constructs involving locality of reference. Designing for high performance requires considering the restrictions of the memory hierarchy, i.
Cache coherenceIn computer architecture, cache coherence is the uniformity of shared resource data that ends up stored in multiple local caches. When clients in a system maintain caches of a common memory resource, problems may arise with incoherent data, which is particularly the case with CPUs in a multiprocessing system. In the illustration on the right, consider both the clients have a cached copy of a particular memory block from a previous read.
Instruction pipeliningIn computer engineering, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming instructions into a series of sequential steps (the eponymous "pipeline") performed by different processor units with different parts of instructions processed in parallel. In a pipelined computer, instructions flow through the central processing unit (CPU) in stages.
Clock rateIn computing, the clock rate or clock speed typically refers to the frequency at which the clock generator of a processor can generate pulses, which are used to synchronize the operations of its components, and is used as an indicator of the processor's speed. It is measured in the SI unit of frequency hertz (Hz). The clock rate of the first generation of computers was measured in hertz or kilohertz (kHz), the first personal computers (PCs) to arrive throughout the 1970s and 1980s had clock rates measured in megahertz (MHz), and in the 21st century the speed of modern CPUs is commonly advertised in gigahertz (GHz).
Computer architectureIn computer science, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, the description may include the instruction set architecture design, microarchitecture design, logic design, and implementation. The first documented computer architecture was in the correspondence between Charles Babbage and Ada Lovelace, describing the analytical engine.
Content-addressable memoryContent-addressable memory (CAM) is a special type of computer memory used in certain very-high-speed searching applications. It is also known as associative memory or associative storage and compares input search data against a table of stored data, and returns the address of matching data. CAM is frequently used in networking devices where it speeds up forwarding information base and routing table operations. This kind of associative memory is also used in cache memory.
Back-side busIn personal computer microprocessor architecture, a back-side bus (BSB), or backside bus, was a computer bus used on early Intel platforms to connect the CPU to CPU cache memory, usually off-die L2. If a design utilizes it along with a front-side bus (FSB), it is said to use a dual-bus architecture, or in Intel's terminology Dual Independent Bus (DIB) architecture. The back-side bus architecture evolved when newer processors like the second-generation Pentium III began to incorporate on-die L2 cache, which at the time was advertised as Advanced Transfer Cache, but Intel continued to refer to the Dual Independent Bus till the end of Pentium III.
Bus snoopingBus snooping or bus sniffing is a scheme by which a coherency controller (snooper) in a cache (a snoopy cache) monitors or snoops the bus transactions, and its goal is to maintain a cache coherency in distributed shared memory systems. This scheme was introduced by Ravishankar and Goodman in 1983, under the name "write-once" cache coherency. A cache containing a coherency controller (snooper) is called a snoopy cache.
Classic RISC pipelineIn the history of computer hardware, some early reduced instruction set computer central processing units (RISC CPUs) used a very similar architectural solution, now called a classic RISC pipeline. Those CPUs were: MIPS, SPARC, Motorola 88000, and later the notional CPU DLX invented for education. Each of these classic scalar RISC designs fetches and tries to execute one instruction per cycle. The main common concept of each design is a five-stage execution instruction pipeline.
Branch predictorIn computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high performance in many modern pipelined microprocessor architectures. Two-way branching is usually implemented with a conditional jump instruction.
Prefetch input queueFetching the instruction opcodes from program memory well in advance is known as prefetching and it is served by using a prefetch input queue (PIQ). The pre-fetched instructions are stored in a queue. The fetching of opcodes well in advance, prior to their need for execution, increases the overall efficiency of the processor boosting its speed. The processor no longer has to wait for the memory access operations for the subsequent instruction opcode to complete. This architecture was prominently used in the Intel 8086 microprocessor.
Translation lookaside bufferA translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. It is a part of the chip's memory-management unit (MMU). A TLB may reside between the CPU and the CPU cache, between CPU cache and the main memory or between the different levels of the multi-level cache.
Locality of referenceIn computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference locality - temporal and spatial locality. Temporal locality refers to the reuse of specific data and/or resources within a relatively small time duration. Spatial locality (also termed data locality) refers to the use of data elements within relatively close storage locations.
Cache (computing)In computing, a cache (kæʃ ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.
Page replacement algorithmIn a computer operating system that uses paging for virtual memory management, page replacement algorithms decide which memory pages to page out, sometimes called swap out, or write to disk, when a page of memory needs to be allocated. Page replacement happens when a requested page is not in memory (page fault) and a free page cannot be used to satisfy the allocation, either because there are none, or because the number of free pages is lower than some threshold.
Multi-core processorA multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions (such as add, move data, and branch) but the single processor can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques.
Processor registerA processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. In computer architecture, registers are typically addressed by mechanisms other than main memory, but may in some cases be assigned a memory address e.g. DEC PDP-10, ICT 1900.
Pentium IIThe Pentium II brand refers to Intel's sixth-generation microarchitecture ("P6") and x86-compatible microprocessors introduced on May 7, 1997. Containing 7.5 million transistors (27.4 million in the case of the mobile Dixon with 256 KB L2 cache), the Pentium II featured an improved version of the first P6-generation core of the Pentium Pro, which contained 5.5 million transistors. However, its L2 cache subsystem was a downgrade when compared to the Pentium Pros. It is a single-core microprocessor.
Am386The Am386 CPU is a 100%-compatible clone of the Intel 80386 design released by AMD in March 1991. It sold millions of units, positioning AMD as a legitimate competitor to Intel, rather than being merely a second source for x86 CPUs (then termed 8086-family). While the AM386 CPU was essentially ready to be released prior to 1991, Intel kept it tied up in court. Intel learned of the Am386 when both companies hired employees with the same name who coincidentally stayed at the same hotel, which accidentally forwarded a package for AMD to Intel's employee.
Program counterThe program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, is a processor register that indicates where a computer is in its program sequence. Usually, the PC is incremented after fetching an instruction, and holds the memory address of ("points to") the next instruction that would be executed.