untitled
NEW! Upgrade to Pro Hosting and receive Ad-Free Webtools + More!
:: BASIC CPU KNOWLEDGE ::
 
Transistor:

Transistor is the basic component in microchips, which acts as an automatic switch. The maximum clock it can sustain depends on the transistor switching speed (the speed to switch ON and OFF rapidly). A 3.0GHz microprocessor can switch ON and OFF at least 3 billion times per second.

Learn more about transistor here:
http://en.wikipedia.org/wiki/Transistor

 


Clock Speed:

Clock speed refers to the number of pulses per second generated by an oscillator inside the microprocessor in 1 second. The standard unit is Hz (Hertz). 1Hz means 1 pulse is generated in 1 second, and 1MHz equals to 1 million pulses per second.

 

Functional Units:

It is also called the working units in a microprocessor. These units are the "workers" for the microprocessor. There are a few categories of "workers", each category has its specific job. The common jobs are:

1. Integer arithmetic - round number calculation
2. Logic operation - make decision upon input
3. Floating-point operation - calculate numbers with decimal point.
4. Data transfer operation - carry data into and out of microprocessor.

The "workers" or function units activities are determine by their manager, the control unit. This unit manages each unit properly and tries to fully utilize them to get the best performance, with the help of clock.

 

Registers

It is a temporary storage space internal to microprocessor, which is much faster than L1 cache or L2 cache, because it requires near zero latency to access. Data from cache or RAM must be stored into registers before it is processed.

EMT64 or x86-64 technology actually expanded the register size from 32-bit to 64-bit each. This leads to 2 performance boost: more storage space and higher bandwidth.

Note that this feature is supported in 64-bit software only. Most of the current programs are NOT 64-bit, so we can¡¯t fully utilize the 64-bit feature of both Athlon 64 and Pentium 4.

 

Address Space

It is the maximum memory space it can support. A 32-bit processor can support up to 2^32 = 4.096 billion addresses, which is around 4GB. A 64-bit processor can support up to 2^64 = 1,840,000 trillion addresses, which is around 1,840,000 TB.
However, neither Pentium 4 nor Athlon64 uses the full range of addresses space offered in 64-bit. For example, Athlon64 only supported 40-bit physical addresses (1TB).



Pipeline

Pipelining in computing increases the overall throughput of the system by breaking up a large batch computation into smaller ones which can be executed independently. -Wikipedia.org

An example (Milo-ice):
I'm sure you guys know how to make Milo-ice. There are a few procedures:
1. Take a cup from rack.
2. Put in some Milo.
3. Put in some susu pekat.
4. Put in some hot water.
5. Steer the water.
6. Put ice into the cup.

Pipelining basically split the big task (the making of Milo-ice) into smaller one. Why pipelining is necessary? Ok, assume that, there's only a Milo tin, a susu pekat tin, a thermo flask, a spoon to steer and a container of ice. If one guy is putting some Milo powder into the cup, the susu pekat tin, thermo flask and etc. are idle. We don't want this to happen, because we want to fully utilize each thing to speed up the process of making Milo-ice.

Hence, what we do is to hire a few more guys and split the big task into smaller one. If there are 6 guys, each of them do one of the procedures above. So while guy X is taking Milo, another guy can put some susu pekat into it at the same time. This can significantly speed up the whole process. This concept leads to the pipelining in microprocessor.

Each stage (or the small task) takes one clock cycle. So more stages leads to higher clock speed. However, we cannot increase the stages infinitely, because usually the optimum setting is between 5-12 stages. If the stages are getting more, there's a great penalty when something unexpected (branching) occurred. The pipeline has to be flushed and reload again if this happen. Longer pipelines require longer time to flush and reload. The scenario can be visualized this way: If one guy is steering the Milo, and suddenly the customer says: Boss, I mau kurang manis punya. So, the whole Milo-ice process has to be carried out again. (flush and reload).

 

Cache Memory

Cache is a temporary memory like RAM, but unlike RAM, cache can operate at much faster clock speed. Cache runs at the same clock speed as the microprocessor (usually 2000MHz), while the fastest RAM can only run at 533MHz). However, the trade off is its limited storage capability because cache is very expensive compare to RAM. The purpose of having cache is to feed the microprocessor with data as fast as possible. If cache does not present, the only choice is to get data from the slower RAM. As a result, microprocessor is forced to stop working temporary to wait for the RAM to send in data. Therefore, larger cache size is always a plus because it can hold more data so that CPU spends less time to access data from RAM. L1 cache is usually faster than L2 cache, because microprocessor uses less time to access the date stored in it.

 

High clock speed (Pentium 4) vs. Low clock speed (Athlon 64)

Q: Why Athlon64 with low clock speed can fight equally or even better than higher clocked Pentium 4 microprocessors?

A: Athlon64's "workers" are able to perform better or do more job in one clock. In fact, Pentium 4 has very good functional units too, but they are not well balanced or in other words, the functional units cannot cooperate well.

Q: Why Athlon64 can't clock as high as Pentium 4?

A: It's because Athlon64 has less pipeline stages compare to Pentium 4. With the limitation of transistor technology, it is impossible to clock as high as Pentium 4 without manipulating the pipeline.

Q: Why don't we increase Pentium 4 the clock to 10GHz or 100GHz?

A: Again, it's due to the transistor limitation. Transistor isn't able to switch as fast as the clock.


Q: Why not increase the work done per clock, if transistor is the limiting factor?

A: Increasing the work done per clock requires more transistors. Each time the transistor switches, heat is generated. More transistors equals to more heat generated and more power it consumed.

 

Copyright © 2005 Chan Yee Yong a.k.a. charge-n-go

 

Web Hosting · Blog · Guestbooks · Message Forums · Mailing Lists
Easiest Website Builder ever! · Build your own toolbar · Free Talking Character · Audio, Fonts, Clipart
powered by a free webtools company bravenet.com