Intel 80x86 CPU

This section will consist of a gross over simplification of the 80x86 CPU family and information on 32 bit 80x86 processor registers.


Before i get into describing assembly language instructions, I will explain to you little about CPU, memory and a little about I/O (input/output) devices. These three components are connected by the system bus. Out of these, I will go into more detail on CPU registers and the memory stack.

Within your CPU there are a number of processor registers. These small amounts of storage can be accessed faster than storage anywhere else. Almost all calculations on the 80x86 CPU involve a register.
Registers will generally act as a middleman in a calculation.

You may have heard talk of 32 bit and 64 bit processors. Possibly you never understood exactly what that mean but one of the main differnce is the range of address the processor is able to address to. A 32 bit processor can address to 2^32 different addresses. In hexadecimal this is 0000000-FFFFFFFF. On the other hand, a 64 bit processor can address to 2^64 different address. In hexadecimal this is 000000000000000-FFFFFFFFFFFFFFFF. Yep that is 16 Fs instead of 8. Notice it is not double the amount as is so commonly though but actually a 64 bit processor could actually address to the polynomial order 2 as many as the 32 bit processor (ie. if you take the different number of combinations available for a 32bit processor as x, the 64 bit processor can address to x^2)

You will often see hexadecimal numbers in this form:

-prefixed by 0x
eg. 0x2C8A
- written by an 'h' on the end
eg. 2C8Ah



Decimal numbers, may be seen in this form
- written with a 'd' on the end.
eg. 11402d

Binary numbers may seen with a 'b' on the end.

There are 9, 32bit registers that you will become most familiar with:

EAX
EBX
ECS
EDX
ESI
EDI
EBP
ESP
EIP

Although each register has a purpose, a lot of the time you can use them for whatever you like. However you must always follow this rule. If you are to use a register, you must make sure that after you finish using it,
it ends up with the value it was holding before you used it.
One particular register from the above, I would like to point out is EIP. This register holds the address to the
next instruction the processor will execute. In fact in one of the parts of my reversing i will demonstrate how
this register can be manipulated to control the flow of the code execution.

The 32 bit registers described above can hold 32 bits of data (4bytes/a dword)
EBP and ESP are related to the memory stack which i will be explainin a little bit later.

Note that the 16 and 8 bit registerse are NOT seperate from the 32 bit registers




The 80x86 overlayers 32 bit registers with 16 bit registers with 8 bit registers

The lower parts of EAX, EBX, ECX, and EDX are called AX, BX, CX, DX respectively. AX, BX, CX, DX are 16 bit registerse meaning they can hold 16 bits of data (2 bytes/a word). Now these four 16 bit registers are also split into higher and lower parts. The higher parts are called AX, BH, CH, DH. The lower parts are AL,BL,CL,DL. The four 32 bit registers described above are called general purpose registers.

Each of the 32 bit registers also have 16 bit registers within them, but there are only the four 8 bit regisers
mentioned above.



To better consolidate your memory, i will give an example of how the registerse interact with each other.


Let us take for example the value of EAX to be
7FEDCBA0
Now AX is the lower word or 16 bit of EAX, therefore the value of AX hold is CBA0
AH holds the higher part of AX there for holds CB
AL holds the lower part of AX and therefor holds A0


Flags

However i have missed out one very special register till the end. This is the EFLAGS register. It is a 32 bit
register holding single-bit boolean values. Boolean means the state of the value is either true or false.

Did you ever wonder what happens when a processor is executing instructions? Most of the time, code execution does not just go down vertically but jumps around. How do we control where we jump to? Well we use conditional jumps.

This is the equivalent of a higher levelled syntax such as IF. It eventually results down to a bunch of conditional jumps.

8 of the bits inside the EFLAGS register are of particular intrest to you when reverse engineering. These bits are called flags. Flags can hold two states, 1 or 0. A flag in the state 1 is said to be set and flag with state 0 is said to be clear. The conditional jumping of code I mentioned above depends on the state of one or more flags. In fact I have seen one conditional jump so far that does not depend on the state of flag/s (JECXZ - jumps when ECX =0) so hopefully you will realise how important flags are. Because flags basically control a program's flow



3 more important flags:

Z-Flag - The zero flag is set or cleared depending upon the result of the prior instruction.
If the result of the prior instrucion was 0, then the Z-flag is set(1), otherwise the flag stays clear(0)

O-Flag - (The overflow flag is set the the prior instruction resulted in the reguster operated on to undergo a change in its highest bit.

C-Flag - The carry flag is set if you add a register's value exceeds FFFFFFF or is less than 0.

And thats all you will need to know about registers and flags to being reversing.

The Memory Stack.
The memory stack is a part in your computers memory you can use to temporarily store data. This data could be an address, a constant, etc. Think of the stack as a massive box that is available for you to stack pieces of paper. When you put something onto the stack, it is called a push. When you move something from the stack is is called a pop. Now lets pretend you really had a stack and you wanted to put things on this stack. Well if I had just piled up 10 pieces of papers on to of each other in my stack and i wanted to remove the one at the bottom, i cant just pull it out. Instead i have to remove every other piece on top of it first before being able to get at the piece at the bottom. This is the most important rule about stacks. This first thing to be push onto the stack is the last to come off. In practice there are few exceptions to this. but I wont go into detail. This also goes for other way round too. Whatever you last put on the stack is the next thin you take off unless you put something else on.