The way functions work is that they manipulate the PC register (program counter) so that flow jumps along the code or increments sequentially depending on circumstance. When you call a function the application stores the next line address to return to onto the stack and then jumps to the function that was called. Likewise when we return from a function, either explicitly via "return" or by reaching the end of the function, we obtain that pushed address from the stack and jump back to it.
For instance say we had this pseudo code, stack and a PC that looks like this:
Code
PC = 0
Code
TOP_OF_STACK
Code
0 main() {
1 myfunction()
2 print "wow"
3 return 0;
}
4 myfunction() {
5 print "hello"
6 return 0
}
When the program counter reaches address 1 the processor issues a call instruction which pushes the next line number to be executed onto the stack, so now our stack looks like this:
Code
TOP_OF_STACK
2
After the call instruction pushes the address of the following instruction onto the stack for storage it jumps to the function by changing the program counters address to it. So now we have
Code
PC = 4
Since there is no out of ordinary circumstance the PC is incremented by 1 to execute the next line of code, PC = 5 now and the print statement is executed. Now once that is finished PC is incremented once again and equals 6.
The return statement takes our stored line number off the stack and forcibly changes our PC to that line number, which changes execution flow. So once line 6 return is executed, PC is now equal to 2 and the stack looks like this:
[code]TOP_OF_STACK[/code]
Now the processor executes that line of code, which is the print statement, increments PC by one for the next instruction to execute and so forth.
-------------------------------------------------------------------------------------------------------
The difference between a "break" and a "return" is how they manipulate the process counter. The return keyword takes the original PC of before the function was called and jumps back to that position in the code, while break uses a calculated jump to exit a specific block.
For instance:
[code]
0 main() {
1 myfunction()
2 print "wow"
3 return 0;
}
4 myfunction() {
5 for(int i = 0; i > 100000000000000069; i++) {
6 break if i == 69
}
7 print "woweee"
8 return 0
}
[/code]
[code]TOP_OF_STACK
2
[/code]
In this code when we execute line 5, the processor executes instructions that go similar to if i is less than 100000000000000069 increment i by one. If that conclusion isn't met, then the block of code inside is ran, which in this case is a line of code that says if i equals 69 then break from this loop. Now the compiler will do some math behind the scenes to figure out that if we break from that specific block it should change PC to 7. In which case we escape the loop early and print "woweee", then execute the return in which PC is set back to 2 and so forth.
Now if the break was changed to "return if i == 69" in stead. Knowing how the return statement works now, rather than having a precalculated PC jump like break which would change PC to 7, it would pop the old address off the stack and proceed to execute the code at line 2.
---------------------------------------------------------------------------------------------------------------------------------------
This is assembly/processor 101. This should help you understand how the computer actually works when the two instructions are executed.
And yes it is possible to overwrite the PC stored on the stack to hijack control flow of the application. This is the goal of stack based buffer overflow exploits.
This post was edited by AbDuCt on Apr 1 2016 12:04am