Fundamentals

Buffer overflows are caused by incorrect program code.

Buffer overflows originate from poor and incorrect program code. If too much data is written to reverved memory or a stack, registers will be overwritten making code execution possible. We can overwrite the return address of a program with arbritary data and execute commands.

A common cause of buffer overflows is the use of programming languages that do not automatically monitor memory buffer limits or stacks. Languages like C and C++.

Read up on CPU Architecture

CPU Architecture

Stack-Based Buffer Overflow

Memory exceptions is the OS response to an error in existing software or during executing software. They are programming errors occuring in low level languages like C or C++. Buffer overflows are errors that occur when data is too large too fit into a buffer of the OS's memory and overflows the buffer, then other functions can be overwritten.

Programs store data and instructions in memory. For expected user input a buffer must be created beforehand by saving the input. As for instructions they model program flow.

Memory layout of a process

When a program is called, sections are mapped to segments in the process, the segments are loaded into memory of an ELF file.

The Buffer

The .text section contains assembler instructions of the program. Area can be read-only to prevent a process from modifying the instructions. Attempt to write to this area wil give a segmentation fault.

Vulnerable Program

We are going to write a vulnerable program which would normally not work due to protections like Address Space Layout Randomization (ASLR), which randomize addresses.

Several vulnerable functions in the C programming language that do not protect memory.:

  • strcpy

  • gets

  • sprintf

  • scanf

  • strcat

Create a vulnerable program bow.c and compile to bow32

We have written a simple program in C.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int bowfunc(char *string) {

	char buffer[1024];
	strcpy(buffer, string);
	return 1;
}

int main(int argc, char *argv[]) {

	bowfunc(argv[1]);
	printf("Done.\n");
	return 1;
}

Disable ASLR

┌──(root㉿kali)-[/home/kali]
└─# echo 0 > /proc/sys/kernel/randomize_va_space

┌──(root㉿kali)-[/home/kali]
└─# cat /proc/sys/kernel/randomize_va_space
0

Compilation

$ sudo apt install gcc-multilib
$ gcc bow.c -o bow32 -fno-stack-protector -z execstack -m32
$ file bow32 | tr "," "\n"

bow: ELF 32-bit LSB shared object
 Intel 80386
 version 1 (SYSV)
 dynamically linked
 interpreter /lib/ld-linux.so.2
 for GNU/Linux 3.2.0
 BuildID[sha1]=93dda6b77131deecaadf9d207fdd2e70f47e1071
 not stripped

CPU Registers

Registers are the essentials components of the CPU. Registers offer a small amount of storage space where data can be stored temporarily. However there are few with functions, like General Registers with further divided into Data registers, Pointer registers and Index registers.

Data registers

Description
64-bit
32-bit

Accumulator is used in input/output and for arithmetic operations

RAX

EAX

Base is used in indexed addressing

RBX

EBX

Counter is used to rotate instructions and count loops

RCX

ECX

Data is used for I/O and in arithmetic operations for multiply and divide operations involving large values

RDX

EDX

Pointer registers

Description
64-bit
32-bit

Instruction Pointer stores the offset address of the next instruction to be executed

RIP

EIP

Stack Pointer points to the top of the stack

RSP

ESP

Base Pointer is also known as Stack Base Pointer or Frame Pointer thats points to the base of the stack

RBP

EBP

Stack Frames

The stack in memory starts with a high address and grows down to low memory addresses as values are added, the Base Pointer points to the beginning of the stack. The stack pointer points to the top of the stack.

When a function is called it gets it own section of the stack called a stack frame. It holds everything the function needs like local variables and saved register values. Stack frame is marked by start Base Pointer (EBP) and end Stack Pointer (ESP).

Because the stack uses a Last-In-First-Out (LIFO) structure, the first thing done in a new function is saving the old base pointer (EBP). This way, when the function ends, it can return to the previous stack state.

The Prologue

To store the old base pointer push is used to store previous EBP, the next step creates the new stack frame. Space is created in stack and next is to move the stack pointer to the top.

(gdb) disas bowfunc 

Dump of assembler code for function bowfunc:
   0x0000054d <+0>:	    push   ebp       # <---- 1. Stores previous EBP
   0x0000054e <+1>:	    mov    ebp,esp   # <---- 2. Creates new Stack Frame
   0x00000550 <+3>:	    push   ebx
   0x00000551 <+4>:	    sub    esp,0x404 # <---- 3. Moves ESP to the top
   <...SNIP...>
   0x00000580 <+51>:	leave  
   0x00000581 <+52>:	ret  

The Epilogue

To get out the stack frame we can do the opposite. The ESP is replaced by the current EBP and value reset to the value it had before in the prologue.

(gdb) disas bowfunc 

Dump of assembler code for function bowfunc:
   0x0000054d <+0>:	    push   ebp       
   0x0000054e <+1>:	    mov    ebp,esp   
   0x00000550 <+3>:	    push   ebx
   0x00000551 <+4>:	    sub    esp,0x404 
   <...SNIP...>
   0x00000580 <+51>:	leave  # <----------------------
   0x00000581 <+52>:	ret    # <--- Leave stack frame

Endiannes

When data is stored or loaded in memory, the order of the bytes can vary. This is called endianness. There are two main types: little-endian and big-endian.

  • In big-endian, the most important byte comes first.

  • In little-endian, the least important byte comes first.

Last updated

Was this helpful?