Buffer overflow and format string attacks: the basics 2023

Poison_tools

Active member
Member
Joined
Oct 19, 2023
Messages
531
Credits
15,358

Overview[Buffer overflow and format string attacks]:​

I have come across various analysts who want to know how buffer overflows and format string attacks actually happen. The articles we read on the web are usually at a very advanced level and start with a dance down. However, this series of articles will serve as the basic building blocks for these advanced articles. This article is part of the first series and will focus only on buffer overflow attacks. We will cover not only how buffer overflows occur, but also how to defend against these attacks.

What is a buffer overflow?​

Buffer overflow attacks are considered the most insidious attacks in information security. Buffer overflow attacks are analogous to the water in a bucket problem. For example, when more water is added than the bucket can hold, the water overflows and spills. The same is the case with buffer overflows, which occur when more data is added than a variable can hold. It then moves to adjacent memory locations. Now you must be asking, “So what?” After all, it’s just data spilling over. Now imagine that someone issued a command and the data is poured over them. Before we go a step deeper into the program, let’s refresh our concept of how a program runs on a computer.

When a program is running, the CPU fetches instructions from memory one at a time. How does the CPU fetch the next instruction? It does so by using the instruction pointer, which tells the CPU where to grab the next instruction from memory. With each fetch, this instruction pointer is incremented and a new memory location is fetched. Whenever the CPU encounters a branch or jump instruction, IP changes its value to a completely new memory location and then starts incrementing from the new memory location.

Now let’s look at a simple program like the one below and its stack representation

image-296.png

Stack Representation​

Below is the stack representation of a normal stack and a buffer-overflowed stack.

Normal Stack

Before moving on to the Buffer Overflow stack, a few important points about the above stack:

  • The stack is LIFO, i.e. it pushes things to the top of the stack and ejects things from the top of the stack.
  • The return pointer contains the address of the calling function. After the subroutine call is complete, control is transferred to the calling function.

Buffer Overflow Stack

In this case, what happens is that the user-supplied input is not properly processed by the function and underflows the buffer. Typically, the attacker sends a machine-specific bytecode, such as /bin/sh in this case, and a new address for the address pointer. Where will this new pointer point? You guessed right. This new pointer will point to the new address where the attacker’s code will run. But as you can see, this attack needs to be very precise to overwrite variables and return a pointer. What are the ways an attacker abuses the buffer? We will see in the next part.

Now, as we’ve seen, in order for an attacker to overwrite a return pointer and let a program execute malicious code, they must pass the input exactly. But how does the attacker do it? There are several steps in which an attacker finds and exploits a buffer overflow vulnerability.

  • The very first step to exploiting a buffer overflow vulnerability is to expose it. If an attacker has a binary executable, he can look for weak function calls. Remember that a buffer overflow attack starts with the input provided by the user and any other function that is used to copy. Attackers typically look for features such as:
  • strcpy
  • strncpy
  • shove
  • sprint
  • scanf
  • fgets
  • will get
  • getws
  • memcpy
  • memes

All these functions are used to move data between memory locations and are usually mishandled by the developer.

  • After an attacker finds these weak features, they try to figure out how much input is needed to rewrite the IP or return a pointer. To do this, the attacker passes a certain amount of similar input, such as the “A” series, into the input field. They then check which “A” overwrote the return pointer.
  • After this step, the attacker pushes the exploit code into memory. An attacker will usually try to invoke a shell and execute arbitrary commands. The exploit code will run with program permissions. For this reason, using a SUID root program is very useful because it will run with root privileges. On Unix systems, attackers typically target programs with UID 0, and on Windows, attackers typically target programs that run as SYSTEM.
  • An attacker should also make sure that the exploit code fits in the buffer and does not contain characters that are filtered out.
So the above steps describe how a buffer overflow actually occurs. We should now explore how to defend against buffer overflow attacks

  • Check the input size if possible and shorten it if it is too large.
  • During the build, ensure that a non-bootable system stack is implemented. Stacks are used to store function call arguments, return parameters, local variables, but not executable code. So if we can implement a non-executable stack, most buffer overflow attacks can be controlled. To implement this feature windows even has a feature called “Data Execution Prevention” which is used to make the stack unexecutable.
DEP settings are available under Systems > Advanced > Performance > Settings > DEP.

image-297.png

  • For compile-time protection, some compilers calculate the key hash of the return pointer when the RP is pushed onto the stack. This keyed hash value is known as a canary. Then the canary and RP are put on the stack and when the function needs to return, the system checks that the RP and canary have the same value. If they don’t, the program will never return from the function (meaning bad code can’t be executed) and the program will exit gracefully.
  • Use patches whenever the vendor releases them and wherever possible.
  • We learned the basics of buffer flow. In the next part of the series, we will learn how format string attacks occur.
 
Top