World Building of Some Sort
Ever wondered why C is still being taught in the introductory courses of most computer-related degrees despite its various problems and vulnerabilities? (don’t quote us on this btw) Well, we sure as well know nothing about it either. However, one thing is for sure; it is a low-level language (meaning it is close to machine language), and; it allows us to directly manipulate aspects such as memory allocation at a close-to-machine level. In a sense, studying C helps learners understand and gain a foothold in programming and its general concepts such as object types, memory, loops, conditionals, and libraries. It essentially serves as a must-know programming language that lies in the middle of assembly language and high-level programming languages. But, oh boy, is it dangerous.
For this write-up, we shall be HACKERMANS!!! Having been introduced to memory safety vulnerabilities, we are here to simulate a scenario where we will try to exploit memory safety vulnerabilities to gain access to someone’s device or system. Let’s get right into it.
9 + 10 = 21 (Laying Foundations)
Given some code:
1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <memory.h> 4 5 char* read_string(int length) { 6 char input[length]; 7 scanf("%s", input); 8 char* ret = malloc(sizeof(int) * length); 9 memcpy(ret, input, length); 10 return ret; 11 } 12 13 int main() { 14 int length = 16; 15 char* in = read_string(length); 16 printf("%s", in); 17 }
We aim to infiltrate the victim’s system by opening its Powershell with elevated privileges. In this exercise, let us say we were provided additional " essential " information to accomplish this great heist of ours.
The state of the memory stack at line 7:
0xffffcff0: 0x00000000 0xf7ffcfd4 0x0000002c 0x565561ca 0xffffd000: 0x0000000f 0xffffcff0 0x01000000 0x8ce2b800 0xffffd010: 0x56559000 0xffffd114 0xffffd048 0x56556298
The memory address of input is located at
0xffffcff0
.The memory address of the Return Instruction Pointer (
rip
) is located at0xffffd01c
. And lastly;The address of the PowerShell executable:
\x6a\x32\x58\xcd\x80\x89\xc3\x89\xc1\x6a\x47\x58\xcd\x80\x31\xc0
For Legal Reasons…
At first glance, the code in itself is just your typical C language code to create a function that reads and prints out a string of length 16 with some elven language pertaining to pointers and dynamic memory allocation. But looking at it discreetly, we start to see vulnerabilities that could have been prevented if memory safety was one of the agenda.
Nevertheless, and for legal reasons, it is unethical to practice exploitation in any shape or form, and in this case, memory safety, and this writing is just only for educational purposes. Any offenses made will be solely the executor’s responsibility and not of this article’s team of writers.
Let’s crack our knuckles as we type into our keyboards while reflecting green numbers in our cool sunglasses under our hoodies.
The Exploitation
Step 1. We need to determine the location of the input
and the rip
, so that we can further determine the distance between their memory addresses. Since we have been magically supplied with this information by a black hat friend of ours that lives in their parent’s basement, we can simply calculate how many bytes apart they are from their memory address:
[
rip
(ffffd01c) -input
(ffffcff0) ] + input buffer = distance
Or in decimal terms:
[4294955036 – 4294954992] + 16 = 60 bytes
So this would mean that there is a total of 60 characters worth of memory for us to traverse or overflow into for us to reach the rip
from the input
.
In this case, we were conveniently handed the information for us to determine the distance between the addresses of input
and rip
, in a scenario where that is not the case, people would use tools to locate memory addresses or print our machine instructions such as GDB.
Step 2. Commence buffer overflow. We shall smash that stack by inputting 60 characters which we determined to be the input buffer plus the distance from their memory addresses in bytes. Ultimately, we can just run the program and input a crafted string that contains 60 random characters and we should be able to reach the rip. And since our goal is to open the PowerShell to infiltrate the system, we overwrite the contents of rip into the address of their PowerShell executable which is at \x6a\x32\x58\xcd\x80\x89\xc3\x89\xc1\x6a\x47\x58\xcd\x80\x31\xc0.
Step 3. We are almost in! Run the program and input your crafted string along with the shellcode and we have now successfully opened the PowerShell of our victim’s system, which would give us access to execute any arbitrary PowerShell commands. HACKERMANS I’M IN
What did we exploit?
Buffer Overflow - this is where since the language C has no native bounds checker, attackers use this by inputting more than what is needed to overwrite the values of succeeding addresses.
Stack Smashing - an upgraded version of buffer overflow where we actively seek the address of rip and overwrite it to run the shellcode \x6a\x32\x58\xcd\x80\x89\xc3\x89\xc1\x6a\x47\x58\xcd\x80\x31\xc0
which will look like 0xc03180cd 0x58476ac1 0x89c38980 0xcd58326a
due to how memory addresses stacking works.
Format Vulnerability - in a way this is being exploited due to the scanf
only using %s
format which doesn’t really limit how long the string the user will input
Condoms and Mitigations
There is that saying every time something already happens–” prevention is the best medicine” or whatever that was. Well not to burst everyone's bubble, it still can apply in this field of work. But enough’s said, how can we actually prevent such attacks?
First of all, just use anything BUT C
There are better languages that already automatically create security measures just through how they process memory like Python.
Write code with security in mind and practice defensive programming
Any part where user input is asked by the program is always a vulnerability, that said, programmers, should practice adding checks, especially boundary checks to avoid overflowing of input. Like in line 7 of the code:
7 scanf("%s", input);
We could instead add few more lines to check the input first before proceeding to the next line:
7 scanf("%s", input);
8 if(strlen(input) > length){
9 return;
10}
Use safe libraries
Instead of
sprintf
we could usesnprintf
instead
Instead ofstrcpy
we could usestrncpy
instead
Other than those, we have a thing where we compile the code with code hardening defenses, split memory into pages, add a canary system, specific pointer authentication code on each address, and lastly, a randomized address space layout. Such things are further mitigations we can implement to cover these vulnerabilities. But then again, resources are limited and prioritization may leave other aspects hanging in a coding process but at least, we know, and that we should know.