The next examples refer to Linux, but can be easily adapted to the *BSD world.
So far, we have seen how to execute simple commands using system calls. To obtain our shellcode, now, we only have to get the opcodes corresponding to the assembler instructions. There are typically three methods to get the opcodes:
I don't think this is the right place to talk about ModRM and SIB bytes, memory addressing and so on. So we won't delve here into writing hand-crafted machine code; anyway, you can find all the information you want (and probably more) in [Intel]. So let's take a look now at the other two methods.
The second method is by far the most efficent and widespread, though we will see that all methods lead to the same results. Our first step will be to use the assembly code from the previous "exit.asm" example to write a shellcode that, using the _exit(2) syscall, will make the application exit cleanly. To get the opcodes, we will first assemble the code with nasm and then disassemble the freshly built binary with objdump:
$ nasm -f elf exit.asm $ objdump -d exit.o exit.o: file format elf32-i386 Disassembly of section .text: 00000000 <.text>: 0: bb 00 00 00 00 mov $0x0,%ebx 5: b8 01 00 00 00 mov $0x1,%eax a: cd 80 int $0x80 $
The second column contains the opcodes we need. Therefore, we can write our first shellcode and test it with a very simple C program "borrowed" from [Phrack]:
char shellcode[] = "\xbb\x00\x00\x00\x00" "\xb8\x01\x00\x00\x00" "\xcd\x80"; int main() { int *ret; ret = (int *)&ret + 2; (*ret) = (int)shellcode; }
Though very popular, the above lines may not be that straightforward. Anyway, they simply overwrite the return address of the main() function with the address of the shellcode, in order to execute the shellcode instructions upon exit from main(). After the first declaration, the stack will look like:
Return address | <-- | Return address (pushed by the CALL instruction) to store in EIP upon exit |
Saved EBP | <-- | Saved EBP (to be restored upon exit from the function) |
ret | <-- | First local variable of the main() function |
The second instruction increments the address of the ret variable by 8 bytes (2 dwords) to obtain the address of the return address, i.e. the pointer to the first instruction which will be executed upon exit from the main() function. Finally, the third instruction overwrites this address with the address of the shellcode. At this point, the program exits from the main() function, restores EBP, stores the address of the shellcode in EIP and executes it.
To see all this in operation, we just have to compile sc_exit.c and run it:
$ gcc -o sc_exit sc_exit.c $ ./sc_exit $
Let me guess: your mouth is not really wide open in amazement! Anyway, if we want to make sure it has really been our shellcode to make the program exit, we can verify it with strace:
$ strace ./sc_exit execve("./sc_exit", ["./sc_exit"], [/* 16 vars */]) = 0 uname({sys="Linux", node="Knoppix", ...}) = 0 brk(0) = 0x8049588 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=60420, ...}) = 0 old_mmap(NULL, 60420, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200^\1"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=1243792, ...}) = 0 old_mmap(NULL, 1253956, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40027000 old_mmap(0x4014f000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x127000) = 0x4014f000 old_mmap(0x40157000, 8772, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40157000 close(3) = 0 munmap(0x40018000, 60420) = 0 _exit(0) = ? $
On the last line, you can notice our _exit(2) system call.
Unfortunately, looking at the shellcode, we can notice a little problem: it contains a lot of null bytes and, since the shellcode is often written into a string buffer, those bytes will be treated as string terminators by the application and the attack will fail. There are two ways to get around this problem:
We will now apply the first method, while we will implement the second later.
First, the first instruction (mov ebx, 0) can be replaced by the more common (for performance reasons):
xor ebx, ebx
The second instruction, instead, contained all those zeroes because we were using a 32 bit register (EAX), thus making 0x01 become 0x01000000 (bytes are in reverse order because Intel® processors are little endian). Therefore, we can solve this problem simply using an 8 bit register (AL) instead of a 32 bit register:
mov al, 1
Now our assembly code looks like:
xor ebx, ebx mov al, 1 int 0x80
and the shellcode becomes:
$ nasm -f exit2.asm $ objdump -d exit2.o exit2.o: file format elf32-i386 Disassembly of section .text: 00000000 <.text>: 0: 31 db xor %ebx,%ebx 2: b0 01 mov $0x1,%al 4: cd 80 int $0x80 $
which, as you can see, doesn't contain any null bytes!
Now let's take a look at the other technique to extract the opcodes: writing the program in C and disassembling it. Let's consider, for instance, the binary built from the previous exit.c listing and open it with gdb:
$ gdb ./exit GNU gdb 6.1-debian Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-linux"...Using host libthread_db library "/lib/libthread_db.so.1". (gdb) break main Breakpoint 1 at 0x804836a (gdb) run Starting program: /ramdisk/var/tmp/exit Breakpoint 1, 0x0804836a in main () (gdb) disas _exit Dump of assembler code for function _exit: 0x400ced9c <_exit+0>: mov 0x4(%esp),%ebx 0x400ceda0 <_exit+4>: mov $0xfc,%eax 0x400ceda5 <_exit+9>: int $0x80 0x400ceda7 <_exit+11>: mov $0x1,%eax 0x400cedac <_exit+16>: int $0x80 0x400cedae <_exit+18>: hlt 0x400cedaf <_exit+19>: nop End of assembler dump. (gdb)
As you can see, the _exit(2) function actually executes two syscalls: first number 0xfc (252), _exit_group(2), and then number 1, _exit(2). The _exit_group(2) syscall is similar to _exit(2) but has the purpose to terminate all threads in the current thread group. Anyway, only the second syscall is required by our shellcode. So let's extract the opcodes with gdb:
(gdb) x/4bx _exit 0x400ced9c <_exit>: 0x8b 0x5c 0x24 0x04 (gdb) x/7bx _exit+11 0x400ceda7 <_exit+11>: 0xb8 0x01 0x00 0x00 0x00 0xcd 0x80 (gdb)
Once again, to make the shellcode work in real-world applications, we will need to remove all those null bytes!