Wednesday, July 14, 2010

Writing a Debugger

You must have debugged your code using gdb before or traced some process with strace, if so, you must wonder how it works ? well it's not magic :) and it's not really that difficult either !

Today I'm going to walk you through writing a debugger, it's a small, yet completely functional, debugger, called mdb, using a mixture of ptrace, libbfd and libopcodes so enjoy... :)

Overview
The task of writing a debugger can be broken down into a set of smaller tasks:
  • Tracing a process.
  • Setting break point.
  • Disassembling instructions.
The main function will open a binary, read the symbol table, run it in a traceable process and then wait for commands from the user to set break points, single step through the code or continue execution normally. We already have a lot to do so let's get started...

Tracing a process:
Tracing a process is examining and controlling the execution of that process, In order for a process to be traced, it must either explicitly express its wish of being traced using ptrace(PTRACE_TRACEME,...) or another process may attach and trace it using ptrace(PTRACE_ATTACH,...).

However, since we're writing a debugger, attaching to a running process is not really an option, since we would like to debug it from the beginning, and we may not always have access to the source code of the process to explicitly start tracing, so what should we do ?

We can call fork(), creating a new process, and then call ptrace(PTRACE_TRACEME,...) on the newly created process, doing so makes it traceable, if we then exec() the binary we wish to trace we end up having our code executing in a traceable process:
pid_t pid;
switch (pid = fork()) {
    case -1: /*error*/
        perror("fork()");
        exit(-1);
    case 0:/*child process*/
        ptrace(PTRACE_TRACEME, NULL, NULL); /*allow process to be traced*/
        execl(path, name, NULL);            /*child will be stopped here*/
        perror("execl()");
        exit(-1);
}
/*parent continues execution here*/ 
After calling ptrace((PTRACE_TRACEME...) any signal, except for SIGKILL, sent to the traced process will cause it to stop executing, Calling exec from the traced process causes a SIGTRAP being sent to it, also causing it to stop.

The parent process can then be notified of the status of the child process using wait() and then resume the execution of the child process using PTRACE_SINGLESTEP or PTRACE_CONT.

Setting break points
Right, so we now have control over the execution of the process, not a very fine one though, a finer control of execution can be achived using breakpoints.

We can use the address of a symbol we wish to break at for setting a breakpoint, however, for obvious reasons, we would like to set break points using the symbol and not it's memory address. In order to do so, we must first read the symbol table of the binary, hopefully the binary in question is not stripped, and save it in hash table, so that when the user wishes to break on a symbol we look up that symbol in the hash and find its address. 

libbfd, part of the binutils, is used to read the symbol table from the binary it also happens to provide hash tables, but, uthash is more flexible: 
/* load symbol table*/
long size;
long nsym;
asymbol **asymtab;

bfd_init();
bfd *abfd = bfd_openr(path, NULL);

bfd_check_format(abfd, bfd_object);
size = bfd_get_symtab_upper_bound(abfd);
asymtab = malloc(size);
nsym = bfd_canonicalize_symtab(abfd, asymtab); /*reads symtab*/

/*create symbol table hash*/
long i;
SymbolTable *symtab=NULL, *symbol;
for (i = 0; i < nsym; i++) {
    symbol = malloc(sizeof(Symbol));
    symbol->sym  = bfd_asymbol_name(asymtab[i]);
    symbol->addr = bfd_asymbol_value(asymtab[i]);
    HASH_ADD_KEYPTR(hh, symtab, symbol->sym, strlen(symbol->sym), symbol);
}
Once we have the address we need a way of making the process stop executing at that address, that is, we need the process to generate a software interrupt or a trap, namly INT 3, which is defined specifically for use by debuggers.

Injecting the opcode of INT 3, which is either 0xCC or 0xCD03 when using 0xCD<imm8>, at the address of the breakpoint should be enough to get the job done:
case BREAK: {/*set break point*/
    /*look up the symbol in the symbol table*/
    HASH_FIND_STR(symtab, arg, symbol);
    if (symbol) {
        /*insert new break point*/
        brp = malloc(sizeof(Symbol));
        brp->sym  = symbol->sym;
        brp->addr = symbol->addr;        
        /*save instruction at eip*/
        brp->opc = ptrace(PTRACE_PEEKTEXT, pid, symbol->addr, NULL);        
        /*insert break point*/
        ptrace(PTRACE_POKETEXT, pid, symbol->addr, 0xcc);
        /*add break point to hash*/
        HASH_ADD_INT(brptab, addr, brp);
        printf("break %lx<%s>\n", brp->addr, brp->sym);
    } else {
        printf("symbol not found <%s>\n", arg);
    }                  
    free(arg);
    break;
} 
However, one minor "detail" is left, injecting opcode at an aribitrary address will most likely miss up the instructions that follow the breakpoint address, causing the processor to raise an exception and our poor process eventually being killed!

The solution is to first backup the instruction at the breakpoint address and then restore it when we reach this break point after the process being stopped, and before executing the next instruction we also need to decrement eip so that it points at the beginning of the restored word:
if (WIFSTOPPED(status)) {
    /*read eip*/
    long rip =  ptrace(PTRACE_PEEKUSER, pid, 8 * RIP, NULL)-1;
    
    /*look up eip in the breakpoint hashtable*/
    HASH_FIND_INT(brptab, &rip, brp);
    if (brp) {
        HASH_DEL(brptab, brp);
        /*restore instruction(s)*/
        ptrace(PTRACE_POKETEXT, pid, brp->addr, brp->opc);
        /*decrement eip*/ 
        ptrace(PTRACE_POKEUSER, pid, 8 * RIP, rip);
        printf("process %d stopped at 0x%lx<%s>\n", pid, rip, brp->sym);
    } else {
        printf("process %d stopped at 0x%lx\n", pid, rip);
    }
}

Disassembling instructions
We would like our debugger to print something readable instead of machine code, to do so we need something that can disassemble machine code from memory, libopcodes, also part of binutils, allows you to disassemble and print machine code of multiple architectures:
/*disassembly routine*/ 
void disassemble(void *buf, unsigned long size)
{
    struct disassemble_info info;
    init_disassemble_info (&info, stdout, fprintf);
    info.mach = bfd_mach_x86_64;
    info.endian = BFD_ENDIAN_LITTLE;
    info.buffer = buf;
    info.buffer_length = size;
    print_insn_i386(0, &info);
    printf("\n");
}
Cool now that we got that out of the way, one small problem left, since this is a CISC architecture then instructions, most likely, are of variable length, they could be 1 byte or 16 bytes long!

Two solutions jump to my mind, we can either let libopcodes decode whole words in this case it will output one or more instruction(s) and some garbage or we can use the program counter to find out the exact length of the instruction.

The program counter, or eip, is a register that keeps count of how many bytes executed so far, so at any given time, eip points to the next instruction to be executed, and when that instruction is executed it gets incremented with the size of the instruction in bytes, if we save the current eip until the next instruction and then subtract it form the new eip we get the size of the previous instruction in bytes, pretty neat, no ?

Not so fast, this might work for normal instructions, branching however, is quite a different story, branch instructions like jmp, will either increment the eip too much, if jumping forward, or decrement it, if jumping backward, if so, we will ignore the eip difference if it's to big or if it's a negative number, okay now we're ready:
case NEXT: {/*next instruction*/
    /*read instruction pointer*/
    long rip =  ptrace(PTRACE_PEEKUSER, pid, 8 * RIP, NULL);
    
    if (old_rip) {
        /*calculate instruction size*/
        long oplen = rip - old_rip;
        if (oplen > 0 && oplen < 16) {
            disassemble(&opcode, oplen);
        }
    }

    /*read two words at eip*/
    opcode[0] = ptrace(PTRACE_PEEKDATA, pid, rip, NULL);
    opcode[1] = ptrace(PTRACE_PEEKDATA, pid, rip+sizeof(long), NULL);
    old_rip = rip;

    /*next instruction*/
    ptrace(PTRACE_SINGLESTEP, pid, NULL, NULL);
    wait(&status);
    break;
}
Test run
Time for fun :) we will debug a small program with mdb, this is the source code for the program:
#include <stdio.h>
#include <string.h>
int say_hello(int x, int y, int z)
{
    printf("Hello\n");
    return 0;
}

int main(int argc, char **argv)
{
    int x = 0x55;
    int y = 0x56;
    int z = 0x57;
    say_hello(0x255, 0x256, 0x257);
    return 0;
}
And this is the complete session:
./mdb test
process 15384 stopped at 0x7f519defcaef
(mdb) break main
breakpoint at 400546<main>
process 15384 stopped at 0x7f519defcaef
(mdb) continue
process 15384 stopped at 0x400546<main>
(mdb) next
process 15384 stopped at 0x400546
(mdb) next
push   %rbp
process 15384 stopped at 0x400549
(mdb) next
mov    %rsp,%rbp
process 15384 stopped at 0x40054d
(mdb) next
sub    $0x20,%rsp
process 15384 stopped at 0x400550
(mdb) next
mov    %edi,-0x14(%rbp)
process 15384 stopped at 0x400554
(mdb) next
mov    %rsi,-0x20(%rbp)
process 15384 stopped at 0x40055b
(mdb) next
movl   $0x55,-0x4(%rbp)
process 15384 stopped at 0x400562
(mdb) next
movl   $0x56,-0x8(%rbp)
process 15384 stopped at 0x400569
(mdb) next
movl   $0x57,-0xc(%rbp)
process 15384 stopped at 0x40056e
(mdb) next
mov    $0x257,%edx
process 15384 stopped at 0x400573
(mdb) next
mov    $0x256,%esi
process 15384 stopped at 0x400578
(mdb) next
mov    $0x255,%edi
process 15384 stopped at 0x400523
(mdb) break say_hello
breakpoint at 400524<say_hello>
process 15384 stopped at 0x400523
(mdb) continue
process 15384 stopped at 0x400524<say_hello>
(mdb) registers
rax 0x7f519def8ec8
rbx 0x0
rcx 0x0
rdx 0x257
rsi 0x256
rdi 0x255
rbp 0x7fff2395c140
rsp 0x7fff2395c118
rip 0x400524
process 15384 stopped at 0x400523
(mdb) kill
process 15384 terminated
The source code of the debugger is around 250 lines it's hosted on mercurial along with uthash, test.c and a Makefile. 
hg clone https://iabdelkader@bitbucket.org/iabdelkader/mdb
A port for x86-32 bit is also available, written by hnd, you may contact him at drona89@live.com for any questions or comments:
http://code.google.com/p/xbugger/
Read more ...

Monday, July 5, 2010

RN-41 Bluetooth Module

Recently I've been playing with the RN-41 bluetooth module from sparkfun, this module is really easy to use, it requires just a couple of connections to work and implements the SPP (bluetooth serial profile) with TTL levels so it's perfect for a pic project :) This is a little how-to on configuring and interfacing with the RN-41 module.
The circuit
This is a typical pic microcontroller circuit with a 3.3v voltage regulated power supply, the pic is running at 4Mhz, connected to the pic's serial port is the RN-41 module, rx->tx and tx->rx. I also wired a 16x2 LCD to print text sent to the RN-41.
Programming the pic
Alright we have the hardware ready now we need to program the pic to read from the RN-41 and print to the LCD, I used the CCS compiler, here's the code

#include <16f876.h>
#use delay(clock=4000000)     /*4Mhz*/
#use RS232(BAUD=9600, XMIT=PIN_C6, RCV=PIN_C7, STREAM=COM_A) 
#fuses XT, NOWDT, NOPROTECT, NOLVP 
#include "flcd.c"            /*flex lcd driver/

char buf[16];
int8 have_string = 0;

/*serial port interrupt service routine*/
#int_rda
void sp_isr()
{ 
    fgets(buf, COM_A);  
    have_string = 1;
}

void main(void)
{
    enable_interrupts(global);    /*enable interrupts*/
    enable_interrupts(int_rda);   /*enable serial port interrupts*/
   
    lcd_init();
    delay_ms(100); /*must delay after initiating the lcd*/
   
    while (1) {             
       if (have_string) {
          have_string = 0;         
          printf(lcd_putc, "\f%s", buf);
       }
    }
}
The code is pretty easy, but anyway, we first enable global and serial port interrupts so we get an interrupt whenever we receive data, you can poll the serial port instead but you should use interrupts because they free up the processor to do other stuff.

Inside the isr, we read a string from the serial port and set the have_string flag which is continuously checked in the main loop, when we have a string we print it to the lcd. After programming the pic, last step is configuring the RN-41.

Configuring the Module
The RN-41 module can be configured either locally, over the serial port, or remotly, over the air, I think the later approach is easier, so that's what I will be doing, I will use python to connect to the RN-41 since it's a lot more faster than doing so in C.

First we need a bdaddr and a channel to connect to the RN-41 module, so go ahead connect the power to the circuit, open a terminal and type:
$hcitool scan
Scanning ...
    00:06:66:04:11:94    FireFly-1194

$sdptool records 00:06:66:04:11:94
Service Name: SPP
Service RecHandle: 0x10000
Service Class ID List:
  "Serial Port" (0x1101)
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 1
Language Base Attr List:
  code_ISO639: 0x656e
  encoding:    0x6a
  base_offset: 0x100
...
And we have the bdaddr and channel, time to configure the RN-41, the RN-41 has two modes of operation, a command mode and a data mode to configure the RN-41 we need to enter the command mode and we have to do so within then config time window, 60 seconds by default, open the python interpreter and:
import bluetooth
sock = bluetooth.BluetoothSocket(bluetooth.RFCOMM)
sock.connect(('00:06:66:04:11:94', 1))  # bdaddr, spp channel
sock.send('$$$')                        # config mode, within 60 of connecting
sock.send('ST,255\r\n')                 # enables continuous configuration
sock.send('SU,9600')                    # set baudrate to 9600bps 
sock.send('---\r\n')                    # switch back to data mode 
sock.send('Hello World !\r\n')          # send some text to test
And we're done have fun :)

Notes:
  • The 10k pot controls the LCD contrast.
  • Continous configuration allows you to enter the config mode at any time, if you don't use it don't forget to enter the config mode within the config time window.
  • The flcd driver can be found somewhere at  the CCS forums
  • If you're going to use the same pins for the LCD this is my configuration:
#define LCD_DB7   PIN_B7
#define LCD_DB6   PIN_B6
#define LCD_DB5   PIN_B5
#define LCD_DB4   PIN_B4
#define LCD_E     PIN_B1
#define LCD_RS    PIN_B0
Read more ...