Lab Exercise 2: Adding a Syscall

Description

This lab explores the implementation and use of system calls on Intel and ARM architectures.

Objective

To understand system calls, the ultimate interface between all user processes and the kernel.

Due Dates

You have one week to do this lab. Group grading will take place in the skiff lab on Monday, October 2 and Tuesday, October 3 between 10:00 AM and 3:00 PM. This schedule will be filled in as people .

Procedure

Introduction

Modern multitasking computers run multiple processes simultaneously. In order to do so reliably, there must be protection mechanisms in place to prevent buggy or malicious processes from interfering with the operating system or other processes. Support for doing this is implemented in hardware (hence the "protected mode" of the 386 CPU). In linux, this scheme is referred to by the names "kernel mode" and "user mode".

Under this scheme, a user mode process is never allowed to access memory that doesn't belong to it, and this is enforced in hardware. This raises a question: if a process isn't even allowed to access the addresses of kernel functions, then how is it to ever call kernel code at all? The answer is that system calls are not made directly. Instead, a user process will push some parameters onto its stack¹ and trigger a particular software interrupt. This raises an interrupt handler in the kernel (the CPU has a table of interrupts and the addresses of their handlers). The handler examines the parameters left by the user process and takes appropriate action. Since the handler was called by the CPU from its interrupt table, it runs in kernel mode and the user/kernel switch has been made without giving the user process access to any memory other than its own.

The linux syscall mechanism

Having to code whatever assembly is necessary to trigger a software interrupt is less than convenient, so the standard C library (on current linux systems, this is glibc2 a.k.a. libc6) provides wrapper functions to work the necessary magic (You should go read the linux intro(2) manpage now, which gives a good introduction to this). In order to implement and use our own syscall, however, we'll have to look a little deeper to see how this is really done.

The kernel header file <linux/unistd.h> provides a set of _syscall[0-4] macros that automatically generate functions that do what is necessary to properly trigger the software interrupt. The digit at the end refers to the number of arguments taken by that particular syscall. For example, this program calls the kernel syscall getuid() directly:

    #include <linux/unistd.h>
    #include <stdio.h>

    /* this macro expands to a function definition returning "int" called
     * "getuid" ( (int)(*getuid)() ) */
    _syscall0(int,getuid)

    int main(void)
    {
        printf( "my uid is %d\n", getuid() );
        return 0;
    }

On Intel, the _syscall0 macro expands to something like:

    int getuid (void)
    {
        long __res;
        __asm__ volatile ( "int $0x80"
                : "=a" (__res)
                : "0" (__NR_getuid) );
        do {
            if( __res >= -125 )
            {
                errno = -__res;
                __res  = -1;
            }
            return  __res;
        } while (0);
    }

This is some gcc voodoo that triggers interrupt 0x80, the "software interrupt" for the Intel architecture. The parameter passed is __NR_getuid. Inside the kernel, the interrupt handler ends up (again, for Intel) at the routine system_call in the file arch/i386/kernel/entry.S This routine looks at the passed __NR_getuid and uses it to index into a table of function pointers, where the actual kernel address of the sys_getuid() routine is found.

Adding a custom syscall

In this lab, we'll add a system call to our skiff kernel that allows you to control the state of the row of eight LEDs on the front of your skiff board. We'll actually use two--one to get their status, and one to set it. But before we make syscalls that do this, we're going to learn to do it without modifying the kernel at all. One philosophy of Linux (and UNIX in general) is for the kernel to provide only the most basic mechanisms necessary to get things done, and leaving it up to userspace programs to actually do them.

In this case, that mechanism already exists. The hardware control interface for the skiff's LEDs is memory I/O. When the skiff's CPU reads and writes certain memory addresses, the I/O is mapped to a non-memory device. Many drivers work this way--video cards work by memory I/O, which is why X servers are userspace processes and not kernel drivers. There is very little X support in the kernel itself. Another example is the skiff's flash memory--there's no "flash driver". Rather, I/O sent to the "flash segment" of the memory bus ends up programming flash rather than poking memory.

Part 1 - LEDs from userspace

In linux, one way for a device to gain access to physical memory addresses is to mmap(2) the file /dev/mem. This allows processes to circumvent the kernel's virtual memory system.

Your assignment for part one is to write a userspace program that uses mmap(2) on /dev/mem to provide access to the LED I/O registers. You'll find the manpages for mmap(2), mem(4), and open(2) helpful. I've provided a header file gpio.h to give you the necessary magic numbers, and a skeleton C file led.c to get you started. Your program should print the value of the GPIO control registers in hexadecimal when invoked without any arguments. If given a single argument, your program should write that value to the proper register, allowing the user to control the LEDs.

Part 2 - A syscall for LED control

As you (hopefully) demonstrated above, this part of the lab is completely unnecessary as far as the kernel is concerned. We're only doing it as an exercise to learn about system calls.

In order to add our own syscalls, we'll need to create our own kernel function and put their addresses in the syscall table. This could be done directly (by modifying the table in entry.S), but that would require recompiling the kernel. Instead, we'll implement the syscalls in a kernel module and have the module modify the syscall table when it's loaded and unloaded².

To do this, we'll select an unused entry in the table and replace it. This technique only works because we know the exact kernel we're working with. A real module might scan for an unused entry and use it, but this dynamic selection would require that applications using that call be recompiled to use the new index into the table. The truth is that the syscall interface is not designed to be modified at all. So for this lab, we'll just choose an arbitrary slot and replace whatever's there. We know that entry will be empty because we're working with our own kernel, but there are no such guarantees for other kernels.

First, you'll need to select some table entries to replace. For the ARM architecture, the syscall table is in arch/arm/kernel/calls.S Notice the section at the end where a .rept assembler directive fills the end of the table with many sys_ni_syscall entries. This is an error routine that is called to indicate that an invalid syscall number was used. The last 40 or so of the 256 entries are filled this way, so we can choose a couple and hijack them for our own use.
Create a header file that declares the constants you choose. You'll use it in both the module and in an application program that uses our system calls.
Copy the skeleton module from Lab 1 and rename it to something more descriptive than "hello". In the C file for your module, include the header that declares your module constant. You'll need to declare extern void *sys_call_table[]; to get access to the system call table from within your module.
You need two system calls, one to get the 16 GPIO_CTL bits and one to set them. They should return (or take) a 16-bit value. Your system calls will function very similarly to the program you wrote in part one, except that you will use a different base address for accessing the I/O ports.
Now modify the init_module() and cleanup_module() functions to add and remove your syscalls from the table when the module is loaded and unloaded. You should also set the GPIO_LED_ENABLE bit when the module is loaded and reset it when the module is removed.
Write a userspace program that does exactly the same thing as the program from part one, except that it uses your new syscalls to do it. Make sure that your kernel behaves properly when your new program is used, whether or not the system call module is loaded.
UPDATE: One point I left out is an undocumented quirk of the _syscall macros. Suppose I've defined a syscall, void sys_myfunc(void), at entry 220 in the syscall table, and I've #declared the GETLED_IDX constant to be 220. I want to generate a function that calls it from a userspace program, so I use _syscall0(void,myfunc). This won't work as is, because the _syscall0 macro expects there to exist a #defined constant matching the function name, in this case __NR_myfunc. You can see where these __NR_ defines are made for existing syscalls in include/asm/unistd.h.

So, to use the _syscall macro in a userspace program for your custom syscall, you'll need something similar to
```
    #define __NR_getled (__NR_SYSCALL_BASE + GETLED_IDX)
```
in your userspace code. Sorry for any confusion that may have caused.

Questions

You don't need to write these questions up, but everyone in your group should be prepared to answer them when your project is evaluated. The answers to some of these haven't been fully covered in class yet, but you should have some idea from your 3210 prerequisites.

The skiff GPIO registers are mapped to physical memory addresses, as is the board's flash memory. The board's physical address space also addresses main memory, of course. What might happen if you accidentally (or purposefully) write bits into parts of the address space that aren't GPIO registers?
The system call registered by your module is added to the system call table when it's loaded. What would happen if you left the pointer in the table after removing the module?
What's the best LED animation program you can write? (optional, but sort of fun :)
Why do you have to use a different GPIO base address in your system call than with /dev/mem?

¹ or into some register. The exact mechanism is architecture-dependent.

² Note that this is only an exercise. Changing the kernel interface is almost always a bad idea because any application written to rely on the new interface will no longer be portable to other systems.