Lab Exercise 2: Adding a Syscall
This lab explores the implementation and use of system calls on Intel and ARM architectures.
To understand system calls, the ultimate interface between all user processes and the kernel.
You have one week to do this lab. Group grading will take place in the skiff lab on Monday, October 2 and Tuesday, October 3 between 10:00 AM and 3:00 PM. This schedule will be filled in as people .
Introduction
Modern multitasking computers run multiple processes simultaneously. In order to do so reliably, there must be protection mechanisms in place to prevent buggy or malicious processes from interfering with the operating system or other processes. Support for doing this is implemented in hardware (hence the "protected mode" of the 386 CPU). In linux, this scheme is referred to by the names "kernel mode" and "user mode".
Under this scheme, a user mode process is never allowed to access memory that doesn't belong to it, and this is enforced in hardware. This raises a question: if a process isn't even allowed to access the addresses of kernel functions, then how is it to ever call kernel code at all? The answer is that system calls are not made directly. Instead, a user process will push some parameters onto its stack1 and trigger a particular software interrupt. This raises an interrupt handler in the kernel (the CPU has a table of interrupts and the addresses of their handlers). The handler examines the parameters left by the user process and takes appropriate action. Since the handler was called by the CPU from its interrupt table, it runs in kernel mode and the user/kernel switch has been made without giving the user process access to any memory other than its own.
The linux syscall mechanism
Having to code whatever assembly is necessary to trigger a software interrupt is less than convenient, so the standard C library (on current linux systems, this is glibc2 a.k.a. libc6) provides wrapper functions to work the necessary magic (You should go read the linux intro(2) manpage now, which gives a good introduction to this). In order to implement and use our own syscall, however, we'll have to look a little deeper to see how this is really done.
The kernel header file <linux/unistd.h> provides a set of _syscall[0-4] macros that automatically generate functions that do what is necessary to properly trigger the software interrupt. The digit at the end refers to the number of arguments taken by that particular syscall. For example, this program calls the kernel syscall getuid() directly:
#include <linux/unistd.h>
#include <stdio.h>
/* this macro expands to a function definition returning "int" called
* "getuid" ( (int)(*getuid)() ) */
_syscall0(int,getuid)
int main(void)
{
printf( "my uid is %d\n", getuid() );
return 0;
}
On Intel, the _syscall0 macro expands to something like:
int getuid (void)
{
long __res;
__asm__ volatile ( "int $0x80"
: "=a" (__res)
: "0" (__NR_getuid) );
do {
if( __res >= -125 )
{
errno = -__res;
__res = -1;
}
return __res;
} while (0);
}
This is some gcc voodoo that triggers interrupt 0x80, the "software interrupt" for the Intel architecture. The parameter passed is __NR_getuid. Inside the kernel, the interrupt handler ends up (again, for Intel) at the routine system_call in the file arch/i386/kernel/entry.S This routine looks at the passed __NR_getuid and uses it to index into a table of function pointers, where the actual kernel address of the sys_getuid() routine is found.
Adding a custom syscall
In this lab, we'll add a system call to our skiff kernel that allows you to control the state of the row of eight LEDs on the front of your skiff board. We'll actually use two--one to get their status, and one to set it. But before we make syscalls that do this, we're going to learn to do it without modifying the kernel at all. One philosophy of Linux (and UNIX in general) is for the kernel to provide only the most basic mechanisms necessary to get things done, and leaving it up to userspace programs to actually do them.
In this case, that mechanism already exists. The hardware control interface for the skiff's LEDs is memory I/O. When the skiff's CPU reads and writes certain memory addresses, the I/O is mapped to a non-memory device. Many drivers work this way--video cards work by memory I/O, which is why X servers are userspace processes and not kernel drivers. There is very little X support in the kernel itself. Another example is the skiff's flash memory--there's no "flash driver". Rather, I/O sent to the "flash segment" of the memory bus ends up programming flash rather than poking memory.
Part 1 - LEDs from userspace
In linux, one way for a device to gain access to physical memory addresses is to mmap(2) the file /dev/mem. This allows processes to circumvent the kernel's virtual memory system.
Your assignment for part one is to write a userspace program that uses mmap(2) on /dev/mem to provide access to the LED I/O registers. You'll find the manpages for mmap(2), mem(4), and open(2) helpful. I've provided a header file gpio.h to give you the necessary magic numbers, and a skeleton C file led.c to get you started. Your program should print the value of the GPIO control registers in hexadecimal when invoked without any arguments. If given a single argument, your program should write that value to the proper register, allowing the user to control the LEDs.
Part 2 - A syscall for LED control
As you (hopefully) demonstrated above, this part of the lab is completely unnecessary as far as the kernel is concerned. We're only doing it as an exercise to learn about system calls.
In order to add our own syscalls, we'll need to create our own kernel function and put their addresses in the syscall table. This could be done directly (by modifying the table in entry.S), but that would require recompiling the kernel. Instead, we'll implement the syscalls in a kernel module and have the module modify the syscall table when it's loaded and unloaded2.
To do this, we'll select an unused entry in the table and replace it. This technique only works because we know the exact kernel we're working with. A real module might scan for an unused entry and use it, but this dynamic selection would require that applications using that call be recompiled to use the new index into the table. The truth is that the syscall interface is not designed to be modified at all. So for this lab, we'll just choose an arbitrary slot and replace whatever's there. We know that entry will be empty because we're working with our own kernel, but there are no such guarantees for other kernels.
UPDATE: One point I left out is an undocumented quirk of the _syscall macros. Suppose I've defined a syscall, void sys_myfunc(void), at entry 220 in the syscall table, and I've #declared the GETLED_IDX constant to be 220. I want to generate a function that calls it from a userspace program, so I use _syscall0(void,myfunc). This won't work as is, because the _syscall0 macro expects there to exist a #defined constant matching the function name, in this case __NR_myfunc. You can see where these __NR_ defines are made for existing syscalls in include/asm/unistd.h.
So, to use the _syscall macro in a userspace program for your custom syscall, you'll need something similar to
#define __NR_getled (__NR_SYSCALL_BASE + GETLED_IDX)
in your userspace code. Sorry for any confusion that may have caused.
You don't need to write these questions up, but everyone in your group should be prepared to answer them when your project is evaluated. The answers to some of these haven't been fully covered in class yet, but you should have some idea from your 3210 prerequisites.
1 or into some register. The exact mechanism is architecture-dependent.