Friday, May 13, 2005

Threading processes or Processing threads?!

The linux kernel does not support threads, even some might say that it does. There are also user space thread libraries. But they all suffer from the basic limitation within the kernel. Within the kernel threads are treated as processes. This can be seen even with the most widely used PThread library. Suppose you create a process that itself creates 10 threads...

extern void * threadrunner( void *parms );
pthread_t tid[10];

for( int i = 0; i < 10; ++i )
  pthread_create( &tid[i], NULL, threadrunner, NULL );

for( int i = 0; i < 10; ++i )
  pthread_join( tid[i], NULL );

// threadrunner defn
void * threadrunner( void *parms )
{
  sleep(20);
}

So now, each thread will sleep for 20 seconds before they can join into the main 'thread'. If you run this in background and run a ps -u - there'll be 11 seperate processes with the same name but different PIDs. What's happened here is that, 10 seperate processes are created and the library is providing the necessary abstraction so that they behave as threads. It does it internally with the help of the 'sys_clone' system call and also the process control calls. So, pthread_join does something very similar to waitpid call.

Recently there has been a new POSIX thread library that patches the kernel to modify the main process structure within the kernel code.

struct task_struct;

It adds a 'tid' (thread id), 'tgid' (thread group id) to the struct. It also modifies the getpid and the getgid calls to return the tid and the tgid respectively whenever the process id is the same. This implementation is a lot better than the previous one. I guess that it's supported only on the newer 2.6.x kernels.

So, all this while, were we threading processes or processing threads?! :-)

No comments:

Post a Comment

What I want to say is: