GIL – Global Interpreter Lock

Good man (or woman), do you have a lot to process? Are you using an interpreted language, like Python or Ruby? Did you decide to use parallelism to speed things up? Did you implement it using threads? Did you make it worse, slower? Are you having a having a hard time trying to figure out what you did wrong?

Chances are you did (mostly) everything right That’s right! Maybe you did everything right with your threads and you code is slower than your single threaded version. And if that’s the case, you might have stumbled across the GIL, which stands for  Global Interpreter Lock. Fancy name, uh? Well, basically, the GIL is a mutual exclusion lock that ensures that only one thread is being run in the interpreter at the time. In practice this means that full parallel execution is not allowed. Begin Python interpreted language, a GIL it’s part of it’s implementation. Lee see this with an example:

LIMIT = 50000000

def cycle(n):	
    while n < LIMIT:
	n += 1

cycle(0)

Simple example, but enough to exemplify. If we run this and time it:

time python single_thread.py

real 0m2.608s
user 0m2.440s
sys 0m0.004s

Let’s try and make it faster. Let’s divide the work between two threads, each of them doing half the work. We would expect a performance boost.

from threading import Thread

LIMIT = 50000000

def cycle(n):	
    while n < LIMIT:
	n += 1

t1 = Thread(target=cycle,args=(LIMIT/2,))
t2 = Thread(target=cycle,args=(LIMIT/2,))
t1.start()
t2.start()
t1.join()
t2.join()

And again, if we run it:

time python threaded.py

real 0m5.256s
user 0m6.416s
sys 0m1.092s

As we can see, we didn’t improve it but actually made it worse. And that’s because of what was told previously: the GIL prevents multiple threads to be run by the interpreter simultaneously. Instead, threads are switching, and that switching is controlled by the GIL. Let’s simple visualize it like this:

thread_switching

What advantages does this kind on implementation gives? Well, some actually:

  • easier implementation, easier memory management;
  • increased speed for single-threaded executions;
  • easier integration with libraries.

Of course it has some drawbacks, and can even have strange behaviour on multi-core environments (check Inside the Python Gil). Easily we can see that for code that needs to do CPU intensive processing we will have a problem. So, how can we get around this? I’ll leave you with a few tips::

  • Use python multiprocessing package. It offers concurrency by using subprocesses instead of threads.
  • Python is used loosely to refer to the default Python implementation, which is in C (CPyhton). There are other implementations (for example Jython, written in Java) that do not suffer from the GIL‘s effecs;
  • Do not use Python at all- you should use the right tool for the job, and Python might not be the one.

P.S. I want to thank Vitor Torres,  Ricardo Sousa and Nuno Silva for the reviews.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s