In a world where we hear and talk a lot about making code run concurrent or in parallel, there’s sometimes a little bit of confusion between the two. It happens that many times we use one term when referring to the other or even use them indistinguishably. Let’s shed some light on the matter.
When we say that we have concurrency in our code, that means that we have tasks running in periods of time that overlap. That doesn’t mean that they run at the exact same time. When we have parallel tasks that means that they they run at the same time.
In a multi-core world it might seem that concurrency doesn’t make sense, but as everything, we should the right approach for the job at hand. Imagine for example a very simple web application where one thread handles requests and another one handles database queries: they can run concurrently. Parallelism has become very useful in recent times in the Big Data era, where we need to process huge amounts of data.
Let’s see an example of each, run and compare run times.
from threading import Thread LIMIT = 50000000 def cycle(n): while n < LIMIT: n += 1 t1 = Thread(target=cycle,args=(LIMIT/2,)) t2 = Thread(target=cycle,args=(LIMIT/2,)) t1.start() t2.start() t1.join() t2.join()
from multiprocessing import Process LIMIT = 50000000 def cycle(n): while n < LIMIT: n += 1 p1 = Process(target=cycle, args=(LIMIT/2,)) p2 = Process(target=cycle, args=(LIMIT/2,)) p1.start() p2.start() p2.join() p2.join()
Now, the times to run:
$ time python concurrent.py
$ time python parallel.py
As we can see, the parallel code runs much faster than the concurrent. Which accordingly to what was said previously makes sense,doesn’t it? In this example, we can only gain time if the tasks run simultaneously.
Your programming language of choice will give the tools needed to implement both the approaches. Analyze you problem, devise a strategy and start coding!
P.S. Please note, that an imperative implementation would run faster than the concurrent one due to the Python’s GIL.