Thread Programming in Python

Thread Programming in Python
Thread Programming in Python

In Computer Science, a thread is defined as the smallest unit of execution with the independent set of instructions. In simple terms, a thread is a separate flow of execution, it means that our program will have two things happening at once, but for most Python 3 implementations the different threads do not actually execute at the same time.

The advantage of thread programming is that it allows a user to run different parts of the program in a concurrent manner and make the design of his/her program simpler.  

During thread programming, different processors run on a single programme and each one of them performs an independent task simultaneously. However, if we want to perform multiprocessing, then we need to execute our code in a different programming language or need to use the multiprocessing module. 

In the CPython implementation of Python, interactions are made with the Global Interpreter Lock (GIL) which always limits one Python thread to run at a time. In threading, Tasks that spend much of their time waiting for external events are generally good candidates for threading. These are all true in the case when the code is written in Python. However, in the case of thread programming in C other than Python, they have the ability to release GIL and run in a concurrent manner.  

Let us see how to start a thread in Python. 

Starting a Thread:

 Now from the above introduction part, we’ve got an idea of what a thread is, let’s learn how to make one. The Python Standard Library contains a module named threading which comprises all the basics needed to understand the process of threading better. By this module, we can easily encapsulate threads and provide a clean interface to work with them. If we want to start a separate thread, first we need to create a Thread instance and then implement .start(): 

If we look around the logging statements, we can see clearly that the main section is creating and starting the thread:

t = threading.Thread(target=thread_function, args=(1,))t.start()  

When a Thread is created, a function and a list of arguments to that function are passed. In the above example, thread_function() is being run and 1 is passed as an argument. The function, however, simply logs messages with a time.sleep() in between them.

Working with Multiple Threads: The example code so far has only been working with two threads: the main thread and one we started with the threading. Thread object. Frequently, we’ll want to start a number of threads and have them do interesting work. The process of executing multiple threads in a parallel manner is called multi-threading. It enhances the performance of the program and Python multithreading is quite easy to learn.

Let us start understanding multi-threading using the example we used earlier:

This code will work in the same way as it was in the process to start a thread. First, we need to create a Thread object and then call the .start() object. The program then keeps a list of Thread objects. It then waits for them using .join(). The threads are sequenced in the opposite order in this example. This is because multi-threading generates different orderings. The Thread x: finishing message informs when each of the thread is done. The thread order is determined by the operating system, so it is essential to know the algorithm design that uses the threading process.

Thread Pool Executor: Using a ThreadpoolExecutor is an easier way to start up a group of threads. It is contained in the Python Standard Library in concurrent futures one can create it as a context manager using the help of with statement. It will help in managing and destructing the pool. 

Example to illustrate a ThreadpoolExecutor: 

The above code creates a ThreadpoolExecutor and informs how many worker threads it needs in the pool and then .map() is used to iterate through a list of things. When the with the block ends, .join() is used on each of the threads in the pool. It is recommended to use ThreadpoolExecutor whenever possible so that you never forget to .join() the threads.

Race Conditions and Synchronisation in Thread Programming:

When multiple threads try to access a shared piece of data or resource, race conditions occur. Race conditions produce results that are confusing for a user to understand and it occurs rarely and is very difficult to debug. To solve this race condition, we need to find a way to allow only one thread at a time into the read-modify-write section of our code. The most common way to do this is called Lock in Python. In some other languages, this same idea is called a mutex. It comes from  MUTual EXclusion, which is exactly what a Lock does. A Lock is an object that acts like a hall pass which will allow only one thread at a time to enter the read-modify-write section of the code. If any other thread wants to enter at the same time, it has to wait until the current owner of the Lock gives it up.  

The basic functions are .acquire() and .release(). A thread will call  my_lock.acquire() to get the Lock. However, this thread will have to wait if the Lock is held by another thread until it releases it. The Lock in Python also works as a context manager and can be used within a with statement and will be released automatically with the exit of with block. Let us take the example of  FalseDatabase class and add Lock to it:

A lock is a part of the threading.Lock() object and is initialised in the unlocked state and later released with the help of with statement. 

Objects in Threading: Python consists of few more threading modules which can be handy to use in different cases. It includes the following:

Semaphore: The first Python threading object to look at is threading.Semaphore. A semaphore is a counter module with few unique properties. The first property is that its counting is atomic which means that the operating system will not swap the thread while incrementing or decrementing the counter. The internal counter is incremented when you call .release() and decremented when you call .acquire().

The other property is that if a thread calls .acquire() while the counter is zero, then the thread will be blocked until another thread calls .release(). The main work of semaphores is to protect a resource having a limited capacity. It is used in cases where you have a pool of connections and you want to limit the size of the pool to a particular number.

Timer: The timer module is used to schedule a function that is to be called after a certain amount of time has passed. We need to pass a number of seconds to wait and a function to call to create a Timer:
t = threading.Timer(20.0,my_timer_function) we start the Timer by calling .start(). The function will be called on a new thread at some point after the specified time, but be aware that there is no promise that it will be called exactly at the time we want.

If we want to stop a Timer that we’ve already started, we can cancel it by calling .cancel(). Calling .cancel() after the Timer has triggered does nothing and does not produce an exception.
A Timer can be used to prompt a user for action after a specific amount of time. If the user does the action before the Timer expires, .cancel() can be called.

Conclusion: In this article, we have covered most of the topics associated with thread programming in Python. We have discussed, What is threading in which we discuss deeply about the thread as creating it and how to start a thread in our program, then after we discuss about the multiple threads and various difficulties in threading as Race conditions after that we discuss how to avoid these difficulties in our program with the help of synchronisation and at last we understand about the objects in threading. So, Python threading allows us to have different parts of our program run concurrently and with this, we can simplify our design. If someone got some experience in Python and want to speed up his/her program using threads, then this article is suitable for them.

To read more about Python programming, click here.

By Mayank Mishra