Concurrency and Parallelism to Optimize Nowadays Hardware Capability

9 min readMay 15, 2019

Origin of Problem

Computer software were written conventionally for serial computing (see here). This meant that to solve a problem, an algorithm divides the problem into smaller instructions to be executed sequentially. But technology is growing rapidly, including computing environment. Many improvements have done which resulting faster processing speed a computer system can provide, bigger capacity a storage has, and bigger problem scale to be solved.

The hardware development trend shows that nowadays hardware is scaled horizontally (by increasing the amount of machine) rather than scaled vertically (by adding more power to machine). It was resulting the invention of multicore processor, multiprocessor, and even cloud computing. Hence, to optimize nowadays computer capability which has developed in many ways, there are new approaches called concurrency and parallelism. You may see the detail of computing environment development and importance of those new approaches here.

Those three terms talk about how computer system execute or process tasks. I assume that reader has fundamental knowledge about programming. This article will focus to explain those terms theoretically, by learning from case study, and explain the relationship related to optimize nowadays hardware capability.

Fundamentals

Sequential

Processing that occurs in the order that it is received

It is obvious that the tasks/jobs are executed one by one based on order. If we had two tasks and going to be processed by one resource, then it would process the first task that come to the queue. After the first one get finished, then it would continue to process the second task.

Parallelism

Doing lots of things at once

Word ‘doing’ in the statement is bold to show that it is the key. Doing parallelism is literally doing many things in the same time. In computer, it means there will be more than one tasks are executed independently in the same time with different processing resources. Hence, requirements to do parallelism are :

Need more resources (more than 1) to process tasks
The tasks have no dependency to each other

You may hear/read about embarrassingly parallel. In parallel computing, an embarrassingly parallel workload, or embarrassingly parallel problem, is problem which has little or no effort is required to separate the problem into a number of parallel tasks.

On the other side, reality shows that not all programming problems are parallelizable (can be executed in parallel). So no matter how many processing resources we have, it will not affect the processing speed. Such a problem will end up with a single processing resource that execute the task sequentially or concurrently.

Concurrency

Dealing lots of things at once

Word ‘dealing’ is also bold to show the difference between concurrency and parallelism. Dealing many things at once means completing many things at once, but it does not matter whether they are executed in the same time or not. Hence, concurrency context can be achieved with one or more processing resources. Dealing many things at once with one processing resource means that you are doing many things as they are executed in the same time by doing context switching between tasks. On the other side, concurrency context with many processing resources means doing parallelism. It means we are doing concurrency by doing parallelism, but not vice versa.

Concurrency concept lets us to think about how to structure our program to be executed optimally. It encourages us to break down a long process which has many times or long time of waiting state into pieces of process. With this approach, we can execute them concurrently to eliminate waiting time.

Case Study

So which one is better? How can concurrency and parallelism utilize nowadays hardware capability optimally? To deepen our knowledge and to answer those questions, let us learn it from a case to solve problems/tasks. So let say we have task called ‘Do The Laundry’ that need to be done. Let say we have to do laundry over 4 packets. Let us omit the factor of washing machine in this case. After we identify our task, we get the detail :

Do The Laundry
It takes total 30 minutes for each packet. It consists of the clothes preparation to be cleaned in first 5 minutes, 20 minutes of waiting, and last 5 minutes to pick up our clothes.

Let us solve those problems with some of techniques:

Sequential

In this condition we only have a task which consist of 4 steps. Each step represents an activity to do laundry over 1 cloth packet that is worth 30 minutes. Since it is the only task, then I will do it by myself. I will take first step of job to be done in first 30 minutes and jump to next step after it is finished, and so on. I will spend 5 minutes of clothes preparation for each step, wait for 20 minutes, pick up cloth packet for last 5 minutes, and then start over a new step to do laundry the next packet. It takes about 120 minutes for the task to be done with sequential process.

This kind of approach does not take advantage of available processing resources amount. No matter how many people there are, they just can not help me since there is only 1 task to be done with 1 person.

Parallelism

As I mention before, parallelism needs more processing resources (human in this case) to do many tasks/jobs at once and it needs task that can be parallelized.

So let us analyze our task. We have a big or long process called ‘do the laundry’ task which consist of 4 cloth packets to be washed. We can see in sequential process, it consists of 4 steps (each 30 minutes) to finish the task. But we can realize that there is no relationship or dependency between those 4 steps. So we can conclude that this kind of task is parallelizable. We can break down this long and big process to smaller processes. In this case, we will break it down into 2 smaller processes which each consists of 2 cloth packets (worth 60 minutes). So now we have 1st and 2nd task with same workload.

Since we need more processing resources (human) to do that new smaller task, then I will ask help from my sister to take 2nd task while I take 1st task. With this execution method, there are 2 tasks (each task consist of 2 laundry packets) are executed independently by different processing resources (me and my sister). It takes about 60 minutes for all those two tasks to be done with 2 processing resources in parallelism technique. It is faster than sequential process.

So, what you think? Yes, it can be better. We can do more parallelization by breaking the task down into 4 smaller tasks which each consist only 1 cloth packet to be washed (worth 30 minutes). But the requirement is that we need to do horizontal scaling, that is to increase the amount of processing resource (human in this case). We need scale up to 4 available processing resources (human) in total to do that all tasks. It will give 30 minutes execution time in total with 4 tasks which are done by 4 processing resources.

Nowadays computing environment supports kind of horizontal scaling. Technology such as multicore processor, multiprocessor, and cloud computing are invented to achieve faster processing speed in computer system by doing parallelism.

Concurrency

Concurrency can be achieved with only 1 processing resource (human) which can deal many jobs at once. To utilize single resource to deal with many things, it requires us to have understanding about the problems so we can structure our program effectively. In this case, we have analyze that our ‘do the laundry’ task can be divided into 2–4 smaller tasks/jobs since there are no dependency between them. The efficiency is work on parallelism since we have more than 1 processing resource to execute the tasks. But, Does this parallelization is work if we only have 1 processing resource?

The answer is we have to analyze deeper about the problem. If we see more detail for each our smaller tasks that has been divided, it consists of 3 steps. Preparation, waiting, and picking up cloth step. Waiting the laundry step causes us (as processing resources) to be idle for 20 minutes while we can do other useful things. Hence, we can do concurrency (handle many things) with only 1 processing resource (person) by utilizing that ‘spare time’.

How it works :

First, I will do my 1st job.
My first 5 minutes will be spent on cloth preparation in 1st job.
After I jump into waiting step (cost 20 minutes) in 1st job, I will move on the 2nd job and start the 5 minutes preparation step over it. Then I will do the same thing until 4th job.
Finally, waiting period for each job will be done and I can finish my last 5 minutes to pick the cloth up start from 1st, 2nd, 3rd, and 4th job.

By doing this approach, I just handle many (4) jobs at once. You can see on the image above, an arrow between task list and ‘me’ emotion means I as a processing resource will do context switching between tasks. It cuts off inefficient waiting time for each task by doing another task and resulting only 45 minutes execution time in total. It is better than sequential process (120 minutes), and even better than parallelism with 2 processing resources do 2 smaller tasks at once (60 minutes).

This condition is an analogy of a computer system which has single processor with single core in it. The core will do task/job switching more often to handle many jobs.

Concurrency with Parallelism

Concurrency also can be achieved through parallelism (with many processing resources). In this case, I will need help from my sister. Each of us will handle 2 jobs. I will handle 1st and 2nd job while my sister will handle 3rd and 4th job.

How it works:

With this approach, we will do 5 minutes in our first job simultaneously. (Me on 1st job while my sister on 3rd job).
After move into waiting step in our first job, each of us will jump into our second job, that is 2nd job for me and 4th job for my sister.
After 5 minutes of our second job, we will jump into waiting step in our second job and resulting both of us idle for a while.
After my cloth is washed completely in 1st job, i will pick it up. As i finish my 1st job, my cloth in 2nd job is finished to be washed and i will pick it up too. On the other side, my sister will do the same with 3rd and 4th job.

This case is an analogy of a computer system which has multiprocessor or multicore processor (or both). Concurrency will let a resource to work on other job that is parallelizable rather than do nothing (idle). It takes 35 minutes in total. It is clear that this type of execution is the best from other execution method (excluding parallelism with 4 processing resources and 4 smaller tasks).

Conclusion

Concurrency is about dealing with many things at once, while parallelism is about doing many things at once. Dealing many things at once means completing many things at once, regardless whether tasks are executed in the same time or not. Concurrency can be achieved with one processing resource by doing context switching among tasks, while it also can be achieved with many processing resources through parallelism.

Both approaches are invented to utilize optimally nowadays hardware development where resources are scaled horizontally (by increasing the amount of machine), than vertically (by adding more power to machine).

The understanding of those approaches will help us to structure our program to work efficiently. The impact is even bigger as the scale grows. We can do cost efficiency by optimize our software execution rather than ask our company to upgrade or add more processing resources to handle growth of user amount or workload.

Well, hope this article gives you new insight or deepen your knowledge in software engineering field.

Disclaimer : This article is written based on my understanding and quoted from many resources which I mention below. I open to critics, appreciations, and suggestions. Let me know what you think by write your comment below.

References

Concurrency and Parallelism to Optimize Nowadays Hardware Capability

Origin of Problem

Fundamentals

Sequential

Parallelism

Concurrency

Case Study

Sequential

Parallelism

Concurrency

Concurrency with Parallelism

Conclusion

Written by Alfian Wijayakusuma