horizontal and vertical parallelism

Threading: Parallel but Horizontal

Parallel programming or better parallel execution allows code to be run in parallel. So each line can be executed at virtual the same time. Pure time parallelism can only occur when code is executed on more then one processor. If you program is just running on one CPU we are running only virtual parallel and switching from time to time between the tasks.
But the programmer cannot determine at which time this exactly happens. Of course there are pretty good scheduler run by the OS and triggered by hardware interrupts so that the timing of context switches is already optimized. But in many cases the programmer knows better in front when there is a good moment to halt and give other code way (for example doing I/O with the hard-drive). A very interesting stack overflow answer on how and when task switch happen can be found here.


My mental model now visualize this as horizontal parallelism because the function, method etc. is executed virtually parallel but in best case at the same line of code.

# parallel code using threading
def bar():

    # this function will be executed virtual parallel. 
    # The context switch and therefore the execution of a line of code 
    # is non-deterministic from perspective of the function.
    def foo(result):
        b = 0
        for x in range(10):
            b = b + x
            time.sleep(0.1)  # here we cannot tell that we halt here for some time. 
        result.append(b)

    results = []
    #  create 2 threads
    threads = [threading.Thread(target=foo, args=(results,)) for _ in range(2)]
    # start 2 threads "parallel"
    [t.start() for t in threads]
    # wait for them
    [t.join() for t in threads]
    print("result:", results)
    # prints result: [45, 45]

A typical pattern to use the virtual or actual parallelism is the so called producer/consumer pattern.

If threads or processes are executing exactly the same code for example inherited from the same function as others do, we call them very often worker. Worker often communicate through a queue where data comes in, is calculated and send out again by another queue. A typical design pattern is the producer/consumer model.

# this is "sync" consumer producer pattern were the parallelism
# will be added by creating more producer oder consumer
# instead a queue one also could use a mutable buffer like a list but needs then a semaphore.
# The queue does this all for us.
def producer(q:queue.Queue):
    for work in range(100):
        time.sleep(0.001)  # extra heavy work
        q.put(work)
    q.put(None)  # poison pill
    print("producer done")


def consumer(q: queue.Queue):
    res = 0
    while True:
        item = q.get()
        if item is None:
            break
        else:
            res = res + item
    print("consumer result", res)


def main():
    q = queue.Queue()
    threads = [threading.Thread(target=target, args=(q,)) for target in [producer, consumer]]
    [t.start() for t in threads]
    [t.join() for t in threads]
    print("bye")

In traditional multitasking patterns like threading, horizontal parallelism is done at high costs of memory. Threads need their context to run and the context needs to be switched whenever it is time to switch to a different task. In times where memory is cheap and CPU has large bandwidth to the memory this disadvantage tends to disappear as a matter of performance. This also could be a reason why threading is still one of the most popular multitasking paradigm used. But there are many more pitfalls one may consider. But that is not subject here.

Leave a Comment

Your email address will not be published. Required fields are marked *