Structured Concurrency as extension of Go statement considered harmful

alex_ber
11 min readNov 18, 2024

--

General Principles, implementation in Python 3.11 and limitations.

https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

I urge you to read the whole text via the link above.
As a side note, you can also read Edsger W. Dijkstra’s Go to statement considered harmful and Notes on structured programming, but you don’t have to.

I will try to give you a quick summary Nathaniel J. Smith’s post above.

Historically, when first invented, if, loop, function calls, etc. where considered as syntactic sugar over goto. For example,

#include <stdio.h>

int main() {
// Using a for loop to print numbers from 1 to 5
for (int i = 1; i <= 5; i++) {
printf("%d\n", i); // Print current value of i
}

return 0; // Exit the program
}

was viewed as

#include <stdio.h>

int main() {
int i = 1; // Initialize counter

start_loop:
if (i <= 5) { // Condition for the loop
printf("%d\n", i); // Print current value
i++; // Increment the counter
goto start_loop; // Jump back to the start of the loop
}

return 0; // Exit the program
}

Later, it was realized that

Any program can be written using branching, loops, and sequential operations.

I will not go into the theory why (see Turing Completenes), but it was realized, that we can think about branching (if), loops (for, while,do-while), and sequential execution as a primitives in our computer language.

As corollary: Goto was not longer required. We can write any program without it.

Side note: Some limited goto abilities still exists. For example, multiple return points from a function, or beak and continue. But they can’t “jump out of the function”. They affect only local piece of code, at most function.

Removing goto statements enables new features

Now, I will selectively quote from the note:

For example, Python has some nice syntax for resource cleanup: the with statement. You can write things like:

with open("my_file") as file_handler:
...

and it guarantees that the file will be open during the code, but then closed immediately afterward. Most modern languages have some equivalent (RAII, using, try-with-resource, defer, …). And they all assume that control flows in an orderly, structured way. If we used goto to jump into the middle of our with block… what would that even do? Is the file open or not? What if we jumped out again, instead of exiting normally? Would the file get closed? This feature just doesn’t work in any coherent way if your language has goto in it.

Error handling has a similar problem: when something goes wrong, what should your code do? Often the answer is to pass the buck up the stack to your code’s caller, let them figure out how to deal with it. Modern languages have constructs specifically to make this easier, like exceptions, or other forms of automatic error propagation. But your language can only provide this help if it has a stack, and a reliable concept of “caller”…

goto statements: not even once

So goto — the traditional kind that ignores function boundaries — isn’t just the regular kind of bad feature, the kind that’s hard to use correctly. If it were, it might have survived — lots of bad features have. But it’s much worse.

Even if you don’t use goto yourself, merely having it as an option in your language makes everything harder to use. Whenever you start using a third-party library, you can’t treat it as a black box — you have to go read through it all to find out which functions are regular functions, and which ones are idiosyncratic flow control constructs in disguise. This is a serious obstacle to local reasoning. And you lose powerful language features like reliable resource cleanup and automatic error propagation. Better to remove goto entirely, in favor of control flow constructs that follow the “black box” rule.

go statement considered harmful

So that’s the history of goto. Now, how much of this applies to go statements? Well… basically, all of it! The analogy turns out to be shockingly exact.

Go statements break abstraction. Remember how we said that if our language allows goto, then any function might be a goto in disguise? In most concurrency frameworks, go statements cause the exact same problem: whenever you call a function, it might or might not spawn some background task. The function seemed to return, but is it still running in the background? There’s no way to know without reading all its source code, transitively. When will it finish? Hard to say. If you have go statements, then functions are no longer black boxes with respect to control flow. ..

Go statements break automatic resource cleanup. Let’s look again at that with statement example:


with open("my-file") as file_handle:
...

Before, we said that we were “guaranteed” that the file will be open while the code is running, and then closed afterwards. But what if the code spawns a background task? Then our guarantee is lost: the operations that look like they’re inside the with block might actually keep running after the with block ends, and then crash because the file gets closed while they’re still using it. And again, you can’t tell from local inspection; to know if this is happening you have to go read the source code to all the functions called inside the … code.

If we want this code to work properly, we need to somehow keep track of any background tasks, and manually arrange for the file to be closed only when they’re finished. It’s doable — unless we’re using some library that doesn’t provide any way to get notified when the task is finished, which is distressingly common (e.g. because it doesn’t expose any task handle that you can join on). But even in the best case, the unstructured control flow means the language can’t help us. We’re back to implementing resource cleanup by hand, like in the bad old days.

Go statements break error handling. Like we discussed above, modern languages provide powerful tools like exceptions to help us make sure that errors are detected and propagated to the right place. But these tools depend on having a reliable concept of “the current code’s caller”. As soon as you spawn a task or register a callback, that concept is broken. As a result, every mainstream concurrency framework I know of simply gives up. If an error occurs in a background task, and you don’t handle it manually, then the runtime just… drops it on the floor and crosses its fingers that it wasn’t too important. If you’re lucky it might print something on the console…

Of course you can handle errors properly in these systems, by carefully making sure to join every thread, or by building your own error propagation mechanism like errbacks in Twisted or Promise.catch in Javascript. But now you’re writing an ad-hoc, fragile reimplementation of the features your language already has. You’ve lost useful stuff like “tracebacks” and “debuggers”. All it takes is forgetting to call Promise.catch once and suddenly you’re dropping serious errors on the floor without even realizing. And even if you do somehow solve all these problems, you’ll still end up with two redundant systems for doing the same thing.

So, as we have seen spawning new Thread like go breaks two abstractions:

  • Linear “black box” threading model
  • First class error handling and interuption

So, what is Structure Concurrency?

The one-line summary is:

When a task splits into several concurrent subtasks, then the subtasks must complete before the main task continue.

We can treat what is happening in 3 subtasks as “black boxes”. We will join them and will get all results and exception from them before proceeding. More over, we can have some cancellation mechanism, for example, if one task raise exception, others receives signals for cancellation.

I will not provide how this was solved in Trio, instead I will focus on 2 new features that tries to address exactly these points and their limitations.

Basic Example of asyncio.TaskGroup:

import asyncio

async def main():
async with asyncio.TaskGroup() as tg:
tg.create_task(some_async_function())
tg.create_task(another_async_function())

asyncio.run(main())

In this example, two asynchronous functions are managed by a single TaskGroup. If either function raises an exception or is cancelled, the TaskGroup ensures that both tasks are cancelled.

Exception Handling with TaskGroup

An essential feature of asyncio.TaskGroup is its approach to exception handling. When an exception occurs in any task within the group, it cancels all other tasks before raising the exception. This behavior makes error management more predictable and manageable when dealing with multiple asynchronous tasks.

import asyncio

async def task1():
await asyncio.sleep(1)
raise ValueError('An error in task1')

async def task2():
await asyncio.sleep(2)
print('Task2 done')

async def main():
async with asyncio.TaskGroup() as tg:
tg.create_task(task1())
tg.create_task(task2())

# At this point, both tasks have been cancelled due to the exception in task1.

asyncio.run(main())

This example demonstrates the exception handling mechanism. Even though task2 would have completed successfully if left alone, the failure of task1 leads to the cancellation of both tasks.

Scheduling Tasks in Parallel and Gathering Results

Another significant advantage of using asyncio.TaskGroup is the capability to run tasks in parallel and easily gather their results. This contrasts with the traditional use of asyncio.gather, providing a more structured and exception-safe way of managing tasks.

import asyncio

async def compute(x):
await asyncio.sleep(1)
return x * x

async def main():
async with asyncio.TaskGroup() as tg:
tasks = [tg.create_task(compute(x)) for x in range(5)]

results = []
for task in asyncio.as_completed(tasks):
result = await task
results.append(result)
print(result) # Print or process each result as it becomes available

print(results) # Print all results at the end if needed

asyncio.run(main())

This approach demonstrates how to schedule tasks in parallel and collect their results upon completion, showcasing the TaskGroup‘s versatility and efficiency in handling multiple async operations.

Limitations

  • Blocking I/O can freeze the event loop, preventing other tasks from executing (this is true for any async function).
  • Tasks without async checkpoints or awaitable points can prevent clean cancellation or proper shutdown.
  • Unresponsive tasks (e.g., infinite loops or synchronous tasks) may not respect cancellation and can stall shutdown.
  • Structured concurrency in TaskGroup requires tasks to be cooperative with the event loop, and any non-async code can lead to delayed shutdowns or poor application performance.

To avoid these limitations, ensure tasks inside the TaskGroup are async, cooperative, and capable of handling cancellation or timeouts properly. Use async-compatible methods for any blocking operations.

Let’s go over limitations in some more details:

Tasks That Stall or Block Shutdown Can Prevent the Event Loop from Completing

Issue: Tasks that do not complete in a reasonable time, whether due to being unresponsive, having no timeout handling, or performing blocking operations, can prevent the event loop from shutting down properly and stall the entire application. These tasks can:

  • Be unresponsive due to infinite loops, errors that prevent normal termination, or missing cancellation points.
  • Take too long to finish because they don’t have proper timeouts or cancellation mechanisms, causing the application to hang while waiting for them to finish.
  • Block the event loop if they perform blocking I/O (e.g., synchronous file operations, database queries) or CPU-bound synchronous operations (e.g., heavy calculations or long-running loops).

Impact:

  • Delayed Shutdown: If a task takes too long to finish or is stuck in an unresponsive state, the TaskGroup will continue to wait for it, preventing the application from shutting down promptly. The application might hang indefinitely while waiting for tasks to complete or be canceled.
  • Frozen Event Loop: If tasks are blocking the event loop (either through blocking I/O or synchronous CPU-bound work), other tasks cannot execute, and the event loop becomes unresponsive. This leads to poor application performance and a delay in shutdown as the event loop remains blocked.
  • Poor Responsiveness: Blocking tasks or those that take an unreasonably long time to complete can prevent the application from handling incoming requests or other concurrent tasks, impacting the overall system responsiveness.

Solution:

  • Timeout Handling: Ensure that tasks have appropriate timeout mechanisms, such as using asyncio.wait_for to automatically cancel tasks that exceed a time limit, or using custom timeouts to handle long-running tasks.
  • Cancellation Handling: Design tasks to be cooperative with the event loop by ensuring they check for cancellation regularly, especially during long-running or computationally intensive operations. Catch and handle CancelledError properly within tasks.
  • Avoid Blocking I/O: For I/O-bound tasks, use asyncio.to_thread or run_in_executor to offload blocking I/O operations to separate threads or processes, freeing up the event loop to handle other tasks. I will write story about new API that I’ve been created for this purpose, soon.
  • Avoid Synchronous CPU-Bound Work: For CPU-intensive tasks, offload heavy computations using asyncio.to_thread to prevent blocking the event loop. I will write story about new API that I’ve been created for this purpose, soon.

As you can see subtasks can prevent proper shutdown or block the event loop, whether through unresponsiveness, lack of timeouts, or blocking I/O/CPU work.

Appendix A

Based on https://www.playfulpython.com/python-3-11-exception-groups/

Now, let’s look on another feature of Python 3.11 — ExceptionGroup.

In the code below, we use asyncio to read two files.

import asyncio

async def read_file(filename):
with open(filename) as f:
data = f.read()
return data


async def main():
try:
async with asyncio.TaskGroup() as g:
g.create_task(read_file("unknown1.txt"))
g.create_task(read_file("unknown2.txt"))
print("All done")
except* FileNotFoundError as eg:
for e in eg.exceptions:
print(e)

asyncio.run(main())

We call read_file in lines 12 and 13 to read two files.

Neither of the files are existing, so both the tasks will give FileNotFoundError. TaskGroup will wrap both the failures in an ExceptionGroup and we should use except* to handle both the errors.

As you can see, this is nice fit for TaskGroups.

Creating Exception Groups

You can create exception groups using the new ExceptionGroup class. It takes a message and a list of exceptions.

def fn():
e = ExceptionGroup("multiple exceptions",
[FileNotFoundError("unknown filename file1.txt"),
FileNotFoundError("unknown filename file2.txt"),
KeyError("key")])
raise e

The code above takes two FileNotFoundError and a KeyError and groups them together into an ExceptionGroup. We can then raise the exception group, just like a normal exception.

When we run the code above, we get an output like this

  + Exception Group Traceback (most recent call last):
| File "exgroup.py", line 10, in <module>
| fn()
| File "exgroup.py", line 8, in fn
| raise e
| ExceptionGroup: multiple exceptions (3 sub-exceptions)
+-+---------------- 1 ----------------
| FileNotFoundError: unknown filename file1.txt
+---------------- 2 ----------------
| FileNotFoundError: unknown filename file2.txt
+---------------- 3 ----------------
| KeyError: 'key'
+------------------------------------

The traceback here shows all the exceptions that are a part of the group.

Handling Exception Groups

Now that we have seen how to raise an exception group, let us take a look at how to handle such exceptions.

Normally, we use try ... except to handle exceptions. You can certainly do this

try:
fn()
except ExceptionGroup as e:
print(e.exceptions)

The exceptions attribute here contains all the exceptions that are a part of the group. The problem with this is that it makes it difficult to handle the individual exceptions within the group.

For this, Python 3.11 has a new except* syntax. except* can match exceptions that are within an ExceptionGroup

try:
fn()
except* FileNotFoundError as eg:
print("File Not Found Errors:")
for e in eg.exceptions:
print(e)
except* KeyError as eg:
print("Key Errors:")
for e in eg.exceptions:
print(e)

Here we use except* to directly match the FileNotFoundError and the KeyError which are within the ExceptionGroup. except* filters the ExceptionGroup and selects those exceptions that match the type specified.

Key differences with try … except

Let us look at a few key points from this snippet

try:
fn()
except* FileNotFoundError as eg:
print("File Not Found Errors:")
for e in eg.exceptions:
print(e)
except* KeyError as eg:
print("Key Errors:")
for e in eg.exceptions:
print(e)

First, when we use the as syntax to assign to a variable, then we get an ExceptionGroup object, not a plain exception. In the code above, eg is an ExceptionGroup. For the first except* block, eg will contain all the FileNotFoundErrors from the original exception group. In the second except*, eg will contain all the KeyErrors.

Second, a matching except* block gets executed at most once. In the above code, even if there is more than one FileNotFoundError, the except* FileNotFoundError block will run only once. All the FileNotFoundErrors will be contained in eg and you need to handle all of them in the except* block.

Another important difference between except and except* is that you can match more than one exception block with except*. In the snippet above, the code in except* FileNotFoundError and except* KeyError will both be executed because the exception group contains both types of exceptions.

--

--

alex_ber
alex_ber

Written by alex_ber

Senior Software Engineer at Pursway

No responses yet