Tag: python

Python threads that mysteriously appear to stop executing

I just solved a weird problem that, once understood, actually makes a lot of sense, but would probably be pretty hard to identify without a lot of guesswork.

My scenario, simplified:

  • One thread that runs in an infinite loop, polling a C-implemented function (from Cython), with a five-second timeout, to populate a queue
  • Any number of worker threads that block on the queue (timeout=7.5s) to get events to process

Now, this should seem like a fairly straightforward thing: a handful of threads, each capable of running in isolation, except for a common dependency on a threadsafe queue. The problem, however, is that the worker threads all eventually seemed to freeze, doing nothing while the infinite-looping thread ran fine.

Symptoms included being able to enumerate all threads, being able to have printouts saying that the threads were, indeed, alive, and what seemed to be freezing related to the logging module.

After commenting out every logging statement in the threads, the problem persisted, so they weren’t the issue. After that, I tried replacing queue.get() with a simple time.sleep(7.5) to see if the threads were still operating and the queue was at fault. The same behaviour occurred, with threads freezing when they slept. This implied that the problem was related to blocking.

It wasn’t until I started pinging someone uninvolved as a sounding board that the pattern started to make sense: the threads may not be reacquiring the GIL, so they might not ever be able to resume, even after they’re supposed to wake up. I tried waiting for ten minutes and, sure enough, one of the threads showed signs of life.

The problem was that my C polling function never released the GIL, so the entire timeout window would have been one big instruction to Python. Instead of taking advantage of threads for extended I/O delays, every other thread was blocking on their completion and the default 100-instruction context-switch was making the process take forever.


Simple to fix, but really, really hard to diagnose when just looking at the obvious symptoms. Hopefully, anyone who reads this will jump to a conclusion faster than I did, since it’s the sort of issue that can be really frustrating in what seems like a common design.

Symmetric encryption in pure Python (Blowfish)

For a project at work, it became obvious that I would need to implement some form of partial encryption between hosts in a self-configuring network. Nothing super-extreme, of course, since all traffic will exist within a closed environment; I just need enough protection to prevent casual observers from finding out enough about the protocol to inject malicious packets and provide some simple handshaking between the components of the system.

Blowfish came to mind as a good scheme for handling this (each host and operator can be considered sufficiently secure, so sharing an application-specific pass-phrase via config-file is an acceptable solution), and a small amount of Googling turned up http://ivoras.sharanet.org/projects/blowfish.html (based on http://felipetonello.com/scripts/python/blowfish.txt, though I doubt that’s the original site), which I ended up using as the basis for my implementation. Actually, ‘basis’ isn’t the right word, since I didn’t change any logic at all. Rather, I just replaced the more antiquated data-types and access methods with more modern equivalents, restructured the layout, and sought to bring things more in line with PEP 8. The result being a slightly faster, leaner implementation that’s a bit more readable.

My code, with an identical interface to Ivan Voras’s version, dual-licensed under the GPLv1 and Artistic Licenses, like the original, is reproduced below: