Threading in Python
threading module uses
multiprocessing uses processes. The difference is that threads run in the same memory space, while processes have separate memory. This makes it a bit harder to share objects between processes with multiprocessing. Since threads use the same memory, precautions have to be taken or two threads will write to the same memory at the same time. This is what the global interpreter lock is for.
Spawning processes is a bit slower than spawning threads. Once they are running, there is not much difference.
- Separate memory space
- Code is usually straightforward
- Takes advantage of multiple CPUs & cores
- Avoids GIL limitations for cPython
- Eliminates most needs for synchronization primitives unless if you use shared memory (instead, it's more of a communication model for IPC)
- Child processes are interruptible/killable
multiprocessingmodule includes useful abstractions with an interface much like
- A must with cPython for CPU-bound processing
- IPC a little more complicated with more overhead (communication model vs. shared memory/objects)
- Larger memory footprint
- Lightweight - low memory footprint
- Shared memory - makes access to state from another context easier
- Allows you to easily make responsive UIs
- cPython C extension modules that properly release the GIL will run in parallel
- Great option for I/O-bound applications
- cPython - subject to the GIL
- Not interruptible/killable
- If not following a command queue/message pump model (using the
Queuemodule), then manual use of synchronization primitives become a necessity (decisions are needed for the granularity of locking)
- Code is usually harder to understand and to get right - the potential for race conditions increases dramatically
Pool.apply is like Python
apply, except that the function call is performed in a separate process.
Pool.apply blocks until the function is completed.
Pool.apply_async is also like Python's built-in
apply, except that the call returns immediately instead of waiting for the result. An
ApplyResult object is returned. You call its
get() method to retrieve the result of the function call. The
get() method blocks until the function is completed. Thus,
pool.apply(func, args, kwargs) is equivalent to
pool.apply_async(func, args, kwargs).get().
pool.apply(f, args): f is only executed in ONE of the workers of the pool. So ONE of the processes in the pool will run f(args).
pool.map(f, iterable): This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. So you take advantage of all the processes in the pool.
- Multiprocessing vs Threading Python 总结
- Python - parallelizing CPU-bound tasks with multiprocessing 实际计算时间比较
- 多进程 进程/线程比较
- Python多进程模块Multiprocessing介绍 map/apply怎么用
- Python Multiprocessing Process or Pool for what I am doing? multiprocessing.Process multiprocessing.Pool
- Python multiprocessing.Pool: when to use apply, apply_async or map? -apply-apply-async-or-map
- Multiprocessing: How to use Pool.map on a function defined in a class?