Implementing a Parallel Processing Workflow

September 09, 2025

The refactoring work is complete, and the results are promising. By leveraging Python’s multiprocessing module to create a pool of data workers, the pipeline is now able to overlap data I/O and preprocessing with model inference. What previously took hours to run on a large dataset now completes in a matter of minutes.

The GPU utilization has increased significantly, and the performance metrics are finally within the targets I set at the beginning of the project. This architectural change was a major undertaking, but it was essential for building a pipeline that can realistically handle the scale of LSST data.

Search This Blog

GSoC 2025

Implementing a Parallel Processing Workflow

Comments

Post a Comment

Popular posts from this blog

Damn couldn't believe I got in again.

Proposal Sent! The Waiting Game Begins...

Proposal Mode: Activated