RIPPLe: Building a Bridge Between LSST and DeepLense

It's hard to believe seven weeks have flown by. In that time, I've consumed countless cups of green tea, and developed a single obsession: getting petabytes of astronomical data ready for deep learning.

When I started this Google Summer of Code project, the mission seemed straightforward enough. I was tasked with building a pipeline to feed data from the Legacy Survey of Space and Time (LSST) into machine learning models for the DeepLense project.

The Vera C. Rubin Observatory, which will conduct the LSST, is a firehose of cosmic data, set to produce 20 terabytes every single night. Buried in that data, we expect to find around 100,000 new gravitational lenses—a massive jump from the few hundred we know of today. Each one is a cosmic magnifying glass that can help us understand the mysteries of dark matter.

But first, you have to find them. That’s where my project, RIPPLe, comes in.

Phase 0: The Foundation 

I look back at who I was in February, happily working my way through Andrew Ng's Coursera courses, and I have to laugh. I had no idea what was coming. 

It involved:

Installing three different versions of the LSST Science Pipelines.

Realizing lsst-activate would become my most-typed command.

Somehow convincing PyTorch and the LSST stack to coexist peacefully, which felt like a minor miracle.

By the end, all seven environment tests were passing.


Phase 1: Building the Data-Fetching Class

This is where the real work began. I started creating the LsstDataFetcher class, envisioning it as a "simple wrapper" around the Butler. 

On the surface, what a user wants is simple:

# Give me a 64x64 pixel image at these coordinates.

fetcher.fetch_cutout(ra=150.0, dec=2.5, size=64)

The code has to convert those coordinates into LSST's unique tract/patch grid system, check for about 15 different ways things can go wrong, handle missing data without crashing, politely retry if the connection fails, and cache everything it possibly can.


So, What Have I Actually Built?

After all that wrestling, the project is starting to look like a real thing. The RIPPLe repository now has a clear structure:

ripple/

├── data_access/        

├── preprocessing/      

├── models/              

└── pipeline/            

Links:  https://github.com/ML4SCI/DeepLense/tree/main/DeepLense_Data_Processing_Pipeline_for_the_LSST

[My sanity](404 Not Found)




Comments

Popular posts from this blog

Damn couldn't believe I got in again.

Proposal Sent! The Waiting Game Begins...

Proposal Mode: Activated