CURF Introduction: Machine Learning Optimization at the Large Hadron Collider

A picture of the ATLAS experiment at the Large Hadron Collider

Hello! My name is Bronco York, and I am a sophomore electrical engineering major minoring in physics, and in my free time, I enjoy cooking and photography. This semester I have been given an incredible opportunity with the Chancellors Undergraduate Research Fellowship to work with Dr. Tae Min Hong on the ATLAS project at the CERN Large Hadron Collider. For me, this is truly a dream come true; since taking my first physics class junior year of high school, I have loved studying physics and trying to understand our complex universe. My love for physics is partially why I chose electrical engineering as my major, as well as to continue taking physics classes by pursuing a physics minor. I’m a fan of the TV show, The Big Bang Theory, and I’ve always heard about the legendary Large Hadron Collider; it’s the largest particle accelerator in the world and where scientists discovered the Higgs Boson!

Freshman year, I wanted to challenge myself by taking the honors version of the standard introductory physics classes. Even though I had already learned about some of these concepts in high school, these classes went much more in depth, and I was able to learn about interesting physics concepts that I otherwise would not have learned at all. Honors physics electromagnetism was taught by my future research mentor, Dr. Tae Min Hong, and it is where I first had exposure to his research. He is a part of the ATLAS project, one of several active projects at the LHC, which involves colliding bunches of protons together at extremely high energies, resulting in formation of a plethora of particles which researchers try to detect using layers of instruments lining the collider.

Currently, at the LHC, they collide around 40 million proton bunches per second, each of which generate around 1 Megabyte of raw data from the various sensors and instrumentation. This means that the raw rate of data generated by the ATLAS experiment is on the order of magnitude of 10s of petabytes per second. To put that into perspective, a standard computer hard drive might have one terabyte of storage, and larger hard drives can store around 20 terabytes. One petabyte is 1000 times large than one terabyte, and the LHC produces tens of petabytes of raw data each second. This is an overwhelming amount of data which would be impossible to store using current technology. Therefore, before this data is stored, it first passes through a series of “triggers,” which are just systems which filter out trivial data from the meaningful collisions.

With this system, it reduces the incoming data rate to only around 1 Gigabyte per second, which is still a large amount of data, but is orders of magnitude less than the initial rate. This data is then stored and can be further analyzed in the future using more advanced algorithms. The trigger system is where Dr. Hong’s research lies—if we can create better and more efficient triggers that can filter out the useless data before it is ever stored, then we can reduce the amount of data that has to be stored. We try to accomplish this by taking machine learning models and optimizing them for speed using a machine learning technique called boost-decision-trees. We can then put this model on FPGAs, a type of programmable chip that is like your computer’s CPU, but is specialized for certain tasks. By doing this, we can implement algorithms and methods that were previously too slow for the low-level trigger systems.

For me, this research perfectly combines many of my interests such as electrical engineering, physics, and computer science. Not only will I learn so much through this project, but it is an opportunity to work with experts in the physics field and contribute to the advancement of important physics experiments.

Leave a Reply