
S3D Alum Receives Best Student Paper Award
By Marylee Williams
Researchers at Carnegie Mellon University's School of Computer Science and the University of Lisbon co-authored a study that won the Best Student Paper award at this year's International Conference on Software Engineering for Adaptive and Self-Managing Systems (SEAMS).
The research addresses a long-standing problem in retraining machine learning models. In the paper, "Ripple: A Long-Sighted Self-Adaptation Approach To Retrain Machine Learning-Enabled Systems," researchers detail a new method for estimating how a system's performance will evolve over time and how retraining would impact that performance.
The research team from the Software and Societal Systems Department (S3D) included Maria Casimiro, a recent S3D doctoral graduate from CMU Portugal, and Professor David Garlan. The team from Lisbon included Valentim Romão, Paolo Romano and Luís Rodrigues.
More and more applications, from banking to healthcare, use machine learning models, but their performance isn't static. It evolves. If these models aren't updated and retrained, they can degrade or violate laws. For example, if a law passes that changes what kind of data can be collected from a healthcare application, then the machine learning model needs to be updated and potentially retrained. But retraining can be expensive and time-consuming.
To solve this problem, researchers propose Ripple, or "Retrain ImPact Predictor for Long-tErm Planning." Currently, machine learning systems retrain based on what's next rather than planning for the long term. Ripple allows developers to collect data, analyze it and determine the best time to retrain the model.
Researchers used look-ahead adaptation impact predictors that they created using an adaptation impact dataset, which describes what happens if you retrain the machine learning model on some given day and compares it to retrains on other days. Using Ripple, researchers can determine when retraining is most efficient, taking into account the characteristics of the new data for retraining and the future evolutions of the system, such as the cost of compute resources in the cloud.
The team will present their research this month at the SEAMS conference in Rio de Janeiro.