Carnegie Mellon University

Kästner Publishes Groundbreaking Book on Machine Learning in Production

Bridging the Gap Between ML Models and Real-World Products

The Software and Societal Systems Department (S3D) at Carnegie Mellon University is pleased to announce the publication of Associate Professor Christian Kästner's new book, "Machine Learning in Production: From Models to Products," released on April 8, 2025, by MIT Press. The book exemplifies S3D's mission to understand and improve how computational technologies can better serve societies and communities, tackling complex socio-technical challenges at the intersection of software engineering and societal impact.

While numerous books explain how to train and evaluate machine learning models, and many MLOps texts focus on streamlining model development and deployment, Kästner's work addresses a critical gap in the literature by focusing on how to build actual software products that deliver value to users. The book cites industry research showing that 87% of machine learning projects fail and 53% never make it from prototype to production—underlining the urgent need for better engineering practices in ML implementation.

 

kaestner-christian.JPGAuthor's Perspective


When asked about his motivation for writing this groundbreaking textbook, Kästner explains that it emerged organically from the classroom experience.

"It came out of teaching," Kästner explains. "We all want to teach how to use cool ML/AI technology in our research to improve software engineering tools and practices, but our students actually go out and build ML-powered applications in industry. It became pretty clear that ML in software products raised new challenges that we were not really prepared to address."

The book responds to a critical gap between academic machine learning and real-world implementation. According to Kästner, part of this involved refocusing attention on established software engineering practices in the ML context. "For example, monitoring systems in production and A/B testing was already established in software engineering, but not widely taught. With ML, it's now a crucial skill," notes Kästner.

This practical origin story exemplifies S3D's approach to education and research - identifying real-world challenges through direct engagement with students and industry, then developing rigorous frameworks that bridge theoretical and applied domains.

The 616-page textbook provides a comprehensive guide to the entire development lifecycle, from requirements and design to quality assurance and operations. It brings an engineering mindset to the challenges of building systems that are usable, reliable, scalable, and safe within real-world constraints of uncertainty, incomplete information, and limited resources.

Rooted in Practical Experience and Research Excellence

The book builds on Kästner's experience teaching the popular CMU course 17-645 "Machine Learning in Production" (also cross-listed as 11-695 "AI Engineering"), which has become a cornerstone offering in both the Software Engineering and Machine Learning programs at Carnegie Mellon since its introduction in 2019. The course's impact extends across departments, serving as a requirement for several years in LTI's Master of Science in Artificial Intelligence and Innovation program—a testament to its wide-ranging relevance and CMU's integrated approach to AI education. The course is designed to foster interdisciplinary collaboration between software engineers and data scientists, providing a system-wide perspective on building AI-enabled products.

Central to the book's approach is the concept of developing "T-shaped team members"—professionals who combine deep expertise in one area (the vertical bar of the T) with sufficient breadth across disciplines (the horizontal bar) to collaborate effectively.

"While it would be nice to make every student graduating here an expert in machine learning, software engineering, security and privacy, human-computer interaction, law and policy, entrepreneurship, and so forth — this is simply not realistic," Kästner explains. "The only way to scale expertise beyond the capacity of a single human is to bring experts together."

But mere co-location of experts isn't enough. "When we bring experts together, they need to be able to work together. Beyond their expertise, they need to understand enough of other fields that they can work with experts in those areas," says Kästner. "That's the essence of the model of T-shaped professionals: Teach them enough to understand what they don't know, when to get help, and how to effectively work together in an interdisciplinary team."

This educational philosophy reflects S3D's commitment to preparing students for the complex reality of modern technical work, where most significant challenges span traditional disciplinary boundaries.

"Christian's course has been instrumental in preparing our students to build AI systems that work reliably in the real world," notes Department Head Nicolas Christin. "This book represents the kind of actionable knowledge and educational innovation that is central to S3D's mission, extending that impact well beyond our campus."

The textbook covers 29 chapters organized into six sections:

1: Setting the Stage
2: Requirements Engineering
3: Architecture and Design

4: Quality Assurance

5: Process and Teams

6: Responsible ML Engineering

It includes extensive case studies, practical exercises, and supplemental resources such as slides and assignments that have been refined through years of classroom testing with both academic and industry practitioners.

Kästner has announced that all author royalties from the book will be donated to Evidence Action, a nonprofit organization that scales evidence-based and cost-effective programs to reduce the burden of poverty.

Industry and Academic Impact

The publication comes at a critical time as organizations across all sectors struggle with the challenges of deploying machine learning systems that are not only technically sound but also responsible and sustainable. This work addresses key concerns highlighted in S3D's vision of tackling complex socio-technical systems and bringing computer science into conversation with social sciences, public policy, and organizational studies.

"Christian's work addresses one of the biggest challenges we see in industry today—moving beyond models and demos to production systems that deliver real value while maintaining safety and reliability,"

"This publication builds on years of research and teaching experience that has already benefited countless students and industry partners."

The book draws from extensive research literature, including meta-studies synthesizing findings from thousands of industry practitioners to identify common challenges in building ML-powered systems. His recent keynote at the International Conference on AI Engineering (CAIN 2024) emphasized the need for a system-wide perspective when addressing safety, usability, fairness, and security in ML applications.

What Sets This Book Apart

Kästner's book distinguishes itself from other machine learning texts through its holistic, system-wide perspective. While most ML education and research focuses narrowly on model development, this book emphasizes that "machine learning is almost always used as a component in a larger system—often a very important component, but usually still just one among many components."

"The point is not winning benchmarks, but solving real-world problems. Models alone don't do that," Kästner asserts. "A model-centric approach is very limiting. You cannot reason about AI safety at the model level. Safety is fundamentally a system property, and the key to making a system safe is to think about safeguards around unreliable components, which ML models fundamentally are and always will be."

This perspective helps readers understand how ML models interact with other system components and with the environment, addressing real-world challenges like handling model mistakes, achieving system-level quality attributes, and designing for responsible operation.

The book uses concrete scenarios, such as an automated transcription startup, to illustrate the practical challenges of transitioning from academic prototypes to production systems.

"Like in all our examples, the model is really important, but to turn this into a product and a profitable business, you need much more than a good model," Kästner notes. "It is really hard to get from a model that works on a benchmark to a product."

These real-world examples demonstrate how going beyond model accuracy requires addressing user experience, scalability, monitoring, and business viability—precisely the interdisciplinary approach that S3D champions.

Unlike other books in the field that focus primarily on model development or deployment pipelines, "Machine Learning in Production" takes a holistic, system-level approach that integrates software engineering best practices with machine learning techniques. While MLOps tools and practices are covered extensively, they're presented as part of a broader engineering context rather than as standalone solutions. The book uniquely addresses:

A listing of the chapters titles
Organizational challenges Bridging communication gaps between data scientists and software engineers
Socio-technical considerations Addressing fairness, accountability, and transparency in ML systems
Practical implementation Converting theoretical principles into working systems
Responsible AI Ensuring ML systems are designed with ethical considerations from the ground up—dedicating seven full chapters to safety, security, fairness, interpretability, and accountability
Engineering techniques Applying established software engineering methods like fault tree analysis and hazard analysis to manage ML uncertainty

This comprehensive approach aligns with S3D's interdisciplinary focus and commitment to studying the bigger picture of computing in context.

Availability and Resources

"Machine Learning in Production: From Models to Products" is available through MIT Press and major retailers in both hardcover ($85) and ebook formats. In keeping with Kästner's commitment to open education, the book is also available as an open-access resource under a Creative Commons license.

The complete book, along with supplementary materials including slides, assignments, and an annotated bibliography, is available at https://mlip-cmu.github.io/book/ .

Related S3D Research and Initiatives

This publication complements ongoing research initiatives within S3D, including work on software architecture, program analysis, and complex socio-technical systems. The department's focus on both foundational and applied projects creates an environment where technical expertise meets real-world application—precisely the approach advocated in Kästner's book.

The book also serves as a valuable resource for S3D's master's programs in Software Engineering (MSE) and Privacy Engineering (MSIT-PE), both of which emphasize the practical application of technical knowledge to address societal challenges.


About Christian Kästner: Christian Kästner is an Associate Professor of Computer Science at Carnegie Mellon University and serves as the director of the Software Engineering PhD program in the Software and Societal Systems Department. His research focuses on building reliable, explainable software systems, especially those incorporating artificial intelligence components. With over 250 research publications and 12,000+ citations, his work spans software engineering, program analysis, and the intersection of AI and software systems. Kästner is particularly interested in understanding the limits of modularity and complexity in variability in software systems and bridging the gap between data scientists and software engineers.

Editor's Note: The Software and Societal Systems Department (S3D) at Carnegie Mellon University explores the vital intersection where software, systems, and society converge. Our interdisciplinary approach allows us to tackle a broad spectrum of research areas, developing tools, policies, and methods that address large-scale societal problems.