You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: book/quarto/contents/core/introduction/introduction.qmd
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -57,7 +57,7 @@ Our investigation begins with the relationship between artificial intelligence a
57
57
58
58
This historical and technical foundation enables us to formally define this discipline. Following the pattern established by Computer Engineering's emergence from Electrical Engineering and Computer Science, we establish it as a field focused on building reliable, efficient, and scalable machine learning systems across computational platforms. This formal definition addresses both the nomenclature used in practice and the technical scope of what practitioners actually build.
59
59
60
-
Building upon this foundation, we introduce the theoretical frameworks that structure the analysis of ML systems throughout this text. The AI Triangle provides a conceptual model for understanding the interdependencies among data, algorithms, and computational infrastructure. We examine the machine learning system lifecycle, contrasting it with traditional software development methodologies to highlight the unique phases of problem formulation, data curation, model development, validation, deployment, and continuous maintenance that characterize ML system engineering.
60
+
Building upon this foundation, we introduce the theoretical frameworks that structure the analysis of ML systems throughout this text. We develop the AI Triangle framework, which models ML systems as three interdependent components (data, algorithms, and infrastructure) whose interactions determine system capabilities. We examine the machine learning system lifecycle, contrasting it with traditional software development methodologies to highlight the unique phases of problem formulation, data curation, model development, validation, deployment, and continuous maintenance that characterize ML system engineering.
61
61
62
62
These theoretical frameworks are substantiated through examination of representative deployment scenarios that demonstrate the diversity of engineering requirements across application domains. From autonomous vehicles operating under stringent latency constraints at the network edge to recommendation systems serving billions of users through cloud infrastructure, these case studies illustrate how deployment context shapes system architecture and engineering trade-offs.
63
63
@@ -83,9 +83,9 @@ This transformation illustrates why ML has become the dominant approach: In rule
83
83
84
84
Machine learning systems acquire recognition capabilities through processes that parallel human learning patterns. Object recognition develops through exposure to numerous examples, while natural language processing systems acquire linguistic capabilities through extensive textual analysis. These learning approaches operationalize theories of intelligence developed in AI research, building on mathematical foundations that we establish systematically throughout this text.
85
85
86
-
The distinction between AI as research vision and ML as engineering methodology carries significant implications for system design. Modern ML's data-driven approach requires infrastructurecapable of collecting, processing, and learning from data at massive scale. Machine learning emerged as a practical approach to artificial intelligence through extensive research and major paradigm shifts[^fn-paradigm-shift], transforming theoretical principles about intelligence into functioning systems that form the algorithmic foundation of today's intelligent capabilities.
86
+
The distinction between AI as research vision and ML as engineering methodology carries significant implications for system design. Rule-based AI systems scaled with programmer effort, requiring manual encoding of each new capability. Data-driven ML systems scale through computational and data infrastructure, achieving improved performance by expanding training datasets and computational resources rather than through additional programming effort. This transformation elevated systems engineering to a central role: advancement now depends on building infrastructure capable of collecting massive datasets, training models with billions of parameters, and serving predictions at scale. Machine learning emerged as a practical approach to artificial intelligence through this paradigm shift[^fn-paradigm-shift], transforming theoretical principles about intelligence into functioning systems that form the algorithmic foundation of today's intelligent capabilities.
87
87
88
-
[^fn-paradigm-shift]: **Paradigm Shift**: A term coined by philosopher Thomas Kuhn in 1962 [@kuhn1962structure] to describe major changes in scientific approach. In AI, the key paradigm shift was moving from symbolic reasoning (encoding human knowledge as rules) to statistical learning (discovering patterns from data). This shift had profound systems implications: rule-based systems scaled with programmer effort, requiring manual encoding of each new rule. Data-driven ML scales with compute and data infrastructure—achieving better performance by adding more GPUs and training data rather than more programmers. This transformation made systems engineering critical: success now depends on building infrastructure to collect massive datasets, train billion-parameter models, and serve predictions at scale, rather than encoding expert knowledge.
88
+
[^fn-paradigm-shift]: **Paradigm Shift**: A term coined by philosopher Thomas Kuhn in 1962 [@kuhn1962structure] to describe major changes in scientific approach. In AI, the key paradigm shift was moving from symbolic reasoning (encoding human knowledge as rules) to statistical learning (discovering patterns from data). This transformation explains why ML systems engineering emerged as a discipline distinct from traditional software engineering.
89
89
90
90
[^fn-petabyte-scale]: **Petabyte-Scale Data**: One petabyte equals 1,000 terabytes or roughly 1 million gigabytes—enough to store 13.3 years of HD video or the entire written works of humanity 50 times over. Modern ML systems routinely process petabyte-scale datasets: Meta processes over 4 petabytes of data daily for its recommendation systems, while Google's search index contains hundreds of petabytes of web content. Managing this scale requires distributed storage systems (like HDFS or S3) that shard data across thousands of servers, parallel processing frameworks (like Apache Spark) that coordinate computation across clusters, and sophisticated data engineering pipelines that can validate, transform, and serve data at rates exceeding 100 GB/s. The engineering challenge isn't just storage capacity, but the bandwidth, fault tolerance, and consistency guarantees needed to make petabyte datasets useful for training and inference.
Copy file name to clipboardExpand all lines: book/quarto/contents/core/ml_systems/ml_systems.qmd
+1-5Lines changed: 1 addition & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -261,11 +261,7 @@ Compounding these memory challenges, the breakdown of Dennard scaling[^fn-dennar
261
261
262
262
[^fn-dennard-scaling]: **Dennard Scaling**: Named after Robert Dennard (IBM, 1974), the observation that as transistors became smaller, they could operate at higher frequencies while consuming the same power density. This scaling enabled Moore's Law until 2005, when physics limitations forced the industry toward multi-core architectures and specialized processors like GPUs and TPUs.
263
263
264
-
Beyond power considerations, physical limits impose minimum latencies that no engineering optimization can overcome. The speed of light establishes an inherent 80ms round-trip time between California and Virginia, while internet routing, DNS resolution, and processing overhead typically contribute another 20-420ms. This 100-500ms total latency renders real-time applications infeasible with pure cloud deployment. Network bandwidth faces physical constraints: fiber optic cables have theoretical limits, and wireless communication remains bounded by spectrum availability and signal propagation physics. These communication constraints create hard boundaries that necessitate local processing for latency-sensitive applications and drive edge deployment decisions.
265
-
266
-
Heat dissipation emerges as an additional limiting factor as computational density increases. Mobile devices must throttle performance to prevent component damage and maintain user comfort, while data centers require extensive cooling systems that limit placement options and increase operational costs. Thermal constraints create cascading effects: elevated temperatures reduce semiconductor reliability, increase error rates, and accelerate component aging. These thermal realities necessitate trade-offs between computational performance and sustainable operation, driving specialized cooling solutions in cloud environments and ultra-low-power designs in embedded systems.
267
-
268
-
These fundamental constraints drove the evolution of the four distinct deployment paradigms outlined in this overview (@sec-ml-systems-deployment-spectrum-38d0). Understanding these core constraints proves essential for selecting appropriate deployment paradigms and establishing realistic performance expectations.
264
+
As established in @sec-ml-systems-deployment-spectrum-38d0, speed of light latency and thermal constraints compound these memory and power challenges, creating hard boundaries that necessitate local processing for latency-sensitive applications. These physical limitations collectively drove the evolution of the four distinct deployment paradigms examined in this chapter and remain essential considerations for selecting appropriate deployment approaches and establishing realistic performance expectations.
269
265
270
266
These theoretical constraints manifest in concrete hardware differences across the deployment spectrum. To understand the practical implications of these physical limitations, @tbl-representative-systems provides representative hardware platforms for each category. These examples demonstrate the range of computational resources, power requirements, and cost considerations[^fn-cost-spectrum] across the ML systems spectrum, illustrating the practical implications of each deployment approach.[^fn-pue]
0 commit comments