@@ -242,3 +242,264 @@ graph(%input.1 : Tensor):
242242 -> (%5, %rv.6)
243243 return (%rv)
244244```
245+
246+
247+ ### Intermediate Representation in JAX
248+
249+ The JAX framework facilitates both static and dynamic computational
250+ graphs and employs the Jax Program Representation (Jaxpr) IR. This IR
251+ ensures that the output, not reliant on global variables, depends solely
252+ on the input, with both input and output encapsulating typed
253+ information. Functionality-wise, Jaxpr IR supports an array of features
254+ such as loops, branching, recursion, closure function differentiation,
255+ third-order differentiation, as well as backpropagation and forward
256+ propagation in automatic differentiation.
257+
258+ Jaxpr IR utilizes the A-normal Form (ANF), a form of functional
259+ expression, demonstrated in
260+ Code ` lst:ANF `
261+ via the ANF grammar.
262+
263+ ** lst: ANF **
264+ ```
265+ <aexp> ::= NUMBER | STRING | VAR | BOOLEAN | PRIMOP
266+ | (lambda (VAR ...) <exp>)
267+ <cexp> ::= (<aexp> <aexp> ...)
268+ | (if <aexp> <exp> <exp>)
269+ <exp> ::= (let ([VAR <cexp>]) <exp>) | <cexp> | <aexp>
270+ ```
271+
272+ The ANF segregates expressions into atomic expressions (aexp) and
273+ compound expressions (cexp). Atomic expressions represent constants,
274+ variables, primitives, and anonymous functions, while compound
275+ expressions, comprising several atomic expressions, can be viewed as
276+ invocations of anonymous or primitive functions. The first input in a
277+ cexp represents the invoked function, and all subsequent inputs
278+ symbolize the invoked parameters.
279+
280+ Code ` lst:JaxCode ` displays the Jaxpr corresponding to a function.
281+
282+ ** lst: JaxCode **
283+ ``` python
284+ from jax import make_jaxpr
285+ import jax.numpy as jnp
286+
287+ def test_func (x , y ):
288+ ret = x + jnp.sin(y) * 3
289+ return jnp.sum(ret)
290+
291+ print (make_jaxpr(test_func)(jnp.zeros(8 ), jnp.ones(8 )))
292+ ```
293+
294+ The structure of this Jaxpr is shown in
295+ Code ` lst:JaxPr ` .
296+
297+ ** lst: JaxPr **
298+ ```
299+ { lambda ; a:f32[8] b:f32[8]. let
300+ c:f32[8] = sin b
301+ d:f32[8] = mul c 3.0
302+ e:f32[8] = add a d
303+ f:f32[] = reduce_sum[axes=(0,)] e
304+ in (f,) }
305+ ```
306+
307+ ### Intermediate Representation in TensorFlow
308+
309+ TensorFlow utilizes dataflow programming to execute numerical
310+ computations through dataflow graphs. TensorFlow's static graph
311+ mechanism progresses through a series of abstractions and analyses when
312+ running a program, transforming it from higher-level to lower-level IRs,
313+ a process referred to as \" lowering\" .
314+
315+ To cater to diverse hardware platforms, TensorFlow employs a range of IR
316+ designs. As illustrated in
317+ Figure :numref:` ch04/ch04-tensorflow_ecosystem ` , the blue boxes denote
318+ graph-based IRs while the green ones indicate SSA-based IRs. During the
319+ IR transformation, each level optimizes the IR independently, precluding
320+ communication with other levels. This absence of awareness about
321+ optimizations performed at other levels necessitates optimal
322+ implementation at each level, often leading to repetitive tasks and
323+ sub-optimal efficiency. Notably, transitioning from graph-based IRs to
324+ SSA-based IRs involves a qualitative transformation that incurs
325+ significant costs. The inability to reuse the same optimization code
326+ across levels also hampers development efficiency.
327+
328+ Multi-level IRs present a mixed bag of advantages and disadvantages. On
329+ the plus side, they offer flexible representations, pass-based
330+ optimization at varying levels, and efficient optimization algorithms.
331+ On the downside, they pose challenges due to their inherent
332+ characteristics: The transformation between different IRs often
333+ complicates full compatibility implementation, thereby increasing
334+ engineering workload and potentially leading to information loss. This
335+ might make lower-level optimization challenging if information at a
336+ higher level has been optimized. To mitigate such information loss, we
337+ can impose stricter constraints on the optimization sequence.
338+ Additionally, choosing the level for implementing certain optimizations
339+ that can be performed at two adjacent levels can be a conundrum for
340+ framework developers. Finally, defining distinct operator granularities
341+ at different levels might impact accuracy to a certain degree.
342+
343+ ![ TensorFlow's IRdesign] ( ../img/ch04/IR-MLIR.png )
344+ :label : ` ch04/ch04-tensorflow_ecosystem `
345+
346+ ### Multi-Level Intermediate Representation
347+
348+ Multi-Level Intermediate Representation (MLIR) serves as a unified
349+ platform for IRs rather than being a specific type of IR. Leveraging the
350+ infrastructure provided by MLIR, developers can define IRs to suit their
351+ needs. Thus, MLIR can be interpreted as a \" compiler of compilers\" . It
352+ expands beyond the TensorFlow framework and can be used to construct IRs
353+ linking other languages to backend platforms (such as LLVM).
354+
355+ Despite the design of MLIR being heavily influenced by LLVM, MLIR
356+ fosters a more open ecosystem. Given that MLIR does not confine
357+ developers to a set group of operation or abstraction types, it offers
358+ more latitude to define IRs and solve specific problems. To facilitate
359+ this extensibility, MLIR introduces the concept of \" dialects\" . These
360+ provide a grouping mechanism for abstraction under a unique namespace.
361+ Each dialect lays out a production and associates an operation to an IR,
362+ thus producing an MLIR-typed IR. Within MLIR, the \" operation\" is the
363+ fundamental unit of abstraction and computation. Operations can carry
364+ application-specific semantics and encapsulate all the core IR
365+ structures in LLVM, including instructions, functions, modules, etc.
366+
367+ The MLIR assembly for an operation is illustrated as follows:
368+
369+ ```
370+ %tensor = "toy.transpose"(%tensor) {inplace = true} : (tensor<2x3xf64>) -> tensor<3x2xf64> loc("example/file/path":12:1)
371+ ```
372+
373+ This MLIR operation can be dissected as follows:
374+
375+ - %tensor: The identifier for the result defined by this operation
376+ (prefixed with a $\% $ to prevent naming conflicts). An operation may
377+ define no results or multiple results, represented as SSA values.
378+
379+ - \" toy.transpose\" : The operation name. It is usually a unique
380+ string, with the dialect's namespace prefixing the ".". This refers
381+ to the transpose operation within the toy dialect.
382+
383+ - (%tensor): A list that can contain zero or more input operands (or
384+ arguments), which are SSA values defined by other operations or that
385+ refer to block arguments.
386+
387+ - inplace = true: A dictionary that may contain zero or more
388+ attributes. These are constant special operands. Here, a boolean
389+ attribute named ` inplace ` with a constant value of ` true ` is
390+ defined.
391+
392+ - (tensor\< 2x3xf64\> )-\> tensor\< 3x2xf64\> : This represents the
393+ operation type in a functional form, specifying the input before the
394+ arrow and output after. The data types and shapes of the input and
395+ output are contained within the parentheses. For instance,
396+ $<2x3xf64>$ represents a tensor with a shape of ` (2, 3) ` and data
397+ type ` float64 ` .
398+
399+ - loc(\" example/file/path\" :12:1): This refers to the source code
400+ location from where this operation originated.
401+
402+ As each level's IR design adheres to this assembly, it simplifies
403+ transformation across levels, boosting the efficiency of IR
404+ transformation. Moreover, different levels can interact to optimize the
405+ IRs, enabling optimization to be performed at the most suitable level,
406+ thereby negating the need for optimal performance at each level. By
407+ transforming them into the IR at the most appropriate level, other IRs
408+ can be optimized, enhancing both optimization and development
409+ efficiency. TensorFlow can also employ MLIR to perform multi-layer
410+ transformation from graph-based IRs to
411+
412+ ### Intermediate Representation in MindSpore
413+
414+ MindSpore adopts graph-based functional IRs, known as MindSpore IR
415+ (abbreviated to MindIR). MindIR employs a unified IR approach instead of
416+ a multi-level IR structure, outlining the network's logical structure
417+ and operator attributes. This approach obliterates model disparities
418+ across different backends, facilitating connections to various target
419+ machines.
420+
421+ MindIR primarily caters to the automatic differential transformation. It
422+ implements a transformation method grounded in functional programming
423+ frameworks, thereby making it similar to ANF (A-Normal Form) functional
424+ semantics. Its defining characteristics include:
425+
426+ 1 . ** Graph-based Representation** . MindSpore represents programs as
427+ graphs which are conducive to optimization. MindSpore treats
428+ functions as essential elements of a machine learning program,
429+ allowing for recursive invocation, parameter passing, or returning
430+ from other functions. This ability paves the way for representing a
431+ range of control flow structures.
432+
433+ 2 . ** Purely Functional** . In a purely functional context, the function
434+ outcomes depend solely on parameters. Side effects are potential
435+ issues when a function relies on or affects external states, such as
436+ global variables. These can lead to incorrect results if code
437+ execution sequence isn't strictly maintained. These side effects can
438+ also impact automatic differentiation, necessitating the requirement
439+ for pure functions. MindIR has the capability to transform
440+ representations with side effects into purely functional
441+ representations, ensuring correct code execution sequence while
442+ upholding ANF functional semantics and enabling a higher degree of
443+ automatic differentiation freedom.
444+
445+ 3 . ** Closure Representation** . Reverse mode automatic differentiation
446+ requires the storage of basic operation intermediate results in
447+ closures for a combined connection. Closures, the combination of a
448+ code block bundled with references to its surrounding environment,
449+ become particularly crucial. In MindIR, the code block takes the
450+ shape of a function diagram, with the surrounding environment
451+ interpreted as the function invocation context.
452+
453+ 4 . ** Strongly Typed** . Each node requires a specific type for achieving
454+ optimal performance. This is particularly crucial in machine
455+ learning frameworks where operator execution can be time-consuming.
456+ Detecting errors at the earliest can help save valuable time.
457+ MindIR's type and shape inference capabilities thus center on the
458+ support for function invocation and higher-order functions.
459+
460+ Figure :numref:` ch04/ch04-MindIR ` outlines the MindIR grammar based on
461+ MindSpore framework's characteristics. ANode corresponds to an atomic
462+ expression in ANF, ValueNode represents the constant value,
463+ ParameterNode signifies the function's formal parameter, and CNode
464+ (corresponding to a compound expression in ANF) indicates function
465+ invocation.
466+
467+ ![ MindIR grammar] ( ../img/ch04/IR-MindIR.png )
468+ :label : ` ch04/ch04-MindIR `
469+
470+ The example provided below in Code 1 offers a deeper analysis of MindIR.
471+
472+ ** lst: MindSporeCode **
473+ ```
474+ def func(x, y):
475+ return x / y
476+
477+ @ms_function
478+ def test_f(x, y):
479+ a = x - 1
480+ b = a + y
481+ c = b * func(a, b)
482+ return c
483+ ```
484+
485+ The ANF expression corresponding to this function is demonstrated in
486+ Code ` lst:MindIR ` .
487+
488+ ** lst: MindIR **
489+ ```
490+ lambda (x, y)
491+ let a = x - 1 in
492+ let b = a + y in
493+ let func = lambda (x, y)
494+ let ret = x / y in
495+ ret end in
496+ let %1 = func(a, b) in
497+ let c = b * %1 in
498+ c end
499+ ```
500+
501+ In ANF, each expression is encapsulated as a variable utilizing the
502+ ` let ` expression, with dependencies on the expression's output
503+ represented via variable references. In contrast, MindIR packages each
504+ expression as a node, portraying dependencies through directed edges
505+ connecting the nodes.
0 commit comments