Skip to content

Commit bb07b77

Browse files
Fix a minor typo in DTensor overview ipynb file.
PiperOrigin-RevId: 507825081
1 parent 7d9aab3 commit bb07b77

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

site/en/guide/dtensor_overview.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -770,7 +770,7 @@
770770
"\n",
771771
"The global matrix product sharded under this scheme can be performed efficiently, by local matmuls that runs concurrently, followed by a collective reduction to aggregate the local results. This is also the [canonical way](https://github.com/open-mpi/ompi/blob/ee87ec391f48512d3718fc7c8b13596403a09056/docs/man-openmpi/man3/MPI_Reduce.3.rst?plain=1#L265) of implementing a distributed matrix dot product.\n",
772772
"\n",
773-
"Total number of floating point mul operations is `6 devices * 4 result * 1 = 24`, a factor of 3 reduction compared to the fully replicated case (72) above. The factor of 3 is due to the sharing along `x` mesh dimension with a size of `3` devices.\n",
773+
"Total number of floating point mul operations is `6 devices * 4 result * 1 = 24`, a factor of 3 reduction compared to the fully replicated case (72) above. The factor of 3 is due to the sharding along `x` mesh dimension with a size of `3` devices.\n",
774774
"\n",
775775
"The reduction of the number of operations run sequentially is the main mechansism with which synchronuous model parallelism accelerates training."
776776
]
@@ -805,7 +805,7 @@
805805
"You can perform additional sharding on the inputs, and they are appropriately carried over to the results. For example, you can apply additional sharding of operand `a` along its first axis to the `'y'` mesh dimension. The additional sharding will be carried over to the first axis of the result `c`.\n",
806806
"\n",
807807
"\n",
808-
"Total number of floating point mul operations is `6 devices * 2 result * 1 = 12`, an additional factor of 2 reduction compared to the case (24) above. The factor of 2 is due to the sharing along `y` mesh dimension with a size of `2` devices."
808+
"Total number of floating point mul operations is `6 devices * 2 result * 1 = 12`, an additional factor of 2 reduction compared to the case (24) above. The factor of 2 is due to the sharding along `y` mesh dimension with a size of `2` devices."
809809
]
810810
},
811811
{

0 commit comments

Comments
 (0)