Skip to content

Commit b97257b

Browse files
authored
Merge pull request #13875 from panyx0718/pick-doc
Merge pull request #13819 from panyx0718/doc
2 parents b70c94c + d4e45a3 commit b97257b

File tree

3 files changed

+50
-3
lines changed

3 files changed

+50
-3
lines changed

paddle/fluid/pybind/pybind.cc

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,50 @@ PYBIND11_PLUGIN(core) {
156156
.def("_get_double_element", TensorGetElement<double>)
157157
.def("_dtype", [](Tensor &self) { return ToDataType(self.type()); });
158158

159-
py::class_<LoDTensor, Tensor>(m, "LoDTensor")
159+
py::class_<LoDTensor, Tensor>(m, "LoDTensor", R"DOC(
160+
LoDTensor is a Tensor with optional LoD information.
161+
162+
np.array(lod_tensor) can convert LoDTensor to numpy array.
163+
lod_tensor.lod() can retrieve the LoD information.
164+
165+
LoD is short for Level of Details and is usually used for varied sequence
166+
length. You can skip the following comment if you don't need optional LoD.
167+
168+
For example:
169+
A LoDTensor X can look like the example below. It contains 2 sequences.
170+
The first has length 2 and the second has length 3, as described by x.lod.
171+
172+
The first tensor dimension 5=2+3 is calculated from LoD if it's available.
173+
It means the total number of sequence element. In X, each element has 2
174+
columns, hence [5, 2].
175+
176+
x.lod = [[2, 3]]
177+
x.data = [[1, 2], [3, 4], // seq 1
178+
[5, 6], [7, 8], [9, 10]] // seq 2
179+
x.shape = [5, 2]
180+
181+
LoD can have multiple levels (for example, a paragraph can have multiple
182+
sentences and a sentence can have multiple words). In the following
183+
LodTensor Y, the lod_level is 2. It means there are 2 sequence, the
184+
first sequence length is 2 (has 2 sub-sequences), the second one's
185+
length is 1. The first sequence's 2 sub-sequences have length 2 and 2,
186+
respectively. And the second sequence's 1 sub-sequence has length 3.
187+
188+
y.lod = [[2 1], [2 2 3]]
189+
y.shape = [2+2+3, ...]
190+
191+
Note:
192+
In above description, LoD is length-based. In Paddle internal
193+
implementation, lod is offset-based. Hence, internally,
194+
y.lod is represented as [[0, 2, 3], [0, 2, 4, 7]] (length-based
195+
equivlent would be [[2-0, 3-2], [2-0, 4-2, 7-4]]).
196+
197+
Sometimes LoD is called recursive_sequence_length to be more
198+
self-explanatory. In this case, it must be length-based. Due to history
199+
reasons. when LoD is called lod in public API, it might be offset-based.
200+
Users should be careful about it.
201+
202+
)DOC")
160203
.def_buffer(
161204
[](Tensor &self) -> py::buffer_info { return CastToPyBuffer(self); })
162205
.def("__init__",

python/paddle/fluid/layers/io.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,11 @@ def data(name,
5555
Args:
5656
name(str): The name/alias of the function
5757
shape(list): Tuple declaring the shape.
58-
append_batch_size(bool): Whether or not to append the data as a batch.
58+
append_batch_size(bool):
59+
1. If true, it prepends -1 to the shape.
60+
For example if shape=[1], the resulting shape is [-1, 1].
61+
2. If shape contains -1, such as shape=[1, -1],
62+
append_batch_size will be enforced to be be False (ineffective).
5963
dtype(int|float): The type of data : float32, float_16, int etc
6064
type(VarType): The output type. By default it is LOD_TENSOR.
6165
lod_level(int): The LoD Level. 0 means the input data is not a sequence.

python/paddle/fluid/layers/tensor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ def create_global_var(shape,
111111
force_cpu=False,
112112
name=None):
113113
"""
114-
Create a new variable in the global block(block 0).
114+
Create a new tensor variable with value in the global block(block 0).
115115
116116
Args:
117117
shape(list[int]): shape of the variable

0 commit comments

Comments
 (0)