Skip to content

Commit 2807896

Browse files
authored
Fix the CPP Data Feeding design document (#9033)
1 parent 3621d9a commit 2807896

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

doc/design/cpp_data_feeding.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# C++ Data Feeding
22

3-
In training with Paddle V2 API, data feeding wholly dependents on Python code. To get rid of the Python environment and achieve the goal of "wrapping the whole training by a while loop op" in Paddle Fluid, a C++ data feeding mechanism is required.
3+
While using Paddle V2 API for Training, data feeding completely depends on the Python code. To get rid of the Python environment and achieve the goal of "wrapping the whole training by a while loop op" in Paddle Fluid, a C++ data feeding mechanism is required.
44

5-
In this document we show the fundamental design of C++ data feeding process, which includes the data reading, shuffling and batching.
5+
In this document we show the fundamental design of a C++ data feeding process, which includes data reading, shuffling and batching.
66

77
## Reader
88

9-
A new concept named 'Reader' is introduced. `Reader` is a series of inherited classes which can be hold by our `Variable` and they are used to read or process file data.
9+
In order to handle the above mentioned problem, a new concept called 'Reader' is introduced. `Reader` is a series of inherited classes which can be held by our `Variable` and they are used to read or process file data.
1010

1111

1212
### `ReaderBase`
1313

14-
`ReaderBase` is the abstract base class of all readers. It defines the all readers' interfaces.
14+
`ReaderBase` is the abstract base class for all readers. It defines the interface for all readers.
1515

1616
```cpp
1717
class ReaderBase {
@@ -20,10 +20,10 @@ class ReaderBase {
2020
PADDLE_ENFORCE(!shapes_.empty());
2121
}
2222
// Read the next batch of data. (A 'batch' can be only one instance)
23-
// If the next batch doesn't exist, the '*out' will be an empty std::vector.
23+
// If the next batch doesn't exist, '*out' will be an empty std::vector.
2424
virtual void ReadNext(std::vector<LoDTensor>* out) = 0;
2525

26-
// Reinitialize the reader and read the file from the begin.
26+
// Reinitialize the reader and read the file from the beginning.
2727
virtual void ReInit() = 0;
2828

2929
// Get a certain read in data's shape.
@@ -42,36 +42,36 @@ class ReaderBase {
4242
4343
### `FileReader` and `DecoratedReader`
4444
45-
These two classes are derived from the `ReaderBase` and will further be derived by respective specific readers. That is to say, in our design, there are two kinds of readers: file readers and decorated readers. A file reader reads from a file of some specific format, and yield only one instance of data at a time. e.g. RecordIO reader, jpg reader, .... A decorated reader takes another reader(both file reader and decorated reader are OK) as its 'underlying reader'. It gets data from its underlying reader, does some process on them(shuffling, or batching), then yields processed data. The output data of a decorated reader can be a single instance or a batch. `ShuffleReader` and `BatchReader` are both decorated readers.
45+
These two classes are derived from the `ReaderBase` and will further be derived by more specific readers. Thus, in our design, there are two kinds of readers: file readers and decorated readers. A file reader reads from a file of some specific format, and yield only one instance of data at a time. For example, RecordIO reader, jpg reader, .... A decorated reader takes another reader(both file reader and decorated reader are OK) as its 'underlying reader'. It gets data from its underlying reader, does some processing on them(shuffling, or batching), then yields processed data. The output data of a decorated reader can be a single instance or a batch. `ShuffleReader` and `BatchReader` are both decorated readers.
4646
47-
All the readers share exactly the same interfaces defined in `ReaderBase`. So they can be decorated for more than one time: We can **shuffle** a reader's outputs and then **batch** the shuffle outputs. The interface consistency also allows related ops use readers without knowing what they are exactly.
47+
All the readers share exactly the same interface as defined in `ReaderBase`. So they can be decorated for more than one time: We can **shuffle** a reader's outputs and then **batch** the shuffle outputs. The interface consistency also allows related ops use readers without knowing what they are exactly.
4848
4949
5050
### `ReaderHolder`
5151
52-
Different readers belong to different class types. It leads to a problem: How can we drop them into `Variable`s and fetch them out by a unified method? For example, if a Variable holds a `BatchReader`, we can not get it by the following code:
52+
Different readers belong to different class types. This leads to a problem: How can we drop them into `Variable`s and fetch them out by a unified method? For example, if a Variable holds a `BatchReader`, we can not get it by the following code:
5353
5454
```cpp
5555
var->Get<ReaderBase>("batch_reader");
5656
```
5757

58-
we have to write:
58+
We would have to write:
5959

6060
```cpp
6161
var->Get<BatchReader>("batch_reader");
6262
```
6363

64-
This requires each time getting a reader from a variable we must know the reader's type exactly. It is nearly impossible.
64+
This requires that in order to get a reader from a variable, every time, we must know the reader's type exactly. This is nearly impossible.
6565

66-
To solve this problem, we introduce `ReaderHolder` as a wrapper. It acts as an empty decorator of `ReaderBase`, which erases reader's type. With `ReaderHolder` we are able to fetch all types of readers by `var->Get<ReaderHolder>("...")` and regard the obtained object as a reader.
66+
To solve this problem, we introduce `ReaderHolder` as a wrapper. It acts as an empty decorator of `ReaderBase`, which hides reader's type. With `ReaderHolder` we are able to fetch all types of readers by `var->Get<ReaderHolder>("...")` and regard the obtained object as a reader.
6767

6868
## Related Operators
6969

70-
To create and invoke readers, some now ops are introduced:
70+
To create and invoke readers, some new ops are introduced:
7171

7272
### `CreateReaderOp`
7373

74-
Each reader has its creating op. File readers' creating ops have no input and yield the created file reader as its output. Decorated readers' creating ops take the underlying readers as inputs and then yield new decorated readers.
74+
Each reader has its creation op. File readers' creation ops have no input and yield the created file reader as its output. Decorated readers' creation ops take the underlying readers as inputs and then yield new decorated readers.
7575

7676
### `ReadOp`
7777

0 commit comments

Comments
 (0)