[Done] Feature/mnist train api by reyoung · Pull Request #971 · PaddlePaddle/Paddle

reyoung · 2016-12-20T15:24:41Z

Under developing. Trying to expose ParameterUpdater APIs.

@jacquesqiao but to make read trainer_config.py to one file can be under developing now.

You can get my feature branch by

git remote add yy_remote https://github.com/reyoung/Paddle.git
git fetch yy_remote feature/mnist_train_api
git checkout FETCH_HEAD
git checkout -b YOUR_LOCAL_BRANCH_NAME

and push your commits into your repo. Thx.

…ist_train_api

…t_train_api

…re/mnist_train_api

jacquesqiao · 2016-12-20T15:57:17Z

ok, I will work based on this, Thanks~

wangkuiyi · 2016-12-20T16:37:13Z

demo/mnist/api_train.py

+def main():
+    api.initPaddle("-use_gpu=false", "-trainer_count=4")  # use 4 cpu cores
+    config = paddle.trainer.config_parser.parse_config(
+        'simple_mnist_network.py', '')


没看明白，我们昨天讨论的目的是用Python API写的Paddle程序都在一个py文件里。为什么这是还是分了两个.py文件呢？

这个PR的目的是用目前的Python API来实现MNIST training，还是用新的API来写demo呢？

这个mnist一个基础，按照昨天我们商量的结果，@reyoung 先提供一个小的demo，我在这个基础上去掉配置文件并做一些封装，@reyoung会修改data provider直接feed数据，在这个基础上把这个mnist demo改造成一个符合讨论结果的api调用方式。

这个PR的目的是用目前的Python API来实现MNIST training，还是用新的API来写demo呢？

用第一步目前的Python API实现，第二步把一堆python文件合成一个，第三步确定Python API究竟是啥样的。

wangkuiyi · 2016-12-20T16:38:29Z

demo/mnist/api_train.py

+    m.start()
+
+    for _ in xrange(100):
+        updater.startPass()


通常startXXX要有对应的endXXX。比如 https://golang.org/pkg/os/exec/#Cmd.Start 有对应的 https://golang.org/pkg/os/exec/#Cmd.Wait。如果意思是“做且做完”，可以叫 run_one_pass，比如 https://golang.org/pkg/os/exec/#Cmd.Run

恩，之前的code开发了一半，发现其他的一些问题，然后交另一个PR了。。

这个后面的code补上了。

wangkuiyi · 2016-12-20T16:40:00Z

demo/mnist/api_train.py

+    updater = api.ParameterUpdater.createLocalUpdater(opt_config)
+    assert isinstance(updater, api.ParameterUpdater)
+    updater.init(m)
+    m.start()


start得有一个宾语。如果是method，有时候class就是宾语，也有时候得明确指定宾语。比如

class BashCommand { public: void Run(); // Run the bash command. void SetStderrPipe(Pipe* p); // Set is the predicate, Pipe is the subject. }

这里比较诡异的是m的类型是GradientMachine，如果method叫 start，我不明白是 "start computing gradient” 还是 "start machine”。根据下文看，貌似是 start_training，那么最好的安排貌似是m的类型起名叫做 Trainer 且method name是 start，这样就成了 start trainer 的意思了。

好，不过用户态的代码应该不这样。。可能类似于

with gradient_machine.enter_training():
gradient_machine.forwardBackward这样。

wangkuiyi · 2016-12-20T16:44:31Z

demo/mnist/api_train.py

+    assert isinstance(m, api.GradientMachine)
+    init_parameter(network=m)
+
+    updater = api.ParameterUpdater.createLocalUpdater(opt_config)


我不常用Python，所以我去看了一下各种style guide里关于naming的描述。以下style guide都要求function names和method names都是 function_name的形式，而不是 createLocalUpdater 的形式。

PEP style guide

Google style guide

Python.net

如果我们的API要被Python社区接受，我估计得是Python style的吧。

ParameterUpdater和createLocalUpdater是直接从cpp中用swig expose出来的结构和方法，所以是cpp中的命名规范。我理解这些都不是直接暴露给用户的，而是经过我们封装一下，统一成python style。

是的，不过这就是swig api的悲剧点之一了。

这个名字直接从C++的头文件PaddleAPI.h自动化翻译过来的，没有办法自定义。

如果是C-API会好很多。

wangkuiyi · 2016-12-20T16:46:09Z

demo/mnist/api_train.py

+    for _ in xrange(100):
+        updater.startPass()
+
+    m.finish()


m.complete_training

另外，finish和complete的区别是“完蛋”和“完美”的区别，比如：

If you married a wrong woman, you are finished.

If you married the right woman, your are completed.

wangkuiyi · 2016-12-20T16:53:13Z

paddle/api/PaddleAPI.h

+  ParameterUpdater();
+
+public:
+  static ParameterUpdater* createLocalUpdater(OptimizationConfig* config);


在 Google C++ Style Guide 里，function/method names 应该是 CreateLocalUpdater。

虽然现在Paddle的C++ naming和Google style不同，但是我建议（至少新代码里）采用Google style，因为一旦有人违反，我们在code review comments里可以贴一个链接（如上），告诉开发者为什么naming需要修正，而不需要一遍又一遍地手工输入解释。

这个条款看起来不一定是非要遵守，因为没有感受到这个条款的优势，目前Paddle中的变量命名规则是.

对于类型名，采用全大写的CamelCase. 例如 GradientMachine, NeuralNetwork.

对于变量名:

普通变量名使用首字母小写的CamelCase, gradientMachine.

类内成员变量，采用首字母小写的CamelCase，同时尾缀为_。比如 gradientMachine_。

对于静态全局变量，采用首字母为g的CamelCase，比如 gSyncThreadPool;

对于namespace的名称，为下划线分割的名字，例如 cpu_avx

Paddle目前有一套完整的命名风格了，所以似乎不需要非要改。

另外，关于变量命名方式的guide，有很多风格(Qt，stl，boost，linux，google)。只要在一个库里面统一，看到某一个名字，能够反应出变量大概是什么，其实就还好了。

我们必须要遵守一个条款。遵守的最大用处，就是不要留给每个人纷争哪些条款值得遵守，否则review就总是会陷入在这类无休止的纷争里。

准守Google style的原因是：它对pros和cons都有分析，并且每一条条款都有一个URL。这样我们code review的时候只需要贴URL，不需要反复解释。

是的。单纯就命名这一点。我们可以遵守Paddle目前使用的条款，而没有必要全改成google风格的。

@reyoung paddle现在有关于naming的条款，可以在code review comments里贴URL的吗？如果没有，我建议就用Google style就好了。

wangkuiyi · 2016-12-20T17:03:08Z

demo/mnist/simple_mnist_network.py

@@ -0,0 +1,16 @@
+from paddle.trainer_config_helpers import *
+
+settings(learning_rate=1e-4, learning_method=AdamOptimizer(), batch_size=1000)


函数名字应该是动词或者动宾短语，settings是一个名词。这里的函数名看上去应该是 config_training_settings吧？

这里估计最后会改。。

wangkuiyi · 2016-12-20T17:10:32Z

我对这个pr的意图不一定理解对了，但是我感觉API的naming convention最好得符合C++和Python社区的大众习惯。所以我的comments里有引自code styles的，也有我关于英语语法的描述。希望这两点得到重视。 @reyoung @jacquesqiao

jacquesqiao · 2016-12-21T00:07:09Z

@wangkuiyi 关于code style的部分非常重要，后面改造过程中还需要持续反馈提供意见，保证我们输出的是符合预期的，封装良好的api style。

wangkuiyi · 2016-12-21T00:10:48Z

@jacquesqiao 谢谢回复。我了解这个pr的上下文了。谢谢大家！ @reyoung

* hidden decorator kwargs in DataProvider.__init__ * also add unit test for this.

…rain_api

…n_api

* hidden decorator kwargs in DataProvider.__init__ * also add unit test for this.

reyoung

这个PR目前还在开发，有一些事情会慢慢修复的。

reyoung · 2016-12-21T06:35:37Z

demo/mnist/api_train.py

+    assert isinstance(m, api.GradientMachine)
+    init_parameter(network=m)
+
+    updater = api.ParameterUpdater.createLocalUpdater(opt_config)


是的，不过这就是swig api的悲剧点之一了。

这个名字直接从C++的头文件PaddleAPI.h自动化翻译过来的，没有办法自定义。

如果是C-API会好很多。

reyoung · 2016-12-21T06:36:55Z

demo/mnist/api_train.py

+    updater = api.ParameterUpdater.createLocalUpdater(opt_config)
+    assert isinstance(updater, api.ParameterUpdater)
+    updater.init(m)
+    m.start()


好，不过用户态的代码应该不这样。。可能类似于

with gradient_machine.enter_training():
gradient_machine.forwardBackward这样。

reyoung · 2016-12-21T06:37:28Z

demo/mnist/api_train.py

+    m.start()
+
+    for _ in xrange(100):
+        updater.startPass()


恩，之前的code开发了一半，发现其他的一些问题，然后交另一个PR了。。

这个后面的code补上了。

reyoung · 2016-12-21T06:38:50Z

demo/mnist/simple_mnist_network.py

@@ -0,0 +1,16 @@
+from paddle.trainer_config_helpers import *
+
+settings(learning_rate=1e-4, learning_method=AdamOptimizer(), batch_size=1000)


这里估计最后会改。。

reyoung · 2016-12-21T06:40:13Z

demo/mnist/api_train.py

+    for _ in xrange(100):
+        updater.startPass()
+
+    m.finish()


reyoung · 2016-12-21T06:49:29Z

paddle/api/PaddleAPI.h

+  ParameterUpdater();
+
+public:
+  static ParameterUpdater* createLocalUpdater(OptimizationConfig* config);


这个条款看起来不一定是非要遵守，因为没有感受到这个条款的优势，目前Paddle中的变量命名规则是.

对于类型名，采用全大写的CamelCase. 例如 GradientMachine, NeuralNetwork.

对于变量名:

普通变量名使用首字母小写的CamelCase, gradientMachine.

类内成员变量，采用首字母小写的CamelCase，同时尾缀为_。比如 gradientMachine_。

对于静态全局变量，采用首字母为g的CamelCase，比如 gSyncThreadPool;

对于namespace的名称，为下划线分割的名字，例如 cpu_avx

Paddle目前有一套完整的命名风格了，所以似乎不需要非要改。

另外，关于变量命名方式的guide，有很多风格(Qt，stl，boost，linux，google)。只要在一个库里面统一，看到某一个名字，能够反应出变量大概是什么，其实就还好了。

…rain_api

…e/mnist_train_api

…rain_api

…nto feature/mnist_train_api

…train_api

reyoung · 2016-12-22T13:25:57Z

@jacquesqiao your 843b63b and 763a30f are cherry picked to my branch.

And @jacquesqiao , maybe you should associate your email address to your github account. Then github will show the correct avatar of your commits.

jacquesqiao · 2016-12-23T05:54:54Z

Done, I have associate my email, thanx

reyoung · 2016-12-23T06:18:44Z

@wangkuiyi 这个PR已经完成了，我们使用SWIG API可以训练MNIST的demo了。

这里并不涉及任何Python用户态的API应该如何设计，只是从裸的SWIG API出发，完成这个功能。

PythonAPI的设计和提取在另一个PR里面做。

jacquesqiao

the remaining work will be done in #1005

…train_api

wangkuiyi · 2016-12-27T03:39:39Z

不好意思，因为我休假，耽误了这个pr 的merge。

我approve了。

但是有一点需要指出：

我们必须要严格遵守一个code style。遵守的最大用处，就是不要留给每个人纷争哪些条款值得遵守，否则review就总是会陷入在这类无休止的纷争里。

准守Google style的原因是：它对pros和cons都有分析，并且每一条条款都有一个URL。这样我们code review的时候只需要贴URL，不需要反复解释。

reyoung · 2016-12-27T03:42:58Z

耽误了这个pr 的merge。

倒是不耽误。因为git的分支很好用。可以有新的PR基于老的PR，互相不影响review。#1005 #1012

我们必须要严格遵守一个code style。

我们可以在paddle这个组织下维护一个code style吧。之前adu那个，可以fork过来，加上一些Paddle特有的style(主要是增加paddle目前代码的命名风格)

wangkuiyi · 2016-12-27T03:53:35Z

@reyoung 好！

因为和ADU同学合作的那个primier，是对google code style的一个修订，基本原则还是google style，所以我们comnents里贴的链接我估计绝大多数还是google style。

* bugfix (PaddlePaddle#967) * bugfix * Update release_note_cn.rst * bug fix (PaddlePaddle#968)

1. typo fix: stm_hidden_size -> lstm_hidden_size. 2. import fix: add missing imports in export_model.py 3. nit: remove unused imports.

reyoung added 6 commits December 20, 2016 17:32

Refine Code

c0e687b

Merge branch 'feature/add_const_in_parameter_updater' into feature/mn…

06944ee

…ist_train_api

Start doing mnist_train_api

8b4cbcf

Merge branch 'feature/clean_gradient_machine_start' into feature/mnis…

ad6cb60

…t_train_api

Add GradientMachine::start/finish to API

025e3e9

Merge branch 'feature/clean_parameter_updater_finish_pass' into featu…

677c79b

…re/mnist_train_api

reyoung assigned jacquesqiao Dec 20, 2016

jacquesqiao self-requested a review December 20, 2016 16:09

wangkuiyi requested changes Dec 20, 2016

View reviewed changes

reyoung added 9 commits December 21, 2016 12:54

Wait for reading data.

27d87db

A tiny fix in PyDataProvider2

9f5e742

* hidden decorator kwargs in DataProvider.__init__ * also add unit test for this.

Merge branch 'feature/fix_param_hidden_in_pydp2' into feature/mnist_t…

ad93b8f

…rain_api

Try to read data in mnist

5f6c4af

Use numpy in DenseScanner.

36d1e61

Merge branch 'feature/fix_swig_dense_scanner' into feature/mnist_trai…

efb5c10

…n_api

Try expose ParamUpdater::update

20249e8

A simplest train file for mnist added.

05ab22c

A tiny fix in PyDataProvider2

1f4f044

* hidden decorator kwargs in DataProvider.__init__ * also add unit test for this.

reyoung commented Dec 21, 2016

View reviewed changes

reyoung added 7 commits December 21, 2016 15:19

Merge branch 'feature/fix_param_hidden_in_pydp2' into feature/mnist_t…

cf5bf5b

…rain_api

Merge branch 'feature/add_const_in_gradient_machine_eval' into featur…

1e6c87b

…e/mnist_train_api

Expose Evaluator API

eaba2e2

Complete a very simple mnist demo.

409a577

Merge branch 'feature/fix_param_hidden_in_pydp2' into feature/mnist_t…

06dc66b

…rain_api

Add AverageOptimizer, Add save parameter

680dd92

Add gitignore

5bca268

reyoung and others added 8 commits December 22, 2016 07:45

Test GPU

f06b64f

Merge branch 'feature/mnist_train_api' of github.com:reyoung/Paddle i…

65e957c

…nto feature/mnist_train_api

Test on GPU

5a68584

Merge branch 'develop' of github.com:baidu/Paddle into feature/mnist_…

16ea66e

…train_api

Add comments.

3a80272

add config_parser in trainer_config_helpers to seperate trainer config

843b63b

add config_parser_utils

763a30f

Remove unnecessary import in api_train.py

9b41b08

reyoung mentioned this pull request Dec 22, 2016

add config_parser in trainer_config_helpers to seperate trainer config reyoung/Paddle#1

Closed

reyoung changed the title ~~Feature/mnist train api~~ [Done] Feature/mnist train api Dec 23, 2016

reyoung mentioned this pull request Dec 23, 2016

Paddle Python API demo [Basically Done] #1005

Closed

3 tasks

jacquesqiao mentioned this pull request Dec 26, 2016

[C-API] expose remote updater api #1012

Closed

jacquesqiao approved these changes Dec 26, 2016

View reviewed changes

Merge branch 'develop' of github.com:baidu/Paddle into feature/mnist_…

f8e4b0b

…train_api

jacquesqiao assigned reyoung Dec 26, 2016

Merge branch 'develop' of github.com:baidu/Paddle into feature/mnist_…

eefe5a7

…train_api

wangkuiyi approved these changes Dec 27, 2016

View reviewed changes

Fix merge errors.

eca4592

jacquesqiao merged commit 22454ed into PaddlePaddle:develop Dec 27, 2016

reyoung deleted the feature/mnist_train_api branch December 27, 2016 05:55

zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019

Cherrypick2dev (PaddlePaddle#971)

4854d9c

* bugfix (PaddlePaddle#967) * bugfix * Update release_note_cn.rst * bug fix (PaddlePaddle#968)

wangxicoding pushed a commit to wangxicoding/Paddle that referenced this pull request Dec 9, 2021

Fix rnn example. (PaddlePaddle#971)

98e3564

1. typo fix: stm_hidden_size -> lstm_hidden_size. 2. import fix: add missing imports in export_model.py 3. nit: remove unused imports.

lizexu123 pushed a commit to lizexu123/Paddle that referenced this pull request Feb 23, 2024

fix qat docs (PaddlePaddle#971)

7b49bed

		@@ -0,0 +1,16 @@
		from paddle.trainer_config_helpers import *

		settings(learning_rate=1e-4, learning_method=AdamOptimizer(), batch_size=1000)

Conversation

reyoung commented Dec 20, 2016

Uh oh!

jacquesqiao commented Dec 20, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacquesqiao Dec 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangkuiyi commented Dec 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacquesqiao commented Dec 21, 2016

Uh oh!

wangkuiyi commented Dec 21, 2016

Uh oh!

reyoung left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

reyoung commented Dec 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacquesqiao commented Dec 23, 2016

Uh oh!

reyoung commented Dec 23, 2016

Uh oh!

jacquesqiao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangkuiyi commented Dec 27, 2016

Uh oh!

jacquesqiao Dec 20, 2016 •

edited

Loading

wangkuiyi commented Dec 20, 2016 •

edited

Loading

reyoung commented Dec 22, 2016 •

edited

Loading

jacquesqiao left a comment •

edited

Loading

reyoung commented Dec 27, 2016 •

edited

Loading