zhhsplendid
diff --git a/‎doc/fluid/api/index_en.rst‎
Lines changed: 1 addition & 1 deletion b/‎doc/fluid/api/index_en.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/fluid/flags/cudnn_cn.rst‎
Lines changed: 71 additions & 0 deletions b/‎doc/fluid/flags/cudnn_cn.rst‎
Lines changed: 71 additions & 0 deletions
diff --git a/‎doc/fluid/flags/cudnn_en.rst‎
Lines changed: 71 additions & 0 deletions b/‎doc/fluid/flags/cudnn_en.rst‎
Lines changed: 71 additions & 0 deletions
diff --git a/‎doc/fluid/flags/data_cn.rst‎
Lines changed: 46 additions & 0 deletions b/‎doc/fluid/flags/data_cn.rst‎
Lines changed: 46 additions & 0 deletions
diff --git a/‎doc/fluid/flags/data_en.rst‎
Lines changed: 45 additions & 0 deletions b/‎doc/fluid/flags/data_en.rst‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎doc/fluid/flags/debug_cn.rst‎
Lines changed: 82 additions & 0 deletions b/‎doc/fluid/flags/debug_cn.rst‎
Lines changed: 82 additions & 0 deletions
@@ -5,7 +5,7 @@ API Reference
 ..  toctree::
     :maxdepth: 1
 
-    ../flags_en.rst
+    
     ../api_guides/index_en.rst
     fluid.rst
     average.rst
 
@@ -0,0 +1,71 @@
+
+cudnn
+==================
+
+
+conv_workspace_size_limit
+*******************************************
+(始于0.13.0)
+
+用于选择cuDNN卷积算法的工作区限制大小（单位为MB）。cuDNN的内部函数在这个内存限制范围内获得速度最快的匹配算法。通常，在较大的工作区内可以选择更快的算法，但同时也会显著增加内存空间。用户需要在内存和速度之间进行权衡。
+
+取值范围
+---------------
+Uint64型，缺省值为4096。即4G内存工作区。
+
+示例
+-------
+FLAGS_conv_workspace_size_limit=1024 - 将用于选择cuDNN卷积算法的工作区限制大小设置为1024MB。
+
+
+cudnn_batchnorm_spatial_persistent
+*******************************************
+(始于1.4.0)
+
+表示是否在batchnorm中使用新的批量标准化模式CUDNN_BATCHNORM_SPATIAL_PERSISTENT函数。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_cudnn_batchnorm_spatial_persistent=True - 开启CUDNN_BATCHNORM_SPATIAL_PERSISTENT模式。
+
+注意
+-------
+此模式在某些任务中可以更快，因为将为CUDNN_DATA_FLOAT和CUDNN_DATA_HALF数据类型选择优化路径。我们默认将其设置为False的原因是此模式可能使用原子整数缩减(scaled atomic integer reduction)而导致某些输入数据范围的数字溢出。
+
+
+cudnn_deterministic
+*******************************************
+(始于0.13.0)
+
+cuDNN对于同一操作有几种算法，一些算法结果是非确定性的，如卷积算法。该flag用于调试。它表示是否选择cuDNN中的确定性函数。 
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_cudnn_deterministic=True - 选择cuDNN中的确定性函数。
+
+注意
+-------
+现在，在cuDNN卷积和池化Operator中启用此flag。确定性算法速度可能较慢，因此该flag通常用于调试。
+
+
+cudnn_exhaustive_search
+*******************************************
+(始于1.2.0)
+
+表示是否使用穷举搜索方法来选择卷积算法。在cuDNN中有两种搜索方法，启发式搜索和穷举搜索。穷举搜索尝试所有cuDNN算法以选择其中最快的算法。此方法非常耗时，所选择的算法将针对给定的层规格进行缓存。 一旦更改了图层规格（如batch大小，feature map大小），它将再次搜索。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_cudnn_exhaustive_search=True - 使用穷举搜索方法来选择卷积算法。
@@ -0,0 +1,71 @@
+==================
+cudnn
+==================
+
+
+conv_workspace_size_limit
+*******************************************
+(since 0.13.0)
+
+The workspace limit size in MB unit for choosing cuDNN convolution algorithms. The inner funciton of cuDNN obtain the fastest suited algorithm that fits within this memory limit. Usually, large workspace size may lead to choose faster algorithms, but significant increasing memory workspace. Users need to trade-off between memory and speed.
+
+Values accepted
+---------------
+Uint64. The default value is 4096. That is to say, 4G memory workspace.
+
+Example
+-------
+FLAGS_conv_workspace_size_limit=1024 set the workspace limit size for choosing cuDNN convolution algorithms to 1024MB.
+
+
+cudnn_batchnorm_spatial_persistent
+*******************************************
+(since 1.4.0)
+
+Indicates whether to use the new batch normalization mode CUDNN_BATCHNORM_SPATIAL_PERSISTENT function in batchnorm.
+
+Values accepted
+---------------
+Bool. The default value is False.
+
+Example
+-------
+FLAGS_cudnn_batchnorm_spatial_persistent=True will enable the CUDNN_BATCHNORM_SPATIAL_PERSISTENT mode.
+
+Note
+-------
+This mode can be faster in some tasks because an optimized path will be selected for CUDNN_DATA_FLOAT and CUDNN_DATA_HALF data types. The reason we set it to False by default is that this mode may use scaled atomic integer reduction which may cause a numerical overflow for some input data range.
+
+
+cudnn_deterministic
+*******************************************
+(since 0.13.0)
+
+For one operation, cuDNN has several algorithms, some algorithm results are non-deterministic, like convolution algorithms. This flag is used for debugging. It indicates whether to choose the deterministic in cuDNN.
+
+Values accepted
+---------------
+Bool. The default value is False.
+
+Example
+-------
+FLAGS_cudnn_deterministic=True will choose the deterministic in cuDNN.
+
+Note
+-------
+Now this flag is enabled in cuDNN convolution and pooling operator. The deterministic algorithms may slower, so this flag is generally used for debugging.
+
+
+cudnn_exhaustive_search
+*******************************************
+(since 1.2.0)
+
+Whether to use exhaustive search method to choose convolution algorithms. There are two search methods, heuristic search and exhaustive search in cuDNN. The exhaustive search attempts all cuDNN algorithms to choose the fastest algorithm. This method is time-consuming, the choosed algorithm will be cached for the given layer specifications. Once the layer specifications (like batch size, feature map size) are changed, it will search again.
+
+Values accepted
+---------------
+Bool. The default value is False.
+
+Example
+-------
+FLAGS_cudnn_exhaustive_search=True will use exhaustive search method to choose convolution algorithms.
@@ -0,0 +1,46 @@
+
+数值计算
+==================
+
+
+enable_cublas_tensor_op_math
+*******************************************
+(始于1.2.0)
+
+该flag表示是否使用Tensor Core，但可能会因此降低部分精确度。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+enable_cublas_tensor_op_math=True - 使用Tensor Core。
+
+
+use_mkldnn
+*******************************************
+(始于0.13.0)
+
+在预测或训练过程中，可以通过该选项选择使用Intel MKL-DNN（https://github.com/intel/mkl-dnn）库运行。
+“用于深度神经网络的英特尔（R）数学核心库（Intel(R) MKL-DNN）”是一个用于深度学习应用程序的开源性能库。该库加速了英特尔（R）架构上的深度学习应用程序和框架。Intel MKL-DNN包含矢量化和线程化构建建块，您可以使用它们来实现具有C和C ++接口的深度神经网络（DNN）。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_use_mkldnn=True - 开启使用MKL-DNN运行。
+
+注意
+-------
+FLAGS_use_mkldnn仅用于python训练和预测脚本。要在CAPI中启用MKL-DNN，请设置选项 -DWITH_MKLDNN=ON。
+英特尔MKL-DNN支持英特尔64架构和兼容架构。
+该库对基于以下设备的系统进行了优化：
+英特尔SSE4.1支持的英特尔凌动（R）处理器；
+第4代，第5代，第6代，第7代和第8代英特尔（R）Core（TM）处理器；
+英特尔（R）Xeon（R）处理器E3，E5和E7系列（原Sandy Bridge，Ivy Bridge，Haswell和Broadwell）；
+英特尔（R）Xeon（R）可扩展处理器（原Skylake和Cascade Lake）；
+英特尔（R）Xeon Phi（TM）处理器（原Knights Landing and Knights Mill）；
+兼容处理器。
@@ -0,0 +1,45 @@
+
+data processing
+==================
+
+enable_cublas_tensor_op_math
+*******************************************
+(since 1.2.0)
+
+This Flag indicates whether to use Tensor Core, but it may lose some precision. 
+
+Values accepted
+---------------
+Bool. The default value is False.
+
+Example
+-------
+enable_cublas_tensor_op_math=True will use Tensor Core.
+
+
+use_mkldnn
+*******************************************
+(since 0.13.0)
+
+Give a choice to run with Intel MKL-DNN (https://github.com/intel/mkl-dnn) library on inference or training.
+
+Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) is an open-source performance library for deep-learning applications. The library accelerates deep-learning applications and frameworks on Intel(R) architecture. Intel MKL-DNN contains vectorized and threaded building blocks that you can use to implement deep neural networks (DNN) with C and C++ interfaces.
+
+Values accepted
+---------------
+Bool. The default value is False.
+
+Example
+-------
+FLAGS_use_mkldnn=True will enable running with MKL-DNN support.
+
+Note
+-------
+FLAGS_use_mkldnn is only used for python training and inference scripts. To enable MKL-DNN in CAPI, set build option -DWITH_MKLDNN=ON
+Intel MKL-DNN supports Intel 64 architecture and compatible architectures. The library is optimized for the systems based on:
+Intel Atom(R) processor with Intel SSE4.1 support
+4th, 5th, 6th, 7th, and 8th generation Intel(R) Core(TM) processor
+Intel(R) Xeon(R) processor E3, E5, and E7 family (formerly Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
+Intel(R) Xeon(R) Scalable processors (formerly Skylake and Cascade Lake)
+Intel(R) Xeon Phi(TM) processors (formerly Knights Landing and Knights Mill)
+and compatible processors.
@@ -0,0 +1,82 @@
+
+调试
+==================
+
+
+check_nan_inf
+********************
+(始于0.13.0)
+
+用于调试。它用于检查Operator的结果是否含有Nan或Inf。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_check_nan_inf=True - 检查Operator的结果是否含有Nan或Inf。
+
+
+cpu_deterministic
+*******************************************
+(始于0.15.0)
+
+该flag用于调试。它表示是否在CPU侧确定计算结果。 在某些情况下，不同求和次序的结果可能不同，例如，`a+b+c+d` 的结果可能与 `c+a+b+d` 的结果不同。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_cpu_deterministic=True - 在CPU侧确定计算结果。
+
+
+enable_rpc_profiler
+*******************************************
+(始于1.0.0)
+
+是否启用RPC分析器。
+
+取值范围
+----------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_enable_rpc_profiler=True - 启用RPC分析器并在分析器文件中记录时间线。
+
+
+multiple_of_cupti_buffer_size
+*******************************************
+(始于1.4.0)
+
+该flag用于分析。它表示CUPTI设备缓冲区大小的倍数。如果在profiler过程中程序挂掉或者在chrome://tracing中加载timeline文件时出现异常，请尝试增大此值。
+
+取值范围
+---------------
+Int32型，缺省值为1。
+
+示例
+-------
+FLAGS_multiple_of_cupti_buffer_size=1 - 将CUPTI设备缓冲区大小的倍数设为1。
+
+
+reader_queue_speed_test_mode
+*******************************************
+(始于1.1.0)
+
+将pyreader数据队列设置为测试模式。在测试模式下，pyreader将缓存一些数据，然后执行器将读取缓存的数据，因此阅读器不会成为瓶颈。
+
+取值范围
+---------------
+Bool型，缺省值为False。
+
+示例
+-------
+FLAGS_reader_queue_speed_test_mode=True - 启用pyreader测试模式。
+
+注意
+-------
+仅当使用py_reader时该flag才有效。