|
4 | 4 | 是一套把原始输入转换为模型所需输入(特征)的数据变换过程,用来保证离线、在线样本生成结果的一致性。 |
5 | 5 | 特征生成也可以理解为特征变换,对单个特征或者多个特征做变换。我们提供了各种类型的FG算子来完成各种特征变换操作。 |
6 | 6 |
|
7 | | -特征生成只关注同时需要在离线和在线样本生成过程中的变换操作。如果某个变换操作只需要作用在离线阶段,则不需要定义为FG的操作。 |
| 7 | +特征生成只关注同时需要在离线和在线样本生成过程中的变换操作。如果某个变换操作只需要作用在离线阶段,则不需要定义为FG的操作。 |
8 | 8 |
|
9 | 9 | FG模块在推荐系统架构中的位置如下图所示: |
10 | 10 |
|
@@ -55,21 +55,21 @@ FG模块在推荐系统架构中的位置如下图所示: |
55 | 55 |
|
56 | 56 | FG支持的特征变换算子与EasyRec支持的特征(`Feature Column`)之间没有严格的对应关系,大致可以参加如下表格: |
57 | 57 |
|
58 | | -| FG 算子 | EasyRec Feature Column | |
59 | | -|:-------------|:------------------------------------| |
60 | | -| id_feature | IdFeature 或 TagFeature | |
61 | | -| raw_feature | RawFeature | |
62 | | -| expr_feature | RawFeature | |
63 | | -| combo_feature | IdFeature 或 TagFeature | |
64 | | -| lookup_feature | RawFeature 或 IdFeature 或 TagFeature | |
65 | | -| match_feature | RawFeature 或 IdFeature 或 TagFeature | |
66 | | -| overlap_feature | RawFeature | |
67 | | -| sequence_feature | SequenceFeature 或 TagFeature | |
68 | | -| bm25_feature | RawFeature | |
69 | | -| kv_dot_product | RawFeature | |
70 | | -| tokenize_feature | TagFeature | |
71 | | -| text_normalizer | IdFeature | |
72 | | -| regex_replace_feature | IdFeature | |
| 58 | +| FG 算子 | EasyRec Feature Column | |
| 59 | +| :-------------------- | :---------------------------------- | |
| 60 | +| id_feature | IdFeature 或 TagFeature | |
| 61 | +| raw_feature | RawFeature | |
| 62 | +| expr_feature | RawFeature | |
| 63 | +| combo_feature | IdFeature 或 TagFeature | |
| 64 | +| lookup_feature | RawFeature 或 IdFeature 或 TagFeature | |
| 65 | +| match_feature | RawFeature 或 IdFeature 或 TagFeature | |
| 66 | +| overlap_feature | RawFeature | |
| 67 | +| sequence_feature | SequenceFeature 或 TagFeature | |
| 68 | +| bm25_feature | RawFeature | |
| 69 | +| kv_dot_product | RawFeature | |
| 70 | +| tokenize_feature | TagFeature | |
| 71 | +| text_normalizer | IdFeature | |
| 72 | +| regex_replace_feature | IdFeature | |
73 | 73 |
|
74 | 74 | 备注:**FG的执行结果输出给EasyRec模型,两种之间是串联的关系**。 |
75 | 75 |
|
@@ -122,8 +122,8 @@ pai -name easy_rec_ext |
122 | 122 | 如果不是, 可以通过-Dedit_config_json='{"export_config.multi_placeholder":true}' 进行修改 |
123 | 123 |
|
124 | 124 | - 如果有设置feature_config.features.max_partitions, 请加入下面的命令重置: |
125 | | - - -Dedit_config_json='{"feature_config.features\[:\].max_partitions":1}'进行修改, 可以获得更好的性能 |
126 | 125 |
|
| 126 | + - -Dedit_config_json='{"feature_config.features\[:\].max_partitions":1}'进行修改, 可以获得更好的性能 |
127 | 127 |
|
128 | 128 | #### 特征筛选 |
129 | 129 |
|
|
0 commit comments