You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<Typographyvariant="h5">Complementary tables of practices</Typography>
63
63
<br/>
64
-
<Typographyvariant="body1"align="justify">The following tables are complement to the taxonomy presented in the previous chart. These tables are organized in the ML pipeline stages proposed by <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">Amershi et al. (2019)</a> (<b><em>Model requirement</em></b>, <b><em> Data collection</em></b>, <b><em> Data cleaning</em></b>, <b><em> Feature engineering</em></b>, <b><em> Data labeling</em></b>, <b><em> Model training</em></b>, <b><em> Model evaluation</em></b>, <b><em> Model deployment</em></b> and <b><em> Model monitoring</em></b>) and an extra stage called <b><em> implementation</em></b>. For each stage, a brief explanation of it is given and a table with the respective practices is presented. In the Table, an indicator per practice is given (this ID match wirh the ID used in the article). In addition to the ID, the taxonomy's categories are presented with the description of the practices. Furthermore, we present extra resources, the post(s) that is related to the practices, external URL(s) related to the post, and extra urls that help to understand the practices and the ML terminology/concepts associated to them. Kindly note that below each table, you will find an explanation abou the acronyms used in each table.</Typography>
64
+
<Typographyvariant="body1"align="justify">The following tables are complement to the taxonomy presented in the previous chart. These tables are organized in the ML pipeline stages proposed by <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">Amershi et al. (2019)</a> (<b><em>Model Requirement</em></b>, <b><em> Data Collection</em></b>, <b><em> Data Cleaning</em></b>, <b><em> Feature Engineering</em></b>, <b><em> Data Labeling</em></b>, <b><em> Model Training</em></b>, <b><em> Model Evaluation</em></b>, <b><em> Model Deployment</em></b> and <b><em> Model Monitoring</em></b>) and an extra stage called <b><em> Cross-cutting</em></b>. For each stage, a brief explanation of it is given and a table with the respective practices is presented. In the Table, an indicator per practice is given (this ID match wirh the ID used in the article). In addition to the ID, the taxonomy's categories are presented with the description of the practices. Furthermore, we present extra resources, the post(s) that is related to the practices, external URL(s) related to the post, and extra urls that help to understand the practices and the ML terminology/concepts associated to them. Kindly note that below each table, you will find an explanation abou the acronyms used in each table.</Typography>
65
65
<br/>
66
-
<Typographyvariant="h6"align="left"> Model requirement (MR) </Typography>
66
+
<Typographyvariant="h6"align="left"> Model Requirement (MR) </Typography>
67
67
<Typographyvariant="body1"align="justify"> In this stage, designers decide the functionalities that should be included in an ML system, their usefulness for new or existing products, and the most appropriate type of ML model for the expected system features <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. Four ML best practices were identified for this stage.</Typography>
<Typographyvariant="h6"align="left"> Data collection (DC)</Typography>
73
+
<Typographyvariant="h6"align="left"> Data Collection (DC)</Typography>
74
74
<Typographyvariant="body1"align="justify"> This second stage encompasses looking for, collecting, and integrating available datasets <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. Datasets can be created from scratch, or existing datasets can be used to train models in a transfer learning fashion. Both scenarios are widely used when creating ML systems. In this stage, seven validated practices were identified. Bear in mind that the identified practices relate to some characteristics that the collected data has to meet during/after this process and not to the collection process itself.</Typography>
<Typographyvariant="h6"align="left"> Data cleaning (DCL)</Typography>
80
+
<Typographyvariant="h6"align="left"> Data Cleaning (DCL)</Typography>
81
81
<Typographyvariant="body1"align="justify"> This is the second stage in which the most practices were identified (i.e., 33 practices). In general, this stage involves removing inaccurate or noisy records from a dataset <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. In this case, we present the practices aggregated by three subcategories: <b><em>Exploratory data analysis (EDA)</em></b>, <b><em>Wrangling</em></b>, and <b><em>Data</em></b>.</Typography>
82
82
<br/>
83
83
<Typographyvariant="subtitle1"align="left"mb={1}><em>Exploratory data analysis (EDA)</em></Typography>
<Typographyvariant="h6"align="left"> Data labeling (DL)</Typography>
104
+
<Typographyvariant="h6"align="left"> Data Labeling (DL)</Typography>
105
105
<Typographyvariant="body1"align="justify"> This phase, in which a ground truth label is assigned to each sample/record of the datasets <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>, is not always required since some ML approaches do not need it. In particular, ground truth is needed when working with projects that use supervised or semi-supervised learning but is not needed for projects that use unsupervised learning. For instance, if a snippet of code is going to be classified as vulnerable or not, then for each snippet of code, a label indicating if it is vulnerable or not should be assigned. Two practices were identified in this stage. The first practice in this group was validated by all the experts (<em>DL1</em>), while the second practice (<em>DL2</em>) was validated by three of them. </Typography>
<Typographyvariant="body1"align="justify"> This stage of an ML pipeline involves all the activities that are performed to extract and select informative features (i.e., characteristics/attributes that are useful or relevant) for machine learning models <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. In this stage, 11 validated practices were identified, four of them (<em>FE1</em> - <em>FE4</em>) were validated by the four experts and the remaining six (<em>FE5</em> - <em>FE11</em>) were validated by three experts.</Typography>
<Typographyvariant="h6"align="left"> Model training (MT)</Typography>
118
+
<Typographyvariant="h6"align="left"> Model Training (MT)</Typography>
119
119
<Typographyvariant="body1"align="justify"> This is the ML pipeline stage with the largest number of validated practices, 47 in total. In this stage, machine learning models are trained and tuned using the selected features in the fe stage, and the labels created/selected during the dl stage <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>, if applicable. To facilitate the reading of this subsection, the practices are grouped into two subcategories, a Learning phase, and a Validation phase. In each subcategory, we present, first, all the practices that were validated by all the four experts, followed by those that were validated by three experts. Note that validation refers to the usage of a validation set in order to optimize hyper-parameters; validation, in this case, is not related to testing an already trained and tuned model.</Typography>
<Typographyvariant="h6"align="left"> Model evaluation (ME)</Typography>
135
+
<Typographyvariant="h6"align="left"> Model Evaluation (ME)</Typography>
136
136
<Typographyvariant="body1"align="justify"> In the model evaluation stage, trained and tuned models are tested. For instance, engineers evaluate the output models on tested or safeguard datasets by using pre-defined metrics. In particular cases, for critical domains (e.g., safety-critical applica- tions from the medical domain), this stage involves human evaluation <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. For this stage, we have a few practices, eight, that are related to model evaluations. However, some other practices that involve or are associated with model evaluation/testing were mentioned before as part of other stages. All the experts validated two practices (<em>ME1</em> - <em>ME2</em>), and six (<em>ME3</em> - <em>ME8</em>) were validated by three experts. </Typography>
<Typographyvariant="h6"align="left"> Model deployment (MD)</Typography>
141
+
<Typographyvariant="h6"align="left"> Model Deployment (MD)</Typography>
142
142
<Typographyvariant="body1"align="justify"> Note that, in this stage, the inference code (i.e., the code used to train/test/validate a model) of the previously trained, tuned, and tested model is deployed on a production setup <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. Two practices were identified in this stage and validated by the four experts. </Typography>
<Typographyvariant="h6"align="left"> Model monitoring (MM)</Typography>
147
+
<Typographyvariant="h6"align="left"> Model Monitoring (MM)</Typography>
148
148
<Typographyvariant="body1"align="justify"> In the last but not least stage of the ML pipeline, models are continuously monitored for possible errors while being executed in the real world <ahref="https://www.microsoft.com/en-us/research/uploads/prod/2019/03/amershi-icse-2019_Software_Engineering_for_Machine_Learning.pdf">(Amershi et al. (2019))</a>. For this stage, two practices related to data deviations were validated by all the experts. </Typography>
0 commit comments