-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathdata_sources_and_features.txt
More file actions
166 lines (124 loc) · 4.71 KB
/
data_sources_and_features.txt
File metadata and controls
166 lines (124 loc) · 4.71 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
PROJECT: Process_Decisions_Optimization
TITLE: Data Sources & Feature Engineering
--------------------------------------------------
OVERVIEW:
This dataset integrates process parameters, supplier conditions, operational readiness, and human factors from an industrial manufacturing environment.
The objective is to classify scrap risk levels and provide clear decision rules to improve process stability and quality outcomes.
--------------------------------------------------
DATA SOURCES AND VARIABLE ORIGIN:
1. PROCESS PARAMETERS (PRODUCTION SYSTEM)
Variables:
- press_speed_spm
- ambient_temp_c
Taken from:
Machine control systems and environmental monitoring sensors.
Reason:
Process speed and ambient temperature directly affect process stability and material behavior, influencing scrap generation.
--------------------------------------------------
2. RAW MATERIAL QUALITY (SUPPLIER / INCOMING INSPECTION)
Variables:
- raw_material_hardness_hrb
- critical_supplier_lot
Taken from:
Incoming material inspection records and supplier traceability systems.
Reason:
Material hardness and supplier lot variability can significantly impact process performance and defect formation.
--------------------------------------------------
3. HUMAN FACTOR (OPERATIONS)
Variables:
- operator_experience_yrs
Taken from:
Operator assignment and experience records.
Reason:
Operator experience influences setup accuracy, adjustments, and response to process deviations.
--------------------------------------------------
4. OPERATIONAL CONDITIONS
Variables:
- shift
- recent_model_change
Taken from:
Production scheduling systems and engineering change records.
Reason:
Shift conditions (Day/Night) and recent product changes introduce variability and instability in the process.
--------------------------------------------------
5. PROCESS CONTROL / READINESS
Variables:
- setup_checklist_complete
Taken from:
Pre-production checklist and setup validation records.
Reason:
Incomplete setups are a major source of defects and process inconsistency.
--------------------------------------------------
6. TARGET VARIABLE (QUALITY RISK CLASSIFICATION)
Variables:
- scrap_risk
Taken from:
Derived classification based on historical scrap performance and process conditions.
Reason:
Categorizes risk levels (e.g., Low / Medium / High) for decision-making purposes.
--------------------------------------------------
VARIABLE-BY-VARIABLE SUMMARY:
press_speed_spm
Taken from:
Machine control system
Why it was included:
Higher speeds can increase instability and defect probability.
raw_material_hardness_hrb
Taken from:
Material inspection records
Why it was included:
Material properties affect formability and defect generation.
operator_experience_yrs
Taken from:
HR / operator records
Why it was included:
Experience improves setup quality and process control.
ambient_temp_c
Taken from:
Environmental monitoring
Why it was included:
Temperature affects material and machine behavior.
critical_supplier_lot
Taken from:
Supplier traceability system
Why it was included:
Certain lots may introduce variability or defects.
shift
Taken from:
Production schedule
Why it was included:
Different shifts operate under different conditions.
recent_model_change
Taken from:
Engineering change logs
Why it was included:
Recent changes increase process instability.
setup_checklist_complete
Taken from:
Setup validation records
Why it was included:
Incomplete setups are a key driver of defects.
scrap_risk
Taken from:
Quality classification system
Why it was included:
Defines the outcome for predictive modeling.
--------------------------------------------------
FEATURE ENGINEERING LOGIC:
The dataset combines machine parameters, material quality, human factors, and process readiness to capture conditions that lead to scrap.
Each variable represents a controllable or observable factor within the production system.
--------------------------------------------------
BUSINESS LOGIC:
Scrap is not random.
It is driven by process conditions, material variability, operator capability, and setup discipline.
Decision Trees allow translating this complexity into clear, actionable rules.
--------------------------------------------------
GOAL:
Classify scrap risk to:
- Support decision-making on the shop floor
- Prevent defects before production
- Improve standard work and setup discipline
- Reduce variability and scrap
--------------------------------------------------
NOTE:
Data originates from an industrial production environment, integrating machine data, material inspection, operational records, and setup validation processes.