You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/mlpaths/A1_Intro_to_DataScience_and_ML.md
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
[[**Back to ML Paths Home!**](https://ua-datalab.github.io/mlpaths_grids/)]
1
+
[[:arrow_left:**Back to ML Paths Home!**](https://ua-datalab.github.io/mlpaths_grids/)]
2
2
3
3
## Introduction to Data Science and Machine Learning 🧭🤖
4
4
@@ -178,9 +178,9 @@ This quiz helps you self-assess your understanding. Answers can be verified by r
178
178
b) A field of study that gives computers the ability to learn without being explicitly programmed.<br>
179
179
c) The process of using computers to analyze large datasets only.<br>
180
180
d) The development of software that can reason and solve complex problems like humans.<br>
181
-
*(Answer:(1) :eyes:*
181
+
*(Answer:(1) :eyes:)*
182
182
{ .annotate }
183
-
1. Answer is: <b>b</>
183
+
1. Answer is: <b>b</b>
184
184
185
185
186
186
2. Consider the following Python code using Pandas:
@@ -196,9 +196,9 @@ This quiz helps you self-assess your understanding. Answers can be verified by r
196
196
b) 0<br>
197
197
c) Approximately 2.67<br>
198
198
d) 6<br>
199
-
*(Answer:(2) :eyes:*
199
+
*(Answer:(2) :eyes:)*
200
200
{ .annotate }
201
-
2. Answer is: <b>c</>
201
+
2. Answer is: <b>c</b>
202
202
203
203
204
204
3. Which Python library is primarily used for creating statistical visualizations like heatmaps and pair plots with concise syntax?<br>
@@ -207,9 +207,9 @@ This quiz helps you self-assess your understanding. Answers can be verified by r
207
207
c) Pandas<br>
208
208
d) Scikit-learn<br>
209
209
*(Answer: b)*
210
-
*(Answer:(3) :eyes:*
210
+
*(Answer:(3) :eyes:)*
211
211
{ .annotate }
212
-
3. Answer is: <b>b</>
212
+
3. Answer is: <b>b</b>
213
213
214
214
215
215
4. In a typical classification problem, what is the role of the 'target variable'?<br>
@@ -218,69 +218,69 @@ This quiz helps you self-assess your understanding. Answers can be verified by r
218
218
c) It's a numerical value the model tries to estimate.<br>
219
219
d) It's a technique for reducing the number of features.<br>
220
220
*(Answer: b)*
221
-
*(Answer:(4) :eyes:*
221
+
*(Answer:(4) :eyes:)*
222
222
{ .annotate }
223
-
4. Answer is: <b>b</>
223
+
4. Answer is: <b>b</b>
224
224
225
225
5. What is the primary purpose of `train_test_split`in Scikit-learn?<br>
226
226
a) To combine two different datasets into one.<br>
227
227
b) To separate features from the target variable within a single dataset.<br>
228
228
c) To divide a dataset into one part for training the model and another, unseen part for evaluating its performance.<br>
229
229
d) To visualize the distribution of data.<br>
230
230
*(Answer: c)*
231
-
*(Answer:(5) :eyes:*
231
+
*(Answer:(5) :eyes:)*
232
232
{ .annotate }
233
-
5. Answer is: <b>c</>
233
+
5. Answer is: <b>c</b>
234
234
235
235
6. If you want to create a scatter plot in Python to visualize the relationship between 'Height'and'Weight' columns in a Pandas DataFrame `df`, which line of code is most appropriate using Seaborn?<br>
236
236
a) `sns.histplot(data=df, x='Height', y='Weight')`<br>
237
237
b) `sns.boxplot(data=df, x='Height', y='Weight')`<br>
238
238
c) `sns.scatterplot(data=df, x='Height', y='Weight')`<br>
239
239
d) `df.plot(kind='scatter', x='Height', y='Weight')` (This is Pandas plotting, not Seaborn directly)<br>
240
240
*(Answer: c)*
241
-
*(Answer:(6) :eyes:*
241
+
*(Answer:(6) :eyes:)*
242
242
{ .annotate }
243
-
6. Answer is: <b>c</>
243
+
6. Answer is: <b>c</b>
244
244
245
245
7. You have loaded a dataset into a Pandas DataFrame called `sales_df`. How would you display the first 10 rows of this DataFrame?<br>
246
246
a) `sales_df.show(10)`<br>
247
247
b) `sales_df.display_head(10)`<br>
248
248
c) `sales_df.head(10)`<br>
249
249
d) `sales_df.first(10)`<br>
250
250
*(Answer: c)*
251
-
*(Answer:(7) :eyes:*
251
+
*(Answer:(7) :eyes:)*
252
252
{ .annotate }
253
-
7. Answer is: <b>c</>
253
+
7. Answer is: <b>c</b>
254
254
255
255
8. When you encounter a Python error message that you don't understand while working in a Jupyter Notebook, how can an LLM assist you most effectively?<br>
256
256
a) By automatically fixing the code in your notebook.<br>
257
257
b) By explaining what the error message typically means, suggesting possible causes, and providing examples of how to fix similar errors.<br>
258
258
c) By providing a link to the full Python documentation without context.<br>
259
259
d) By advising you to restart your computer.<br>
260
260
*(Answer: b)*
261
-
*(Answer:(8) :eyes:*
261
+
*(Answer:(8) :eyes:)*
262
262
{ .annotate }
263
-
8. Answer is: <b>b</>
263
+
8. Answer is: <b>b</b>
264
264
265
265
9. What does the `.info()` method in Pandas primarily provide for a DataFrame?<br>
266
266
a) A statistical summary of numerical columns (mean, std, min, max).<br>
267
267
b) The first five rows of the DataFrame.<br>
268
268
c) A concise summary of the DataFrame, including data types of columns and non-null counts.<br>
269
269
d) The correlation matrix of numerical columns.<br>
270
270
*(Answer: c)*
271
-
*(Answer:(8) :eyes:*
271
+
*(Answer:(8) :eyes:)*
272
272
{ .annotate }
273
-
8. Answer is: <b>b</>
273
+
8. Answer is: <b>b</b>
274
274
275
275
10. Which of these tasks falls under the 'Data Cleaning/Preparation' stage of the data science workflow?<br>
276
276
a) Defining business objectives.<br>
277
277
b) Training a machine learning model.<br>
278
278
c) Handling missing values and transforming variables.<br>
0 commit comments