You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/pragmatism-over-perfection.md
+3-11Lines changed: 3 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,19 +8,15 @@ weight: 1
8
8
---
9
9
10
10
We as Engineers often chase perfection.
11
-
12
11
It fuels our curiosity, sharpens our skills, and makes us feel good about the things we build. But at the same time, it can be a double-edged sword, sometimes slowing us down or distracting us from what actually matters: **delivering business value**.
13
-
14
12
That’s where pragmatism comes in.
15
13
16
14
This is a story about me getting that lesson reinforced with a simple task I was working on: KYC Image Tagging.
17
15
18
16
## The Problem
19
17
20
18
I had worked with the ML team to develop two models for tagging KYC signature images as **valid** or **invalid** — one based on a CNN, the other on a Decision Tree.
21
-
22
19
Once trained, we ran both models on a dataset of around **1.2 lakh images**. They disagreed on about **17%** of them — and the only way to figure out which model was better was to manually tag those images and compare.
23
-
24
20
The ops team was ready to help. All we needed now was a simple interface for them to actually do the tagging.
25
21
26
22
So I got to work exploring image tagging tools.
@@ -64,19 +60,18 @@ It ticked all our boxes:
64
60
65
61
My manager signed off, and we presented the options to the stakeholders. They were happy that we could use an existing system to get the job done.
66
62
So, we used Databricks as a tagging tool.
67
-
68
63
It was in no way the "right" tool for the job, but it worked, and that made it the best one.
69
64
And now it was time to implement this pragmatic solution.
70
65
71
-
> What is databricks?
66
+
> What is Databricks?
72
67
>
73
-
> Databricks is anallin one platform for analysts, and engineers to manipulate, process and use data, read more [here](https://www.databricks.com/data-intelligence?scid=7018Y000001f8FIQAY&utm_medium=paid+search&utm_source=google&utm_campaign=20782149301&utm_adgroup=152953302702&utm_content=microsite&utm_offer=data-intelligence&utm_ad=724408738477&utm_term=what%20is%20databricks&gad_source=1&gclid=Cj0KCQjw2ZfABhDBARIsAHFTxGwAa41AMcCUzaTbsL60svmAaD4LReAsmqlwm_SMoJYbKgzcDWwEoGAaAi4wEALw_wcB).
68
+
> Databricks is an-all-in one platform for analysts, and engineers to manipulate, process and use data, read more [here](https://www.databricks.com/data-intelligence?scid=7018Y000001f8FIQAY&utm_medium=paid+search&utm_source=google&utm_campaign=20782149301&utm_adgroup=152953302702&utm_content=microsite&utm_offer=data-intelligence&utm_ad=724408738477&utm_term=what%20is%20databricks&gad_source=1&gclid=Cj0KCQjw2ZfABhDBARIsAHFTxGwAa41AMcCUzaTbsL60svmAaD4LReAsmqlwm_SMoJYbKgzcDWwEoGAaAi4wEALw_wcB).
74
69
75
70
## Getting it up and Running
76
71
77
72
I onboarded the Ops team onto Databricks and assigned them the right roles, which was quick and easy since I already had admin access.
78
-
79
73
Then, I created a simple notebook for them.
74
+
80
75
Here is what the notebook did:
81
76
82
77
1. Fetched an image path at random from the table we had where the manual tag field was Null.
@@ -94,13 +89,10 @@ And after a short walkthrough, the Ops team was off tagging the images.
94
89
## The Outcome
95
90
96
91
Using existing tools enabled us to get started quickly.
97
-
98
92
It did introduce a few extra steps for the ops team, like running the notebook cells repeatedly, as compared to the right tools. However, the overhead was minimal, within acceptable limits, and the Ops team found the process simple.
99
93
100
94
The tagging was completed within **3 weeks**, which gave us the clarity we needed on the model performances.
101
-
102
95
If I had used the “right” tools, the setup alone would have taken about a week.
103
-
104
96
In the end, this approach saved us **time, effort, complexity and additional costs** while keeping the stakeholders happy.
0 commit comments