Skip to content

Commit 4ff65d3

Browse files
committed
Answer the PRR questionnaire
1 parent d03b93b commit 4ff65d3

File tree

1 file changed

+31
-19
lines changed
  • keps/sig-network/2595-expanded-dns-config

1 file changed

+31
-19
lines changed

keps/sig-network/2595-expanded-dns-config/README.md

Lines changed: 31 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,8 @@ overage.
185185

186186
- **What happens if we reenable the feature if it was previously rolled back?**
187187

188-
It should continue to work as expected.
188+
New objects with expanded DNS configuration will be accepted by the apiserver
189+
and new Pods with expanded configuration will be created by the kubelet.
189190

190191
- **Are there any tests for feature enablement/disablement?**
191192

@@ -195,84 +196,95 @@ We will add unit tests.
195196

196197
- **How can a rollout fail? Can it impact already running workloads?**
197198

198-
N/A
199+
If a kubelet starts with invalid `resolvConf`, new workloads will fail DNS
200+
lookups.
199201

200202
- **What specific metrics should inform a rollback?**
201203

202-
N/A
204+
If new workloads start to fail DNS lookups due to a corrupted resolv.conf, or
205+
due to older resolver libraries, that would be an indication to rollback the
206+
enablement.
203207

204208
- **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
205209

206-
N/A
210+
We will do test.
207211

208212
- **Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?**
209213

210-
N/A
214+
No
211215

212216
### Monitoring Requirements
213217

214218
- **How can an operator determine if the feature is in use by workloads?**
215219

216-
N/A
220+
There is no metric to indicate the enablement. The operator has to check if
221+
there are objects or DNS resolver configuration files with expanded
222+
configuration to determine if the feature is in use.
217223

218224
- **What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?**
219225
- [ ] Metrics
220226
- Metric name:
221227
- [Optional] Aggregation method:
222228
- Components exposing the metric:
223-
- [ ] Other (treat as last resort)
224-
- Details:
225-
226-
N/A
229+
- [x] Other (treat as last resort)
230+
- Success of DNS lookups
227231

228232
- **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
229233

230-
N/A
234+
DNS lookups should not fail as before the feature was enabled.
231235

232236
- **Are there any missing metrics that would be useful to have to improve observability of this feature?**
233237

234-
N/A
238+
TBD
235239

236240
### Dependencies
237241

238242
- **Does this feature depend on any specific services running in the cluster?**
239243

240-
N/A
244+
No
241245

242246
### Scalability
243247

244248
- **Will enabling / using this feature result in any new API calls?**
245249

246-
N/A
250+
No
247251

248252
- **Will enabling / using this feature result in introducing new API types?**
249253

250-
N/A
254+
No
251255

252256
- **Will enabling / using this feature result in any new calls to the cloud provider?**
253257

254-
N/A
258+
No
255259

256260
- **Will enabling / using this feature result in increasing size or count of the existing API objects?**
257261

258-
N/A
262+
The sum of the lengths of `PodSpec.DNSConfig.Searches` can be increased to 2048.
259263

260264
- **Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?**
261265

262-
N/A
266+
The DNS lookup time can be increased, but it will be negligible.
263267

264268
- **Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?**
265269

266-
N/A
270+
No
267271

268272
### Troubleshooting
269273

270274
- **How does this feature react if the API server and/or etcd is unavailable?**
271275

276+
N/A
277+
272278
- **What are other known failure modes?**
273279

280+
N/A
281+
274282
- **What steps should be taken if SLOs are not being met to determine the problem?**
275283

284+
If DNS lookups fail, you can check error messages. And then, validate the
285+
kubelet's `resolvConf` if it is corrupted or use newer DNS resolver libraries if
286+
they are too old.
287+
276288
## Implementation History
277289

278290
- 2021-03-26: [Initial

0 commit comments

Comments
 (0)