|
| 1 | +--- |
| 2 | +status: accepted |
| 3 | +author: @toddbaert |
| 4 | +created: 2025-05-16 |
| 5 | +updated: -- |
| 6 | +--- |
| 7 | + |
| 8 | +# Adoption of Cucumber/Gherkin for `flagd` Testing Suite |
| 9 | + |
| 10 | +This decision document outlines the rationale behind adopting the Cucumber/Gherkin testing framework for the `flagd` project’s testing suite. The goal is to establish a clear, maintainable, and language-agnostic approach for writing integration and behavior-driven tests. |
| 11 | + |
| 12 | +By leveraging Gherkin’s natural language syntax and Cucumber’s mature ecosystem, we aim to improve test clarity and accessibility across teams, enabling both developers and non-developers to contribute to test case development and validation. |
| 13 | + |
| 14 | +## Background |
| 15 | + |
| 16 | +`flagd` is an open-source feature flagging engine that forms a core part of the OpenFeature ecosystem. As such, it includes many clients (providers) written in multiple languages and it needs robust, readable, and accessible testing frameworks that allow for scalable behavior-driven testing. |
| 17 | + |
| 18 | +Previously, test cases for `flagd` providers were written in language-specific test frameworks, which created fragmentation and limited contributions from engineers who weren’t familiar with the language in question. Furthermore, the ability to validate consistent feature flag behavior across multiple SDKs and environments became increasingly important as adoption grew, and in-process evaluation was implemented. |
| 19 | + |
| 20 | +To address this, the engineering team investigated frameworks that would enable: |
| 21 | + |
| 22 | +- Behavior-driven development (BDD) to validate consistent flag evaluation behavior, configuration, and provider life-cycle (connection, etc). |
| 23 | +- High cross-language support to integrate with multiple SDKs and tools. |
| 24 | +- Ease of use for writing, understanding, enhancing and maintaining tests. |
| 25 | + |
| 26 | +After evaluating our options and experimenting with prototypes, we adopted Cucumber with Gherkin syntax for our testing strategy. |
| 27 | + |
| 28 | +## Requirements |
| 29 | + |
| 30 | +- Must be supported across a wide variety of programming languages. |
| 31 | +- Must offer mature tooling and documentation. |
| 32 | +- Must enable the writing of easily understandable, high-level test cases. |
| 33 | +- Must be open source. |
| 34 | +- Should support automated integration in CI pipelines. |
| 35 | +- Should support parameterized and reusable test definitions. |
| 36 | + |
| 37 | +## Considered Options |
| 38 | + |
| 39 | +- Adoption of Cucumber/Gherkin e2e testing framework |
| 40 | +- No cross-implementation e2e testing framework (rely on unit tests) |
| 41 | +- Custom e2e testing framework, perhaps based on csv or other tabular input/output assertions |
| 42 | + |
| 43 | +## Proposal |
| 44 | + |
| 45 | +We adopted the Cucumber testing framework, using Gherkin syntax to define feature specifications and test behaviors. Gherkin offers a structured and readable DSL (domain-specific language) that enables concise expression of feature behaviors in plain English, making test scenarios accessible to both technical and non-technical contributors. |
| 46 | + |
| 47 | +We use Cucumber’s tooling in combination with language bindings (e.g., Go, JavaScript, Python) to execute these scenarios across different environments and SDKs. Step definitions are implemented using the idiomatic tools of each language, while test scenarios remain shared and version-controlled. |
| 48 | + |
| 49 | +### API changes |
| 50 | + |
| 51 | +N/A – this decision does not introduce API-level changes but applies to test infrastructure and development workflow. |
| 52 | + |
| 53 | +### Consequences |
| 54 | + |
| 55 | +#### Pros |
| 56 | + |
| 57 | +- Test scenarios are readable and accessible to a broad range of contributors. |
| 58 | +- Cucumber and Gherkin are supported in most major programming languages. |
| 59 | +- Tests are partially decoupled from the underlying implementation language. |
| 60 | +- Parameterized and reuseable test definitions mean new validations and assertions can often be added in providers without writing any code. |
| 61 | + |
| 62 | +#### Cons |
| 63 | + |
| 64 | +- Adding a new framework introduces some complexity and a learning curve. |
| 65 | +- In some cases/runtimes, debugging failed tests in Gherkin can be more difficult than traditional unit tests. |
| 66 | + |
| 67 | +### Timeline |
| 68 | + |
| 69 | +N/A - this is a retrospective document, timeline was not recorded. |
| 70 | + |
| 71 | +### Open questions |
| 72 | + |
| 73 | +- Should we enforce Gherkin for all providers? |
| 74 | + |
| 75 | +## More Information |
| 76 | + |
| 77 | +- [flagd Testbed Repository](https://github.com/open-feature/flagd-testbed) |
| 78 | +- [Cucumber Documentation](https://cucumber.io/docs/) |
| 79 | +- [Gherkin Syntax Guide](https://cucumber.io/docs/gherkin/) |
| 80 | +- [flagd GitHub Repository](https://github.com/open-feature/flagd) |
| 81 | +- [OpenFeature Project Overview](https://openfeature.dev/) |
0 commit comments