Why Cucumber Exists — and the Problem It Actually Solves
Somewhere in almost every project I’ve worked on, there’s been a moment where a developer, a tester, and a business analyst are staring at the same bug — and none of them can agree on what the feature was supposed to do in the first place. Requirements get lost in email threads. Test cases get written against assumptions that nobody actually verified with the business.
Cucumber was built for exactly that breakdown. Not as a testing tool, really — more as a communication tool that happens to run tests.
The idea is straightforward. Instead of writing test logic that only developers can read, you write scenarios in plain English that anyone on the project can understand. A product owner can read it. A QA engineer writes the automation behind it. A developer implements the feature against it. Everyone is working from the same description.
That shared language is called Gherkin. And once you see a well-written Cucumber scenario for the first time, it’s hard to go back to test code that only makes sense to the person who wrote it.
This guide covers everything you need to go from zero to working Cucumber tests:
- What Cucumber is and the BDD methodology behind it
- How the three-part architecture actually works
- Every Gherkin keyword with honest explanations
- Data-driven testing with Scenario Outline
- Hooks — the feature most beginners skip until they need it badly
- Real code examples, common mistakes, and when NOT to use Cucumber
What Is the Cucumber Framework?
Cucumber is a BDD testing tool that bridges the gap between business requirements and executable test code. The bridge is Gherkin — a structured plain-English syntax where test scenarios read like sentences rather than code.
Compare these two ways of describing the same test:
Technical-only version (Selenium without Cucumber):
driver.findElement(By.id(“username”)).sendKeys(“admin”);
driver.findElement(By.id(“password”)).sendKeys(“pass123”);
driver.findElement(By.id(“loginBtn”)).click();
Assert.assertEquals(driver.getTitle(), “Dashboard”);
Cucumber version (Gherkin feature file):
Given the user is on the login page
When the user enters valid username and password
Then the user should be redirected to the dashboard
Both test the same thing. The difference is who can read it. The first version is invisible to anyone without a Java background. The second version? Your product owner can review it. Your client can sign off on it. Your new tester on day one can understand what’s being tested without reading the automation code.
That’s the real value of Cucumber. Not that it makes testing easier — it often makes it more work. The payoff is shared understanding and tests that double as living documentation.
What Is BDD — Behavior-Driven Development?
BDD is a development approach, not a tool. Cucumber is just the most popular tool that implements it.
The core idea: define how the application should behave from the user’s perspective before writing any code. Not ‘what classes do we need’ but ‘what should happen when a user does X?’ That shift in framing changes what gets built and how problems get caught.
In practice, BDD means three things:
Focus on behavior, not implementation. A test that says ‘user clicks button with id=submit’ is implementation-specific. A test that says ‘user submits the registration form’ is behavior-specific. The second one survives UI refactors. The first one breaks every time a developer renames an element.
Shared vocabulary between teams. Given-When-Then is a format both technical and non-technical people can use. It forces clarity. ‘Given the user is logged in’ — okay, what exactly does ‘logged in’ mean here? Answering that question early prevents bugs later.
Requirements as tests. A Cucumber feature file is simultaneously a requirement document and an executable test. It can’t drift apart from the code the way a Word document can, because it runs against the actual application.
Teams that do BDD well tend to have fewer misunderstandings between what was asked for and what was built. Teams that do it badly just end up with verbose test files. The tool isn’t the point — the conversations it forces are.
How Cucumber Works — The Three-Part Architecture
Every Cucumber setup has three components that need to work together. Miss one and nothing runs.
1. The Feature File
This is the plain-English layer. Feature files use the .feature extension and contain your scenarios written in Gherkin. No code here — just human-readable test cases.
Feature: User Login
Scenario: Successful login with valid credentials
Given the user is on the login page
When the user enters valid username and password
Then the user should be redirected to the dashboard
Feature files are meant to be written collaboratively. In teams doing BDD properly, a BA or product owner drafts the scenarios and developers/testers refine them. The test code comes later. The scenario comes first.
2. Step Definitions
Step definitions are where the plain-English steps get wired to actual automation code. Each line in your feature file maps to a method in a step definition file.
Example in JavaScript:
Given(‘the user is on the login page’, function () {
driver.get(‘https://example.com/login’);
});
When(‘the user enters valid username and password’, function () {
driver.findElement(By.id(‘username’)).sendKeys(‘admin’);
driver.findElement(By.id(‘password’)).sendKeys(‘pass123’);
});
Then(‘the user should be redirected to the dashboard’, function () {
assert.equal(driver.getTitle(), ‘Dashboard’);
});
The matching between feature file steps and step definition methods is done by text pattern — Cucumber reads the step text and finds the method whose annotation matches it. Get the text wrong and you’ll get an ‘undefined step’ error.
One thing to get right from the start: keep step definitions generic enough to be reusable. A step that says ‘the user is on the login page’ works for every scenario involving the login page. A step that says ‘admin user with role=superuser is on the login page at 9am’ is useless outside that one scenario.
3. The Runner File
The runner file tells Cucumber where to find everything and how to run it. In a Java project with JUnit or TestNG, it’s typically a class annotated with @CucumberOptions. In a JavaScript project, it’s configured via cucumber.js or package.json.
Minimum configuration it needs:
- Path to your .feature files
- Path to your step definitions
- Report format (html, json, pretty — usually all three)
The runner is also where you filter by tags — so you can run only @smoke scenarios, or exclude @wip ones that aren’t ready yet.
Gherkin Keywords — What Each One Actually Means
Gherkin has seven keywords you’ll use constantly. Knowing what each one signals — not just what it does — makes your scenarios much cleaner.
| Keyword | What It Signals in the Scenario |
| Feature | Names the functionality being tested |
| Scenario | One specific test case |
| Given | The precondition or starting state |
| When | The action the user takes |
| Then | The outcome you expect to see |
| And | Chains an additional step onto any Given/When/Then |
| But | Adds a negative/exception condition |
A few things worth knowing that tutorials usually gloss over:
Given/When/Then are interchangeable syntactically. Cucumber doesn’t enforce which keyword you use — you could write ‘When the user is on the login page’ and it would work. But don’t. The keywords communicate intent to the reader. Given sets context, When triggers an action, Then checks an outcome. Respect that.
And chains onto whatever came before it. If the previous step was a When, ‘And’ reads as another When. It’s just there so you don’t write ‘When… When… When…’ which looks awkward.
But is for negative conditions. ‘But the user should not see the admin panel.’ Use it sparingly — if you’re writing a lot of But steps, your scenario probably has too many assertions in it.
Scenario Outline — Data-Driven Testing Without the Repetition
Here’s a situation that comes up constantly: you need to test your login form with multiple credential combinations. Valid user, invalid password, locked account, empty fields. Writing a separate scenario for each one is copy-paste hell — four scenarios that are structurally identical, differing only in the input values.
Scenario Outline was built for this. You write the scenario once with placeholders, then supply the data in an Examples table:
Scenario Outline: Login with multiple credential types
Given the user is on the login page
When the user enters “<username>” and “<password>”
Then the login result should be “<status>”
Examples:
| username | password | status |
| admin | admin123 | success |
| user1 | wrongpwd | failed |
| locked | pass123 | locked |
Cucumber runs this scenario three times — once per row in the Examples table. Each run gets its own entry in the report. If row 2 fails, you can see exactly which input set caused it.
You can have multiple Examples tables under one Scenario Outline, which is useful when you want to separate positive cases from negative cases visually in the feature file. Doesn’t change how it runs — it’s just easier to read.
Hooks in Cucumber — Setup and Teardown Done Right
Hooks are methods that fire before or after every scenario automatically. Most beginners ignore them until they’ve written the same browser setup code in 30 different step definitions and finally get tired of maintaining it.
The two you’ll use constantly:
Before(function () {
// Runs before every scenario
driver = new Builder().forBrowser(‘chrome’).build();
console.log(‘Browser launched’);
});
After(function () {
// Runs after every scenario
driver.quit();
console.log(‘Browser closed’);
});
What hooks are actually used for in production test frameworks:
- Browser launch and teardown — the most common use by far
- Screenshot capture on failure — hook into the After block, check scenario status, save screenshot if failed
- Test data setup — create a test user in the DB before the scenario, clean it up after
- Logging — open a log entry in Before, close and flush it in After
- Authentication — log in once in Before so individual scenarios don’t have to repeat it
Cucumber also supports tagged hooks — you can make a hook only run for scenarios tagged with @database or @api. This lets you keep expensive setup code out of scenarios that don’t need it.
One gotcha: if your Before hook throws an exception, the scenario is marked as failed but the After hook still runs. Make sure your teardown code can handle a scenario that never properly started.
Why Teams Actually Choose Cucumber
The reasons people give in presentations and the reasons teams actually stick with Cucumber are somewhat different. Here’s the honest version:
The Reasons That Actually Hold Up
Feature files become the real source of truth. When a BA writes a scenario, gets it reviewed by the dev and QA, and the automation is written to match it — that file is more reliable than any requirement document in Confluence. It can’t get stale because it runs against the live application.
Onboarding is faster. A new tester joining a project with well-written Cucumber scenarios can understand what the application does in hours, not weeks. The feature files read like documentation because they are documentation.
Step reuse adds up. Once you have 50 step definitions covering your core application flows, writing new scenarios becomes fast. The infrastructure already exists — you’re just assembling steps.
Integration is genuinely flexible. Cucumber works with Selenium, Cypress, Playwright, RestAssured, TestNG, JUnit — the framework doesn’t care what’s underneath. The Gherkin layer stays the same even when you switch automation tools underneath.
Where It Gets Harder Than Expected
Writing good Gherkin is harder than it looks. Vague steps (‘user performs action’), overly technical steps (‘user clicks element with XPath //div[@class=login]’), and scenarios that test five things at once are all common mistakes that take time to learn to avoid.
The collaboration model also only works if people actually collaborate. If feature files are written by a single developer and never reviewed by anyone non-technical, you’ve just added a layer of abstraction with none of the communication benefits.
A Complete Working Example — Calculator
Abstract concepts only get you so far. Here’s a full working example from feature file through to step definition so you can see how the pieces connect.
Feature File
Feature: Basic Calculator Operations
Scenario: Add two numbers
Given the calculator is open
When the user adds 3 and 5
Then the result should be 8
Scenario: Subtract two numbers
Given the calculator is open
When the user subtracts 9 from 15
Then the result should be 6
Step Definitions
let result;
Given(‘the calculator is open’, function () {
result = 0;
});
When(‘the user adds {int} and {int}’, function (a, b) {
result = a + b;
});
When(‘the user subtracts {int} from {int}’, function (a, b) {
result = b – a;
});
Then(‘the result should be {int}’, function (expected) {
assert.equal(result, expected);
});
Notice that the Then step is shared across both scenarios — same step definition, works for any expected integer. That’s the reuse pattern in action. The step definitions for add and subtract are different because the action is different, but the assertion step is generic.
Also notice the {int} placeholders — Cucumber’s built-in parameter types. They capture integers automatically from the step text without regex. Cleaner than writing your own patterns.
Writing Cucumber Tests That Don’t Turn Into a Maintenance Nightmare
Bad Cucumber is worse than no Cucumber. Bloated feature files, duplicated step definitions, and scenarios chained together create more confusion than they solve. These are the practices that actually keep things manageable over time.
- Keep scenarios short. If a scenario needs more than five or six steps, it’s probably testing more than one behavior. Split it.
- Write feature files as if a non-technical stakeholder will read them tomorrow — because they might.
- No technical language in feature files. XPath selectors, database queries, API response codes — all of that belongs in step definitions, not feature files.
- Every scenario should be independently runnable. If scenario B only works because scenario A ran first and set up some state, you’ve created a fragile dependency that will break at inconvenient times.
- Parameterize step definitions from the start. A step that accepts parameters is reusable. A step hardcoded to one specific value is a liability.
- Use tags to organize runs — @smoke, @regression, @wip. It costs nothing to add them and saves time every CI run.
- Review feature files as a team. The BA writes, the developer reviews, the tester refines. That’s the loop that makes BDD valuable.
Mistakes Beginners Make — and How to Avoid Them
These mistakes show up on almost every team the first time they use Cucumber. Some of them I’ve made myself.
| Common Mistake | What to Do Instead |
| Cramming 10 steps into one scenario | One scenario, one behaviour — split it up |
| Technical code leaking into feature files | Feature files are for plain English; keep code in step definitions |
| Copy-pasting step definitions | Write reusable, parameterised steps from the start |
| Scenarios that depend on each other | Every scenario must be able to run in isolation |
| Vague step names like ‘user does something’ | Be specific — ‘user clicks the Submit button’ |
| Skipping the Examples table for repetition | Use Scenario Outline when the same flow needs multiple inputs |
The one that causes the most pain long-term: dependent scenarios. You’ll notice it immediately when you try to run a subset of your suite and half the scenarios fail because they relied on state set up by a scenario that didn’t run. Fix it early — retrofitting independent scenarios into a coupled suite is painful.
When Cucumber Makes Sense — and When It Doesn’t
Cucumber is genuinely great for certain projects and actively counterproductive for others. Choosing it for the wrong reasons wastes everyone’s time.
Use Cucumber When:
- Your project uses Agile or BDD and stakeholders are expected to read test scenarios
- You have a mix of technical and non-technical team members who all need visibility into test coverage
- Feature documentation tends to drift out of sync with reality — Cucumber keeps them aligned because the docs run
- Your application has clear user-facing behaviors that map naturally to Given-When-Then
- You want a single source of truth that both business and engineering teams trust
Skip Cucumber When:
- It’s a small project — 20 tests, one developer, no stakeholder review of test cases. TestNG or JUnit is simpler
- The testing is entirely API or backend — Gherkin doesn’t add much over direct assertions in those layers
- No one on the business side will actually read the feature files. Cucumber without that collaboration is just verbose test infrastructure
- Your team is still learning test automation fundamentals — Cucumber adds an abstraction layer that confuses beginners more than it helps
The honest test: if you can point to a specific non-developer on your project who will regularly read and benefit from Gherkin-style scenarios, use Cucumber. If you can’t, think hard before committing to the overhead.
Final Thoughts
Cucumber is one of those tools that polarizes people. Teams that use it well — where BAs actually write scenarios, developers actually review them, and the feature files actually reflect reality — swear by it. Teams that bolt it on as an afterthought end up with a maintenance burden and no communication benefit.
The tool isn’t magic. The workflow around it is what matters. Get the three-part architecture clean (feature file → step definitions → runner), write your Gherkin at the right level of abstraction, use hooks to handle setup/teardown properly, and the framework holds up well at scale.
If you’re new to Cucumber, start with the calculator example — get it running end to end before touching anything more complex. Once you understand how a Gherkin step maps to a step definition method and then to actual automation code, the rest is just detail.
And if you’re evaluating whether to adopt Cucumber at all: the question isn’t ‘is it a good framework.’ It is. The question is whether your project and team structure will let you use it the way it was designed to be used.