
Introduction to Microservice End-to-End tests with Cucumber
In this first part of the guide, we cover the main concepts of BDD, TDD, Cucumber, and Gherkin. We detail the differences between the techniques and the tools, and how they can help improve the ways of working of your organization.
- What is Cucumber?
- Behavior-Driven Development (or BDD)
- Filling the gap between business requirements and software implementation
- So, what is Cucumber?
- The Gherkin syntax
- Adapting Gherkin with Cucumber Expressions
- How to use Cucumber in Java
What is Cucumber?
Cucumber is a tool to support Behavior-Driven Development (BDD). That means we shouldn’t jump directly into the specifics of this tool without knowing first the core concepts of BDD; we’ll come back to Cucumber later.
Behavior-Driven Development (or BDD)
Behavior-Driven Development (BDD) is a way of working for software teams. It helps people map business requirements to their software implementation, using specific examples that make developers and business analysts understand better the problems to solve.
One of the main goals of BDD is to get people to talk to each other around the system’s use cases. Ideally, during these conversations, somebody should capture the “use case scripts” (what should happen when a user does this or that), so they can be used as documentation to implement the corresponding functionalities. Additionally, these use case scripts can also be automated as tests to verify that the behavior of the system is as expected.
Therefore, you can say your team is following BDD principles when you go through an iterative process to define your system’s features using those three stages – discuss, capture, and write tests. In the Cucumber documentation, you’ll find them referenced as Discovery, Formulation, and Automation.
Differences between BDD and TDD
Since one of the principles of BDD is to come up with tests that you can automate before writing the implementation, BDD and TDD (Test-Driven Development) share a lot of common ground.
To me, BDD is a better, more ambitious approach, because it puts a lot of focus on bringing people together to analyze problems around examples – the human side of IT. TDD, on the other hand, puts the focus on creating the tests before writing the code. When looking for helpful resources online about TDD, it’s common to find articles that jump too quickly into technical details and ways to implement it. The key aspect of these techniques should not really be about implementing the tests before writing the business logic; the most important part is to understand the problem before you start writing code, so you solve the right problem. I’m not saying TDD doesn’t encourage following this discovery process, it’s just that the common definitions and documentation around TDD focus more on the technical side of the process.
The main reason for this difference in focus is, in my opinion, that BDD and TDD are normally applied at different levels of the test pyramid. See Figure 1 below, or save this article for later if you are not familiar with the test pyramid concepts. Let’s focus on the first part of the BDD process: business people, developers, and testers come together in a session to refine the requirements. You want to have these sessions at feature level – in Agile: user stories, and epics; but also at a higher level, the software vision, to capture the overall requirements of your system. However, during these sessions, you don’t normally zoom in into implementation details like what classes the solution will include, and the logic they will contain. The use case scripts you get as an outcome of those sessions aren’t enough to cover all the test cases you might need for Unit Tests. They fit better into the Integration and End-to-End (or Acceptance) test layers.
Nevertheless, you can also write Unit Tests before writing the main logic of your program. You can even leverage tools that are normally associated with a BDD workflow like Cucumber, BDDMockito, or BDDAssertions (AssertJ), and use a Given-When-Then syntax for better readability. But, at this low level, most of the time you don’t need to involve a business analyst or a tester. You don’t need a session to define a Unit Test method to check if a specific Java controller returns a certain HTTP status after a validation rule, for example. That is perfectly fine, because many of these decisions are purely technical, implementation details. You don’t follow the whole BDD process at that level because you probably don’t need it either. Still, you get the TDD benefits when you write the test first because it’ll drive your design based on interfaces and use cases, including error scenarios.
So, from a way-of-working perspective, BDD helps you define high-level requirements, whereas TDD helps you write better Unit Tests. You can combine these techniques to achieve the best results. From the tooling perspective, you can use BDD tools all the way down to the Unit Tests, even though you’re not really doing BDD at that level.
What if you write the tests after writing the code?
Some people can’t manage to implement a pure BDD or TDD approach in their organizations, or they simply don’t get used to writing the tests before the code.
That’s not really a big drama. If you’re an experienced developer, you’ve probably worked in many projects where tests are written after finished the feature implementation. If you still discuss properly the requirements and capture them in any format (like a task in a dashboard), you may not suffer from delivering the wrong solution.
You can still use the BDD tooling, because it improves the readability of your tests, and can act as an eye-opener to catch unhappy scenarios that were not covered in the initial requirements. So, I still recommend you to use the BDD tools even when you’re writing the tests at the end.

BDD in the book
In the first chapters of the book (remember: this is a standalone extra chapter), we went through BDD very quickly because I didn’t want to distract you from the main topic of those chapters. We wrote Unit Tests before the service implementation following a Given-When-Then syntax. That’s not BDD.
The main reason to skip the Discovery part of BDD in the book is that it’s not easy to simulate how a discussion between business analysts, testers, and developers would occur. We introduced our user stories as if those sessions already happened, to keep a good learning rhythm based on practical examples.
In the future, try to use BDD to capture the requirements of your system, not only functional but also the non-functional ones. Figuring out wishes like capacity, performance, fault-tolerance, etc., will make you choose the proper solution for your overall architecture too. Who knows, maybe you find out that you don’t need a distributed system at all.
Filling the gap between business requirements and software implementation
When you start as a software developer, you assume that some obvious parts of the work have been done correctly, like capturing requirements in a User Story, for example.
It’s somehow obvious, isn’t it? The Product Owner (PO) and/or the Business Analyst (BA) talk to the customer or analyze data and the market needs, to decide what are the next features to implement. Then, the PO and/or BA have a meeting with developers (and testers, if they’re not the same) to refine the requirements, and have an approximate estimation of the required work. They can use this to calculate the Return of Investment (ROI), which doesn’t need to be measured in real currency. For example, let’s say we want to add a new badge to our gamification logic, “Hero of the day”. The initial requirement says we should give this badge to the first three users within 24 hours that solve 3 multiplication challenges. It’s a simple statement, but it has some technical challenges behind, like deciding the timezone to base this logic on, or aggregate existing badges for all the user base. It’s not easy to implement. What’s the value of this for our users? It’s nice, but probably not as nice as creating other 3 badges that don’t require cross-user aggregation and dealing with timezones. If we enable this discussion in real life, probably the PO/BA would lean towards adding simpler extra badges.
Even though that process sounds pretty straightforward and it’s a core concept in Agile, many organizations fail to implement it. Most of the time, it’s just a communication issue: people don’t talk to each other. Some other times, there is a pressure problem: people think they don’t have time to properly analyze the problem at hand, so it’s better to jump into the implementation directly. Discovery sessions, refinements, and similar gatherings simply don’t happen, or they’re not conducted properly. The results are catastrophic: investing too much time in topics that barely have any business value; coming up with solutions that are not what the users wanted; having poor software quality due to cowboy-mode workarounds and hacks that don’t fit into the target architecture, etc.
Only if you have experienced those issues, or if you trust the argumentation above, you’ll understand why people take time to document and structure the process to make people talk to each other. There is a clear gap in many IT organizations between business people and developers.
BDD is one of the tools you can experiment with to fill that gap, enabling a conversation about use case examples and how the system should work in those scenarios. To structure the sessions, you can follow some ideas as detailed in the Cucumber’s Discovery Workshop web page: Example Mapping, OOPSI, or Feature Mapping. You can also use your own approach. Any option is good as long as it facilitates the conversation about the requirements and prevents people from jumping too early into possible solutions.
So, what is Cucumber?
Cucumber is a very popular tool to support BDD. From a high-level perspective, it provides two main features:
- A syntax, Gherkin, that you can use to define scenarios, features, and use cases. It follows a Given-When-Then pattern.
- An open-source framework to translate the Gherkin steps to executable code that you can run as tests. It’s available in multiple languages (Java, Javascript, Ruby, Scala, etc.).
Therefore, Cucumber covers two stages of BDD: writing the use cases, and automating the tests.
The Gherkin syntax
Gherkin is a human-readable language to define Cucumber’s test cases. The main goal of its syntax is that you keep the tests non-technical, so they can be used as the documentation of your system features.
When using BDD, you can define the use case examples using Gherkin, so it’s easier to map them directly to automate tests: you just need to map these human-friendly sentences to code blocks.
Let’s use an example. This is a real test scenario that we’ll build in this guide:
Scenario: Users solve challenges, they get feedback and their stats.
Given a new user John
When he requests a new challenge
And he sends the correct challenge solution
Then his stats include 1 correct attempt
Listing 1. Gherkin Scenario example
It’s quite readable, isn’t it? This example use case (or scenario) represents one of the possible user interactions with the system that is built within the book. The user gets a new multiplication challenge, that they should try to resolve using a mental calculation. When they solve it, they get feedback indicating if it was correct or not, and their statistics are updated.
The Gherkin syntax is simple and defines only a few keywords. The core ones are Given, When, and Then, which we employ to structure the use case definition:
- Given defines the initial context of the system before the user interacts with it.
- When describes the interaction of the user.
- Then defines the expected outcome of the interaction.
There are other keywords that you can use to avoid repetition and make your step definitions look more natural, like And, But, or simply *.
To describe the different test cases, you use the keywords Scenario or Example. What you write after this keyword won’t be parsed as a step definition, so it’s really free-form.
Gherkin documents must also start with the keyword Feature. Following it, you should describe what is the functionality you’re testing, from a high-level perspective.
Let’s have a look at a fragment of the Gherkin file that contains one of the features of the system (also available on Github) to see these keywords in practice:
Feature: Solve multiplication challenges
We present users with challenges they should solve using mental calculation
only. When they're right, we give them score and, in some cases, new badges.
All attempts from all users are stored, so they can see their historical data.
Scenario: Users get new attempts.
Given a new user Mary
When she requests a new challenge
Then she gets a mid-complexity multiplication to solve
Scenario: Users solve challenges, they get feedback and their stats.
Given a new user John
And he requests a new challenge
When he sends the correct challenge solution
Then his stats include 1 correct attempt
Listing 2. A fragment of the Gherkin file ‘solving_challenges.feature’
Even if you didn’t read the book, you can surely understand what this feature does. The cool part is that this is also our test script. In the code, these are the contents of the solving_challenges.feature
file.
Avoid technical language in Gherkin features
Do not use technical interfaces in Gherkin, keep it readable by a non-developer grandparent. Avoid step definitions like Given the database includes record A or When the user calls API X. That would defeat the whole purpose of using a human-readable language.

Adapting Gherkin with Cucumber Expressions
Most likely, the scenario shown in Listing 1 is not exactly the same use case definition that you would come up with during a requirement discovery phase. A more realistic outcome would be something like When users send a correct attempt, (then) they get feedback indicating that it’s correct, and their stats are updated with this new correct result. More fluent and natural.
The Scenario shown above has been adapted a bit to make the steps (each line in the scenario definition) reusable across different test cases. In Gherkin, you can achieve this by using Cucumber Expressions to link each step to the method in the code that handles it. This is because Gherkin doesn’t have any tagging to specify which words are arguments (to respect a natural language). We define that at the code level. As an example, the last step is defined in code as:
her/his stats include {int} {correct} attempt(s)
Listing 3. An example of step mapping using Cucumber Expressions
The sentence above is a Cucumber Expression, and it includes two parameters: {int}
and {correct}
. The first one is a built-in Parameter Type, and there are many options available (e.g. float
, string
, etc.). The {correct}
placeholder is an example of a Custom parameter type, that we can define ourselves. We’ll cover that functionality in more detail through this guide.
There are other syntax usages in the expression above. The /
character in her/his
defines alternative text, so we can use one of the options. The (s)
, as text inside the parenthesis, defines an optional part. Here, we use it to make the language look a bit more natural.
How to use Cucumber in Java
Once you’ve written your features in Gherkin, you can execute them as automated tests. As introduced before, Cucumber supports multiple programming languages, Java included. The Java implementation of Cucumber is known as Cucumber JVM.
Each step in the Gherkin features, with its corresponding optional expressions, must be paired to a method in the Java code that implements the logic behind the specific step. In Cucumber JVM, we define these methods inside the Step Definition files. See Listing 4 for an extract of the ChallengeStepDefinitions
class, which maps the step we used as an example in Listing 3. We’ll get into the details in the next part of this guide. For now, the key concept is that the Cucumber’s @Then
annotation is used to provide step definitions.
public class ChallengeStepDefinitions {
// ...
@Then("her/his stats include {int} {correct} attempt(s)")
public void statsIncludeAttempts(int attemptNumber, boolean correct) {
var stats = this.userActor.retrieveStats();
assertThat(stats).satisfies(s ->
assertThat(s.stream().filter(r -> r.isCorrect() == correct)
.count()).isEqualTo(attemptNumber)
);
}
}
Listing 4. Mapping Gherkin steps to Java code with Step Definition files
Cucumber JVM integrates with JUnit, so you can run the Gherkin features using this popular Java testing framework. To accomplish this, we need to create a class that works as an entry-point. See Listing 5.
package microservices.book;
import io.cucumber.junit.Cucumber;
import io.cucumber.junit.CucumberOptions;
import org.junit.runner.RunWith;
@RunWith(Cucumber.class)
@CucumberOptions(plugin = {"pretty"})
public class RunCucumberTest {
}
Listing 5. Running Cucumber tests as JUnit tests with RunWith
We use the @RunWith
annotation in JUnit to point the framework to the Cucumber
class, which is an implementation of a JUnit Runner that knows how to load the .feature
files and scans the classes looking for Step Definition annotations (like our example in Listing 4).
The @CucumberOptions
annotation can be used to configure the runner. In the example above, we use a plugin
option to use the pretty
Cucumber formatter, which is a built-in plugin that makes the generated reports look nice (for example, other options are json
or junit
).
In the next Figure, you can see how these different elements relate to each other in Cucumber JVM.
The next step is to set up a skeleton project, some good use case examples, and start coding our steps. That’s what we’ll cover in Part 2 of this guide.

Comments