Integration Tests Across Services

9 min read

Component tests verify one service in isolation with stubbed dependencies. Contract tests verify structural compatibility between pairs of services. But sometimes you need something more: two real services talking to each other via real HTTP, against real databases, to verify a critical business flow actually works end to end. That's an integration test in the microservices sense.

What a microservices integration test is

A microservices integration test runs two or more real services — real HTTP servers, real databases — and tests the interaction between them. There are no stubs for the services under test. You use Testcontainers' Network feature or Docker Compose to connect the containers so they can resolve each other by hostname, exactly as they would in production.

This is a narrower definition than the term "integration test" often implies in monolith testing. Here it means: two services, real wire, real state. The test exercises the full call path between them — serialisation, transport, authentication headers, response parsing — not a stub approximation of it.

When to write integration tests

Not every interaction between services deserves an integration test. Reserve them for situations where a cheaper test genuinely cannot do the job:

  • Write one when a new service-to-service interaction is introduced — for example, Order Service starts calling Payment Service for the first time. Verify the full interaction before it ships.
  • Write one when a critical business flow spans two services, such as placing an order that triggers a payment. If this flow breaks in production, the business loses money. That justifies the cost of a slower test.
  • Write one when a contract test passes but you suspect the business logic between two services is still wrong. Pact verifies structural compatibility — correct field names, correct types — but it cannot tell you that Order Service misinterprets Payment Service's "DECLINED" status value and retries indefinitely.
  • Skip them when a contract test covers the same ground. If the concern is purely structural — does the response schema match what the consumer expects? — a contract test is faster and catches the problem earlier in the pipeline. Don't duplicate it with an integration test.
  • Skip them when you are testing error handling that WireMock can simulate just as well. A 503 timeout from a downstream service does not require that service to actually be running.

A real integration test: Order + Payment

Here is a complete example for a Java Spring Boot system where Order Service calls Payment Service via HTTP.

@Testcontainers
class OrderPaymentIntegrationTest {
 
    static Network network = Network.newNetwork();
 
    @Container
    static GenericContainer<?> paymentService = 
        new GenericContainer<>("mycompany/payment-service:latest")
            .withNetwork(network)
            .withNetworkAliases("payment-service")
            .withExposedPorts(8080)
            .waitingFor(Wait.forHttp("/actuator/health").forStatusCode(200));
 
    @Container
    static PostgreSQLContainer<?> orderDb = 
        new PostgreSQLContainer<>("postgres:15")
            .withNetwork(network)
            .withNetworkAliases("order-db");
 
    // Order Service started as @SpringBootTest (the service under primary focus)
    @SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT)
    @Testcontainers
 
    @Test
    void shouldChargePaymentWhenOrderPlaced() {
        String orderPayload = """
            {"userId": 42, "productId": 100, "quantity": 1, "paymentMethod": "card_test_visa"}
        """;
 
        Response response = given()
            .contentType("application/json")
            .body(orderPayload)
            .post("/orders");
 
        assertThat(response.statusCode()).isEqualTo(201);
        String orderId = response.jsonPath().getString("id");
 
        // Poll Payment Service directly to confirm charge was created
        await().atMost(10, SECONDS).pollInterval(500, MILLISECONDS).untilAsserted(() -> {
            Response payment = given()
                .baseUri("http://localhost:" + paymentService.getMappedPort(8080))
                .get("/payments?orderId=" + orderId);
            assertThat(payment.statusCode()).isEqualTo(200);
            assertThat(payment.jsonPath().getString("status")).isEqualTo("CHARGED");
        });
    }
}

Network.newNetwork() creates a Docker bridge network so containers can resolve each other by alias — payment-service resolves to the Payment Service container's IP, exactly as a Kubernetes DNS entry would. waitingFor(Wait.forHttp(...)) blocks the test from proceeding until the Payment Service has finished starting up and is returning 200 on its health endpoint. await().untilAsserted() (Awaitility) polls because payment processing may be asynchronous — the order placement returns 201 before the payment record exists. Asserting immediately after the POST would produce a false failure.

Integration test scope: pairwise is the right level

Resist the temptation to test all ten services together in a single integration test. That is an end-to-end test in disguise — slow, brittle, and difficult to diagnose when it fails. The right scope is pairwise: Order + Payment, Order + Inventory, User + Notification. Each test covers one specific seam and keeps everything else out of the picture.

The pairwise approach also keeps failure diagnosis fast. When an Order + Payment integration test fails, the problem is somewhere in that interaction. When an all-services test fails, the problem could be anywhere.

Run frequency and placement

Integration tests are expensive to run — each one needs real services to start. Plan your pipeline accordingly:

  • Run on every push to main rather than on every pull request. PR pipelines should stay under five minutes; integration tests rarely do.
  • For critical paths — the payment flow, the checkout flow — consider running those specific tests on every PR and the full suite nightly.
  • Expect 2–5 minutes per test class. Container startup dominates. The test method itself may take under a second.
  • Keep integration tests in a separate Maven module (integration-tests/) so they don't bloat the unit and component test cycle. Developers running mvn test locally should not wait for containers to start.

Component Test vs Integration Test

  • 1 real service

    Your service runs as a real process

  • External services → WireMock stubs

    Fast, deterministic, controllable

  • Real database (Testcontainers)

  • Fast: 300-500ms per test

  • Run on every PR

  • 2+ real services

    Both services run as real processes

  • Real service-to-service HTTP

    No stubs between the services under test

  • Multiple real databases

  • Slower: 2-5 min per test class

  • Run on main branch or nightly

⚠️ Common mistakes

  • Using integration tests instead of contract tests for structural compatibility. Contract tests are faster and catch field mismatches earlier in the pipeline. Integration tests should be reserved for business logic that spans services — the sequence of calls, the state transitions, the error handling — not for verifying that a response field is named orderId and not order_id. That is contract testing's job.
  • Not using Awaitility (or equivalent) for asynchronous flows. If Service A publishes an event and Service B processes it asynchronously, asserting immediately after the publish will pass locally (fast machine, lucky timing) and fail in CI (slower environment, different scheduling). Always use await().atMost(...).untilAsserted() for any state that might not be visible immediately after the triggering action.
  • Testing all services together when pairwise would suffice. Every additional service in a test multiplies the startup time and introduces new failure points. A test for the Order → Payment interaction does not need the Notification Service running. If Notification Service has a broken Docker image that day, your payment test should not fail because of it.

🎯 Practice task

  1. Pick a critical business flow in your system — or design one — that spans exactly two services. Write it out explicitly: "Order Service calls Payment Service via POST /payments. Payment Service writes a payment record to its own PostgreSQL database. Order Service reads the payment response and updates the order status accordingly."
  2. Set up a Network.newNetwork() in a test class and start a containerised version of the second service (Payment Service or equivalent). Write a minimal sanity test that calls the second service's health endpoint from within the test, using the mapped host port, to confirm the two containers are reachable.
  3. Write the integration test for the happy path: post a valid order, poll the Payment Service using Awaitility until the charge record appears, and assert the status equals "CHARGED".
  4. Add a test for the failure path: configure the Payment Service to reject the card by using a test card number that triggers a decline (check the service's documentation for the magic value). Assert that the Order Service transitions to a PAYMENT_FAILED status rather than retrying indefinitely.
  5. Measure the total test run time for the class. Compare it to the equivalent component test run time for the same happy path. Calculate how many integration tests at that speed you could run within a five-minute CI window — and use that number to decide which flows actually deserve one.

In the next lesson you will see how Docker Compose makes it practical to define and manage these multi-service environments as a single YAML file, shareable across every developer machine and every CI job.

// tip to track lessons you complete and pick up where you left off across devices.