Building the Global Street-Level Imagery Platform: How We Test Software at Mapillary

When building a global platform for street-level imagery, there are lots of moving parts in the system that need to work together. At Mapillary, we pay a lot of attention to quality assurance and have built a testing framework that is reliable, developer-friendly, and scales as we keep developing the platform further. Here is how we do it.

Mapillary enables people and organizations across the world to share street-level imagery and access map data for improving maps, cities, and mobility. Just recently, we were celebrating half a billion images on the platform. It’s our team’s job to build and operate this platform to enable anyone, anywhere, to access map data at scale.

Besides building the tools and features that people interact with directly, there’s a significant amount of work going on behind the scenes. One big and crucial part of it is testing—making sure that all parts of the system work the way they’re meant to. In this post, we will explain our fellow developers out there how we automate testing at Mapillary.

Don’t stage

Like most of the companies out there doing QA automation, at Mapillary we are driving our test strategy based on the test pyramid.

The test pyramid) The test pyramid: tests that run through the UI take much more time and resources compared to unit tests. (Illustration based on Martin Fowler)

Unit tests are easy, you don’t really need an environment to run them, you just need the code. Integration and system tests on the other hand are a completely different story. You need an environment or test bed in which those tests are going to be executed since all the real interactions with other systems need to be set up.

Let’s say that you have a web application like mapillary.com/app where users can create an account and you want to automate the test cases for that particular scenario. If we oversimplify the architecture behind it, this means:

  • Web app
  • Web server with an API
  • Database

Given this scenario, a sound integration/system test setup would be:

  1. Integration tests cases for the API, where we are going to use a rest client to create users and testing assertions on data being persisted correctly in the database.
  2. System test for the web app to make sure that users can be created through the UI.

A very common approach is to have a dedicated environment with all the components above deployed—also known as a staging or integration environment. However, a single staging/integration environment has multiple pain points:

  • Multiple developers are running their tests against the same environment, resulting in both tainted, duplicated and corrupt/colliding test data because of a lack of cleaning routines.
  • Undefined consistency over time because of changing tests and stale test data.
  • Multiple builds in the CI pipeline running in parallel can overlap in time and data.
  • Tests in unstable and experimental branches are contributing to the inconsistent data over time, demanding multiple staging environments (one for each branch).
  • A single long-running staging environment tends to get harder and harder to set up over time, since a full bootstrap is very rarely executed.
  • If tests are run on the developers’ machines, they need to be online and have access to VPN etc. in order to use the staging environment.

At Mapillary, there are around 100 discrete micro services that need to have integration and system testing. With one or a few staging environments, both the setup, the tainting of data over time, and the concurrency of execution data made this a blocker for our plans for a developer-friendly testing environment.

Mapillary-in-a-box: staging for the rest of us

Docker-compose for staging

Luckily, docker-compose is really helpful when it comes to test automation. The idea of having a dedicated staging environment is now replaced by disposable docker-compose based environments that are created right before running the tests, and then killed once the tests have been executed. To keep things independent of the current cloud services we use, we have docker containers to the extent that we can run all tests offline. This means,we have docker containers emulating AWS S3, PostgresQL, Elasticsearch, Apache Kafka, RabbitMQ, and all other middleware we are using.

Just using docker-compose to create on-demand integration environments is not enough. Let’s say that our three components above live in three different repositories that build three different docker images for deployment:

  • mapillary_ui
  • mapillary_api
  • mapillary_db

For automating the integration tests for mapillary_api we would need a target docker-compose.yml that looks like this:

version: '2'

services:
  mapillary-api:
    build:
    context: .
    command: script/run
  mapillary_db:
    image: mapillary/mapillary-db:a49ca5b920406e920ee4d47366f0f1
  • NOTE! We use commit hash to tag our images in docker hub, pulling in the binary of the services not under test in order to save build time and reuse the binaries.
  • We build the current code under test in order to have a fresh docker image.

This approach requires continuous updates of the reference to the database docker image. Every time we add a new table or a new column to a table, we need a new docker image and hence have to update the compose file to point to the git version of the new docker image for mapillary_db.

Another alternative would be to have something like this:

version: '2'

services:
  mapillary-api:
    build:
      context: .
    command: script/run
  mapillary_db:
    build:
    context:
      ../mapillary_db

This last example has a couple of problems. To start with, we would have to build two docker containers every time we want to run the tests, and the the second problem is that we force people to keep their repos in a given hierarchy; otherwise the contexts we use above would not make sense.

Maybe we could live with that for db-s because they don’t usually change too frequently but take the environment that we need for running UI tests:

version: '2'

services:
  mapillary-ui:
    build:
    context: .
    command: script/run
  mapillary_db:
    image: mapillary/mapillary-db:a49ca5b920406e920ee4d47366f0f1
  mapillary_api:
    image: mapillary/mapillary-api:asdf87a6sd8f7asdf8as7df6a8s7dfa

You know what I mean now, right? The mapillary_api repo changes all the time and we can’t just update our compose file for running mapillary_ui tests every single time that the API changes. Once again, take this to an architecture where you have more than 100 micro services.

So what’s the best way to deal with this? We decided not to version the docker-compose files we need to create the test environments. Instead, we automatically create them when we need to spin up an environment.

The service-dependency-graph

At Mapillary, every app (aka every repo) builds one docker-image (a provider) and defines its services based on the code in that image and the options the image is started with. For every provided service, its dependencies are stated as services (provided by other repos—exactly the same as you define pip or node dependencies).

From examining and compiling the structure of all repos, providers, services, and their dependencies, we construct a graph of dependencies between them, including the current git versions of these, and persist it in Neo4j.

Example of a dependency chain, visualized with Neo4j Example of a dependency chain, visualized with Neo4j

From this graph, we dynamically create a docker-compose file based on those dependencies, based on what kind of tests you need to run. For system tests, the environment will contain all containers needed to fulfill all leaves of the transient dependency tree of the service under test. For integration tests, only the implementations of the direct dependencies of the current service are needed.

The description file for the mapillary_ui service and its dependencies could look like this:

{
  "name": "mapillary_ui_provider",
  "description": "The main Mapillary website",
  "dependencies": [
    {
      "name": "mapillary_api"
    }
  ]
}

While the downstream service mapillary_api depends of the mapillary_db service:

{
  "name": "mapillary_api",
  "description": “Mapillary API",
  "dependencies": [
    {
      "name": "mapillary_db"
    }
  ]
}

In order to create an environment to run system tests for our service mapilalry_ui, the framework is traverse all its dependencies and creates a docker compose that looks like this:

version: '2'

services:
  mapillary-ui:
    build:
context: .
    command: script/run
  mapillary_db:
    image: mapillary/mapillary-db:a49ca5b920406e920ee4d47366f0f1
  mapillary_api:
    image: mapillary/mapillary-api:a49ca5b920406e920ee4d47366f0f1

In summary, we don’t version the docker-compose files at Mapillary; instead, we version the dependencies for each app and then our framework will automatically create the docker-compose files based on those dependencies.

Last but not least, there’s yet two major components to our test framework. One of them is mapillary_messi (yep, because of Messi), and the other is the test_runner.

True blackbox testing with test_runner

The test_runner is simple, it is just another docker container that is always added as part of the docker-compose setup. Every time we create a test environment, there’s a test_runner which will try to bind mount the folder tests/ in the current source context in /source/tests and it will try to run whatever is inside that folder.

So we would have something like:

mapillary_ui
    |— tests
      |— test_ui.py

mapillary_api
    |— tests
      |— test_api.py

Continuing with the previous examples, this is how the docker-compose files look like with a test_runner for running system tests:

version: '2'

services:
  mapillary-ui:
    build:
    context: .
    command: script/run
  mapillary_api:
    image: mapillary/mapillary-db:a49ca5b920406e920ee4d47366f0f1
  mapillary_db:
    image: mapillary/mapillary-db:a49ca5b920406e920ee4d47366f0f1
  mapillary_test_runner:
    image: mapillary/mapillary_test_runner:a49ca5b920406e920ee4d4
    volumes:
    - ./tests:/source/tests

Docker-compose file

And this is how an environment for integration tests would look:

version: '2'

services:
  mapillary-api:
    build:
      context: .
    command: script/run
  mapillary_db:
    image: mapillary/mapillary-db:a49ca5b920406e920ee4d47366f0f1
  mapillary_test_runner:
    image: mapillary/mapillary_test_runner:a49ca5b920406e920ee4d4
    volumes:
    - ./tests:/source/tests

Docker-compose file

The test runner is really agnostic of the repo—its tests are mounted with a relative mount point to ./tests in the current context. It will produce junit xml-like results in ./tests_results after running the tests, which then are gathered into test reports.

If we are working in the mapillary_ui repo and create a test environment, the test_runner/source/tests directory will be mounted to run mapillary_ui./tests . Likewise, if developing in the mapillary_api repo, the test runner would run mapillary_api./tests.

Repo-language-agnostic testing with mapillary_messi

The last piece of this puzzle is mapillary_messi—our test framework, written in Python. It’s a common library across all repos at Mapillary, used for all integration and system testing. Its main features are to:

  • Provide common domain abstractions (Models)
  • Provide CRUD operations on domain objects to the underlying persistent stores and messaging systems (Drivers)
  • Easily setup and tear down domain objects with persistence in the underlying stores (Fixtures)
  • Wait and retry functions for assertions that deal with eventual consistency in the system
  • Provide a common test system across all repos, so that tests can be moved when functionality moves between systems and services, and new developers can easily modify a system by understanding and changing the tests first

Test-in-a-box

Going back to our example, let’s say that we want to test that we can create users through mapillary_ui through a browser interaction with system tests and through mapillary_api programmatically with integration tests.

In that case mapillary_messi provides

  • A User model to hold the test data
  • Drivers to do CRUD operations on User from mapillary_db (low level) or mapillary_api (high level)
  • Fixtures that can be used by the test case

Here’s an example of how we can use the same model and driver provided by our test library in mapillary_ui (testing with a Chrome browser to create a user).

from webbrowser import Chrome
from mapillary_messi.db.els_base_driver import ElsBaseDriver
from unittest import TestCase
from mapillary_messi.models.user import User

class UiTestCase(TestCase):

    def test_sign_up(self):

        user = User()
        driver = Chrome()
        driver.get("http://mapillary-ui.com")

        driver.find_element_by_id("email_input").send_keys(user.email)
        driver.find_element_by_id("password_input").send_keys(user.password)
        driver.find_element_by_id("signin_btn").click()

        els_driver = ElsBaseDriver()
        q = {"query": {"bool": {"filter": {"term": {"email": user.email}}}}}
        els_user = els_driver.search("users", body=q)

        self.assertEquals(user, els_user)

The same test code can be reused in setting up a User and checking data through mapillary_api

import requests
from mapillary_messi.db.els_base_driver import ElsBaseDriver
from mapillary_messi.fixtures.user_fixture import UserFixture
from unittest import TestCase
from mapillary_messi.models.user import User

class ApiTestCase(TestCase):

    def test_create_users(self):
        user = User()
        user_fixture = UserFixture(user)
        self.useFixture(user_fixture)
        requests.post("http://mapillary-api:4000/users/", data=user)

        els_driver = ElsBaseDriver()
        q = {"query": {"bool": {"filter": {"term": {"email": user.email}}}}}
        els_user = els_driver.search("users", body=q)

        self.assertEquals(user, els_user)

Waiting before the test!

Running the tests in a CI pipeline is relatively easy. The tricky part is to wait for the different deployed services to be initialized and ready in the correct startup order before triggering the test. Remember that we are starting an environment from scratch—just having all containers running is not enough. Services can take minutes to start up, for instance databases need to create the database schemas before being available.

For that matter we add the concept of WAIT_PORT to the provider meta data. Each provider has a WAIT_PORT where the provider starts listening as soon as its services are ready to be accessed. For our test setup, the services listen for their dependencies to be available before initializing, making the startup order deterministic and consistent, as opposed to simply starting all containers in the docker-compose file.

{
  "name": "mapillary_ui_provider",
  "description": "The main Mapillary website",
  "dependencies": [
    {
      "name": "mapillary_api"
    }
  ],
  "WAIT_PORT": 3000
}

Summary

To give some numbers—this framework allows us to have more than 100 UI test cases consistently running in Jenkins for our web interface and more than 1,000 test cases for our main APIs.

We also improved the development speed by providing developers an easy way to create an end-to-end environment for them to try their changes that doesn’t touch the production systems and has no need to deploy the changes to a staging environment or wait for the server build pipeline.

Making changes in unfamiliar code bases is now also much easier, since the tests are easy to understand for all developers and have a standard way of setup and running across the whole company.

Considering the rest of the tooling—it is literally one command in localhost and boom.

Happy testing!

/Santiago, Software Engineer

Continue the conversation