Ever since we launched the new version of the Mapillary platform in June, we have been busy iterating and refining features and stability - often with the help from the Mapillary community who provide valuable feedback. In this post we summarize some of the most recent changes and updates.
Our mobile apps are the easiest way to capture and directly upload to Mapillary. We have followed up with various releases since June. For Android we released 5 updates since the initial 5.0 release including:
The Desktop Uploader is the most convenient way to upload larger amounts of images captured with action cameras or other external cameras. Since June we followed up with 5 significant releases covering:
If you’d like to control upload from other tools or scripts, or want to build more complex processing pipelines, our command line interface (CLI) tools have been and remain the way to go. We’ve updated the tools to use the new platform and followed up with various updates since.
The new Mapillary API and vector tiles have been launched, approaching Mapillary data from a different perspective. The API focuses on retrieving information about entities, such as an image, a traffic sign, or a map feature, while the vector tiles are served from a new endpoint that we encourage using for bulk data downloads. Since its initial launch in June we have added the following features to the API:
Integrations of Mapillary imagery and data into OSM editing tools (JOSM, iD and RapiD) are essential to effectively and efficiently leverage the community’s Mapillary contributions to improve OpenStreetMap. We’ve updated all the integrations to leverage the new platform and APIs:
MapillaryJS is an interactive, extendable street-level imagery and semantic mapping visualization platform and a reusable component on the web. Our recent 4.0 release brings major improvements across the board. A dedicated post goes into all the details.
When Mapillary joined Facebook, one of our goals was to leverage Facebook’s world-class infrastructure as a solid and scalable foundation for the future of Mapillary. It was our goal to ensure scalable and reliable upload and processing, as well as scalable and low-latency access to Mapillary data. This meant more than migrating from Amazon Web Services (AWS) to Facebook systems. Instead we replaced many of the key layers of our system with Facebook’s equivalents and re-architected the system from the ground up.
The biggest change that occurred during the transition was Mapillary’s overall processing model, where we moved from a pure streaming approach to batching + streaming. Historically we were processing everything using Apache Storm (+ Trident) and Apache Kafka as part of our architecture. When a user uploaded a new image, we were scheduling “affected” geographic areas for processing. This meant that sometimes certain active areas weren't scheduled for processing for long periods of time (hours), as more images were popping up and the system couldn't "close" the area for processing. In contrast, other areas were reprocessed multiple times. Add to this the time required to actually run the algorithms or generate vector tiles and you end up with very inconsistent update and processing times across the system, respective to geography. To work around this characteristic of our system, and to introduce more predictability into the system, we decided to introduce a daily batch model for parts of our system that needed it the most. We now have combined streaming and batching models, but separated surfaces which they affect (e.g. visualization of map data is batched, but processing imagery is still streaming). For the user of the platform it means: we can keep processing the data as it comes in as soon as we can and keep displaying the relevant data and then on a daily cadence we can regenerate the tiles for the whole world to ensure consistency. To do this data crunching and tile generation we use various of Facebook’s technologies, e.g. Hive and Hadoop and FBLearner flow.
For storage and delivery of binary data we moved from AWS s3 to Facebook’s storage systems and global CDN.
For our public v4 APIs, in the new model we're using Facebook-style Graph APIs and model everything that third-party developers who build applications on top of the Mapillary platform as a graph. This model is heavily inspired by GraphQL.
GraphQL is also the new way that our iOS, Android and Web apps now use to interact with the Facebook backend. It allows us to load only what we need to make loading faster and combine requests in order to minimize the number of API requests when we load complex data (user/org profiles and feeds, dashboard etc.). This directly contributed to performance improvements that we’ll show further down.
The benefit of this major refactor of the complete stack from storage to frontend is that:
The performance and stability improvements should be noticeable across the platform now, but one example of its impact is the frontend performance of our web app before and after the updates. For example, time to interactive improved by almost an order of magnitude:
/ Till