As the leading digital marketplace for residential and commercial real estate, ImmoScout24 successfully brings together homeowners, real estate agents, tenants, and buyers for over 20 years. ImmoScout24 is continually developing new products to build up an ecosystem for real estate in Germany and Austria. The online marketplace is part of the Scout24 Group.
From a manual process to automation
As part of its ongoing platform optimization efforts ImmoScout24 wanted to help real estate sellers navigate their own digital marketplace offering better by creating an intuitive report. The report, from a technology perspective, would utilize raw data to turn into a one-pager with an overview of relevant data for sellers.
Today, ImmoScout24 automatically creates a business intelligence (BI) report for sellers on their platform to analyze their customer data, evaluate product performance and make better, informed decisions. Before the development of this new BI report, this work was done manually.
Introducing a new approach to data
To create this automated business intelligence report, ThoughtWorks worked with ImmoScout24, pulling data from several products and data sources across the company.
From an organizational perspective, the data team worked independently from the rest of the product teams. The coordination between several product teams across the company and data teams (engineers and analysts) required extra effort in understanding the different team roadmaps and products. Often, the limited availability and complexity of the interactions created bottlenecks. Further, the ThoughtWorks’ project team had to first build up context and knowledge about how data was collected and stored in the data lake.
Most of the data work done so far was exploration and market analysis. From a technology perspective, the existing code was treated as a no-production asset since it was only temporary. This meant they were not using core company standards like versioning, testing, continuous deployment and observability for these types of reports. In order to develop data-core products, the ImmoScout24 team needed a new way of working.
To understand the client pain points, ThoughtWorks collaborated on a daily basis with the data analysts and data engineers. This helped to better understand how existing tools provided by the data platform could be leveraged to build the report. After navigating the data lake independently, the team created the necessary algorithms and a first level implementation of the reporting logic.
This project is a shining example for how a team tackled a complex, data-intense topic and managed various dependencies successfully. It entailed us digging deep down into our data landscape. We gained a profound understanding of the importance of data validation upfront and close collaborations with data analysts.
Creating an automatic BI report for sellers
Today, ImmoScout24 automatically creates a business intelligence report for sellers on their platform. Sellers can now query monthly customer data, evaluate how products are performing and make better decisions on which products could help them the most.
The business intelligence report is built on top of 20 AWS data pipelines. Some pipelines create data for the monthly report (numbers and KPI over-months comparison), others create charts and visualizations. Before the development of the BI report, this work was done manually by one person and it took nearly two full working days for turnaround. The reports are now automatically deployed through Jenkins pipelines without the need of manual intervention.
The pipelines include Amazon Kinesis Data Firehose Streams which was used for ingesting data via http rest apis. The team implemented a custom layer to collect the requests into packages. The data was then processed by a complex architecture using Amazon SQS queues, Step functions, Event Bridge events, and some Spark jobs running in Amazon EMR clusters.To achieve high performance of pipelines the team utilizes Amazon EMR, a managed, scalable service, and has enabled EMR-managed scaling to add additional nodes for large datasets.
Also, the development experience for the data product has changed significantly. Feedback loops for report changes have become faster. The business logic is better separated across different data pipelines and it is possible now to run several business logics parallel to aggregate impactful results.
The success of this proof of concept has been regarded at ImmoScout24 as a best practice with a catalogue on implementation. This project gave the team the chance to influence and start an organization-wide movement for making data a self-serve asset for product teams. This project was also a first step in the direction of implementing a data mesh architecture for the company.