How data engineering is powering Pinterest’s global platform
Pinterest is the visual inspiration platform people around the world use to shop for products personalised to their taste, find ideas and crafts to do offline, and discover the most inspiring creators.
Beginning as a tool to help people collect the things they were passionate about online, today more than 460 million people flock to Pinterest’s platform every month to explore and experience billions of ideas.
Central to powering this platform is data engineering on a vast scale – as Dr. Dave Burgess, VP of Data Engineering at Pinterest, explains.
“In Data Engineering we create and run reliable and efficient planet-scale data platforms and services to accelerate innovation at Pinterest and sustain our business,” he says. “We do everything from online data systems, to logging data, big data and stream processing platforms, analytics and experimentation platforms, machine learning (ML) platforms, and the Pinterest Developer Platform for external developers to build applications using Pinterest APIs.”
As Burgess explains, one of the biggest challenges in data engineering is improving Pinterest’s developer productivity, which is measured through surveys and the time taken to complete tasks: “For example, the time it takes to train and deploy a new machine learning model or run an experiment.”
From the survey results, a developer productivity NPS (Net Promoter Score) is calculated, from +100 to -100. “When I first started at Pinterest four years ago, our developer productivity NPS was -5 and now it’s +65.”
Since joining the business four years ago, Burgess has overseen the replacement of many of Pinterest’s data engineering systems with the latest in open-source software. “We’ve also built machine learning and experimentation platforms on top of our data platform, increased ML Engineering velocity by 10x and run hundreds of new experiments every week,” he adds. “We’ve also democratised our data so that everyone in the company can use data to make decisions, build applications and experiment. All of this has significantly improved our agility, developer productivity, and the products for our customers.”
Pinterest’s machine learning and experimentation platforms
‘Under the covers’, according to Burgess, Pinterest is a ‘massive ML machine’: “We use ML to generate recommendations for our home feed, search results, related products, advertising, and also have augmented reality for our Pinners (the affectionate name we call our users) to see makeup on their face.”
Central to Pinterest’s success is its ML platform. Used to power everything from product recommendations and image categorisation to online advertising and spam filtering, Burgess explains that it enables Pinterest’s engineers to be significantly more productive.
“Our ML engineers can iterate much more quickly, building and deploying new ML models in a day, performing offline training to iterate and improve their models offline before testing them with real production traffic, and have production ML systems be automatically monitored and self-healed,” he comments.
One such tool is Pinterest Lens, a visual search tool allowing users to search for ideas and products using images. The tech trick behind this feature is computer vision, which identifies objects in photos to suggest related content, allowing users to find similar items on Pinterest. These innovations, Burgess explains, are powered by open-source and internal advancements in ML technology.
“Our ML platform is built with a combination of open source ML technologies, like PyTorch, Tensorflow and MLFlow, and tech that integrates with our own big data and online systems,” he explains. “That enables us to train ML models and automatically deploy them into serving systems for ML inference.”
Pinterest is an organisation defined by a culture of experimentation. As Burgess describes, its Experimentation Platform encourages experimentation and data-driven decision-making throughout the whole organisation, while also enabling the organisation to test thousands of new ideas.
“Our Experimentation Platform is designed to support rapid iteration and the continuous improvement of our products, and allow us to quickly test and refine new features, user interfaces, and other elements of the user experience. By using data to guide our product development decisions, Pinterest is able to better meet the needs and preferences of our users, as well as increase inspiration.”
A next-generation data warehouse
One of the number of changes made in Pinterest’s data systems involves the building of a next-generation data warehouse and the transition to a Data Mesh: an emerging approach to data architecture that aims to address the challenges of managing large and complex data environments, which was first introduced by Zhamak Dehghani – a software architect at ThoughtWorks – in 2019.
“At a high level, Data Mesh is a decentralised data architecture that emphasises data ownership and autonomy,” Burgess explains. “Rather than having a central data team manage all the data for an organisation, Data Mesh encourages each business unit or team to take ownership of their own data domains, managing their data in a way that is best suited to their needs.”
This approach involves breaking down data into smaller, more manageable domains that can be owned and managed by individual teams. Each team is responsible for the data within their domain, including defining the schema, ensuring data quality, and providing access to other teams that need to use the data.
To enable collaboration and sharing across domains, Pinterest has a catalogue of schemas and metadata stored in Apache DataHub, has standardised its data vocabularies and metrics, has tiered the quality of its data, and has integrated its open-sourced Querybook platform to collaborate and share SQL queries.
“Querybook is an open-source data collaboration platform developed by Pinterest,” Burgess explains. “It has a user-friendly interface for data analysts and engineers to collaborate on data analysis tasks, allowing them to share queries, datasets, and insights with one another. It’s the most popular and highly rated internal tooling platform at Pinterest.”
As Burgess describes, Querybook also benefits from advanced data analysis capabilities for ad-hoc data analysis, generating visualisations, and even building machine learning models: “We’ve also built a ChatGPT-like interface to automatically generate and execute queries from a text business statement. For example, you could ask it how many daily active users there are on Pinterest over the past month and it will generate a SQL query with the right tables and fields.”
“Overall,” Burgess asserts, “Data Mesh represents a new way of thinking about data architecture that helps us to manage our large and complex data environment more effectively, while also fostering greater collaboration and innovation.”
Building a successful partner ecosystem
Pinterest’s Data Engineering department works with a number of third party partners, including AWS for cloud infrastructure and Percona for MySQL support, along with a number of other companies on open source software such as Netflix, Lyft, AirBnB, AWS, Starburst (for Presto/Trino), StarRocks Technologies, and Preset (for Superset), as well as close collaborations with the open source community.
Another of Pinterest’s partners, PingCAP, has assisted with the deployment of its TiDB system: a distributed SQL database engine that provided users with better data consistency, reducing tail latencies by 30-90% while reducing hardware instance costs by more than 50%.
“We had been using an older version of HBase for many years which is a scalable open-source, distributed, column-oriented NoSQL database,” Burgess explains. “We’ve made many fixes to HBase over the years to make it fault-tolerant at our scale on AWS, used it for different kinds of use cases, and added a lot of functionality on top.”
“The biggest pain points with this older version of HBase were: the total cost of ownership to maintain and run this; limited functionality, which led to lower engineering productivity and increased application complexity; the lack of data consistency across tables, affecting our users’ experience; and the scalability requirements our internal users wanted to run at.”
This partnership with PingCAP to use TiDB is already reaping benefits, providing better data consistency, a lower total cost of ownership, and more powerful features than the previous solution, HBase.
“As a NewSQL database, TiDB provides a scalable solution in a huge problem space for use cases that need stronger consistency or richer functionalities”, Burgess explains. “It fills in the gap between our existing SQL and NoSQL systems, allowing developers to build storage applications faster without making painful tradeoffs.”
“All these factors combined enable us to more easily build and scale business-critical applications including shopping catalogues, advertising index systems, trust and safety systems and many more.”
What are Pinterest’s main aims for the next five years?
As Burgess describes, central to Pinterest’s plans for the future is innovating and creating new technologies and products that Put Pinners First. “This means enhancing the user experience and driving growth internationally.”
The organisation will also look to improve its advertising products and expand its advertising partnerships with businesses of all sizes, while becoming a more sustainable and socially responsible company. Reducing its environmental impact is part of the latter, as is promoting diversity and inclusion, in addition to supporting causes related to social and environmental issues.
“We will make it easier for Pinners to shop for the things they love. They’ll be able to go from being inspired to making this a reality in their lives,” Burgess adds. “We will also be a more sustainable company, with almost 100% renewable energy for our operations. This includes renewable energy for our offices and data centres.”
With the space moving quickly, making the most of the opportunities presented by developments in ML and AI will also be central to Pinterest’s success going forward.
“This space is changing quickly with the recent advances in Large Language Models, Stable Diffusion, and Transformer models,” Burgess concludes. “We have the ability to generate images and text answers, augment ML models with more data, recognise objects in images, and create an augmented reality. We can also significantly improve our productivity with AI-assisted bots that generate code and answers.”
“There are many applications of this and it’s going to be a game changer.”