At HTAP Summit 2024, Dave Burgess, an industry veteran, angel investor, and ex-VP of Data Engineering at Pinterest, walked through why distributed SQL has emerged as a key component of modern SaaS platforms. As Burgess explained during his keynote, traditional databases like MySQL and PostgreSQL may serve startups initially. However, as businesses scale and encounter complex data demands, they often hit performance and scalability roadblocks.
In this blog, we explore Burgess’s key drivers for distributed SQL adoption along with the essential KPIs for SaaS success. We’ll also discuss the unique advantages distributed SQL provides to solve critical challenges facing data-intensive businesses.
The Growing Need for Scalable SaaS Architectures
Burgess underscored the importance of scalability in the SaaS industry, where both business requirements and technical demands are constantly evolving. SaaS companies, especially those with large-scale platforms like Salesforce or Workday, rely heavily on cloud infrastructure and must contend with unpredictable spikes in usage. With generative AI applications driving rapid growth, the demand for robust data systems is accelerating. Distributed SQL emerges as a powerful solution, providing the scalable, multi-tenant capabilities that these platforms require.
Burgess discussed how the scalability of distributed SQL enables SaaS applications to manage not only growing data volumes. It also handles the complexity for high-traffic periods while maintaining seamless user experiences. He noted that scalable architectures are crucial for supporting customer acquisition, increasing usage, and retaining existing clients — all while maintaining high availability, security, and compliance.
Key KPIs for Successful Modern SaaS Platforms
To determine the technical requirements of a successful SaaS platform, Burgess identified four major KPIs:
- Customer Acquisition: SaaS companies need to attract users through free trials and high conversion rates to paid subscriptions.
- Revenue Growth: A subscription-based model incentivizes SaaS companies to increase both customer volume and usage among existing users.
- Net Dollar Retention: Minimizing churn and maximizing customer retention drive the long-term success of SaaS models.
- Gross Margin: SaaS businesses must optimize their cost-to-revenue ratio, reducing operational expenses while maximizing profitability.
To meet these KPIs, Burgess emphasized that modern SaaS platforms require a scalable, highly available, and cost-effective data architecture. Distributed SQL databases offer the flexibility to handle diverse customer needs, from small businesses to large enterprises, while providing the elasticity to scale seamlessly during peak usage periods.
The Challenges of Scaling Traditional Database Architectures
Burgess explored the limitations of conventional database systems and their inability to handle modern SaaS requirements. He explained that most companies begin with a simple SQL database but quickly outgrow this setup as data volumes increase. To manage the growing demand, these companies often resort to database sharding, automated re-sharding, or adding additional systems like NoSQL databases for specific needs, such as search or real-time analytics.
However, this approach introduces data inconsistency, operational complexity, and substantial maintenance costs. As more systems are added, data engineers spend increasing amounts of time managing schema synchronization and addressing data discrepancies across platforms. This fragmentation also makes it difficult for developers to innovate rapidly and leads to high operational costs. Burgess explained that distributed SQL consolidates these disparate systems, offering scalability and strong consistency without sacrificing performance.
Meeting the Need for High Availability in Modern SaaS Platforms
For modern SaaS platforms, maintaining high availability is paramount. Burgess highlighted how distributed SQL supports high availability through data replication and automated failover across nodes. In cloud environments, where instances or even entire availability zones may occasionally go down, distributed SQL ensures that applications remain operational by redirecting traffic to available replicas. This capability is essential for building user confidence, particularly in mission-critical applications.
Ex-Pinterest VP of Data Engineering Dave Burgess on stage during his keynote at HTAP Summit 2024.
Burgess talked about the trade-offs between consistency and latency in distributed environments. In scenarios with high latency—such as cross-region databases—a slight consistency lag may be acceptable. However, within the same region, distributed SQL ensures strong consistency, supporting a seamless user experience without compromising performance.
Ensuring Multi-Tenancy for Scalability in Modern SaaS Platforms
Another essential feature for modern SaaS platforms is multi-tenancy, allowing a single system to support multiple customers while isolating data and workloads. Burgess noted that distributed SQL excels at separating storage and compute resources for individual tenants. This provides data and workload isolation that ensures users cannot access each other’s data. This setup also enables elastic scalability. This allows the system to allocate additional resources to high-demand tenants while keeping costs manageable.
The flexible multi-tenant architecture of distributed SQL empowers SaaS companies to adopt hybrid models. In these scenarios, smaller customers share a database while larger clients have dedicated resources. This approach optimizes operational costs while maintaining data security and performance.
Reducing Architectural Complexity with Distributed SQL
Burgess also explored how distributed SQL reduces architectural complexity by consolidating multiple data stores into a single system. Traditional architectures require separate relational, NoSQL, search, and real-time analytics databases, each with its own schema and synchronization requirements. Distributed SQL, on the other hand, integrates these functionalities, streamlining data management and improving consistency.
With distributed SQL, companies can quickly adapt to changing business requirements, as schema changes, data updates, and new features are easily implemented without disrupting operations. This simplified architecture not only improves data integrity and availability. It also reduces costs and allows engineers to focus on building innovative features rather than maintaining infrastructure.
Real-World Case Studies: Catalyst and Pinterest
Burgess shared two success stories that illustrate the transformative impact of distributed SQL:
- Catalyst: This customer success platform replaced its PostgreSQL and Elasticsearch systems with TiDB, an open source, distributed SQL database. This allowed Catalyst to achieve 10x query performance and significant cost reductions. The company’s shift to distributed SQL also enabled real-time integration and simplified schema management while providing a scalable, unified data platform.
- Pinterest: Facing challenges with HBase clusters, Pinterest transitioned to TiDB to reduce its operational costs and improve performance. By consolidating HBase through TiDB’s distributed SQL architecture, Pinterest experienced a reduction in latency, greater consistency, and lower total ownership costs, freeing engineers to focus on innovation.
These case studies demonstrate how a distributed SQL database like TiDB can address the scalability, performance, and complexity challenges of large-scale modern SaaS platforms.
Conclusion
Distributed SQL databases like TiDB have emerged as a key foundation for scalable and resilient SaaS architectures as they address the critical challenges of traditional databases. By simplifying scalability, enhancing high availability, supporting multi-tenancy, and reducing architectural complexity, distributed SQL databases enable SaaS platforms to meet modern data demands effectively.
As Burgess emphasized, SaaS providers that adopt distributed SQL gain a significant competitive edge by streamlining data operations, improving cost efficiency, and accelerating innovation. For any SaaS company looking to future-proof its data infrastructure, distributed SQL databases offer a compelling solution that provides the performance, flexibility, and reliability needed to succeed in today’s data-intensive, AI-driven landscape.
Want to gain an edge over your competitors with a data architecture built from the ground up to scale modern SaaS applications? Register to watch this entire keynote from the event for additional insights. Happy viewing!
TiDB Cloud Dedicated
TiDB Cloudのエンタープライズ版。
専用VPC上に構築された専有DBaaSでAWSとGoogle Cloudで利用可能。
TiDB Cloud Serverless
TiDB Cloudのライト版。
TiDBの機能をフルマネージド環境で使用でき無料かつお客様の裁量で利用開始。