Snowflake Inc.
SNOW · NYSE Arca · United States
Runs a single query that can pull data from AWS, Azure, and Google Cloud Platform at the same time.
Snowflake runs a platform that lets companies query data stored across AWS, Azure, and Google Cloud Platform at the same time, without copying any of it from one place to another, by maintaining a single central catalog that tracks where every dataset lives across all three clouds and routes each query accordingly. Because the execution plans that direct queries to the right cloud live in that catalog rather than in any individual cloud, the compute engines that actually run those queries can spin up or down freely — but the catalog itself cannot be split into pieces without breaking the cross-cloud joins and the data-sharing agreements that customers have formed with each other through the same layer. That indivisibility is also what makes the platform hard to leave: a customer's pipelines, dashboards, and data-sharing contracts are all wired into that one catalog, so switching away means rebuilding the pipelines and tearing up the agreements simultaneously. The sharpest risk runs in the opposite direction — if AWS, Azure, or Google Cloud Platform changes the storage or authentication hooks the catalog depends on, the catalog's synchronization breaks for every customer at once, not just those on the cloud that changed.
How does this company make money?
Customers are charged based on how many compute credits their virtual warehouses consume while running queries — the more queries they run, the more they pay. On top of that, there are separate charges for how much cloud storage their datasets occupy and for data moving between regions. This means monthly revenue goes up and down with how actively customers are querying data and how fast their datasets are growing.
What makes this company hard to replace?
Customers write their data pipelines using SQL extensions and automatic clustering features that are specific to this platform and do not transfer cleanly to a conventional data warehouse. Any data-sharing agreements a customer has formed with other companies run through the platform's cross-cloud metadata layer, so leaving the platform means those agreements break too. The auto-scaling settings that customers have tuned to match their exact workload patterns would need to be rebuilt from scratch on any alternative system, a process that takes months.
What limits this company?
The catalog that routes every query has to be kept as one unified, unbroken record across all customers and all three clouds at once. It cannot be split into smaller pieces — splitting it would break queries that join data sitting on different clouds and would collapse the data-sharing agreements customers have formed with each other through the platform. That means the catalog's ability to keep up with simultaneous queries is the hard ceiling on how much the platform can handle, and that ceiling cannot be raised simply by adding more computing power.
What does this company depend on?
The platform cannot run without AWS, Microsoft Azure, and Google Cloud Platform supplying the regional computing and storage infrastructure that holds all customer data. It also depends on Apache Parquet to compress and store that data efficiently, Transport Layer Security to encrypt data moving between clouds, ANSI SQL query parsing engines to interpret incoming queries, and OAuth 2.0 authentication frameworks from each hyperscaler to verify that requests are legitimate.
Who depends on this company?
Fortune 2000 data engineering teams rely on the platform's auto-scaling compute engines to keep their ETL pipelines running — those pipelines would break without it. Business intelligence analysts at retail and financial services companies would lose the real-time data feeds that power their Tableau and Power BI dashboards. AI and machine learning teams whose training datasets live on the platform would find that data inaccessible, stopping model development entirely.
How does this company scale?
New computing engines can be added quickly and cheaply across cloud regions using automated provisioning, so raw compute capacity grows without much friction. The bottleneck that does not go away as the company grows is the central metadata catalog: because it must stay in one piece to keep cross-cloud queries and data-sharing agreements working, it cannot be expanded by simply adding more machines to share the load.
What external forces can significantly affect this company?
European GDPR rules and China's Data Security Law require that certain data never leave specific geographic regions, forcing the company to build and maintain expensive region-specific infrastructure to stay compliant. Federal Reserve stress-testing requirements push financial services customers to spread workloads across multiple clouds, which shapes how the platform is used. Shortages in semiconductor supply can limit the computing capacity that AWS, Azure, and Google Cloud Platform have available in specific regions, creating gaps the platform has no direct control over.
Where is this company structurally vulnerable?
If AWS, Microsoft Azure, or Google Cloud Platform changed the way its storage or authentication interfaces work — even just one of the three — the catalog's routing plans would break for every customer at once, not only the ones on that cloud. This happens because a query joining data across all three clouds references all three API states in a single operation. One unilateral change by one cloud provider is enough to make the platform's core capability stop working.