Data is the new oil, and the cloud is the new pipeline.

Data is critical in the running of a modern business. It is used to make informed decisions, develop new products, and improve existing operations. The cloud is the perfect platform for storing and managing data due to its reliability, scalability, and security. When combined, data and the cloud have the potential to revolutionise the way we live, work, and learn.

Nearly every business interaction generates data – whether via social media, mobile communications, internet of things (IoT) devices, e-Commerce transactions, or any type of digital service. Multiply those interactions by a growing number of connected individuals, connected devices, and interaction points, and the scale is overwhelming.

According to a recent Forbes report, the amount of data created, captured, copied, and consumed in the world between 2010 and 2020 increased from 1.2 trillion gigabytes to 59 trillion gigabytes. The IDC (International Data Corporation) estimates the amount of data created in the three years from 2020 to 2023 will be more than the data created during the previous 30 years.

In response to the surge of new data, the global cloud computing market size is expected to more than double in size to $832 billion by 2025, according to MarketsandMarkets Research.

How is data stored?

Depending on the format and purpose of the data, it can be stored in one of the following:

  1. Databases – stores data in a structured format. Stores large amounts of data that can be easily accessed or queried.
  2. Data Warehouses – stores data that has been pre-processed and structured for analysis. Often stores data that is used for reporting, business intelligence, and data mining. Data has been cleaned, enriched, and transformed so it can act as the “single source of truth” that users can trust.
  3. Data Lakes – stores data in its raw format. This means that data is not pre-processed or structured in any way. Data lakes are often used to store large amounts of data that may not be immediately useful, but that could be valuable for future analysis.

Given the current data architecture setup, there is ongoing adoption of data warehouses and data lakes underway as companies increasingly leverage big data capabilities. There is also an ongoing migration of databases, data lakes, and data warehouses to a cloud environment from an enterprise’s on-premise servers.

The cloud can then be used to host all types of data storage systems. It can be used to implement a variety of architectures that combine databases, data lakes, and data warehouses. This is what provides businesses with a scalable, reliable, and cost-effective way to store their data.

Cloud computing has revolutionised the way businesses store data. In the past, companies could only use on-premise storage to host websites and applications on their servers.

A company’s on-premise computing infrastructure and software resides within its physical offices – typically in its own data centre.

Traditional data storage and analytics solutions are not equipped to handle the quantity of data generated. Existing systems are slow, expensive to acquire and maintain, and very difficult to scale. The cloud has been the catalyst for change, with cloud platforms designed to store and analyse large data sets quickly and cost-effectively. It also makes it easier to share data across different systems and applications, leading to a new era of data-driven decision making.

Traditionally, data has been exclusively stored in databases (hosted by the likes of Oracle’s databases or SAP’s HANA offering) and deployed in companies’ data centres. With the emergence of the public cloud since the 2000s, there has been increasing demand to access data from different databases in one place. A data warehouse can do this, but still does not meet all public cloud data needs – such as creating artificial intelligence (AI) insights.

Data lakes solve this problem by storing raw data that is ingested into AI models to create insights. These insights are housed in a data warehouse to be queried more easily.

Data is mission critical for an enterprise’s business practices, and there are major financial costs, time, and technical work required in changing between data architectures. Data architecture tends to come with high switching costs – something legacy firms like SAP and Oracle have benefitted from.

One of the most interesting examples of data migration and switching costs, is Amazon’s migration from Oracle’s databases to its own AWS servers. Despite Amazon owning AWS and offering its own database technologies, it still took the company several years to fully migrate its data from Oracle’s databases to its own. As the velocity of data generation has surged, the speed at which data can be migrated has failed to keep pace.

Upon completion of the migration, Amazon had reduced its database costs by over 60% on top of what was already a heavily discounted rate from Oracle due to its scale. Amazon customers regularly report cost savings of 90% by switching from Oracle to AWS. Amazon’s consumer-facing applications reported performance improvements with latency (delays) reduced by almost 40%.

Amazon’s migration from Oracle’s databases highlights the difficulty in migrating between data architectures and the time taken to complete it. This dynamic has been a key roadblock for companies wanting to migrate on-premise data into public clouds. It is estimated that only around 20% of companies’ workloads have moved to the cloud, offering significant long-term growth runway for public cloud adoption.

Amazon’s AWS and Microsoft’s Azure clouds are the two largest public cloud companies by a wide margin. Cumulatively the two companies have over 50% market share. Google entered the market far later than Amazon and Microsoft, but has achieved strong growth since its launch, moving into third spot with just over 10% market share exiting 2022.

We anticipate that the next growth leg for public clouds will be underpinned by the development and use of generative AI models. To take advantage of generative AI models, companies will need to migrate more workloads to the public clouds. As Andy Jassy, the CEO of Amazon, said on the company’s second quarter results call, AI models will be brought to the data, not the other way around.

Microsoft, Amazon, and Google’s clouds stand to be several of the earliest beneficiaries from the surge in AI model development. Not only are the companies offering tools to developers to build generative AI models, but most importantly, have the necessary computing resources to train and run the models.

Having gone through the second quarter results released by a range of companies exposed to generative AI, we have highlighted the potential benefit for several companies held by the Odyssey Global Fund and the Odyssey BCI Worldwide Flexible Fund.

Microsoft:

There are few companies that are positioned as strongly as Microsoft to benefit directly and indirectly from the surge in generative AI. The company is in the enviable position of being exposed to the expected increase in cloud consumption (through Azure) and incorporating AI into its enterprise productivity applications.

The company has c.390 million enterprise users of Office 365, with around 13% of users signed up for the premium, E5 version, of the suite. During the quarter, Microsoft announced the launch of Office 365 Copilot, which adds generative AI capabilities to the Office suite of apps. Copilot will cost $30 per user per month, on top of the base Office 365 subscription price. The service has been rolled out to 600 of the company’s largest customers through an early access program. Feedback from the likes of Emirates, General Motors, Goodyear, and Lumen have pointed to the service being a “game-changer” for employee productivity.

We anticipate a meaningful proportion of the estimated 50 million Office E5 subscriber base to upgrade to Copilot over the next two years, providing a substantial boost to Office average revenue per user. Should all E5 customers upgrade, we estimate that Microsoft would generate ~$18 billion of incremental revenue. Should 50% of the group’s installed base of Office Commercial subscribers add Copilot, we estimate nearly $70 billion of incremental revenue. This is equivalent to more than one-third of the revenue generated by the company over the last year.

Amazon:

The company believes that large language models (LLMs) in generative AI have three layers, where Amazon will be investing heavily in:

  1. At the foundational level are the compute resources needed to train and operate the models. AWS’ EC2 P5 service is being used to train LLMs and develop AI apps.
  • The middle layer includes customisable LLMs. At the heart of the proposition is that only the largest enterprises have the resources to build proprietary models from scratch. AWS’ Bedrock service offers companies a base model that companies can train using their own data and without the threat of proprietary data “leaking” into a general model.
  • The top layer is where most of the publicity and attention have been focused – the applications that run on top of the LLMs (ChatGPT and Bard). Amazon offers CodeWhisperer, an AI-powered coding “companion” that recommends code to help developers accelerate the time to write AI applications.

Alphabet:

When ChatGPT was launched, it was seen as a threat to Search advertising – a market dominated by Google. However, Google had been developing the same (if not better) capabilities for years and was quick to launch its competitor to ChatGPT – “Bard.” Over the last four years, 80% of searches on Google haven’t had any ads at the top of the search results and less than 5% of searches contain more than four sponsored ads at the top of the page.

Nearly all ads are placed on searches with commercial intent, such as searches for “sneakers,” “t-shirt” or “plumber.” This indicates that LLMs like ChatGPT or Bard are not a threat to commercial search and can be used for better ad targeting – which corresponds to a greater ability to monetise.

The company has been progressively enriching ad creation and reach by harnessing the AI capabilities it has developed. The company recently disclosed that nearly 80% of advertisers already use at least one AI-powered search advertising product.

Palo Alto:

Following the pandemic shift to work-from-home company policies and the related increase in cloud spending, companies were left increasingly vulnerable to cyber-attacks. Palo Alto and other cybersecurity companies experienced robust demand over the last three years as companies have reinforced their defences against malicious attacks.

The proliferation of AI models and applications creates even more avenues for cyber criminals to infiltrate companies’ networks. New malicious programs (used to infiltrate a company’s network) have increased by 20x since 2011 to over 1 billion programs. The development of generative models and apps is expected to become the next major tool used by attackers to launch more attacks, more sophisticated and faster attacks.

In response, Palo Alto has been leveraging its own AI capabilities to automate the detection and response of cyber-attacks in real-time. Not long ago, it took an attacker 44 days from the initial compromise to exfiltration, providing a company with the same length of time to detect, disrupt and potentially prevent an attack from happening. Today, the time to exfiltration has reduced to just hours.

The capabilities developed by the Palo Alto not only provide enhanced protection but ease the pressure on companies Security Operations Centres and demand for scarce (and expensive) security analyst resources.

It is estimated that companies need to allocate between 5% and 7% of cloud spend on cybersecurity, providing an inextricable link between growing cloud adoption and cybersecurity spending going forward.

Conclusion:

We believe we are on the cusp of the next leg of cloud workload migration, underpinned by the development of large language models and applications. After several quarters of slowing cloud revenue growth, there were nascent signs in the latest quarter that companies were reaching the end of the cloud optimisation cycle. This is expected to give way to a new cycle of companies migrating more workloads to public clouds to be able to take advantage of generative AI capabilities.

Microsoft, Amazon, Alphabet and Palo Alto are expected to be key beneficiaries of the growth in AI and related cloud spend by companies.