Home » Business » Helping data storage keep up with the AI revolution

Helping data storage keep up with the AI revolution

by Priya Shah – Business Editor

Cloudian Pioneers ‘AI-First’ Storage, Bridging Data and GPU Processing

SAN FRANCISCO, CA – October 26, 2023 – Cloudian, a provider of hyperscale data storage solutions, today announced advancements in its platform designed to accelerate artificial intelligence (AI) workloads by minimizing data transfer bottlenecks. The company’s strategy centers on bringing compute power closer to the data source, a paradigm shift from customary cloud architectures. This move comes as demand for AI applications surges, requiring faster access to massive datasets.

The Distributed Cloud Imperative

According to a recent report by Grand View Research, the global AI market is projected to reach $309.6 billion by 2030, growing at a compound annual growth rate (CAGR) of 36.2% from 2023 to 2030. [https://www.grandviewresearch.com/industry-analysis/artificial-intelligence-market] This exponential growth necessitates a re-evaluation of data storage and processing methods. Cloudian’s founder,Michael Tso,emphasizes the need for a distributed cloud model,extending beyond centralized data centers to encompass edge devices and servers. “That means the end state is a distributed cloud that reaches out to edge devices and servers. You have to bring the cloud to the data, not the data to the cloud,” Tso stated.

Cloudian’s Evolution and focus on Scalability

Founded in 2012, evolving from Gemini Mobile Technologies, Cloudian initially focused on providing scalable, distributed, cloud-compatible data storage. Though, the company recognized the transformative potential of AI as a primary driver for edge data processing. Tso notes, “What we didn’t see when we first started the company was that AI was going to be the ultimate use case for data on the edge.”

MIT roots and the Rise of Accelerated Computing

Tso’s research at the Massachusetts institute of Technology (MIT) over two decades ago laid the groundwork for Cloudian’s current approach. his collaborations with David Clark on disconnected networks – a common characteristic of edge environments – and Professor William Dally on high-speed interconnects proved prescient. “It’s like my whole life is playing back,” Tso remarked. Professor Dally’s work is now integral to NVIDIA’s GPU architecture, particularly in interchip dialogue. Tso also highlighted his work with Professor George Papadopoulos on accelerating application software without code rewrites, a challenge Cloudian is addressing in its partnership with NVIDIA.

Object Storage and the Vector Database Advantage

Cloudian utilizes an object storage architecture, storing diverse data types – documents, videos, sensor data – as unique objects with associated metadata. Object storage excels at managing large, unstructured datasets ideal for AI applications. Traditionally, accessing this data for AI models required copying it into a computer’s memory, creating latency and energy inefficiencies. To overcome this, Cloudian introduced a vector database in July 2023, storing data in a format directly usable by AI models. This system computes vector embeddings in real-time as data is ingested, powering applications like recommender systems, search engines, and AI assistants.

Did you Know? Vector databases represent data as mathematical vectors, enabling efficient similarity searches crucial for AI applications like image recognition and natural language processing.

NVIDIA Partnership and GPU Optimization

Cloudian’s partnership with NVIDIA allows its storage system to interface directly with NVIDIA’s GPUs, further accelerating AI operations and reducing computational costs. NVIDIA initiated contact with Cloudian approximately 18 months ago, recognizing the importance of consistently feeding data to GPUs to maximize their utilization. “Now that people are realizing it’s easier to move the AI to the data than it is to move huge datasets. Our storage systems embed a lot of AI functions, so we’re able to pre- and post-process data for AI near where we collect and store the data,” tso explained.

Real-World Applications and Client Impact

Cloudian currently serves approximately 1,000 companies globally, including manufacturers, financial institutions, healthcare providers, and government agencies. One major automotive manufacturer is leveraging cloudian’s platform to predict maintenance needs for its robotic assembly line. The National Library of Medicine utilizes Cloudian to store research papers and patents, while the National Cancer Database employs it for storing tumor DNA sequences – datasets ripe for AI-driven research into new treatments and insights.

Metric Value
Global AI Market Size (2023 est.) $110.6 billion
Global AI Market Size (Projected 2030) $309.6 billion
CAGR (2023-2030) 36.2%
Cloudian Client Base (approx.) 1,000 companies

the Power of Parallel Processing

Tso emphasizes the role of GPUs in pushing the boundaries of AI. While Moore’s Law predicts a doubling of computing power every two years, GPUs achieve greater gains through parallel processing and networking. “GPUs have been an incredible enabler,” Tso stated.”Moore’s Law doubles the amount of compute every two years, but GPUs are able to parallelize operations on chips, so you can network GPUs together and shatter Moore’s Law. That scale is pushing AI to new levels of intelligence, but the only way to make GPUs work hard is to feed them data at the same speed that they compute – and the only way to do that is to get rid of all the layers between them and your data.”

Pro Tip: Consider the data gravity principle when designing your AI infrastructure – locate compute resources where the data resides to minimize latency and costs.

Background: The Evolution of Data Storage

Data storage has evolved significantly from magnetic tapes to hard disk drives (HDDs) and now solid-state drives (SSDs).Object storage represents a further evolution, designed for scalability and handling unstructured data. the rise of cloud computing has accelerated the adoption of object storage, with services like Amazon S3 and Azure Blob Storage becoming industry standards. However, traditional cloud models often involve meaningful data transfer costs and latency, prompting the need for distributed cloud solutions like Cloudian’s.

Frequently Asked Questions

  • What is ‘AI-first’ storage? AI-first storage prioritizes data accessibility and processing speed for artificial intelligence workloads, minimizing latency and maximizing GPU utilization.
  • How does Cloudian’s vector database enhance AI performance? By storing data in a vector format, Cloudian’s database allows AI models to perform similarity searches and analyses much faster than with traditional storage methods.
  • What are the benefits of a distributed cloud architecture? A distributed cloud reduces latency, lowers data transfer costs, and improves resilience by bringing compute resources closer to the data source.
  • What types of data can Cloudian’s object storage handle? Cloudian’s platform can store any type of data, including documents, videos, sensor data, and DNA sequences.
  • How does Cloudian’s partnership with NVIDIA benefit customers? The partnership enables direct integration between Cloudian’s storage and NVIDIA’s GPUs, resulting in faster AI operations and reduced computing costs.

What are your biggest challenges in managing data for AI applications? Share your thoughts in the comments below!

Stay informed about the latest advancements in AI and data storage – subscribe to our newsletter for exclusive insights!

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.