Advice for CIOs: How to Build AI-Powered Storage for Optimal O&M Automation
From management to products, AI powers autonomous-driving storage throughout data lifecycle.
More enterprises are introducing AI for IT Operations (AIOps) to handle huge data volumes, improve efficiency, and facilitate automated O&M. Today, AI technologies in the storage field are no longer limited to the monitoring and O&M of devices; instead, they are integrated into storage products.
1. Enterprises are using AI to improve O&M automation of storage systems
The explosive growth of data volumes in data centers (DCs) has created new challenges for storage management such as fault location and risk identification. This means that existing O&M methods are no longer sufficient.
According to Gartner, by 2023, 40% of I&O teams in large enterprises will use AI-augmented automation. Enterprises are expected to invest more in AI tech to automate storage O&M in DCs, improving resource management and O&M efficiency with less reliance on human labor.
2. Enterprises and storage vendors are jointly developing 3-layer AI architecture (Cloud-Center-Device AI)
To produce high accuracy and reliability, AI training requires a large amount of data for accumulation and model optimization. To meet this demand, enterprises are using storage vendors’ AI management tools to build 3-layer AI architecture to centrally manage storage devices, simplify infrastructure O&M, and improve efficiency.
Figure 1: 3-layer AI architecture
Device AI: Software and hardware resources on devices are automated, with recommendations for device configuration items, auto-detection faults, and slow disks, and data acquisition from devices for cloud training and running AI model updates from the cloud via online updates or offline imports.
Center AI: Dedicated software can implement unified management on multiple devices in a DC, as well as resource pooling, standardization, and the automation of storage devices. The software is deployed in a private DC and therefore isolated from the extranet for stringent data security controls.
Cloud AI: Powerful cloud-based computing resources are used to train AI models using the training data uploaded from storage devices. Optimized AI models are distributed on demand to DC management software and devices. And cloud management software can implement remote intelligent O&M on storage devices, despite weaker capabilities than DC management software.
For security purposes, remote O&M prohibits device modifications.
3. Storage vendors are building intelligent storage products to optimize device efficiency and reliability
To fit the diverse storage requirements of different applications, storage vendors are integrating AI into storage products to enhance device performance and reliability. Dell EMC storage systems use built-in intelligent tuning and data reduction algorithms for self-optimized storage provisioning and optimal data reduction ratios. NetApp systems can intelligently optimize hardware resources scheduling to accelerate data access. And Huawei storage intelligently allocates hardware resources to accelerate data read and write, while intelligently adjusting data reduction algorithms based on data types to increase data compression rates and reduce the storage cost per unit of data.
In traditional storage, algorithms and data are coupled and multiple fixed algorithms are distributed at the cache, in the scheduling layers, and in the storage pools of storage devices. However, algorithm parameters need to be manually adjusted to ensure the access efficiency of different types of data. In contrast, intelligent storage incorporates architectural innovations by decoupling algorithms from data. A self-learning and adaptive algorithm library enables autonomous decisions on the layout, scheduling, and reduction of different data types, ensuring efficient and flexible access in diverse data applications.
Figure 2: Algorithm-data decoupling with intelligent storage
What we suggest
1. Develop new evaluation elements for storage AI management software
To accelerate enterprise digital transformation, both storage vendors and enterprises must consider how to integrate AI management software into enterprise production and management services. It is recommended that enterprises establish clear evaluation factors and standards for the AI management software provided by storage vendors. This will drive storage vendors to upgrade AI management software based on the core values enterprises care most about.
Evaluation elements should cover the following dimensions:
Responsibility scope: AI is not developed to replace humans, but to assist and strengthen human abilities and contributions by learning and transcending how human beings perceive and respond to the world. It is recommended that enterprises develop the responsibility scope of AI within which storage vendors can upgrade and expand AI capabilities to guarantee that storage AI management is under enterprise control.
Technical specifications: AI algorithms depend on learning and training. Model understanding and training data volumes determine the error rate of AI inference results. It is recommended that enterprises develop service-specific, quantifiable AI technical specifications that can be proven by storage vendors, while storage vendors that do not meet AI specification requirements will not be adopted by enterprises.
Capability extension: The evaluation criteria for AI management software should extend from independent capabilities to E2E closed-loop designs. For example, storage disk fault prediction should focus on the closed-loop capabilities of storage management software such as identification, prewarning, proactive isolation, replacement, and data rebalancing.
2. Upgrade enterprise tech stacks for storage AI
Due to the large-scale deployment of AI in storage devices and management software, enterprise infrastructure management teams need to systematically plan data-centric AI capabilities. Enterprise digital transformation starts with retraining internal staff to ensure tech stacks transform from solely storage management to E2E automation and enterprise AI capability building.
Equally, internal infrastructure teams can use intelligent management software for storage resources to fit service requirements, and ensure service agility through standardized and service-oriented resource management. Similarly, they need to stay up-to-date with AI trends; explore management intelligence in intelligent storage infrastructure; and use AI to mine data value and inform business decisions.
Learn more about Huawei’s Data Storage solutions.
Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.