AI Powers Self-Driving Full-Lifecycle Data Management
Artificial intelligence for IT operations (AIOps) is a popular method for enterprises to improve storage O&M automation. AI technologies are no longer limited to the monitoring and O&M of storage devices, but also supercharging storage products from the bottom up with intelligence.
Storage vendors are adopting diverse disruptive innovations to optimize storage SLA management
The combination of conventional AI + foundation models helps optimize storage SLA management from diverse dimensions.
Optimized service rollout: Storage resource provisioning and changes are shortened from days to just minutes. Traditionally, service changes require manual solution planning, change script development, and execution. Conventional AI technology makes automatic service simulation capable of formulating optimal change solutions. When adding AIGC technology, change scripts can also be automatically generated to shorten change periods to mere minutes.
Optimized infrastructure availability: The annual average failure period of the data center is shortened from hours to mere minutes. Traditional AI can help predict performance, capacity, and spare parts faults, reducing the probability of exceptions. In complex exception handling scenarios, storage management systems can also use AI foundation models to quickly strengthen interaction logic and facilitate manual fault location, thus greatly shortening troubleshooting periods.
Optimized cost management: Storage resource utilization is growing from 50% to 60%. Improper resource allocation has consistently been the primary cause for low resource utilization in data centers. AI-based intelligent identification and release of idle resources can protect storage investment. In addition, thermal analysis of global data optimizes data distribution in different media in the data center and migrates cold data in a timely manner, reducing storage costs. One large carrier in Asia Pacific, for example, used Huawei's intelligent storage centralized management software to improve storage resource utilization from 30% to more than 60%.
AI capability needs to be fully unlocked for enablement AI management architecture
With the increasing complexity of enterprise IT technology stacks and the continued emergence of new applications such as big data, containers, and multi-cloud, storage must meet increasing usage and management requirements as it serves as the foundation of IT infrastructure. More enterprises use the AI management tools provided by storage vendors to build three-layer management architectures that feature intelligent device management, data centers, and clouds. This simplifies infrastructure management and optimizes management efficiency, all while creating a new AI process of incubation, release, and optimization to better cope with AI transformation.
Figure 1: 3-layer AI architecture
Device management intelligence: Device management software collects basic information for cloud AI model incubation and obtains updated AI models in the cloud via online updates or offline imports. Software is then responsible for using and managing individual storage devices, delivering features like optimal configuration recommendations, fault detection of optical modules, disks, and controllers, and slow disk detection.
Data center intelligence: Data center management software covers a wider scope than the device software. Unified management of multi-vendor storage devices simplifies O&M processes, and intelligent cross-device data scheduling and tiering optimize storage costs. By managing full-stack data center devices, management software can then intelligently analyze application, virtualization, network, and storage resources to diagnose problems in minutes. Unlike cloud intelligence, data center management software is deployed in data centers and therefore isolated from the extranet for more stringent data requirements.
Cloud intelligence: There are powerful computing and storage resources in the cloud, which can continuously perform AI model inference and training on the running data uploaded by a large number of devices, and distribute optimized AI models to data center management software and devices on demand. Cloud management software also provides diversified O&M methods like mobile applications. However, compared to data center management software, cloud management does not have sufficient full-stack analysis capabilities for data center infrastructure and cannot support device change operations.
Storage vendors prioritize product intelligence to optimize storage device efficiency and reliability
To meet the diverse storage needs of emerging applications, storage vendors are focusing on product intelligence to boost performance and reliability. For example, Dell EMC storage products have built-in smart tuning and data reduction algorithms to achieve self-configuration optimization and optimal data reduction ratio. NetApp products deliver intelligent storage resource scheduling and fast data access. And Huawei storage systems intelligently provision hardware resources to improve read and write efficiency. In addition, Huawei storage systems intelligently adjust data reduction algorithms for a variety of data types, improving the compression ratio and reducing the average storage cost per-unit data.
Another point is the innovation of intelligent storage architectures. In traditional storage, algorithms and data are coupled. Multiple fixed algorithms are dispersed at the cache, scheduling layers, and storage pools, but these algorithms must be manually set to ensure the access efficiency of different types of data, resulting in poor flexibility. This brings us to the decoupled design of intelligent algorithms from data. Self-learning and adaptive algorithm libraries can independently determine the correct layout, scheduling, and reduction of different data types, ensuring optimal data access efficiency.
Figure 2: Algorithm-data decoupling with intelligent storage
Clearly define service model indicators and SLA requirements, and develop new evaluation standards systems once new platforms and technologies are introduced
Before enterprises plan to introduce AI-related products, they should first evaluate their current and future business needs and establish a multi-dimensional and quantifiable evaluation system that covers requirements for storage capacity, performance, reliability, energy efficiency, resilience, and ecological robustness. If enterprises work with multiple storage suppliers, they should also establish supplier capability baselines and dynamically update those baselines as their product capabilities improve or deteriorate to ensure that intelligent storage devices and management platforms best suited to actual business needs are procured.
Leverage the AI capabilities provided by storage vendors and work together on continuous AI capability improvement
The application of AI in storage systems greatly improves storage SLAs.
Enterprises should pursue joint innovation with storage vendors to strengthen AI capabilities, thus incubating AI capabilities that are more closely related to their actual service characteristics.
Update the capability model of enterprise IT teams and provide comprehensive pre-training for employees
With the introduction of intelligent storage devices, employees will need to learn and adapt to an increasing number of new technologies. Enterprises need to proactively establish training plans and technical support mechanisms to help their employees better understand and utilize the functions of intelligent storage devices and their management tools. This will ensure intelligent storage devices deliver their promised functions, achieve better human-machine collaboration, and create more value to enterprises.
Learn more about Huawei’s Data Storage solutions.
Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.