Why Mass Unstructured Data Needs a Professional Scale-Out Storage Foundation
Unstructured data accounts for more than 80% of new enterprise data and is increasingly important to production and decision-making.
According to Huawei's Global Industry Vision (GIV) report, the global data volume will reach 180 ZB by 2025, of which over 80% will be unstructured data. A total of 25% of the unstructured data at that time is predicted to be used for production and decision-making, which is expected to soar to 80% by 2030.
Trends
New applications will give rise to mass unstructured data, and AI foundation models will accelerate the use of unstructured data in production and decision-making systemsUnstructured enterprise data is rapidly growing from PBs to EBs with the development of new technologies and applications such as 5G, cloud computing, big data, AI, and High Performance Data Analytics (HPDA). This data includes a mix of video, image, and file types.
A major carrier can process up to 15 PB of data on average every day. In HPDA scenarios, a single DNA sequencer, remote sensing satellite, or autonomous-driving training car can generate 8.5 PB, 18 PB, or 180 PB of data every year, respectively.
Production and decision-making systems have also started using unstructured data, and the adoption of AI foundation models across industries is set to expedite this evolution.
In the financial industry, one major bank is using its financial big data platform and AI analysis platform to facilitate online real-time credit extensions. This helps the bank shorten the loan application process from 15 minutes to just 1 minute and improve risky applicant identification accuracy by 80%. In the healthcare industry, the Pangu drug molecule model has learned the chemical structure of 1.7 billion drug-like molecules to help reduce both drug R&D periods and costs.
A growing number of industries are looking for professional-grade scale-out storage solutions for enterprise data centers to efficiently and securely store unstructured data
First, storage must provide sufficient capacity to store more data at minimum costs, footprint, and power consumption.
- Enterprises need to use mass unstructured data. So, storage scale and scalability are now top considerations. A single cluster must support thousands of nodes to simplify storage resource allocation and management. In addition, capacity and performance must increase linearly as the number of nodes increases.
- The traditional multi-copy technique is also a capacity barrier to unstructured data storage. To optimize storage space utilization, the data reduction techniques provided by professional scale-out storage are needed, such as erasure coding (EC), deduplication, and compression. Replacing general-purpose servers with high-density storage hardware also helps reduce footprints, power consumption, and O&M complexity to achieve optimal TCO.
- Industry players can use professional scale-out storage that integrates software and hardware to provide enterprise customers with end-to-end solutions that deliver high reliability, performance, and scalability. This simplifies deployment, management, and services
Second, storage must deliver efficient, on-demand, and policy-driven data mobility both within and between data centers.
- Multi-region and multi-form data center deployment needs the data fabric function to enable the sharing of data resources across different regions, clusters, vendors, and forms, with efficient and on-demand data scheduling through a graphical topology view.
- In a data center, professional scale-out storage can be used to implement hot, warm, and cold data tiering that automatically relocates data to different tiers for optimal ROI.
Third, storage must ensure premium data usability to easily handle hybrid workloads involving video, audio, image, and text data.
- Mass unstructured data serves a wide variety of applications. Therefore, all-flash scale-out storage specially designed for hybrid workloads is the best choice to prevent data silos and provide both high bandwidth for video, audio, and file scenarios and high IOPS for image, retrieval, and query scenarios. All-flash scale-out storage delivers significantly lower read and write latency than traditional HDD storage for faster data processing.
- In scenarios that involve mass data and hybrid workloads, technologies powered by unstructured data often involve multiple access protocols (such as those for file, object, and HDFS access) in a single data processing flow. To ensure premium usability, professional scale-out storage should reduce data redundancy by implementing multi-protocol interworking without data copying.
- In addition to storing mass unstructured data, storage systems also need to manage it effectively, for example, by accelerating queries and retrieval based on metadata as well as identifying hot and cold data for proper data lifecycle management.
- Storage also acts as the last line of defense and so must ensure high intrinsic resilience and reliability, through ransomware protection, DR, and backup functions.
Suggestions
- Enterprise IT teams should strengthen their mass unstructured data processing capabilities
- As enterprises use unstructured data more widely, especially in their production and decision-making systems, the ability to efficiently store mass unstructured data and extract its huge value to help make informed decisions has become a key competitive edge. Enterprise IT teams therefore need to strengthen their mass unstructured data processing capabilities, and transform their structured data-centric capabilities to design, planning, and management of mass unstructured data.
- Choose professional scale-out storage to build a foundation for mass unstructured data
- To improve the efficiency of using mass unstructured data for production, use a professional scale-out storage system to build a global unified data storage foundation centered on unstructured data. It is best to choose a scale-out storage system that supports hybrid workloads, multi-protocol interworking (file, object, and HDFS), data reduction, high-density hardware, and all-flash configurations to ensure sufficient capacity, superb data mobility, and premium usability.
Learn more about Huawei Data Storage and subscribe to this blog to get notifications of all the latest posts.
Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.
Leave a Comment