Since the widespread onset of COVID-19 in March 2020, I.T. organizations have faced one crisis after another. The longest-lasting one has been massive supply chain shortages and disruptions for computer hardware ranging from PCs to servers and other data center equipment. Anecdotally, I.T. managers share that it’s taking several months to get a new storage device and 10 months to order new networking gear. These delays make it difficult to meet demands across the organization for capacity — while consuming ever more time for I.T. to manage a complicated global supply chain process.
Of course, supply chain disruptions have also resulted in escalating costs for new technology. To offset these costs, 97% of I.T. departments have undertaken alternative measures such as purchasing cheaper hardware and using an internal chargeback approach for budgeting, according to a survey by GetApp. Nearly 60% said they are coping by refurbishing older hardware.
These are not viable strategies when we’re talking about data center equipment — the servers, storage and networking gear that run the business and must guarantee fast, reliable and secure access to files, databases, applications and other data services. For I.T. infrastructure purchases, it’s important to try to extend the life of your existing hardware, use cloud services when and where you can, and manage data surgically so that you can reduce waste on your most critical hardware assets and free up capacity.
Lengthening the life of your existing infrastructure also means less disruption to existing processes.
One of the highest budget categories in I.T. these days is data storage — comprising at least 30% of the overall budget for most organizations, according to I.T. directors and executives surveyed in the Komprise 2022 State of Unstructured Data Management. Below are three ways to address supply chain shortages without creating undue complexity and costs — such as haphazardly switching vendors or massively expanding capacity by stockpiling.
3 Tactics to Mitigate Supply Chain Hell in Data Storage
Get the Data on Your Data
Demand for data storage technology of course depends upon your organization’s data profile, but many enterprises don’t have deep insight into data volumes, where data is stored, growth rates and data characteristics, especially across disparate silos. This entails understanding which departments and individuals are your top data owners and producers, the types and sizes of files that consume the most space on your storage devices (such as clinical images in a healthcare organization), usage metrics (such as time of last access) and storage costs by department.
With more insights on data characteristics, usage patterns and costs, I.T. can make better decisions about data management, such as which data sets meet the requirements for top-tier network attached storage (NAS) and backups. Gathering this data isn’t easy without automation and central visibility across storage, whether it’s located on-premises, at the edge or in the cloud. The data should be easy to view and act on, with the ability to drill down into metrics to see trends and to perform cost modeling of deploying different storage technologies including cloud storage.
Get the Older, Less Valuable Data Off Top Tier Storage and Data Protection — ASAP
A critical next step is to analyze data for usage, such as last access times. Many enterprises are storing petabytes of old or “cold” data which has not been accessed for over a year on their most expensive, Tier 1 storage devices. Typically, cold data can account for 50% to 80% of all the data stored in Tier 1. By moving cold data to secondary storage or deep archives — including object storage in the cloud — you can free up critical space on your data center hardware which of course extends its life and lowers the need to purchase (and wait) for a new network attached storage (NAS) device.
If you can tier the cold data transparently without disruption, it’s a no-brainer. Capacity savings come not only from primary storage but all the backup copies and replications which are not needed on tape or in cloud object storage. There are also of course the cost savings, which can be up to 70% annually.
Manage Data Continuously Over its Lifecycle
As data volumes have exploded in the last few years — primarily from growing unstructured data volumes such as user files, video, sensor data and images — it is no longer viable to have a one-size-fits-all data management strategy. Data needs will always change unpredictably over time. Organizations need a way to continually monitor data usage and move data sets to where they need to be at the right time. This is not just a metrics discussion but requires regular communication and collaboration with departments. What are the new requirements for compliance, security and auditing? What about analytics needs in the coming quarter and year?
This information helps all I.T. departments optimize decisions for ongoing data management while still keeping costs and capacity in mind. For instance, by knowing that the R&D team always wants their data available for re-testing for up to three months after a project, I.T. can keep it on the NAS for 90 days and then move it to a cloud data lake for long-term storage and potential cloud machine-learning or artificial-intelligence analysis. Even better, some unstructured data management systems can be configured to give authorized end users access to their own shares so they can search for specific files, view data usage or chargeback metrics and even tag data with key information to make future searches easier.
An often-overlooked aspect of data lifecycle management is retention and deletion policies. Every organization should have such policies and a way to enforce them safely and systematically. Policy-based workflows can automatically confine data for review and deletion. Data which no longer has viable use, such as files which don’t serve any purpose after a project completes, research copies, and ex-employee data are good places to start.
It’s unclear when the current supply chain crisis, which has multiple causes and is affecting many different products beyond I.T., will end. “Shipping lines expect problems to last through the first half of 2023 as increasingly widespread port congestion adds further pressure to already record-low schedule reliability,” according to a report by S&P Global Market Intelligence. As reported in Bloomberg, Michigan State logistics expert Jason Miller expressed concern given ongoing insufficient supply of labor and raw materials: “At the rate things are going, I hope issues will start to substantially ease by summer 2023, but I’m increasingly worried that prediction may be too optimistic.”
With all this to consider, data storage managers should work closely with vendors to devise smart data management strategies to weather the storm. They will also need to get as analytical as possible on the data in storage to extend the life of their data center hardware, move data to the cloud whenever possible, and delete data that is simply wasting space and money.
Kumar Goswami is the CEO and Co-founder of Komprise.