How do I get old and cold data out of my primary storage system?

By | 21. May 2024

Hello everyone,

almost every administrator is familiar with the problem that it feels like 90% of all data on their storage system is “just old data and garbage anyway”. It is usually difficult to prove this.

NetApp offers in its ONTAP systems the possibility to recognize the cold and old data on storage (block) level. Since ONTAP 9.4 (and that was a long time ago) you can activate Inactive Data Reporting on your ONTAP storage systems by issuing the following command in the CLI of your ONTAP cluster:

storage aggregate modify -aggregate youraggregatename -is-inactive-data-reporting-enabled true

(This feature is already activated in newer ONTAP versions)

And then?

Then you wait a bit… 😉 ONTAP will gradually “notice” which of your data is really being actively used and which is not.

You can then see in your System Manager under Storage -> Tiers -> Volumes how much inactive, i.e. cold, data is available. Depending on the version, ONTAP assumes that the blocks in the volume have not been used for at least 30 days. In my demo system, this is around 230 GB in total.

Outside, “in the field”, there are TerraByte values. I can tell you that between 70 and 90% of the file data has not been touched for more than 90 days. This is actually a huge “waste” of expensive storage space.

So if you have an all-flash system in use, ask yourself why you might be “wasting” tons of “expensive” SSD storage for old and cold data.

And what do I do with this cold/old data on my All Flash system?

Good question, tough question… You have several options:

Option 0.1: “I simply delete the cold data!”

Hahaha, you know the consequence yourself 😉

Option 1: Leave everything as it is. “I have enough storage space anyway, that’s enough for years”

Congratulations on this option. You and your company have enough money and are showering the IT department with resources. When can I start with you?

Option 2: Tiering to cheaper storage on-premises

You buy another storage system with cheap rotating spindles and store all your cold and old data on it.

“Hi boss, I know we don’t have any money for IT, but I know how we only need to buy a very small amount to… “PISSOFFF!!!!””

Option 3: Tiering the cold data to “The Cloud”

“Hi boss, we’re running out of disk space on our primary storage system. I have an option with which we can solve this problem without CAPEX (Capital Expenses), i.e. without investment. With an OPEX (Operational Expenditures) option as part of a Storage as a Service, we can outsource our old and cold data to a provider within Germany. This allows us to save and reduce the storage requirements of our “expensive” all-flash storage in our own data center. We outsource the old and unused data to an ISO 27001 and ISO 9001 certified provider. “

Your boss presents you with the golden oak leaf wreath of the employee of the year, you are promoted, now drive a Porsche, a coffee fountain with your initials has been erected in the office, there is free beer in your name every evening, the IT department plays Counterstrike with 3D glasses for eight hours a day and you are invincible as a HALO Master Seargant.

… Wait a minute… Counterstrike?… HALO Master Seargent?….

… Your dream is shattered!

You wake up and realize that you are still in the real world.

But not everything was a dream. Tiering of cold data is real and I’ll show you how fast it works.

Ready to rumble Master Seargent?

Okay lets go!

In your ONTAP System Manager, click on Storage-> Tiers(1) -> Add cloud tier (2) -> StorageGRID(3)

Then we fill in the requirements.

Then I can set the tiering options for each volume individually.

Here I choose between Snapshot Copies only, Auto, None or all.

Explanation of the cloud tier settings:

SettingExplanation
Snapshot copies onlyOnly snapshots are stored in the cloud tier.
AutoCold blocks are swapped out after a certain cooldown time
NoneNothing is swapped out
AllAll incoming blocks are swapped out immediately. In this and all other cases, however, at least 10% of the metadata remains on the source.

Now that I have swapped out some of my volumes using cloud tiering, also known as fabric pool, I can see the direct benefits shortly afterwards. In this example, that’s just under 55 GB that I was able to outsource.

Now you’re wondering… Why are there two green arrows in the graphic? Well, that’s quite simple. I can also mirror the target to which I want to replicate my cold data! So it’s not a single point of failure.

But more on that in the next blog post.

I hope I was able to help, inspire or challenge you. Feel free to ask your questions in the comments

Best regards

André Unterberg aka DerSchmitz

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.