Backup & Restore
E-3 Editor-in-Chief Peter Färbinger spoke with Alexander Wallner, NetApp Managing Director Germany, Michael Scherf, Member of the Management Board All for One Steeb, and Martin Finkbeiner, Managing Director of Grandconsult, a subsidiary of All for One Steeb.
The 2015 study "Espionage, sabotage and data theft - business protection in the digital age" conducted by the industry association Bitkom shows that only 49 percent of German companies have an IT emergency plan.
"From our own experience, we can say that these figures are quite realistic, depending on the industry and business model"
Alexander Wallner confirms the current situation at the beginning of the interview.
Research also shows that emergency plans are unfortunately not even worth the paper they are printed on. Michael Scherf from All for One Steeb thus also distinguishes between IT and organization:
Anyone who develops an IT contingency plan will also define regular test runs there. This is used to check the functionality and consistency of the restore data.
However, the organizational processes surrounding data recovery should also be tested and trained. Particularly in industries that carry out high volumes of transactions or are closely integrated into supply chains, such as large online retailers or companies in the automotive supply industry, every minute counts in an emergency.
There is no time to discuss who should do what in detail.
"Backup and business continuity are basically a central topic in every IT organization, and the CIO's job here is to ensure data security and data availability. So much for the status quo"
says Alexander Wallner.
"What is changing, however, are the business processes and increasingly even the business models - and thus essential framework conditions. In recent years, for example, data has increasingly become a central production factor.
Digital transformation with its end-to-end digital process chains and highly data-centric business models that analyze customer behavior, for example, mean that the message has also reached senior management that IT systems must function in a fail-safe manner."
If they do fail, it must be possible to restore them very quickly. However, IT managers not only need to demonstrate effective business continuity strategies to management, effective measures are also required for compliance with data protection regulations.
"Anyone who knows this background should really assume that 100 percent of all companies should have detailed business continuity contingency plans in place, right?"
Wallner asks.
However, the practice is clearly different.
The K case
All predictions are difficult, especially if they concern the future. Thus, probably the best precaution for the K-case is to practice standards and processes.
"With some companies, we run such tests once a quarter, which already provides a high level of confidence that data and operations can be safely recovered.
However, anyone who operates a living SAP landscape that is regularly expanded or into which updates are continuously imported should conduct such training more often."
warns Martin Finkbeiner of Grandconsult, a subsidiary of All for One Steeb.
If the data recovery did not work with the last backup, older versions must be used. It may also be necessary to apply older system patches or even to restore hardware components in order to make the IT environment compatible with the old backup data.
"In an emergency, however, it is often too late to do so"
Finkbeiner knows from his professional experience.
"Therefore, sufficient precautions should always be taken to ensure that even in the K-case, which can never be completely ruled out, as much as possible goes smoothly right away during data recovery."
Surveys have shown that the most common cause of data loss is a hardware or software defect. This can be the failure of a hard disk or a storage controller, but also a temporarily faulty application that does not save its data quite correctly.
Also very common are operating errors or a loss of data due to a sudden power failure without the emergency power supply being able to kick in in time.
Malicious software such as viruses or natural forces such as fire and water are significantly less likely to be the cause of data loss.
"In an international survey, the Federal Statistical Office cites errors in IT components as the leading reason for data loss, followed by human error, power outages and outages generated by weather"
explains Alexander Wallner from NetApp. Can you practice or automate the K-case?
"The can-do question doesn't even arise".
says NetApp manager Wallner.
"Rather, it's a must to train on the operational and organizational procedures of an IT outage, including data recovery."
The BSI (German Federal Office for Information Security) basic IT protection catalog prescribes "exercises for data reconstruction". The design in practice can vary.
"Some IT organizations test their contingency plans in as many as four test runs per year"
adds Michael Scherf from All for One Steeb.
But he also knows that the reality is, unfortunately, that people only get around to performing the most necessary steps once a year.
"In the event of a disaster, however, this attitude leads to operational chaos and significantly delayed start-up times"
warns Scherf.
Priorities in view
From a purely technological point of view, there are no hurdles when it comes to backup. Even for the largest SAP Hana scenarios with Hadoop clusters, there are secure and efficient backup systems. To set up the necessary processes within the IT organization, some providers offer free best practice examples and tips on the Internet.
"The biggest threat is basically the day-to-day project work in IT and the efficiency and cost pressures"
warns Martin Finkbeiner.
"This causes the unproductive activities like backup to slip to the bottom of the priority list."
Data backup is something everyone does - right?
"Data backup, in one form or another, is in fact done by just about every company"
Martin Finkbeiner from Grandconsult is convinced. However, the reason for backing up data is to be able to restore the backed-up data very quickly and without errors in the event of an emergency, so that business operations are affected as little as possible.
"It is often not until the restore that it becomes apparent how effective the backup strategy actually is"
Finkbeiner emphasizes the recovery that really matters in the end.
"Reasons for failure of recovery are many"
knows his colleague Michael Scherf.
Technical problems can lead to data on a storage medium no longer being readable or a backup not being saved consistently. In modern backup environments from NetApp, for example, magnetic tapes have long been dispensed with entirely.
Data backups that have run cleanly from a purely technical point of view, but whose restoration delivers unusable results, for example in the event of errors or mutually unfavorable factors on the application landscape side, are also particularly tricky.
"Restore problems like this also occur again and again in practice. So there are very different factors that can lead to a faulty restore."
adds Alexander Wallner.
False security?
With redundant data storage, emergency data centers with data mirroring, etc., the data is usually very secure. So what's the point of backing up?
"Depending on the business model and industry, data availability requirements vary widely, so backup solutions are still necessary despite redundant storage systems and business continuity concepts"
Alexander Wallner emphasizes emphatically in the E-3 interview.
In addition, the backup also assumes the role of archiving and thus fulfills legal requirements.
"Therefore, companies always need a multi-level concept for current operational data up to long-term archiving"
Wallner explains. Michael Scherf adds:
"In order for an emergency data center, for example, to be able to step in immediately in the event of an emergency, more or less permanent data backup is required, depending on the requirements of the business operation."
In addition, several generations of data backups often have to be restorable in order to selectively restore specific parts of a complete data backup.
Not Zero Downtime Everywhere
"Increasingly, business operations can only tolerate a small amount of downtime"
Michael Scherf describes the changing business.
"The window for restarting IT is therefore getting narrower and narrower. The appropriate solutions for this are anything but trivial, but with the involvement of a specialized external service provider, they can still be mapped economically."
Online backup via snapshots without interfering with ongoing IT operations and very fast and targeted recoverability are important key points. Scherf knows the practice:
"An SAP table is quickly shot up in day-to-day business, and only this table is to be restored from the last data backup."
Classic magnetic tapes or tape robots are therefore clearly on the retreat for modern backup tasks. Today, IT experts prefer high-availability backup networks.
The time window for backup and recovery with simultaneous strong increase in data volumes is considered the decisive parameter for an effective data protection system. NetApp offers SnapManager or SnapCreator for classic databases.
Both products use snapshot technology for fast and performance-neutral backups. Among other things, these backup tools use the SAP Backint interface and are thus integrated into SAP backup management and monitoring (DB12/DB13).
"There are no blanket data protection concepts, as customers have a wide variety of SLAs"
Alexander Wallner emphasizes.
"It is always important to consistently derive the data protection concept from the requirements of the business operation"
Scherf describes the scenario and raises some questions:
- How long can IT be down without having an unacceptable impact on my business?
- To what extent is data loss acceptable to me, because I can manually retrace it if necessary?
- Which applications are absolutely business-critical and can even be designed for zero downtime, and which are not?
- When do which of my customers' sanctions take effect if, for example, as a supplier to the automotive industry I can no longer meet delivery schedule calls on time?
- What kind of revenue losses should I expect if my trading platform fails?
- How much time do I have at all to recoup the losses later through correspondingly more transactions?
- Is my IT landscape even designed to handle the multiple transactions?
"Only when the business case is clearly and resiliently outlined can the appropriate IT service continuity strategy be derived from the business continuity requirements"
adds his colleague Martin Finkbeiner. Important, says Finkbeiner: The aforementioned basic questions about the requirements of business operations should be revisited periodically.
Especially in times of digital transformations and their enormous business dynamics, the same questions are answered fundamentally differently a year later, according to experience.
"Data protection becomes especially complex when the amount of data increases and the number of systems that need to be backed up in parallel grows, for example, SAP landscape backups"
NetApp manager Wallner knows from his professional experience. In addition, there are the long backup runtimes that put a strain on system operation. With the help of storage-based backup methods, backup times can be minimized, performance losses can be almost completely neutralized, and the simultaneous backup of an entire SAP landscape can be realized.
"If you want to optimize your backup strategy without a lot of administrative effort, use a cloud storage gateway, for example"
says Wallner.
Cloud backup
The AltaVault solution offered by NetApp is available as a physical appliance or virtual machine and handles the transfer of the company's own backup data to any cloud provider or even to a private cloud.
AltaVault can be deployed in any SAP landscape and works with popular backup applications.
Technologically, AltaVault provides the cloud with access similar to that of a network drive: Protocols such as CIFS (Common Internet File System) and NFS (Network File System) form the basis for IT to directly reuse existing processes and software for data backup. This safeguards investments already made and accelerates implementation.
"In addition, the solution can be deployed with public clouds such as AWS, Azure or Softlayer."
Alexander Wallner is proud of NetApp's expertise. Backup services from the cloud, which are aimed at companies, are offered by IT service providers in various forms and quality levels.
"The problem here is the comparability of services, as CIOs also need to look at SLAs when purchasing cloud services, for example"
Wallner emphasizes in an E-3 interview. For example, it is quite time-consuming for the IT department to evaluate different providers. This is the background to the "Backup as a Service" offer.
Based on NetApp technologies, authorized service providers offer the complete service of backup to the cloud. The special feature here: NetApp certifies the service of the IT service providers, thus practically taking over the quality control for the enterprise customers.
"In addition, partners may only use data centers in Germany for BaaS"
Wallner explains. Today, ten partners in Germany already offer their services for BaaS, which are of course also perfectly suited for SAP customers.
Hana Backup
Unlike traditional databases, which primarily read their data from disk or flash, in-memory computing systems like SAP Hana keep most of the data entirely in main memory.
"This creates new demands on the backup infrastructure, as significantly more data needs to be backed up"
Alexander Wallner knows the new technological challenges. This data is typically saved as continuous streaming to a backup system, as classic procedures with daily delta backups no longer work here due to the volume of data.
For a backup in the TByte range, backing up to disk and then to tape can take several hours. Restoring the data takes a similar amount of time.
"That's why many companies today are working with the concept of snapshots"
Wallner knows.
"Here, backup copies of the operational SAP system are continuously created and backed up without burdening the productive systems. Recovery is thus considerably faster."
NetApp's FAS or AFF storage systems create these snapshots for Hana backup in just a few seconds. An analysis conducted by NetApp among existing Hana customers showed that the average time for a Hana snapshot backup is 19 seconds.
Even complex backups do not run for more than a minute for customers.
"SAP solutions are mostly business critical"
adds Michael Scherf from All for One Steeb.
"The increased use of Hana is also changing the technology base, which is also increasingly being used to redesign previous business processes or even business models."
This changed "big picture" also influences the organization of data backup and data recovery. In addition, the temporary inclusion of resources from the public cloud, such as compute power, offers completely new scaling options.
"Our procedural model for the Restore-Schutzbrief therefore ranges from reviewing the requirements from the specific business case to analyzing and comparing them with the backup processes and technologies already in place, evaluating suitable target scenarios, implementing them and, above all, ongoing services during ongoing operations"
Scherf explains the holistic backup and restore model that has been developed.
In conclusion: What advice would this panel give to an existing SAP customer about reviewing their data backup?
"This is where we recommend relying on a service provider to verify and validate backups"
answers Martin Finkbeiner for everyone.
"For this purpose, we have developed our offer for the Restore Protection Letter. The basic idea is:
Backup is not everything, because without validation of the restored backup, everything is nothing. So no backup without validation of the restore."