Are Your SaaS Backups as Secure as Your Production Data?


Conversations about data security tend to diverge into three main threads:

All are valid and necessary conversations for technology organizations of all shapes and sizes. Still, the average company uses 400+ SaaS applications. The same report also uncovered that 56% of IT professionals aren’t aware of their data backup responsibilities. This is alarming, given that 84% of survey respondents said at least 30% of their business-critical data lives inside SaaS applications.

SaaS data isn’t like on-premises or cloud data because you have no ownership over the operating environment and far less ownership of the data itself. Due to those restrictions, creating automated backups, storing them in secure environments, and owning the restoration process is a far more complicated engineering task.

That inflexibility leads organizations to develop workarounds and manual processes to back up SaaS data, leaving them in far less secure environments—a shame because your backups are almost as valuable to attackers as your production data. Organizations that treat SaaS data with less care, even in light of double-digit growth in the usage of SaaS apps, are handing over the keys to their kingdom in more obvious ways than they might expect. With the threat of data loss looming, what is the cost to your business if you don’t move quickly to build a SaaS data recovery plan?

Let’s illustrate a common scenario: Your team has a single GitHub organization where your entire engineering team collaborates on development and deployment projects on several private repositories.

Now, let’s tweak that illustration with a less common addition: You have backups for all of your GitHub data, which includes not only the code in each of those repositories but also metadata like pull request reviews, issues, project management, and more.

In this case, your GitHub backup data won’t contain passwords or personally identifiable information (PII) about your employees besides what they’ve already made public on their GitHub profile. It also wouldn’t allow an attacker to move laterally to your production servers or services because they haven’t yet found their attack vector or point of intrusion. You’re still not, however, out of the woods—backup data of all kinds does contain information attackers can learn from, creating an inference of how your production environment does operate.

Every insecure backup and clone of your private code is remarkably valuable if the attacker only aims to steal intellectual property (IP) or leak confidential information about upcoming features, partnerships, or mergers and acquisitions activity to competitors or for financial fraud.

Your Infrastructure as Code (IaC) and CI/CD configuration files would also be of particular interest, as they identify the topology of your infrastructure, expose your testing infrastructure and deployment stages, and reveal all the cloud providers or third-party services your production services rely on. These configuration files rely on secrets such as passwords or authentication tokens. Even if you’re using a secret management tool to obfuscate the actual content of said secrets from being version-controlled on GitHub, an attacker will be able to quickly identify where to look next, be that Hashicorp Vault, AWS Secrets Manager, Cloud KMS, or one of the many alternatives.

Because you’re also backing up your metadata in this illustration, an insecure implementation leaves your pull requests and issue comments, which you have otherwise hidden inside your private GitHub repositories, available for an attacker to explore. They’ll quickly learn who has privileges to approve and merge code into each repository and explore checklists for deployment or remediation to identify weaknesses.

With this information, they can craft a far more targeted attack, either directly against your infrastructure or using social engineering methods, like pretexting, on employees they now understand to have admin-level privileges.

In short, SaaS data has never been more critical to your organization’s hour-by-hour operations. Whether you’re using a code collaboration platform like GitHub, productivity tools like Jira, or even leveraging Confluence as the core provider (and dependency) of an entire brand, you’re beholden to environments you don’t own, with data management practices you can’t fully control, just to keep the lights on.

SaaS data is uniquely vulnerable because, unlike on-premises data, there are two stakeholders: your provider and you. Your provider could experience data loss, like when GitLab lost 300GB of user data in just a few seconds when an engineer wrote over their production database. You could make an honest mistake, like accidentally deleting your instance or uploading a CSV that instantly corrupts every facet of your data.

Awareness is a major concern. In a 2023 report from AppOmni, 85% of the IT and cybersecurity experts they surveyed claimed there is no security problem around SaaS. Yet, 79% of those same folks admitted their organization had identified at least one SaaS-based cybersecurity threat in the last 12 months. The most common incidents were vulnerabilities in user permissions, data exposure, a specific cyber attack, and human error.

At the same time, a report by Oracle and analyst firm ESG uncovered that only 7% of chief information security officers (CISOs) said they fully understand the Shared Responsibility Model, which puts the onus of data security on the user rather than the SaaS provider. 49% of respondents also stated that confusion around that model has resulted in data loss, unauthorized access to data, and even compromised systems.

The answer to any fears about the security of backed-up data is not to ignore backups altogether.

As you explore the landscape of platforms that allow you to backup and restore data from those mission-critical SaaS apps, you should carefully validate these must-haves:

The unique threats to SaaS data are rapidly expanding. Even the tools we think are designed to uncover inefficiencies or automate work we’d rather not do, like third-party AI agents, could be massive data loss incidents in disguise—ones we’ll certainly hear about in the months and years to come.

When you give an AI write access to your SaaS platforms, it might innocently corrupt all your mission-critical data at GPU-accelerated speed. When reports of these situations start popping up en masse, you’ll be glad you tucked your SaaS data away where no one—an attacker or a lost AI—can read it. You’ll be doubly glad it’s also safe and sound when you need it most.


Please enter your comment!
Please enter your name here