5 Questions to Ask Before Migrating Your Legal Data to the Cloud

  • Will Pfeifer
  • February 22, 2021

Venio Systems kicked off its monthly #VenioVibes webinar series on Tuesday, Feb. 9 with “5 Questions to Ask before Migrating Your Legal Data to the Cloud” (learn more about #VenioVibes here). This webinar addressed the security issues that must be considered when deciding how to protect your legal data in the cloud. With increasing amounts of data and vital business functions moving to the cloud, it’s never been more important for people working in eDiscovery to be aware of the security pitfalls and problems they might encounter and the solutions that are available to their organizations.  

Last month, Venio Systems launched its Venio Cloud End-To-End eDiscovery SaaS, a new cloud-based platform that offers a self-managed, intuitive, and highly scalable solution to transform the way teams handle eDiscovery (learn more about Venio Cloud here) . In the webinar, Venio Systems Chief Technology Officer Arestotle Thapa and VP of Products Ankur Agarwal discussed the security considerations that went into launching Venio Cloud. Issues covered included protecting storage, data transfers, cloud-versus-on-prem, and other topics. The webinar also focused on how the most vital security issue isn’t whether the cloud is safer than on-premise storage, it’s what companies do to train their employees to be secure stewards of the data for which they’re responsible. Finally, webinar attendees learned the 5 important questions to ask when they’re considering moving their data to the cloud.

How do I get my data into and out of the cloud?

When making the decision to move your data into (or out of) cloud-based storage, here are some of the things to consider:

  • Volume of data
  • Number of files
  • Connectivity to the cloud
  • Priority: Do you need all your data moved to the cloud immediately, or would it make more sense to segment your data and move portions at a time, based on specific needs?

There are also different ways to transfer your data. These include:

  • FTP: Works best for relatively small amounts of data, and if users are relatively close to the cloud storage location.
  • Shipping physical data: If the volume of data is too big to transfer via FTP, many cloud providers will ship you servers and allow you to load your data, then ship it back to them for storage and access.
  • Specialized applications: Can help improve the speed of data transfers depending on your location and the location of the storage.

When building Venio Cloud and running simulations that mimicked users loading data, Venio technicians discovered that the number of files being transferred can be at least as important as how large the files are. For example, when transferring a large number of files, even if the total volume of those files may be as little as a single GB, it can take six or seven times longer for the transfer. The number of files can matter more than the total data volume. For example, zipping those same files into a single file, dramatically increased the upload speeds.

Also, it’s advantageous to choose cloud storage located in the same region as the people who will be accessing the data. With more and more users working remotely, having a closer connection to the data stored in the cloud will make your system more responsive and help users be more efficient with less data latency.

How will the cloud solution (and my team/connection) perform?

Once your data is in the cloud, how will your applications perform and how will the data be accessed? Here are some things to consider:

  • Multi-cloud/hybrid vs. centralized deployment: Some clients have specific applications and data behind a firewall while other data is stored in the cloud, and some even have multiple cloud suppliers. 
  • SLAs and performance expectations: Performance expectations can vary depending on when (and how often) reports are generated, how closely (and how many) users are monitoring performance, etc.
  • Location of users: The closer data can be located to the users, the better the performance will be. If a company has users located over a great distance (or even all over the globe), the solution might be to replicate the data on multiple regional servers so they all have close access.

Load balancing and replication of data are usually something the cloud provider will take care of. It’s an easy matter for them to ensure the data load is balanced and that the data is replicated as needed. For those who have not deployed and managed virtual environments, this is a great help to fill technical skill gaps.

One of the major concerns in the legal industry regarding cloud-based storage involves backups and recovery: What if I lose access to the cloud? What happens if my data goes down? How will I recover it? How will I restore it? Cloud-based data storage is actually in a better position to deal with these issues than on-premises storage. For an on-premises system, backup has to be done when users are not active because, depending on the volume of data, it can take a long time to back up. That means it can’t be done every day, which means the data is less protected against loss. But in cloud storage, snapshots are taken for backup and replication purposes on a regular basis, and that process does not affect user activity. Also, if an issue arises and recovery becomes necessary, cloud storage allows you to easily go to a specific point in time for data recovery. That is something you can’t do when backing up the system on-premises.

In on-premises storage, backing up data takes time and effort, often as much as a week’s worth of work. In fact, those systems are rarely tested because so much work is involved, which means when a problem arises and recovery is needed, data restoration workflows and systems aren’t ready. But backup recovery in the cloud is very different. Because the data is stored on virtual servers with replication, the data can be recovered and access to datacan be restored quickly and easily.

How do I keep cloud data safe?

There are definitely threats to data security that can become more of an issue as data moves from behind an on-premises firewall to the cloud. The two primary issues being:

  • Cryptojacking: In a sense, cryptojacking is more likely with a cloud-based storage solution because unlike disconnected on-premises storage, cloud-based storage must, due to its nature, be connected to something. An on-premises server with users working only within the same network needs no external connection, but the cloud must be linked in order to make the data available to authorized users.

    But security measures can be taken to prevent cryptojacking The main reason systems remain vulnerable is because users assume the data is completely safe and thus don’t follow basic data security protocols. Because they are accustomed to data being protected by a firewall, when data moves to the cloud, they keep those assumptions — which is where cryptojacking becomes a potential threat.
  • Ransomware: What began as a small criminal operation has transformed into a new type of organized crime, with total costs to victims estimated to be as high as $6 trillion, which would make it the biggest transfer of wealth in human history. Some organizations using ransomware have claimed they made as much as $2 billion in less than a year. Those are staggering numbers, and sobering ones if you have data or operations that could be a target.

    On-premises data storage, with backups scheduled relatively irregularly and often in the same datacenter, is actually more prone to ransomware threats. Remote servers dispersed in the cloud for primary and backup storage with replication can make recovery faster and safer, often without having to pay the attackers a dime.  

Overall, however, we found that the data stored in the cloud is often more secure than data in an on-premises server. That’s because cloud storage providers have much greater tools and resources at their disposal, along with the newest technology, to secure data. That doesn’t, however, mean you can automatically assume your cloud provider will take care of all security concerns and you don’t have to take precautions yourself.

The biggest security issues with cloud storage end up being the same human ones that we have on-premises. Users leave backdoors open, fail to be careful with passwords, and, in general, don’t follow obvious security procedures that they would follow otherwise. They’re used to the firewall protection of on-premises servers, sothey carry poor security practices into the cloud. 

Who has access to my data?

Clearly defining who has access to data, for what purpose, and for how long will go a long way to securing cloud data. There are a number of people and groups who will need access to your data. These include:

  • Employees
  • Clients and/or customers
  • Partners, including outside counsel, witnesses and experts
  • Managers of data
  • Cloud providers
  • Companies hosting data

So how do you provision your data, given that so many people need different levels of access? Using the right application can provide the necessary flexibility to provide access to the data, but careful considerations must be made. Who — really — needs access to the data, and for how long? Two-factor authentication and complex passwords are absolutely necessary to make sure only those who need access are getting it, and you need to ensure that those granted access are not sharing it with anyone else. 

In Venio Cloud, steps were taken to ensure that when access had to be shared with a client or outside party, it could be done through secure means via email links, and those links could be turned on or off at any time. What’s more, only the necessary sections of the data would be shared, and those links expire after a set period of time. This was a function of the application, and not something limited to cloud storage. 

If you are storing data in the cloud, you need to be sure that your application can support the necessary security features, including link expiration, the ability to share securely (and share only subsets of data), etc. The system should also have application audit logs, so you know what was done or what data was accessed by whom and when. 

Another key element is blocking access to all back-end users. In on-premises applications, back-end access can be a common occurrence. Even though the policy might be to only give access to administrators, people come and people go, and sometimes contractors need access to your machines. Often, companies fail to purge those access rights, leaving their data more vulnerable. Venio Systems solved this problem by logging off every person, only allowing them access for a specified period of time. Only one or two people — both Venio employees — manage data, and everyone else has to prove they have a specific need for access. 

In the end, sharing data has two parts: the application and the process. The application involves how you share the data, including what technology is used, how access is given (and taken away), and what protections are in place.

The process is more philosophical and involves three important questions:

  • Does the user have a genuine need to access the data?
  • What specific part of the data?
  • And for how long should that  access be granted?

This also involves keeping track of the data, using audit logs, so there’s a record of who has access to what data. And it means making the decision not to grant access to everyone to ensure the data remains secure.

How do I develop a security mindset?

It’s crucial to develop a security mindset from the first day that work begins on an application, whether that application will be deployed on-premises or in the cloud. It’s a requirement with Venio Systems that we held to when developing our own Venio Cloud platform, and it’s one your company should follow as well — and be sure your platform provider follows.

Don’t have a laissez-faire attitude regarding security. Take it seriously. Don’t assume that, because a third-party company is storing your information in the cloud, or because your IT department is taking care of storing that information on servers, that it’s not your responsibility — and your team’s responsibility — to follow secure practices. That means:

  • No default and/or weak passwords: Users should have passwords that meet security standards and change those passwords on a regular basis. And, of course, those passwords should not be shared or reused.
  • Make sure servers are maintained and patches are applied promptly: In many cases, keeping servers secure isn’t a priority. It must be a top priority to prevent cyber attacks and other breaches. Each day, there are millions of attacks on cloud servers, using brute force or other methods to either steal data or steal resources. Not applying patches may leave known vulnerabilities open to attackers. They may use your company’s servers for bitcoin mining, for example, or to send scam emails. Either way, you need to ensure that your servers are protected against such attacks. 
  • Make sure applications and email are secured: Third-party, unauthorized applications running on your servers can conceal ransomware, viruses, or other security threats. Take care to insure that the apps your employees are using are secured and do not pose threats to your data. 
  • Well-trained employees to spot suspicious items: Most vulnerabilities start as phishing emails, so you must make sure employees are properly trained in security protocols and know not to open suspicious emails that might conceal ransomware or ask for information or credentials that could be used for social engineering, allowing unauthorized users into your system. They must be trained and tested regularly to ensure compliance.

Perform a security risk analysis: This analysis, performed regularly by your own internal security team, should cover all potential risk areas, including: 

  • Data: Is it stored safely?
  • Applications: Are they authorized? Are they secured?
  • Servers: Have they been hardened to protect against attacks?
  • Training: Have your employees been trained in security measures? Is that training repeated and/or updated as necessary?

Undergo third-party Pen (penetration) testing: Your company’s internal testing is not enough. It can easily miss elements that are not secure enough, or internal biases can skew the results. By bringing in an experienced third-party expert, you can expose potentially dangerous security breaches. Such testing is more effective, because it is:

  • Independent
  • Unbiased
  • Based on a greater skillset 
  • Utilizes good oversight

Third-party tests are vital because companies tend to overestimate their own security policies and procedures at the same time that they underestimate the threats. If you ask a developer to test his or her own code, chances are they will miss the obvious issues, because they will only check what they already know to be threats. They’ll reconfirm their own ideas while missing the backdoor issues, and they’ll miss the fact that security problems can arise from within the network and not just in the form of external threats. 

Third-party testing is not a one-time thing. It’s a process. Companies must undergo the test, get the results, fix the problems and then test again. And again. The threats are constantly evolving, so the security measures must be evolving as well. When companies slow down testing or stop it altogether, the other side gains a dangerous advantage. 

Security on a home network can also be an issue for a number of reasons: People keep default passwords on routers, making them easy to find. Also, they are not network security experts, meaning it’s incumbent on the company to make sure that they practice safe security. Because so many people are working remotely now, it’s more important than ever for companies to take additional security precautions and provide extra training.

On-Premises vs. Cloud Security

On-Premises Cloud-based
Physical Security Secure Site
Local Connections Internet Dependent
Behind a Firewall Outside of Firewall
Designed for Internal Access Designed for External Access


But when the cloud arrived, the rules changed. Suddenly, you could be anywhere, and you didn’t have to be physically close to the server holding your data. In many cases, companies using cloud-based storage don’t actually know where the servers are located. There are often multiple servers holding a company’s data, and workflows can move from data center to data center depending on the need. When cloud-based storage became popular, there were many fears initially surrounding because people didn’t know where the server was located. At that point, everything had been based on an internal perspective, meaning people felt better if they could actually see the server and felt they could build their own security measures to protect a physical server. Having an on-prem solution gave people a feeling of being in control. 

The truth is, whether you’re using an on-prem or cloud-based option, the only things that change are tools you use and the amount of work you have to do. Your ability to manage some of the elements in the cloud, for example, physical security, is limited. You have no control over the specific security measures that are used. But cloud-based storage is, in all probability, more secure than anything you could provide on your own. The major providers, including Amazon and Google, have been providing secure storage solutions for years, and they are experts in the field who have all the necessary security measures in place. They have no record of unauthorized access, which means fears involving cloud security are unfounded as long as the storage is by a known provider. They have the best security tools available, and when those tools are not enough, they design their own. What’s more, they’ve developed ways to compress and store your date so not only is it more secure, it’s available globally and with low latency.

That means you shouldn’t be worried about cloud security.  Instead, what you should be worried about are mistakes by your own internal teams, who is getting access, and when they’re getting it. This goes back to the security mindset your company must have to prevent internal issues from allowing access by external threats.

One advantage a cloud-based solution has over an on-prem one is in the area of connectivity. Even though the data is theoretically closer and more directly linked when it’s stored on-prem, something as simple as a clipped wire can prevent your teams from being able to access it until a repair is made. Cloud-based storage providers, on the other hand, can react faster and reroute the necessary data connections to another data center, giving you almost immediate access to your data, even in the event of a connectivity issue. In fact, many of the legal service providers that Venio Systems serves and who buy an on-prem version of our software actually deploy it on the cloud. That’s because a cloud-based solution tends to be less expensive, faster, more convenient, more reliable and, most importantly, more secure than an on-prem solution.

Summary: Five Questions to Ask

  1. How do I get my data into and out of the cloud?
  2. How will the cloud solution (and my team/connection) perform?
  3. How do I keep my cloud data safe?
  4. Who has access to my data?
  5. How do I develop a security mindset?