For many organizations, Data Loss Prevention (DLP) is at once one of the most important components of their security framework and the biggest headache for administrators. Why? Because most risks to data security actually come from within an organization, which means security teams have to classify and monitor data across hundreds – even thousands – of different entry and exit points of a corporate network.
This includes user devices like laptops and mobile devices, email clients, servers, and gateways within the network.
While “DLP” applies to more than email, email has become one of the most important vectors to safeguard.
“According to Tessian data, over 700 misdirected emails are sent in organizations with 1,000 people every year. ”
Why is email the number one threat vector for data loss?
Employees spend 40% of their digital time on email sending memos, spreadsheets, invoices, and other sensitive information and data (structured and unstructured alike). When you combine this with the fact that the underlying technology behind email hasn’t evolved since its inception and its ease-of-access – email accounts today are accessible on laptops, smartphones, tablets, smartwatches and even cars – it’s easy to see why 90% of data breaches start on email.
A major US health insurance provider had to pay out $115 million in a class-action lawsuit after an employee stole the data of over 18,000 members over the course of nine months. How? Via email. The data exfiltrated included the members’ ID numbers, names, social security numbers, and other personal information.
Of course, not all incidents of data loss make headlines. According to Tessian data, over 700 misdirected emails are sent in organizations with 1,000 people every year.
This goes to show that businesses must be vigilant in assessing risk around both data loss and data exfiltration and, in doing so, must implement security measures that decrease their likelihood of suffering a breach.
Unfortunately, that’s easier said than done.
Data sent through email is hard to regulate
As security leaders know, preventing data loss requires not only advanced security tools but also buy-in from the entire organization. Here are three reasons why data sent through email is hard to regulate:
- Billions of emails are sent and received every day. According to research, over 124 billion business emails are sent and received every day. That means it’s virtually impossible for IT teams – often resource-constrained themselves – to monitor all of those emails for incidents that could (or do) result in data loss.
- Organizations hold a lot of data. Whether it’s employees’ social security numbers, insurance policies for clients, or bank account details for suppliers, organizations across industries deal with more data than most of us can imagine. What’s more, it’s stored in various ways, from spreadsheets to project proposals. Limiting access to this data is one solution, but IT teams run the risk of limiting employee productivity in doing so.
- People make mistakes and break the rules. Human error is the number one cause of breaches under GDPR. Whether it’s an employee sending an email to the wrong person or a disgruntled employee intentionally exfiltrating data, there are numerous ways in which sensitive data can fall into the wrong hands. Unfortunately, to err is human and even training can’t eliminate this risk entirely.
Data vs. human behavior
When you consider the objective of DLP, you realize there are two distinct approaches to take.
- Data-centric approach: Rule-based solutions use the content of an email to perform analysis. These rules consider keywords, attachments, seniority level, and even the role or department of an employee to identify sensitive information and keep it within the organization.
- Human-centric approach: Instead of focusing only on the data, human-centric approaches like those offered by Tessian seek to understand complex and ever-evolving human relationships in order to protect sensitive information.
While both approaches have their merits, there are some clear shortcomings to a data-centric approach.
“The challenge with current DLP solutions is that most are based on rules. But human behavior can’t be predicted or controlled by rules. That means that the more effective solution is one that’s adaptable and can discern the variations in human behavior over time.
”
Why current DLP solutions are failing
There are several different approaches organizations can take in preventing data loss. But, given the fact that security breaches have increased by 67% in the last five years, it’s worth noting the drawbacks of each solution.
Blocking accounts/domains: In this approach, particular domains (particularly free mail domains like @gmail.com or @yahoo.com) are blocked by the company. Why? These emails will undoubtedly be attached to people outside of the organization and, oftentimes, are actually the personal email accounts of employees themselves.
Drawbacks: There are legitimate reasons to send and receive emails from people or organizations outside of your company’s network and with “freemail” domains. Employees might need to communicate with a client or manage freelancers. They may also simply be trying to send documents “home” to work after hours or over the weekend. Unfortunately, it’s not difficult for employees to find workarounds, regardless of their intentions.
Blacklisting email addresses: Security teams can create a list of non-authorized email addresses and simply block all emails sent or received.
Drawbacks: Because blacklisting requires constant updating, it’s very time- and resource-intensive. Beyond that, though, this is a very reactive measure. Email addresses will only be added to a blacklist after they’ve been known to be associated with unauthorized communications, which means data exfiltration attempts may be successful before IT and security teams are able to take steps towards remediation.
Focusing on Keywords: This method uses words and phrases to alert administrators of suspicious email activity. For example, IT and security teams can create rules to identify keywords like “social security numbers” or “bank account details”, which will then signal an email should be quarantined or blocked before sent.
Drawbacks: The person trying to exfiltrate data – like social security numbers or bank account details – can circumvent keyword tracking tools by sending the email and the attached data in an encrypted form.
Tagging Data: After classifying data, an organization may attempt to tag sensitive data, allowing administrators to track it as it moves within and outside of a network.
Drawbacks: Again, this system is time- and resource-intensive and relies on employees accurately identifying and tagging all sensitive data. Data could be misclassified or simply overlooked, allowing it to move freely within and out of a network. Additionally, employees often get fatigued with enforced tagging which could lead to default tagging everything as sensitive.
You can find more information about email tagging in this guide.
The challenge with all of the above is that they are based on rules. But human behavior can’t be predicted or controlled by rules.
That means that the more effective solution is one that’s adaptable and can discern the variations in human behavior over time. A solution like this relies on machine-intelligent software that learns from historical email data to determine what is and isn’t anomalous in real-time.
What’s the best solution?
Tessian uses contextual machine learning to prevent data exfiltration. Our machine learning models look at evolving patterns in data and constantly reclassifies email addresses based on changing relationships between employees and third-parties like vendors and suppliers.
This way, Tessian can determine whether a communication is legitimate information sharing or exfiltration.
To learn more about data exfiltration and how Tessian is helping organizations like Arm keep data safe, talk to one of our experts today.