A Human-centric Approach to Operational Awareness and Risk Management.

Understanding Pseudonymization and Anonymization

Understanding pseudonymization and anonymization

Maintaining an engaged workforce is a requirement in a competitive market. An engaged workforce is more productive. A study by Gallup found companies with highly engaged teams were 21 per cent more profitable, had a 41% reduction in absenteeism, and 59% less turnover.

An important factor in building engagement is building trust. This presents challenges for organizations defending against insider risk. These solutions must observe employee actions with data, devices, and applications to correlate events that present risk. An invasive employee surveillance approach can backfire. One study found that “close performance monitoring (via cameras, data entry, chat, and phone recording) had significant negative effects on workplace attitudes such as job satisfaction and affective commitment.”

In addition, organizations must be careful with how they manage employee data. The General Data Protection Regulation (GDPR), California Consumer Protection Act (CCPA), and similar regulatory standards have strict rules regarding how organizations must protect data that can be linked to a person’s identity.

Strategies for Protecting Data

To defend against insider threats, it is important to understand the differences between data anonymization and data pseudonymization.


Data is anonymized by removing or encrypting all identifiers to an individual. This maintains the privacy of the individual while still allowing the data to be used for other purposes. In the example below, anonymization simply removes names from the fields.

Understanding Pseudonymization and Anonymization

Researchers commonly use anonymization to depersonalize personal information before processing it for statistical purposes. For example, a healthcare organization may wish to share information about groups of patients with a pharmaceutical company. The pharmaceutical company needs specific data on the individual patients, including test results or diagnoses, age, weight, gender, and other health factors. They do not require the patients’ names, social security numbers, phone numbers, email addresses, or other data that could link a patient to a record. Anonymization strips personal information from the data required for research.

While useful in the example above, anonymization is not a good solution for insider risk protection. Behavior analytics can help by baselining a group of people with similar roles, but when anomalies occur, it helps to know precisely which user is performing those actions. Simply knowing it was one of a group of hundreds in a similar role won’t prevent the breach. Similarly, blocking all users in that group unnecessarily disrupts legitimate work.


The GDPR encourages the use of pseudonymization and refers to it as “a central feature of ‘data protection by design’”. The GDPR text defines pseudonymization as:

“…the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information…”

Pseudonymization replaces private identifiers with pseudonyms or artificial identifiers. For example, a user’s name could be replaced by a unique random string.  This protects the user’s identity while the underlying data remains suitable for data analysis and data processing.

Understanding Pseudonymization and Anonymization

Unlike anonymization, pseudonymization is reversable when necessary. The identifiers are unique and can be looked up in a separate database to identify the associated individual. This is the “additional information” referenced in the definition above. It is critical that organizations maintain these databases separately and protect them from unauthorized access.

Pseudonymization is useful in defending against insider risk. It allows tracking of individuals, while removing personal identifiers eliminates the risk of inherent bias and protects the privacy of workers. The ability to reverse pseudonymization – when warranted – allows risk teams to identify and stop individuals putting data at risk.

Data Minimization

In the Prepare for the Future of Remote Work report, Gartner recommends that organizations, “focus more on outcomes and less on activity.” This is where data minimization becomes a useful strategy. Simply put, one cannot reveal information it does not possess. From an insider threat perspective, the traditional approach of capturing “everything” – including video streams, image capture, and keystrokes – increases the risk of exposing sensitive information while adding little security value. It also erodes trust between the parties and can hamper employee engagement.

How DTEX uses pseudonymization to protect employee privacy and sensitive data

DTEX takes a privacy-first approach to security. The DTEX InTERCEPT platform collects the minimum amount of data needed to build a forensic audit-trail. It significantly reduces the amount of data an organization needs to collect, eliminating the collection of intrusive data sources which are unnecessary for improving security. This enables organisations to identify high risk events without infringing the privacy of individuals and complying with GDPR and CCPA.

DTEX collects application and user metadata, and uses patented pseudonymization techniques on raw data fields, including username, email, IP address, domain name, and device name. When observed indicators of malicious intent warrant unmasking personal identities – with a clear, evidentiary quality audit trail –  it requires “dual authorization.” Two DTEX administrators (typically one from security and one from HR or legal) must agree to the unmasking and provide justification.

Rajan Koo, Chief Customer Success Officer for DTEX, wrote more about our patented approach to pseudonymization in a recent blog. You can read it here.

To learn more about InTERCEPT or to see the platform in action, contact us to book a demo.