English Posts

PII (Personally Identifiable Information): what is it?

PII (personally identifiable information) what it is and how to fix it in google analytics

You have often come across the acronym PII or Personally Identifiable Information and wondered what it is and how to fix it in Google Analytics.

PII are all the information that can give you the possibility to identify a user.

This topic is closely linked with Privacy and, especially for the EU, with the GDPR (General Data Protection Regulation).

In this post I’ll show you how to identify if your Google Analytics account is collecting and storing personal information or PII

Content:
- What information are covered by the PII?
- How do I know if I collect PII?
   - PII Collection: Event Category/Action/Label
   - PII Collection: Custom Dimensions
   - PII Collection: Pages
   - PII Collection: Search Terms
   - PII Collection: Data Import
- What to do if I collect PII?
- Conclusions

What Information are covered by the PII?

Here you can see a list of examples that is considered as PII and that has not to be collected by Google Analytics or Tag Manager.

  • Name and Lastname
  • Email
  • Credit Cards
  • Telephone number
  • Personal Information from the login page
  • Exact GPS coordinates
  • IP addresses

The collection of these information is strictly prohibited by Google Analytics and, in case of violation, can lead to permanent deletion of your account.

How Do I Know if I Collect PII?

Now you can have the following question: How can I know if I’m collecting personal information from my users in my Google Analytics account?

There are several possibilities to know if you’re collecting PII. Below I’ll show you some concrete examples which I check during my Google analytics audit.

PII Collection: Event Category/Action/Label

I suggest you to check the hierarchy Category/Action/Label into the Top Events Report and see if you’re collecting any type of PII information into the events you have set up.

Check if you collect email, telephone numbers or any other stuff by clicking in each event.

personally identifiable information google analytics

PII Collection: Custom Dimensions

You should check if the custom dimensions you created in the account do not collect PII.

Go into the Admin > Property and check the dimensions.

You can simply create a Custom Report with your custom dimensions and easily see which values are collected. If these values contain personal information you have to fix the issue asap.

personally identifiable information google analytics custom dimensions

PII Collection: Pages

Let’s continue your checklist by going in the All Pages report and control if there is any PII inside.

PII can be contained in the query parameters, so a way to check that information such as e-mail address is not processed in your Google Analytics account is to look for the @ symbol in the filter.

If the result is zero, no pages with the query parameter @ have been found. That’s good!

personally identifiable information google analytics all pages

However, it is possible to find some PII. A good way to find it, it’s by using the following RegEx (Regular Expression) as suggested from CardinalPath:

([a-zA-Z0-9_\.-]+)@([\da-zA-Z\.-]+)\.([a-zA-Z\.]{2,6})

google analytics pii email addresses

PII Collection: Search Terms

You should also check the Search Terms Report. Here you can find the most searched terms typed by your users in the internal search engine of your website.

By checking this Report, you could find some personal information.

personally identifiable information google analytics searrch terms

PII Collection: Data Import

In Google Analytics you have the possibility to import set of data. Hence, it’s important to check what kind of data you want to import in order to avoid having PII imported.

So, remember to don’t skip this check!

personally identifiable information google analytics data import

What to Do if I Collect PII?

If I notice that I’m collecting personally identifiable information, what actions should I take? The advice I give you is to having a meet with your IT Department to find the best solutions to stop collecting personal information.

For some PII such as the IP address, Google Tag Manager can come to your rescue, especially if you’re using the Universal Analytics version of GA (Google Analytics 4 automatically provides to anonymize IP addresses)

But in general, it’s a good practice to better coordinate with developers to find the most robust solution!

Knowing which parts of the website are collecting certain information is a great starting point to be more effective and find optimal solutions.

Conclusions

The Privacy aspect is a very important issue, not only on a theoretical but also a practical level. As mentioned, if you do not respect the terms proposed by Google, you risk account suspension and other legal problems.

For this, the final tips that I share with you are the following:

  • Coordinate with your Legal Department to understand which data you can collect and which you cannot. At least, involve the legal department to make them aware of what is possible and cannot be done on Google Analytics (don’t take it for granted!);
  • Periodically perform an audit on your Google Analytics account. Remember: the audit is not just about the PII part but needs to be more structured. Personal information is an important area but there are other points as well;
  • Involve the IT Department. With the audit you can find the critical points in more detail; by involving IT, you will be able to find more qualitative solutions and understand if Google Tag Manager is enough for you to correct the collection of some data, or if you need a stronger solution.

You may also be interesting by the following articles:

  • Google Analytics 4: source/medium report
    One of the most popular reports within the Universal Analytics version is definitely the Source/Medium Report. In this Report, you can quickly observe the source and medium of users landing on the website. In UA we can find it under Acquisition > All Traffic > Source/Medium The new version of GA4 offers many features but […]
  • Landing Pages Report in GA4
    One of the most interesting reports in Google Analytics – Universal Analytics is the Landing Pages Report. In this report you can observe the first landing page of a user on the analyzed website. It’s a very useful report to quickly understand what our user’s entry points are. However, the question may arise: how do […]
  • Google Analytics 4: Comparison
    In this post, I’ll show you how to compare data in Google Analytics 4. Just as with the UA (Universal Analytics) version, where you can use segments to better analyze your users’ behavior on the site, GA4 offers a similar feature. However, there are some nuances, compared to the UA version. Compare data in GA4 […]
  • Search Console and Google Analytics 4
    In this post I will show you two things that I find very useful: Connect the Google Search Console directly to your Google Analytics Property 4 Save a widget for faster access to Search Console data in the GA4 User Interface Let’s start! How to Link Search Console to GA4 Property The steps to link […]
  • How to Change Language in Google Data Studio
    Google Data Studio is a great data visualization tool that is completely free, allowing you to connect different data sources to create dashboards. Some of the available data sources are: Google Analytics Google Sheets Google BigQuery File in .csv etc. The sources are different and, as you have seen, they are not only from the […]
  • How to Implement Content Group in Google Analytics 4
    Content Groups allow you to create sections of the website, grouping contents in a convenient way for your analysis. Let’s give some examples: I can create a specific group that shows me the most viewed Brands on my website or I can create Product categories to analyze, at a higher level, the interactions of users […]
  • Server-side Tagging: what is it?
    Update: October 5th. Google Tag Manager Server Side is officially out of Beta, as confirmed by Google, and has entered a new phase. Announced in August 2020, Google Tag Manager Server-side is still a tool / theme unknown to most. There are many doubts and questions on the subject and in this post I want […]

Lascia un commento

Il tuo indirizzo email non sarà pubblicato.