Register for our virtual fireside chat with data privacy regulations expert Jessica B. Lee How to prepare for the upcoming, and several, state-level data privacy
What is data onboarding?
When I hear the term “onboarding,” the first thing I think about is someone joining a new company and going through the process of learning about the culture and offering of that company. I guess data onboarding isn’t so different… Data onboarding is taking offline or CRM data that’s tied to home addresses or hashed emails and connecting them with online identifiers such as mobile ad IDs (MAID’s) and cookies. Data onboarding strengthens and activates customer databases. And for some companies, data onboarding is the one thing that stands between it and catastrophic loss of data viability and investment, depending on the format of their hashed email addresses! If they’re using an older hash format, then they should definitely have a look at one of the data onboarding offerings out there.
First things first: what is an email hash?
We’re hardly into this blogpost and I’ve already mentioned a jargon term: email hashing. And I’ll probably mention the acronym HEMs at some point, which stands for Hashed Emails. So what is an email hash or “HEM”? LiveIntent offers a helpful description: an email hash is a code created by running an email address through a hashing algorithm. Whenever you log into a website using your email address, you are recognized throughout your session. Hashes are a standard across the web. This code (i.e., hash) cannot be reversed, making it a completely anonymous customer identifier.
Here are some types of hashing algorithms:
- MD5 (message-digest algorithm): The output will always be 32 characters. In recent years, MD5 has lost popularity to the SHA family of hashing functions.
- SHA-1 (secure hash algorithm 1): Designed by the United States National Security Agency as a Federal Information Processing Standard. The output is a hexadecimal number 40 characters long.
- SHA-2 (secure hash algorithm 2): The successor to SHA-1, the SHA-2 family consists of multiple hash functions with hash values that vary in size, the most common being SHA-256, which produces an output of 64 characters.
The benefits of email hashing include the fact that the email address is a relatively stable ID that represents a known customer. It is often unique to an individual and has a degree of persistence across devices, apps, and browsers. Consumer email addresses are transformed into anonymized identifiers that cannot reveal any personally identifiable information, making the email hash a valuable people-based identifier.
Let’s spin the clock back 25 years. You run a company in the mid-nineties. How do you gather information about your customers? Most likely, you use simple offline databases, perhaps based on MS Excel. And when it comes to identifying your customers, you have to operate with offline identifiers, such as:
- Email address
- Physical address
- Phone number
These identifiers are, of course, still in play today. And in the cookieless world, their importance has grown significantly. With data onboarding, you can take that offline data online and make the most of it in the digital environment. In order to make this “merging” possible, there has to be a match between the specific user’s profile and their online activity. Data onboarding provides such a match and enables you to see your customers online.
Let’s go back to our 1990’s company example. Imagine you wanted to send a new catalog to your customer base in the 1990’s. It was pretty straightforward back in the day. You just had to open your Excel database and copy and print all of the physical addresses. In today’s digital world, this matter is a bit more complex. Until recently, third-party cookies and Apple’s IDFA have played the role of the primary online identifier. And shortly, companies all over the world will have to redesign their addressability solutions since IDFAs are disappearing and third-party cookies won’t be available anymore. Where does it leave us?
There are new identifiers that can be used for a similar purpose:
- Unified ID 2.0
- Zeotap ID+
- Lotame Panorama
- Various publisher ID’s (mobile apps, connected TV IDs, websites, etc.)
One of the main use cases of data onboarding is connecting CRM or offline data with one of the above new identifiers (and third-party cookies, at least until some time in 2022). At the conclusion of this process, a company’s CRM data has been activated.
How does data onboarding work?
Generally speaking, the data onboarding process is based on three steps:
- Uploading data (your company’s first-party data is anonymized and uploaded to your onboarding’s partner system)
- Matching data (the data you uploaded is matched with specific online identifiers)
- Activating data (the last stage is to create addressable audience segments that you can target with your ads and messages)
Of course, you will need a data onboarding partner that will help you execute the whole process. In their Google Attribution 360 white paper, Google explains what this process looks like from their perspective:
How can you select the best data onboarding partner?
For starters, ask your future partner how long it will take to activate data. In today’s dynamic world, speed is of paramount importance. Users use different channels and devices to interact with your brand and offer. If you can’t keep up the pace, you will surely lose lots of potential possibilities to interact with users and engage them in your brand’s message.
Secondly, pay attention to data accuracy. Verify your partner’s addressability capabilities (based on authenticated data). Partner with a company that deterministically and probabilistically matches customer data.
And finally, take care of your datasets. Your data onboarding partner should ensure full transparency when it comes to how your data is collected and used. Moreover, you ought to maintain full control over it.
In their whitepaper, Google advises to take a closer look at these additional elements:
- Is their methodology transparent? Do they use proven technology and solutions?
- Can they provide you with measurable results and a successful track record?
- Do they have necessary privacy and cybersecurity measures in place? Do they use consented data?
Getting answers to all of these questions is critical and will help you make an informed decision.
Data onboarding with Roqad Link
We know that data onboarding is all about immediacy, viability, and addressability. That’s why we offer a data onboarding solution that you can implement in 2-4 weeks.
We can take your existing deterministic identity graph or CRM dataset whose keys are Hashed Email Addresses (HEMs) and tie them to mobile ad IDs (MAIDs) and cookies, and then we can connect those to one (or all) of the universal IDs out there or even publisher IDs. We call it “Roqad Link.”
For example, one of our customers has all of their HEMs in MD5 format, which is an older format. They’re worried that if they don’t switch formats now to one of the newer SHA formats, then they’ll lose access to the hard-won deterministic identifiers! So we’ll use our probabilistic graph to do that and then turn the connections into UID 2.0, Zeotap, and Lotame’s Panorama.
The use case for a lot of the big e-commerce players is that they have a deterministic link on a hashed email, but they need help to match it to an online identifier and finally deliver a Zeotap ID.
Other folks in the DMP or CDP business already have the MAIDS aka online identifiers, but they need to link it to Hashed Email Addresses in Germany, for which we have access to the biggest database of them. So it’s kind of the reverse of the above use case.
We’re pretty excited about our entry into data onboarding, because it really becomes more than just cross-device. It’s truly identity resolution in a privacy safe manner.
If you want to find out more, let’s have a chat. You can hit me up on Linkedin.
Why does adtech care so much about CTV IDs? So much buzzz The Connected TV opportunity is all the rage. Connected TV IDs can be used
White papers See all posts Evaluating Cross-Device Identity Graphs: What you need to know Arguably the most important stage in selecting a cross-device identity graph