Tokenization has become a crucial function in the payments system as it allows companies to better protect their customers’ payment information. Recently, we chatted with Maarten Sander, CTO at Ginger, which helps financial institutions revamp their payments technology via its platform. Sander built the Ginger Platform, including its tokenization solution, from scratch, so that’s why we spoke to him about a host of issues involving data including tokenization, GDPR, and digital onboarding. Check out the first part of our conversation and what he had to say on tokenization below.
What’s tokenization?
At face value, tokenization is simple: “It’s just substituting a piece of valuable data for something that has zero value itself”, says Sander. This data, for example, could be a credit card number. Here, a person’s card number would be replaced by a different set of numbers, which can either hold the same format of the credit card or not. Often, tokenization involving credit cards has the same format so legacy systems can process and recognise the data.
Universally unique identifiers (UUIDs) are another form tokens can take. With UUIDs, the format doesn’t have the same structure of the original data because the format is fixed, a 128-bit number. This makes it more difficult to guess the original piece of information because you can’t tell what type of data it is by looking at the tokenized information.
At Ginger, tokens are generated in the form of a UUID, which essentially is a random number created via an algorithm. If a client needs a token that can be validated as a credit card for a legacy system, Ginger uses a different algorithm to generate the token.
So, what’s the advantage to using tokenization for sensitive data, like credit cards and other sensitive information?
“Tokenization gives you more control over information and it makes it easier to store data centrally,” says Sander.
Who uses tokenization, and who tokenizes?
Tokenization is used by financial institutions, including banks and fintechs, to safely store and retrieve credit card and/or merchant data, so it’s also of use for organisations in insurance, eCommerce, or telecom, for instance. Companies that handle credit card details can also reduce their Payment Card Industry Data Security Standard (PCI-DSS) scope by implementing tokenization.
The entities that provide tokenization are usually called Token Service Providers (TSPs), although this term is most frequently used in the mobile payments industry. TSPs are a critical part in the payment ecosystem and as an approved third-party partner, they tokenize and store the sensitive data. Many banks, payment service providers (PSPs), or credit card companies are choosing to become their own TSP, and that is where Ginger helps.
How more complex sets of data can be tokenized
“If you’re tokenizing more than just a single piece of information, like a full address, for example, or a whole customer record,” Sander says, “then it becomes a bit harder, because what piece of information do you tokenize?” Here, the individual pieces of information can be tokenized separately, or the document as a whole. Tokenizing the individual pieces allows for more control, but usually makes implementations for complex.
Ginger is currently working on a new API where customers can specify whether or not they want to tokenize an entire document or its fields individually, and where customers can choose the format of the token per field. The latter makes it easier to work with legacy systems which require data to fit specific formats.
So, where is the tokenized data stored? How is it protected?
Sander succinctly explains how tokenized data is stored at Ginger: “We encrypt the data, store it, and then we generate a token that points to this data.” This practice is common in tokenization.
Ginger stores data in a vault. This means that stored information is encrypted, which Sander says is normally used in combination with tokenization to keep stored data safe. “If data isn’t encrypted” Sander explains, “then it’s stored as plain text, which is a huge problem when a breach happens”.
Vaultless tokenization is also possible. This is a method where a vault is not needed to store sensitive data; rather, this data is stored and encoded within the token itself. This encoding is often done via format-preserving encryption, ensuring that the encrypted data looks similar to the original information, i.e if a piece of information has four letters the encrypted data will also contain four characters. Encoding can also be carried out by using some form of a lookup table. Sander remarks that, “The most important thing to realise is that the original sensitive data can always be derived from the token! Using tokenization with a vault, the sensitive data is not encoded in the token, and retrieving it always requires access to the vault.”
Vaultless tokenization can also help companies become PCI-DSS compliant but does not reduce the scope of PCI-DSS certification, as can be the case with vault-based tokenization. In the latter instance, the sensitive data being stored separately places the systems that use tokens out of the certification scope, but since this information is contained within the token itself for vaultless tokenization, it falls within the scope.
What about deleting tokenized information?
Data encryption and decryption is performed using a key. To “delete” the information, you could simply erase the key to the encrypted data. Without the key, the original data can never be recovered, essentially turning the encrypted data into gibberish. This method, called crypto-shredding, is preferred by some to deleting the source data in the vault itself, since it allows you to keep the integrity of your database intact.
For example, if a web shop customer service employee requests the customer details of a person who has says they want to be forgotten under GDPR, with crypto-shredding the system would return a customer record with all the details made unreadable. If the information was deleted at the source, the system may return an error because the expected records are no longer available.
Ultimately, tokenization and crypto-shredding are crucial methods for dealing with sensitive data and staying compliant with GDPR. This is becoming increasingly important for capturing the personal data needed for a digital onboarding process.