Last edited on May 18, 2024

How to handle payment data without going crazy over PCI

I first came across "PCI DSS" as a teenager, working at a scrappy payments startup. I knew little about compliance then ̶ I was mostly focused on the tech, fascinated by all the innovations that had to come together to do what this company was doing.

We were selling mobile card terminals (POS) to small merchants in Mexico, similar to Squareup in the US.

We had two teams, one based in the US building the hardware: 3D printing the POS case and designing PCB boards, etc. And another team in Mexico that took care of local payment processing.

Card numbers were being encrypted by at the hardware level using multiple layers of AES using the POS unique key, data streamed over the iPhone audio jack to the mobile app, and from there took a trip over the internet over encrypted HTTPS to our backend for payment processing.

Many complex things were happening. Tech felt clever, even elegant. But all this effort was useless, card information was decrypted in the backend and could be found stored in plain text in multiple places. Encryption held up until the data landed in our backend, where everything unravelled.

Encryption wasn't a problem, or the solution. AES and HTTPS were solid. But we hadn't designed for what happened after: how decrypted data moved, where it lived, and who could touch it. That's where PCI gets hard, and where most systems quietly fail.

Years later, I had to implement PCI compliance myself, and the experience was sobering. You don’t really grasp how fragile your systems are until you try to make them legible to an auditor.

In fast-growing teams, compliance tends to fall into the "we'll do it later" pile, right up until your partner bank tells you that you need to get PCI certified to access the unit economic numbers you need to keep growing and make it to your next round of financing. Or... your company dies. PCI wasn't just a checklist, for my startup, was the difference between growth and death.

My intention with this post is to share the findings of a meta-analysis so critical, that changed completely how I approached PCI and allowed me to keep building momentum without falling into the technical traps one walks into when you think you’re just "moving fast".

What's PCI anyway?

PCI is a security standard introduced by payments companies that businesses must adhere to securely handle credit card information. You comply with PCI when you really need to handle card data.

The data

Cardholder data includes things like:

The card number: Primary Account Number, PAN
Card expiration date
Verification code, CVV/CVC
The PIN you type in the bank terminal to approve a transaction and type in the ATM to withdraw cash.
Cardholder name

Sensitive authentication data, like:

The authorisation/signature generated by the card chip (EMV) as the result of approving a transaction with a physical card in a physical terminal (POS).

Good to know

PCI is not strictly a tech thing. Involves processes, people, etc.
Cardholder data environment, CDE: anything/anyone in contact with or that have access to the data.
- People. For example, operations teams, disputes, fraud prevention, customer support.
- Processes: automated scripts or old manual first do this then do that, and the classic — can you send me the excel sheet [that contains sensitive data] to my email?
- Technology: apps, servers, hardware, infrastructure and third party providers.

Reduce the Scope

Engineers can ignore the fact that they're handling sensitive data when building a piece of software. Data can permeate everywhere, making each system target for compliance.

I went meta and understood that "Never touching card data is the simplest way to be PCI-compliant". This is specially hard when your number of services are expanding, and worsened when you're pushing the eng org to move fast: one way or another the card number got tightly coupled in many complex processes. Thus, I made it my goal to reduce the scope of services that needed/could have access to this data.

We'll be applying tokenisation – turn high value data into rubbish – . I won't get into the details of how tokenisation works as there is extensive literature, better explained than I ever could.

If you have questions about this, please get in touch with me; I'll be happy to answer any questions. Or use a managed service.

For this project, we have Go micro-services running on Kubernetes. There's a public API for consumers and internal API where notifications are sent by the network partners, which is protected by VPNs.

The design of the system we'll build looks like this:

Intercept incoming requests

We'll have a Reverse Proxy at the API gateway level that will intercept incoming requests and replace sensitive data with tokens.

The process looks like this:

Authenticate the request.
Check for idempotency.
Telemetry: start tracing if no trace/span exists
...
Look for fields in incoming request that we know have sensitive information like "card_number", or "cvv" and replace it with tokens.

For example, the requester wants to create a new card:


POST /v1/payment_methods

{
  "card_number": "4242424242424242",
  "cvc": "123"
}

Becomes:


POST /v1/payment_methods

{
  "card_number": "tok_1ce85b37fc234451afff384df3c903ba"
  "cvc": "tok_37dd7af618ad494486dcf66c68e2aad3"
}

Only then we let the request continue it path. No service will ever touch actual card data, only the token. So it is safe for other services to store the token.

Accessing sensitive information

There will be services were we'll need to use the actual card number and other sensitive information. Have a service send the decrypted card details to a third party to process a payment.

Instead of allowing this service to pull the decrypted card numbers and risk it would spread it around, we'll implement a forward proxy. Instead of calling the provider directly, the caller will have to instruct the Forward Proxy to make the request. The Proxy will retrieve the token from the token service, and replace the fields with tokens with the corresponding card details. Then forward the request to the third party.

This proxy will have to:

Block all requests by default
Only explicitly allowed urls can receive a request with the actual card data.
Access to the token service, where we store the tokens
Only allowed services can call the proxy.

For example:


POST http://forward_proxy.ns_tokenisation.svc.cluster.local:8080

{
  "type": "http", // or grpc,graphql,etc
  "http": {
    "url": "https://na-gateway.mastercard.com/api/rest/version/43/merchant/1234/order/1234/transaction/1234"
    "method": "PUT",
    "headers": {...}
    "payload": {
      "sourceOfFunds": {
        "cardNumber": "tok_89c34a5056a548f7a61af6f11ec41562",
        "type": "CARD"
      }
      "order": {
        "amount": 123.45
        "currency": "USD"
      },
    }
  }
}

The proxy will replace the tokens with the corresponding values and execute the request.


PUT https://na-gateway.mastercard.com/api/rest/version/43/merchant/1234/order/1234/transaction/1234
...headers

{
  "sourceOfFunds": {
    "cardNumber": "4242424242424242",
    "type": "CARD"
  }
  "order": {
    "amount": 123.45
    "currency": "USD"
  },
}

---
Response:
{
  "order.id": "975507c27706",
  "order.amount": 123.45
  "order.currency": "USD"
}

Tokenisation

In this article I wont get into the details of how tokens are handled or how one should handle the encryption. Banks usually have access to (expensive) hardware security modules (HSMs) that do the encryption part. If you don't have access to one, you could use a managed service like AWS CloudHSM.

The tokenisation service will generate a unique id for each piece of data that needs to be stored and make sure to store the original sensitive data at rest. It must have strong access controls and access logs. Only the proxies should be allowed to make requests to it.

The service looks like this:

tokens-service.proto

Copy


syntax = "proto3";
package namespace.service.tokens.v1;
option go_package = "github.com/namespace/api/service.tokens/proto;tokenspb";

service TokensService {
    rpc CreateToken(CreateTokenRequest) returns(CreateTokenResponse) {}
    rpc RetrieveToken(RetrieveTokenRequest) returns(RetrieveTokenResponse) {}
}

message CreateTokenRequest {
    string data = 1;
}

message CreateTokenResponse {
    string id = 1;
}

message RetrieveTokenRequest {
    string id = 1;
}

message RetrieveTokenResponse {
    string data = 1;
}