Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Blogs Banner

Data and AI tech combat payment fraud

Ever since Big Tech, fintechs and other incumbents have shifted towards a more digital and data-driven decision making approach, the payments space has emerged as one of the most competitive in the financial sector. Players are competing for customer insights and preferences by mining financial habits and ensuing data. 

In spite of COVID-19 severely impacting the retail financial market, the e-commerce space is going to account for about 22% of all retail sales by 2023. That’s a sizable market of $6.5 trillion. 

commerce graph
eCommerce share of total global retail sales from 2015 to 2023
Given that eCommerce holds a significant percentage of the retail pie, the prediction that online retailers will lose up to $130 billion to payment frauds between 2018 and 2023 is a point of concern for retailers across the world. 

In this article, I’d like to discuss various types of online purchase fraud against which data and Machine Learning tools could prove our best defence.

Types of online fraud

Fraud can affect enterprises through lost revenue and negative customer experience, which impacts brand reputation. The diagram below, displays types of fraud and the two specific areas of false positive and false negative cases that companies should focus on when building a robust fraud prediction capability.

kinds of fraud

The reason online payment is a prime target for fraudsters is, how easy it’s to get away with. Verifying the identity of the buyer is a tough task for sellers, especially in  the era of the internet and digital interactions. Fraudsters have access to millions of personal credentials available on the dark web, and can hide behind false details to cover their tracks.

Defence against fraud

Here are a set of parameters that’re typically used to verify and confirm online fraud -

ways of fraud prevention
  • Address Verification Service (AVS) checks the given details against the bank’s records.
  • Card Verification Value (CVV) on a person's card cannot be replicated and isn’t stored at the merchant’s end.
  • Unique electronic device identification can’t be replicated and is easy to track. Also, companies like ThreatMatrix offer services to check fingerprints against fraud databases. 
  • Limiting large transactions is a preventive measure because fraudsters typically make large transactions before the card is blocked. 
  • Payer authentication via 3-D Secure and MasterCard SecureCode etc. can authenticate identity. Digital money transactions are carried out using a secure PIN.
  • High-risk countries are on a global watchlist, especially for orders made from these countries. 
  • Lockout mechanisms for particular IPs, from which a large number of credit cards were declined within a pre-defined time period and disabling transactions that fail AVS tests are viable measures.
  • Risk scoring tools indicate the probability of a transaction being fraudulent. 

The evolution of financial fraud prevention

The traditional rules-engine is still the go-to approach for a lot of organizations when it comes to preventing fraud. For example, during an instance of hackers using stolen cards and bots to generate sales, the conservative approach is to immediately apply WAF rules and shut down the hackers. Unfortunately, this also blocks genuine transactions that could take place during the same period.

As more of the retail and financial sectors adopt digital and social media channels to conduct their business, this rule-based trend is gradually giving way to more proactive methods of identifying fraud.

With the spread of the internet, extensive user reach and the leveraging of social channels for business - the scale of customer data made easily available is on the rise. Businesses are recognizing how machine intervention or intelligence can augment fraud prevention approaches like graph visualizations. Infact, reports confirm that ML based preventive tech is looking at a global investment of up to $10 billion by 2024. 

A popular method, graph visualizations are built off of data coming in from social channels. They work like a detective wall to help businesses identify patterns and build a picture of the fraud taking place. 

link analysis
Source: cambridgeintelligence.com
Here is an example of a graph visualization. The above image is a search result for all insurance claims pertaining to one vehicle.

The current claim is with regard to vehicle, DA53 RMX and witness, Everett Page. The detected connections that are potential red flags are - 
  • Link Analysis shows the vehicle DA53 RMX was involved in another claim.
  • The witness, Everett Page shares his residential address with another person Walter Steward.
  • Walter Steward has made an earlier claim on the same vehicle DA53 RMX. 
These connections alert the insurance officer to carry out further checks into this claim.

ML powered preventive tech

The volume and velocity of data at risk combined with fraudsters’ sophisticated methods to use stolen data make it imperative that preventive approaches are built to scale quickly and efficiently. 

And, while large enterprises have successfully adopted ML as part of their core defence strategy, it’s critical for small to medium enterprises who are embarking on their digital transformation journeys to understand and appreciate the benefits of investing in ML and AI when addressing these challenges. 

In scenarios where there are huge volumes of retail-data to go through, ML allows for better speed, flexibility and precision during fraud analysis. Today, the tech is mined for its capacity to automate and predict fraud, at scale. 

ML models and graph networks are mutually reinforcing (Graph Based Anomaly Detection, GBAD). Businesses can teach their models to flag networks (of densely connected entities) for review and block payments from networks that show accelerated growth to prevent a fraudster using multiple accounts to order goods.

Typical ML models used for fraud detection

The supervised model that’s trained to classify a transaction based on the learnings from labelled or curated past data. Regression and classification models fall under this category. 

classifier models

Classification involves automatically assigning a label to an unlabelled input data. These classifications can be binary or multiclass. Spam filtering and sentiment analysis are examples of classifier models. 

When dealing with eCommerce data, classifier models sieve through information – customer/user, buyer/payer, recipient and details of the order, including shipping, billing, delivery mode, payment mode and merchant information – to classify transactions based on trained and periodically re-calibrated models. 

Classifier models use algorithms like k-nearest neighbors or logistic regression to determine which side of the fraud decision boundary, a new transaction falls into.  

The unsupervised model explores data to identify common patterns. Clustering, pattern search, and dimension reduction models fall under this category.

clustering models

Clustering involves grouping transactions that exhibit similar behaviours. These models are used to detect anomalies, where a customer’s digital behaviour differs from normal. 

An example are the alerts one gets when they sign-in to digital platforms from a new device. One of the more popular algorithms used for clustering is  k-means

The neural and deep networks that mimic the human brain’s approach to classifying objects and detecting patterns. Compared to others, a deep learning model benefits from being able to proactively extract relevant features from data, thereby learning from it. 

A few examples of deep learning in action are Facebook’s image recognition, Apple’s speech recognition for Siri, and natural language processing in Google translator.

neural networks and deep networks

When detecting fraud, deep learning extracts complex patterns with low-to-no errors. This helps accurately score eCommerce transactions based on what actions should be accepted or rejected or require a manual review. Some of the popular techniques used are multilayer perceptron networks, auto encoders, convolutional neural networks, long short-term memory recurrent neural networks.

Due to the pandemic and social distancing, eCommerce is at the center of most, if not all, global retail action. Unfortunately, this is allowing fraudsters to discern general shopping patterns from the vast amounts of data being produced. In fact, scammers are increasingly using AI tools like deepfakes to carry out their illegal activity.  

However, businesses are also recognizing that fraudsters leave data trails just as customers do. This means, a retailer’s best weapon against fraud is data. And, the more data businesses are able to track, the more patterns come to light, making it easier to prevent deceptive activity from taking place. 

eCommerce organizations have to invest in being able to identify the unique set of data points from authentic customers’ browsing, buying, paying behaviour etc. When this information is coupled with internal system identifiers like merchant defined fields, for instance, it’ll be near impossible for a third party to commit fraud.  

Additionally, organizations will also have to invest in basic customer education that helps people understand how their data sharing habits can impact fraud prevention.  Our recommendation is to follow the de facto approach of, “Even if you trust, always verify” for both system-to-system as well as human-to-human data exchanges. We believe the use of ML in fraud detection is, now, and not a matter of if but when.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.

Keep up to date with our latest insights