ML and credit card fraud

Machine Learning and AI: The Future Possibilities in Business

20 Jun 2018 | , , ,
Share on:

At Kx25, the international kdb+ user conference held in New York City on May 18th, Alan Rozet, from Kx partner, gave a presentation (available on the Kx Youtube channel here) about how machine learning (ML) is being used in business today. Alan is a Principal Machine Learning Engineer at Capital One with wide ranging experience in ML including in the financial, health care and consumer sectors.

Brainpool is an academic collective of over 250 experienced data scientists with doctorates and masters degrees from top universities around the world whose goal is to bridge the gap between academia and industry. They are partnering with Kx on a number of ML consulting projects and are involved in creating internal machine learning training material.

The need for ML talent by businesses is far outstripping the rate at which trained specialists are entering the workforce. In his talk, Alan cites figures from the McKinsey Global Institute, which estimates that there will be a 50% to 60% gap between the number of trained data scientists and the requisite demand for data scientists in 2018. At the same time, the Artificial Intelligence market is estimated to grow from $640M in 2016 to $37B in 2025, at a compound annual growth rate of 50%.

In his presentation, Alan presented several real world business uses for ML, including for credit card spend forecasting. In this case, ML was used to manage risk and prevent credit card fraud. To begin, the developers of the application asked themselves, ‘Given a customer with an historical transaction record, how does one predict their aggregate spend month to month?’

Assuming that a single customer might have 50 to 100 transactions per month, aggregated across millions, or tens of millions of customers, and across entire organizations, this quickly became a Big Data problem.  A large part of the project focused on how to take advantage of technologies for fast batch processing and other back-end resources for model training that allowed the data scientists to use many instances, or servers, at once for faster training. This supervised learning case predicted aggregate monthly spend per customer.

Alan also looked at the same problem in a slightly different way, in terms of anomalous credit card spend detection. He pointed out that anomaly detection is very much a problem where you need a close interplay between humans and the machine learning model. The data scientist needs to consider what kinds of signaling is needed in anticipation of an anomalous event, and once one has happened, who should be alerted, and how might that decision change depending on the consumer segment.

For example, if a new consumer item that is enormously popular, like a ‘Tickle Me Elmo’ toy comes out and suddenly tons of consumers are purchasing it, there will be widespread new random spends of $100 — is that truly an anomaly that should generate an alert? Versus a small business customer who typical spends $10K a month, and suddenly has charges that jump to $100K or $1M. What should that alerting look like? Which department should be notified and how should the credit card company let the customer know?

In addition to demonstrating how forecasting and anomaly detection can be tied to the proactive case in credit card fraud detection with ML, Alan described a number of other ML use cases in other industries in his talk. For further insights into how businesses are using ML watch Alan’s full Kx25 presentation here on the Kx Youtube channel.

For more information about ML at Kx, please write to us at


Kx Insights: Machine learning and the value of historical data

2 Aug 2018 | , , ,

Data is being generated at a faster rate now than ever before. IDC has predicted that in 2025, there will be 163 zettabytes of data generated each year—a massive increase from the 16.1 zettabytes created in 2016. These high rates of data generation are partially an outcome of the multitude of sensors found on Internet of Things (IoT) devices, the majority of which are capable of recording data many times per second. IHS estimates that the number of IoT devices in use will increase from 15.4 billion devices in 2015 to 75.4 billion in 2025, indicating that these immense rates of data generation will continue to grow even higher in the years to come.

SEMICON 2018 Snapshot: Data and the Era of AI

24 Jul 2018 | ,

By Bill Pierson The future of the semiconductor industry is looking bright judging by the breadth of new developments, initiatives and innovations on display at SEMICON West in San Francisco this July. Industry leading companies presented the latest technical and business insights into today’s opportunities and challenges, particularly in the areas of smart manufacturing and […]