PAPER PRESENTATION
ON
NEURAL FRAUD DETECTION IN CREDIT CARD OPERATIONS
Abstract:
The prevention of credit card fraud is an important application for prediction techniques. Here we describe the detection of credit card fraud using neural networks. One major obstacle for using neural network training techniques is the high necessary diagnostic quality. Since only one financial transaction of a thousand is invalid no prediction success less than 99.9% is acceptable. Due to these credit card transaction proportions complete new concepts had to be developed and tested on real credit card data. This paper shows how the neural network techniques can be combined successfully to obtain a high fraud coverage.
Using data from a credit card issuer, a neural network based fraud detection system is trained on a large sample of labelled credit card account transactions and tested on a holdout data set that consisted of all account activity over a subsequent two-month period of time. The neural network is trained on examples of fraud due to lost cards, stolen cards, application fraud, counterfeit fraud, mail-order fraud and NRI (non-received issue) fraud. The network detected significantly more fraud accounts with significantly fewer false positives (reduced by a factor of 20)over rule-based fraud detection procedures..
A special category of stolen cards has become a major problem in recent years: the theft of cards from the mail. This so-called NRI (non-receipt of issue) fraud affects issuers at the time of both new card issues as well as re-issues. Certain geographic regions of the country are more at risk than others for NRI. In some areas, the problem has been so severe that issuers have used alternate methods of card delivery (courier as opposed to mail), as well as special card activation programs.
Introduction:
Neural networks are specialized computer software that produce non-linear models of complex problems in a fundamentally different way. From a large database, neural networks can develop a set of rules to recognize and predict certain conditions. This software is learns from experience how to do the task, instead of waiting for programmers to develop the correct relationship. This software works best at recognizing, predicting and controlling patterns, such as in fraud detection, payment reviews and other areas where large amounts of data are gathered.
Here, in the forthcoming pages we are going to see the following sub-topics:
What is a neural network?
Most common neural network and its example
Why we go for neural networks and engineering approach
Neural networks in practice
Selected application:
Neural fraud detection in credit card operations
Conclusion
References
Hence at the end of this paper we will find the importance of neural networks as the slogan goes below:
“Neural networks do not perform miracles. But if used sensibly they can produce some amazing results.”
What is a Neural Network?
A neural network is a powerful data modeling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks resemble the human brain in the following two ways:
A neural network acquires knowledge through learning.
A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights.
The true power and advantage of neural networks lies in their ability to represent both linear and non-linear relationships and in their ability to learn these relationships directly from the data being modeled. Traditional linear models are simply inadequate when it comes to modeling data that contains non-linear characteristics.
Most common Neural Network :
The most common neural network model is the multilayer perceptron (MLP). This type of neural network is known as a supervised network because it requires a desired output in order to learn. The goal of this type of network is to create a model that correctly maps the input to the output using historical data so that the model can then be used to produce the output when the desired output is unknown. A graphical representation of an MLP is shown below.
Figure 1
ON
NEURAL FRAUD DETECTION IN CREDIT CARD OPERATIONS
Abstract:
The prevention of credit card fraud is an important application for prediction techniques. Here we describe the detection of credit card fraud using neural networks. One major obstacle for using neural network training techniques is the high necessary diagnostic quality. Since only one financial transaction of a thousand is invalid no prediction success less than 99.9% is acceptable. Due to these credit card transaction proportions complete new concepts had to be developed and tested on real credit card data. This paper shows how the neural network techniques can be combined successfully to obtain a high fraud coverage.
Using data from a credit card issuer, a neural network based fraud detection system is trained on a large sample of labelled credit card account transactions and tested on a holdout data set that consisted of all account activity over a subsequent two-month period of time. The neural network is trained on examples of fraud due to lost cards, stolen cards, application fraud, counterfeit fraud, mail-order fraud and NRI (non-received issue) fraud. The network detected significantly more fraud accounts with significantly fewer false positives (reduced by a factor of 20)over rule-based fraud detection procedures..
A special category of stolen cards has become a major problem in recent years: the theft of cards from the mail. This so-called NRI (non-receipt of issue) fraud affects issuers at the time of both new card issues as well as re-issues. Certain geographic regions of the country are more at risk than others for NRI. In some areas, the problem has been so severe that issuers have used alternate methods of card delivery (courier as opposed to mail), as well as special card activation programs.
Introduction:
Neural networks are specialized computer software that produce non-linear models of complex problems in a fundamentally different way. From a large database, neural networks can develop a set of rules to recognize and predict certain conditions. This software is learns from experience how to do the task, instead of waiting for programmers to develop the correct relationship. This software works best at recognizing, predicting and controlling patterns, such as in fraud detection, payment reviews and other areas where large amounts of data are gathered.
Here, in the forthcoming pages we are going to see the following sub-topics:
What is a neural network?
Most common neural network and its example
Why we go for neural networks and engineering approach
Neural networks in practice
Selected application:
Neural fraud detection in credit card operations
Conclusion
References
Hence at the end of this paper we will find the importance of neural networks as the slogan goes below:
“Neural networks do not perform miracles. But if used sensibly they can produce some amazing results.”
What is a Neural Network?
A neural network is a powerful data modeling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks resemble the human brain in the following two ways:
A neural network acquires knowledge through learning.
A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights.
The true power and advantage of neural networks lies in their ability to represent both linear and non-linear relationships and in their ability to learn these relationships directly from the data being modeled. Traditional linear models are simply inadequate when it comes to modeling data that contains non-linear characteristics.
Most common Neural Network :
The most common neural network model is the multilayer perceptron (MLP). This type of neural network is known as a supervised network because it requires a desired output in order to learn. The goal of this type of network is to create a model that correctly maps the input to the output using historical data so that the model can then be used to produce the output when the desired output is unknown. A graphical representation of an MLP is shown below.
Figure 1
Block diagram of a two hidden layer multiplayer perceptron (MLP). The inputs are fed into the input layer and get multiplied by interconnection weights as they are passed from the input layer to the first hidden layer. Within the first hidden layer, they get summed then processed by a nonlinear function (usually the hyperbolic tangent). As the processed data leaves the first hidden layer, again it gets multiplied by interconnection weights, then summed and processed by the second hidden layer. Finally the data is multiplied by interconnection weights then processed one last time within the output layer to produce the neural network output.
The MLP and many other neural networks learn using an algorithm called backpropagation. With backpropagation, the input data is repeatedly presented to the neural network. With each presentation the output of the neural network is compared to the desired output and an error is computed. This error is then fed back (backpropagated) to the neural network and used to adjust the weights such that the error decreases with each iteration and the neural model gets closer and closer to producing the desired output. This process is known as "training".
Example:
Figure 2
The MLP and many other neural networks learn using an algorithm called backpropagation. With backpropagation, the input data is repeatedly presented to the neural network. With each presentation the output of the neural network is compared to the desired output and an error is computed. This error is then fed back (backpropagated) to the neural network and used to adjust the weights such that the error decreases with each iteration and the neural model gets closer and closer to producing the desired output. This process is known as "training".
Example:
Figure 2
Demonstration of a neural network learning to model the exclusive-or (Xor) data. The Xor data is repeatedly presented to the neural network. With each presentation, the error between the network output and the desired output is computed and fed back to the neural network. The neural network uses this error to adjust its weights such that the error will be decreased. This sequence of events is usually repeated until an acceptable error has been reached or until the network no longer appears to be learning.
Why use neural networks ?
Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an "expert" in the category of information it has been given to analyse. This expert can then be used to provide projections given new situations of interest and answer "what if" questions.Other advantages include:
Adaptive learning : An ability to learn how to do tasks based on the data given for training or initial experience.
Self-Organisation: An ANN can create its own organization or representation of the information it receives during learning time.
Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability.
Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage.
An Engineering Approach:
A simple neuron:
An artificial neuron is a device with many inputs and one output. The neuron has two modes of operation; the training mode and the using mode. In the training mode, the neuron can be trained to fire (or not), for particular input patterns. In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.
Figure 3
A simple neuron
Neural Networks in Practice :
Ø Pattern Recognition
Ø sales forecasting
Ø Industrial process control
Ø Customer research
Ø Data validation
Ø Risk management
Ø Target marketing
Ø Optical character recognition (OCR)
Ø Process Modeling and Control
Ø Machine Diagnostics
Ø Portfolio Management
Ø Target Recognition
Ø Medical Diagnosis
Ø Credit Rating
Ø Targeted Marketing
Ø Voice Recognition
Ø Financial Forecasting
Ø Quality Control
Ø Intelligent Searching
Ø Fraud Detection
Ø Recognition of speakers in communications
Ø Diagnosis of hepatitis
Ø Recovery of telecommunications from faulty software
Ø Interpretation of multimeaning Chinese words
Ø Undersea mine detection
Ø Texture analysis
Selected Application:
Neural Fraud Detection in Credit Card Operations:
Pattern recognition is certainly one of the most relevant areas of application of neural networks. The range of concrete problems that may fall under this category is very wide. But another area of great practical interest and active research is the classification of what may be called social or economical patterns.
A clear instance are the long and widely used systems for credit scoring that have essentially to decide whether or not a given petition is credit worthy or, instead, should be rejected. This problem model is extendible to many other interesting instances as insurance policy decisions, financial ratings, acceptance or rejection of credit cards operations.
This last problem has, of course, received a great deal of attention, under many different approaches, some of them Neural Networks based. There are even several commercial fraud detection systems with large neural components. In any case, it certainly has some peculiar characteristics derived from the different usage that such cards can have, though, we are going to concentrate only on the detection of possibly fraudulent operations. By this, we mean the usage of a given card that may have been lost, stolen or falsified by an unauthorized person against the will of its true owner. Certainly this possibility has to be taken into account even if the card’s main use is as a credit device. Fraud is then the paramount risk issue.
Credit card fraud detection also has two other highly peculiar characteristics. The first one is obviously the very limited time span in which the acceptance or rejection decision has to be made. The second one is the huge amount of credit card operations that have to be processed at a given time. To just give a medium size example, in Spain more than 1.2 million Visa card operations take place in a given day, 98% of them being handled on line. Of course, just very few will be fraudulent, but this just means that the haystack where these needles are to be found is simply enormous.
When considering the information to be used to rate a given operation, two distinct possibilities arise. In the first one, that we may call by-owner, operations are rated according to the usage history of the card owner. This approach requires the ability to fetch in a very fast way the pertinent owner‘s information from the usually huge databases of all cardholder’s historical records. We have to clarify what is meant for fast in this setting. Fraud prevention is very important to card issuers, but not to the point of making impractical or simply inconvenient the daily card utilisation by hundreds of thousands customers. When all the time needed for the remote connections and the basic operation processing is taken into account, a fraud detection system usually has no more than a rather small fraction of a second to perform its task. This allotted time most likely will not be enough for large database queries. Of course, this may be alleviated if specially configured and dedicated hardware is available, but it will certainly result in higher start up and maintenance costs.
In any case, it is also true that such systems have additional advantages, being the most relevant one the deep and powerful analysis that the past history of a customer allows when rating a certain operation. The specified systems may very well work in a sort of "deferred on-line" mode: even if a given operation has to be authorized before an eventually negative rating has been completed, further incoming operations of its card will be effectively blocked. Notice that fraud may be fought over individual transactions, but it is won on the sum of them. Moreover, given the nature of stolen or falsified card users, they will shift their attention from issuers with globally effective prevention systems to others less prepared.
There are situations, however, where a "by owner" system is simply impossible to set up. This is case when a detection system is to be installed not at an issuer but rather in an "operation hub". That is, a central operation processing center receives transactions from many sellers, distributes them to each particular card issuer, and relays its answer back to the originating sales point. The most important activity of such a hub is to streamline credit card traffic back and forth between sellers and issuers: therefore, it essentially does not have any owner information. Thus, contrary to what happens for instance to card issuing institutions, the above hubs are not good candidates for a "by-owner" system. An alternative solution is to build a scoring system "by-operation", that is, a system that only uses the information of the operation itself. When complemented with the previous operation history of the associated cards over a period of time, variables such as operation frequency, accumulated amounts, acceptance or rejection rates of previous operations, and so on, can be derived. These variables are obviously of great interest when deciding whether the current operation is legal or not.
Certainly, this is used in a "by-owner" system, but this approach has some clear strong points.
Since only operations of an immediately previous period of time are considered, data querying is done on relatively small databases, and operation scoring and authorization can be performed in real-time, without requiring deferred processing.
The small database size make possible to install such a system without having to use large dedicated hardware, thus reducing sensibly its costs.
Its placement on a operation hub allows it to simultaneously service several issuers, without having to install individual systems at each of them.
Fraud detection in credit card operation falls neatly in principle within the scope of pattern recognition procedures, and its solution can be sought through the construction of appropriate classifier functions. However, and has it may also be expected, it has certain characteristics of its own that make it a rather difficult problem. The most important is the great imbalance between good operations and fraudulent ones. This is not surprising at all: after all, card issuers intend to make a profit out of credit card use, and fraud directly diminishes that profit. Thus, they will already have implemented a number of fraud prevention methods. In one words, new fraud detection tools, neural or other, are weapons to be used in a war already being fought.
From the point of view of the construction of neural detectors, this imbalance will certainly make model training rather difficult. In fact, typical fraud rates may well be in the one per tens of thousands. This simply means that prior to the model construction, some kind of data segmentation has to be applied in order to lower these rates in the data sets to be used. Segmentation criteria have to be defined after a thorough statistical analysis of legal and fraudulent traffic among, for instance, the different geographical and sector areas for which independent detection modules are to be built.
If properly working, neural real time fraud detection systems will not only be technically feasible, but highly interesting from a purely economical point of view. However, their development has to overcome certain hurdles. First, rather extensive data analysis has to be performed on traffic information to obtain a meaningful set of detection variables. This analysis is also necessary to effectively segment that data in such a way that enormous imbalances between legal and fraudulent traffic do not overwhelm the later to the point of making detection possible. Moreover, difficult prior probabilities estimation and class overlapping may make ordinary multilayer perceptrons training essentially useless; it is thus necessary to devise new model building approaches.
Finally, we mention that although it is certainly possible to use the ratings of a neural credit card detection system as the basis for absolute automated operation acceptance or rejection criteria, an alternative, more realistic approach in by-operation systems is to use those ratings. If used jointly with a card’s immediate operation history to make referral decisions, the performance and reliability of the system will be greatly enhanced. Even if this makes necessary to make the system work jointly with an Authorization Center this approach will still attack a significant portion of attempted fraud while making very good sense from a cost effectiveness point of view.
.The model for such a system is represented in the following diagram
Figure 4
Conclusion:
The computing world has a lot to gain from neural networks. Their ability to learn by example makes them very flexible and powerful. Furthermore there is no need to devise an algorithm in order to perform a specific task; i.e. there is no need to understand the internal mechanisms of that task. They are also very well suited for real time systems because of their fast response and computational times which are due to their parallel architecture.
Neural networks also contribute to other areas of research such as neurology and psychology. They are regularly used to model parts of living organisms and to investigate the internal mechanisms of the brain.
Perhaps the most exciting aspect of neural networks is the possibility that some day 'consious' networks might be produced. There is a number of scientists arguing that conciousness is a 'mechanical' property and that 'consious' neural networks are a realistic possibility.
Finally, we would like to state that even though neural networks have a huge potential we will only get the best of them when they are intergrated with computing, AI, fuzzy logic and related subjects.
References :
v P. Barson, S. Field, N. Davey, G. McAskie, R. Frank: The
Detection of Fraud in Mobile Phone Networks; Neural
Network World 6, 4, pp. 477-484 (1996)
v S. Ghosh, D.L. Reilly: Credit Card Fraud Detection with a Neural Network; Proc. 27th Annual Hawaii Int. Conf. On System Science, IEEE Comp. Soc. Press, Vol.3, pp.621-630 (1994)
v http://www.neuronet.com/
v http://www.neuralworld.com/
v http://www.ieee.com/
v http://www.google.com/
No comments:
Post a Comment