The main purpose of the predictive analytics service (PAS) is to predict cyber-physical attacks and report this prediction to the Data Layer and to the Dashboard.
The service considers various types of attacks that include both cyber and physical attacks. For cyber-attacks, several types of attacks have been proposed including but not limited to DoS, Fuzzers, Backdoors, Exploits, Generic, Reconnaissance, Shellcode, Analysis and Worms. For physical attacks, abnormal and suspicious user behaviours in close proximity and inside the bank premises, the deviations from the normal trend, established procedures and internal bank routines will be analysed and predicted. The main module of the service is a toolbox of ML and DL models. These models are trained using both labelled and unlabelled data and several ML and DL algorithms.”
The Predictive Analytics Service is implemented in Python 3.8 and is based upon existing ML and DL tools. The following frameworks have been used for testing and implementation:
- Keras: Python Deep Learning framework (https://keras.io/),
- Scikit-learn: The Python scikit-learn (https://scikit-learn.org/stable/)
- TensorFlow 1.3 (https://github.com/JosephGatto/Deep-Belief-Networks-Tensorflow)
The Predictive Analytics Service is composed of four modules.
- The Data Connectivity Module, which enables the transfer of objects between the Predictive Analytics Service, the Data Layer and the Dashboard.
- The Predictive Analytics Module, which implements the toolbox of the predictive models.. Several machine-learning algorithms have been tested; their appropriateness and applicability is being evaluated and classified. The toolbox includes the following algorithms:
- k-Nearest Neighbors (kNN)
- Support vector machine (SVM)
- Logistic Regression (LR)
- Light Gradient Boosting (LGB)
- Decision Trees (DT)
- Random Forest (RF)
- Long-short term memory network (LSTM)
- Artificial Neural Network (ANN) (with multi-hidden layers perceptron)
- Deep Belief Network and Probability Neural Network (DBN and PNN)
- The ML and DL Models Module, which consists of trained predictive models that can be accessed via the REST API.
- A local repository (Model Repository module), which stores csv files with preprocessed data for further analysis by the ML and DL models.
The ML and DL Toolbox receives as input the Pre-processed data from the corresponding component and prepares it for the analysis. Then, it connects to the Model Repository Module to train and test the ML/DL models with the labeled data, and depending on its quality, it selects the best performing algorithm from the Model Repository Module through a cross validation process, to predict the labels of the unlabeled data with the highest possible probability. This newly labeled data is then sent as output to the Data Layer Connection Module to be further mitigated and communicated to the Dashboard in case that the predicted labels indicate an attack. The figure below presents how the Predictive Analytics Service components interact with the Data Layer Module in the FINSEC reference architecture.
More information on the Predictive Analytics Service is available in the Adaptive and Intelligent Data Collection and Analytics for Securing Critical Financial Infrastructure.
The innovation of the solution lies in its ability to predict security indicators, such as vulnerabilities and threats. When properly configured and deployed it can therefore enable organizations to increase their security preparedness and to deal with abnormal or even unexpected attack situations.