Case study: Digitization of data analysis with AI
Every day, large amounts of data in ever shorter frequencies must be analysed in operation or in the course of a project. The analysis mostly the basis for decisions or adjustments of directions in relation to our future actions. Typical questions with regards to these are as follows:
Why did the current net interest income change significantly?
What moves my PnL today?
Where do the deviations in the test come from?
Frequently, these questions are answered daily, for example with elaborate analyse and research. The analysis usually carried out with simple technical means (e.g. spreadsheets) or based on experience (expert knowledge) and, in the case of large data sets, often only randomly. In addition, the required evaluation frequencies increase - for example for quality-assured intraday analysis, ad hoc reports or tests as part of agile development cycles. The need to increase efficiency in results analysis for example through partial automation is clearly evident here. One way of doing this is to use modern data analysis methods.
Based on this we have examined Machine Learning Algorithms (ML) with regards to their applicability and possible results. We implemented our developed approach technically in a data analytics tool and already successfully applied it at Münchener Hypothekenbank in the analysis of deviation data from regression tests of a release upgrade of their Front Office system.
Challenge and Idea
The automation of regular analyst activities is a significant challenge to increase efficiency and frequency. This primarily involves recurring analysis activities. The aim is to provide a system that carries out repetitive analyses in a short period of time, in a comprehensible manner and with consistent quality according to certain pattern specifications.
Our idea was to implement this with ML methods, which group the data appropriately and create automatic evaluations and visualizations. This way, the identification of the facts at hand and the interrelationships in the data can be made much more resource-efficient for the analysts.
With our approach, the analyst's efficiency increases as the processing of the data and the analytics of potential patterns is done by the algorithms. The analyst focuses on the validation and verification of these. With our tool, pattern recognition can be performed faster, more frequently and for larger amounts of data.
With our Data Analytics tool, we have created an application that makes a range of supervised machine learning algorithms for data analysis easily available. The user does not need any programming skills and is quickly able to flexibly use the algorithms for the problem.
The list of algorithms available in the tool includes Decision Trees, Random Forests, Support Vector Machines, and various cluster algorithms. These algorithms can also be parameterized by the user directly.
The use of these supervised procedures is a central building block for our solution approach, because the goal is the recognition of causes and correlations. The rules for classification applied by the algorithms can thus be clearly understood ex post.
The following fundamental assumptions/requirements for the data provided are made for the use of the tool:
Identification of special features
The aim of the application is to provide a justification for suspicious data. This means that this suspicious data must already be identified.
The additional information on the data is decisive for the analysis, as it is used to group or separate the data.
The result is a model-based separation/classification of the data. The respective subsets are characterized so that a graphical analysis with regards to the special features is possible.
Our tool also offers graphical presentations of the results, such as the following representation of a decision tree. This graphic shows the result from the Decision Tree model, which uses Yes/No decisions to narrow down the characterizing properties of the given features in the suspicious data.
For optimal data connection, even to several systems simultaneously, we built our solution to be completely independent and implemented it stand alone as a Python based open source solution. This allows a multitude of evaluations and graphical representations with the use of different algorithms and parameter settings. Furthermore, our solution has a batch mode and can therefore be started independently of a human action, e.g. directly after automated reports. But a graphical user interface is also available.
This flexibility significantly expands the field of application, even if individual algorithms are more predestined for specific problems. A simple use in different areas, ranging from daily business analysis to deviation analysis in tests, is easily possible.
The Case Study
The Münchener Hypothekenbank, as a partner of the Volksbanken and Raiffeisenbanken in the financing of residential and commercial real estate, is facing the challenge, as are many other banks and financial service providers, that the release and test cycles for the software instances used are becoming ever shorter. After a fundamental optimization of the test procedure and the test strategy (link to article: Continuous Testing), the error analysis still represented a considerable effort and resource driver.
As part of the case study, we applied our tool for automated error analysis to the data comparison between a test and reference environment. The characterizations of the individual deviations in various technical reports were mapped using static data from the source system.
In order to embed our procedure in the Münchener Hypothekenbank, we initially set up our application on site and verified the technical functionality.
Subsequently, the required data (the business report itself, technical diff to reference system, usually required data from system for root cause analysis) were identified and their availability analysed. The data provision was designed on this basis and its implementation actively supported. As soon as the identified data was available, the application of the Data Analytics tool was implemented and corresponding initial test runs were carried out.
Following the successful testing of the tool for the first report, it was expanded in consultation with Münchener Hypothekenbank to include further reports. In total, various other business reports (e.g. cash values, cash flow projections) were integrated and evaluated, thus successfully verifying the extensive usability and flexibility of the Data Analytics Tool.
Finally, the findings and results gained were processed and presented at management level and the decision was made by Münchener Hypothekenbank to use the tool on a regular basis in future.
The manager of the case study by the Münchner Hypothekenbank Elena Moroianu (Principal Business Analyst, Treasury and Capital Markets IT) said:
„After implementing the execution of automated testing, the next challenge is making the analysis of this mass of results much more efficient and with as little manual intervention as possible. We see the potential of machine learning mechanisms in achieving a greater turnaround here. With the Finbridge Data-Analytics-Tool, we proved that it is possible to slice and dice the portfolio in only a small fraction of the time we actually spent during the regression cycles of the Summit upgrade project. During the case study, we focused on the Decision Tree Model and we validated that it properly identifies the affected portfolio and therefore can be a powerful tool for impact analysis.“
The usability with respect to possible applications, flexibility and increase in efficiency that we aimed for with our tool has been confirmed by this case study in practice.
The application offered by us based on machine learning algorithms already provides many extensive application possibilities. Our offer in connection with our Data Analytics Tool includes a pre-study to analyse possible applications to increase efficiency with our Data Analytics Tool. Based on this, we will support you during installation and embedding the tool in the identified processes.
Our consulting around the application focuses on the optimal embedding of the automated analysis activities into your process landscape. The training of internal employees and a demand-oriented application support are just as natural for us as a customer-oriented solution approach and the implementation of possibly required individual adaptations. In addition, we are happy to support you in projects and/or regular operations.
We hope to have aroused your interest in our data analytics approach and look forward to hearing from you!