Big data and probably approximately correct learning in the presence of noise: Implications For financial risk management



Chinthalapati, VLR, Mitra, S and Serguieva, A
(2019) Big data and probably approximately correct learning in the presence of noise: Implications For financial risk management. International Journal of Artificial Intelligence, 17 (1). pp. 34-56.

This is the latest version of this item.

[img] Text
LivElBig Data_PAC_Noise_RiskMgmt.pdf - Submitted version

Download (364kB)
[img] Text
Big_Data_PAC_Noise_RiskMgmt_Algorithms(Round2).pdf - Author Accepted Manuscript

Download (433kB)
[img] Text
BigDataPACIJAI.pdf - Author Accepted Manuscript

Download (300kB)

Abstract

High accuracy forecasts are essential to financial risk management, where machine learning algorithms are frequently employed. We derive a new theoretical bound on the sample complexity for Probably Approximately Correct (PAC) learning in the presence of noise, and does not require specification of the hypothesis set |H|. We demonstrate that for realistic financial applications where |H| is typically infinite. This is contrary to prior theoretical conclusions. We further show that noise, which is a non-trivial component of big data, has a dominating impact on the data size required for PAC learning. Consequently, contrary to current big data trends, we argue that high quality data is more important than large volumes of data. This paper additionally demonstrates that the level of algorithmic sophistication, specifically the Vapnik-Chervonenkis (VC) dimension, needs to be traded-off against data requirements to ensure optimal algorithmic performance. Finally, our new Theorem can be applied to a wider range of machine learning algorithms, as it does not impose finite |H| requirements. This paper contributes to theoretical and applied research in the domain of machine learning for financial applications.

Item Type: Article
Depositing User: Symplectic Admin
Date Deposited: 04 Dec 2018 15:05
Last Modified: 19 Jan 2023 01:10
URI: https://livrepository.liverpool.ac.uk/id/eprint/3029453

Available Versions of this Item