The KDD Process in Big Data Analytics: A Theoretical Approach to Taxpayer Non-Compliance Analysis
Abstract
In the modern business environment, big data analytics and data mining techniques are increasingly recognized as tools for improving fiscal discipline and more efficient management of public revenues. This paper explores the possibility of applying the knowledge discovery process from databases to detect patterns of financial behavior that may indicate tax non-compliance. A quantitative approach based on the analysis of secondary data from ten joint-stock companies from the Federation of Bosnia and Herzegovina, for which financial statements and tax debt data are available, was used. The relationship between key financial indicators (EPS, financial stability ratio, total asset turnover ratio and debt ratio) and the amount of tax debt was examined using descriptive statistics and regression analysis. The results show that lower profitability and poorer financial stability significantly correlate with higher tax debt, while high operational efficiency and debt have a more complex and statistically marginal impact. The findings confirm the possibility of using publicly available financial data for early identification of risky taxpayers, which opens up space for further development of predictive models in the domain of tax analytics.
(Previously published on Sciendo: https://sciendo.com/article/10.2478/jfap-2025-0002, pod licencom CC BY-NC-ND 4.0)