Monte Carlo Dependency Estimation
Proceedings of the 31st International Conference on Scientific and Statistical Database Management (SSDBM 2019)
Estimating dependency is a fundamental task in data management. Identifying the relevant variables leads to better understanding and improves both the runtime and outcome of data analysis. In this paper, we propose Monte Carlo Dependency Estimation (MCDE), a framework to estimate multivariate dependency. MCDE quantifies dependency as the average discrepancy between marginal and conditional distributions via Monte Carlo simulations. Based on this framework, we present Mann-Whitney P (MWP), a novel dependency estimator. We show that MWP satisfies a number of desirable properties and demonstrate the superiority of our estimator against the state-of-the-art multivariate dependency measures.