Mining Causality via Information Bottleneck

Computer Science ›› 2022, Vol. 49 ›› Issue (2): 198-203.doi: 10.11896/jsjkx.210100053

• Database & Big Data & Data Science • Previous Articles     Next Articles

Mining Causality via Information Bottleneck

QIAO Jie1, CAI Rui-chu1, HAO Zhi-feng2   

  1. 1 School of Computer,Guangdong University of Technology,Guangzhou 510006,China
    2 School of Mathematics and Big Data,Foshan University,Foshan,Guangdong 528000,China
  • Received:2021-01-07 Revised:2021-06-01 Online:2022-02-15 Published:2022-02-23
  • About author:QIAO Jie,born in 1993,Ph.D student.His main research interests include machine learning and causality.
    CAI Rui-chu,born in 1983,Ph.D,professor,Ph.D supervisor.His main research interests include artificial intellectual and causality.
  • Supported by:
    National Natural Science Foundation of China(61876043,61976052).

Abstract: Causal discovery from observational data is a fundamental problem in many disciplines.However,existing methods such as constraint-based methods and causal function-based methods have strong assumptions on the causal mechanism of data,and are only applicable to low-dimensional data,and cannot be applied to scenarios with hidden variables.To this end,we propose a causality discovery method using information bottlenecks,called causal information bottleneck.This method divides the causal mechanism into two stages:compression and extraction.In the compression stage,we assume that there is a compressed hidden variable in the middle,while in the extraction stage,we extract the correlated information from effect variable as much as possible.Based on the causal information bottleneck,by deriving its variational upper bound,a causality discovery method based on the variational autoencoder is designed.The experimental results shows that the information bottleneck based method improves the accuracy by 10% in synthetic data and 4% in real world data.

Key words: Causal discovery, Causal information bottleneck, Information bottleneck, Mining causality, Variational autoencoder

CLC Number: 

  • TP301.6
[1]MCINERNEY J,BROST B,CHANDAR P,et al.Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions[C]//The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.New York:ACM,2020:1779-1788.
[2]RUNGE J,BATHIANY S,BOLLT E,et al.Inferring causationfrom time series in Earth system sciences[J].Nature Communications,2019,10(1):1-13.
[3]CAI R,ZHANG Z,HAO Z,et al.Understanding social causali-ties behind human action sequences[J].IEEE Transactions on Neural Networks and Learning Systems,2016,28(8):1801-1813.
[4]WANG W J,DU X H,REN Z Y,et al.Reconstruction of Cloud Platform Attack Scenario Based on Causal Knowledge and Temporal-Spatial Correlation[J].Computer Science,48(2):317-323.
[5]CAI R C,CHEN W,ZHANG K,et al.A Survey on Non-Temporal Series Observational Data based Causal Discovery[J].Chinese Journal of Computers,2017,40(6):1470-1490.
[6]XIE F,CAI R,HUANG B,et al.Generalized Independent NoiseCondition for Estimating Latent Variable Causal Graphs[C]//Advances in Neural Information Processing Systems.New York:Curran Associates,Inc.,2020:14891-14902.
[7]GLYMOUR C,ZHANG K,SPIRTES P.Review of causal discovery methods based on graphical models[J].Frontiers in Genetics,2019,10:524.
[8]SHIMIZU S,HOYER P O,HYVÄRINEN A,et al.A linear non-Gaussian acyclic model for causal discovery[J].Journal of Machine Learning Research,2006,7(Oct):2003-2030.
[9]SHIMIZU S,INAZUMI T,SOGAWA Y,et al.DirectLiNGAM:A direct method for learning a linear non-Gaussian structural equation model[J].Journal of Machine Learning Research,2011,12(Apr):1225-1248.
[10]HOYER P,JANZING D,MOOIJ J M,et al.Nonlinear causal discovery with additive noise models[C]//Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems.New York:NIPS,2008:689-696.
[11]ZHANG K,HYVÄRINEN A.On the Identifiability of the Post-Nonlinear Causal Model[C]//UAI 2009,Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence.Corvallis,USA:AUAI Press,2009:647-655.
[12]CAI R,QIAO J,ZHANG K,et al.Causal discovery from discrete data using hidden compact representation[C]//Advances in Neural Information Processing Systems.Calfornia,USA:NIPS,2018:2666-2674.
[13]CAI R,QIAO J,ZHANG K,et al.Causal discovery with cascade nonlinear additive noise models[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2019:1609-1615.
[14]HUANG YL,LI P F,ZHU Q M.Joint Model of Events' Causal and Temporal Relations Identification[J].Computer Science,2018,45(6):204-207,234.
[15]SPIRTES P,GLYMOUR C N,SCHEINES R.Causation,prediction,and search[M].USA:MIT press,2000.
[16]TSAMARDINOS I,BROWN L E,ALIFERIS C F.The max-min hill-climbing Bayesian network structure learning algorithm[J].Machine Learning,2006,65(1):31-78.
[17]ANDERSSON S A,MADIGAN D,PERLMAN M D,et al.A characterization of Markov equivalence classes for acyclic digraphs[J].The Annals of Statistics,Institute of Mathematical Statistics,1997,25(2):505-541.
[18]SLONIM N,FRIEDMAN N,TISHBY N.Multivariate information bottleneck[J].Neural Computation,2006,18(8):1739-1789.
[19]ALEMI A A,FISCHER I,DILLON J V,et al.Deep Variational Information Bottleneck[C]//the 5th International Conference on Learning Representations.2017.
[20]KINGMA D P,WELLING M.Auto-Encoding Variational Bayes[C]//the 2nd International Conference on Learning Representations.2014.
[21]HORNIK K,STINCHCOMBE M B,WHITE H.Multilayerfeedforward networks are universal approximators[J].Neural Networks,1989,2(5):359-366.
[22]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//the 3rd International Conference on Learning Representations.2014.
[23]BÜHLMANN P,PETERS J,ERNEST J,et al.CAM:Causaladditive models,high-dimensional order search and penalized regression[J].The Annals of Statistics,Institute of Mathematical Statistics,2014,42(6):2526-2556.
[24]GRETTON A,BOUSQUET O,SMOLA A J,et al.Measuring Statistical Dependence with Hilbert-Schmidt Norms[C]//Algorithmic Learning Theory,16th International Conference.Berlin,German:Springer,2005:63-77.
[25]MOOIJ J M,PETERS J,JANZING D,et al.Distinguishingcause from effect using observational data:methods and benchmarks[J].The Journal of Machine Learning Research,2016,17(1):1103-1204.
[1] WANG Guan-yu, ZHONG Ting, FENG Yu, ZHOU Fan. Collaborative Filtering Recommendation Method Based on Vector Quantization Coding [J]. Computer Science, 2022, 49(9): 48-54.
[2] FU Kun, ZHAO Xiao-meng, FU Zi-tong, GAO Jin-hui, MA Hao-ran. Deep Network Representation Learning Method on Incomplete Information Networks [J]. Computer Science, 2021, 48(12): 212-218.
[3] CAO Wei-dong, XU Zhi-xiang, WANG Jing. Intrusion Detection Based on Semi-supervised Learning with Deep Generative Models [J]. Computer Science, 2019, 46(3): 197-201.
[4] ZENG Xu-yu, YANG Yan, WANG Shu-ying, HE Tai-jun, CHEN Jian-bo. Hybrid Recommendation Algorithm Based on Deep Learning [J]. Computer Science, 2019, 46(1): 126-130.
[5] JIANG Peng, YE Yang-dong and LOU Zheng-zheng. Multi-clusters IB Algorithm for Imbalanced Data Set [J]. Computer Science, 2016, 43(7): 245-250.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!