T. Tran, D.Q. Phung, and S. Venkatesh. Embedded restricted Boltzmann machines for fusion of mixed data type and applications in social measurements analysis. In Proc. of the 15th International Conference on Information Fusion (FUSION), Singapore, 2012.

Analysis and fusion of social measurements is important to understand what shapes the publicís opinion and the sustainability of the global development. However, modeling data collected from social responses is challenging as the data is typically complex and heterogeneous, which might take the form of stated facts, subjective assessment, choices, preferences or any combination thereof. Model-wise, these responses are a mixture of data types including binary, categorical, multicategorical, continuous, ordinal, count and rank data. The challenge is therefore to effectively handle mixed data in the a unified fusion framework in order to perform inference and analysis. To that end, this paper introduces eRBM (Embedded Restricted Boltzmann Machine) - a probabilistic latent variable model that can represent mixed data using a layer of hidden variables transparent across different types of data. The proposed model can comfortably support largescale data analysis tasks, including distribution modelling, data completion, prediction and visualisation. We demonstrate these versatile features on several moderate and large-scale publicly available social survey datasets.

bib .pdf  ]