T. Tran, D.Q. Phung, and S. Venkatesh.
Embedded restricted Boltzmann machines for fusion of mixed data
type and applications in social measurements analysis.
In Proc. of the 15th International Conference on Information
Fusion (FUSION), Singapore, 2012.
Analysis and fusion of social measurements is important to understand
what shapes the public’s opinion and the sustainability of the global
development. However, modeling data collected from social responses
is challenging as the data is typically complex and heterogeneous,
which might take the form of stated facts, subjective assessment,
choices, preferences or any combination thereof. Model-wise, these
responses are a mixture of data types including binary, categorical,
multicategorical, continuous, ordinal, count and rank data. The challenge
is therefore to effectively handle mixed data in the a unified fusion
framework in order to perform inference and analysis. To that end,
this paper introduces eRBM (Embedded Restricted Boltzmann Machine)
- a probabilistic latent variable model that can represent mixed
data using a layer of hidden variables transparent across different
types of data. The proposed model can comfortably support largescale
data analysis tasks, including distribution modelling, data completion,
prediction and visualisation. We demonstrate these versatile features
on several moderate and large-scale publicly available social survey
datasets.
[ bib .pdf
]