ISCA Archive - Semi-supervised learning for text classification using feature affinity regularization
ISCA Archive MLSLP 2012
ISCA Archive MLSLP 2012

Semi-supervised learning for text classification using feature affinity regularization

Bin Zhang, Mari Ostendorf

Most conventional semi-supervised learning methods attempt to directly include unlabeled data into training objectives. This paper presents an alternative approach that learns feature affinity information from unlabeled data, which is incorporated into the training objective as regularization of a maximum entropy model. The regularization favors models for which correlated features have similar weights. The method is evaluated in text classification, where feature affinity can be computed from feature co-occurrences in unlabeled data. Experimental results show that this method consistently outperforms baseline methods.

Index Terms: semi-supervised learning, text classification, maximum entropy, feature affinity matrix, regularization