Conditional Random Fields (CRFs) are undirected graphical models that can be used to define the joint probability distribution over a label sequences given a set of observation sequences to be labeled. A key advantage of CRFs is their great flexibility to include a wide variety of non-independent features of the input. Faced with this freedom, an important question remains: what features should be used?
This thesis describes two techniques for deriving novel features for use in Conditional Random Fields-based phone recognition, extending previous techniques that incorporated multiclass posteriors of phone classes or phonological features estimated by Multi-Layer Perceptrons.
The first technique investigates the integration of suprasegmental knowledge into the MLP classification system that is part of the CRF recognizer. CRFs are used to integrate MLP posterior estimates, particularly of phonological features or phonetic classes, which stand in as representations of the acoustics; this thesis shows that incorporating suprasegmental information as part of the MLP classification system augments the acoustic space in a beneficial way for phonological feature based CRF models. TIMIT phone recognition experiments show a small but statistically significant improvement due to both techniques.
The second experiment combines phonological feature scores from two different systems that gives a statistically significant improvement in Conditional Random Field-based TIMIT phone recognition, despite a standalone system based on their features performing significantly worse. We then explore the reasons for this improvement by examining different representations of phonological attribute classifiers, in terms of what they are classifying (binary versus n-ary features), the feature definition, the training paradigm and the representation of scoring functions. The analysis leads to the conclusions that different databases gives robustness, and that binary-ness, feature definition and score representation do not help in the improvement of the performance.