We describe a method for learning head-transducer models of translation automatically from examples consisting of transcribed spoken utterances and reference translations of the utterances. The method proceeds by first searching for a hierarchical alignment (specifically a synchronized dependency tree) of each training example. The alignments produced are optimal with respect to a cost function that takes into account co-occurrence statistics and the recursive decomposition of the example into aligned substrings. A probabilistic head-transducer model is then constructed from the alignments. We report results of applying the method to English-to-Spanish translation in the domain of air travel information and English-to-Japanese translation in the domain of telephone operator assistance. We also report on a variation on this model-construction method in which multi-word pairings are used in the computation of the hierarchical alignments and head transducer models.