Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Logical specification of finite-state transductions for natural language processing

Vaillette, Nathan

Abstract Details

2004, Doctor of Philosophy, Ohio State University, Linguistics.

This thesis is concerned with the use of a logical language for specifying mappings between strings of symbols; specifically, the regular relations, those which can be computed by finite-state transducers. Because of their efficiency and flexibility, regular relations and finite-state transducers are widely used in Natural Language Processing (NLP) for tasks such as grapheme-to-phoneme conversion, morphological analysis and generation, and shallow syntactic parsing. By exploiting logical representations for finite-state transductions, the technique advocated in this thesis combines efficient processing with the advantages of declarative specification, thus taking a step in the direction of providing finite-state NLP with the best of both worlds.

Previous work has demonstrated how all sets of strings recognized by finite-state automata can be described in monadic second-order logic. A formula of this logic describing a set can be automatically compiled into the finite-state automaton recognizing that set. This technique unfortunately does not carry over to relations on strings without further restrictions, since the class of regular relations lacks certain crucial closure properties. In this thesis we introduce the logical language MSO(SLR), a language for same-length relations, a proper subset of the regular relations which has the necessary closure properties. We discuss how a formula of MSO(SLR) describing a relation can be automatically compiled into the finite-state transducer implementing that relation. Although there are many regular relations which MSO(SLR) cannot describe directly, we show how MSO(SLR) can characterize such relations indirectly by describing aligned representations of them.

To demonstrate the usefulness of MSO(SLR), we use it to define the finite-state conditional replace operator φ → ψ / λ_ρ in a declarative fashion. We argue that this approach improves on previous definitions in terms of clarity, maintainability, extensibility, and formal verifiability. We justify these claims by discussing several extensions and variations of the operator and providing rigorous proofs of correctness for our definitions.

A further demonstration of MSO(SLR)’s usefulness is given in the form of definitions of the rule formalisms used in two-level morphology. As with the replace operator definition, our declarative definitions give us a compiler automatically and make extensions and formal verification easy.

Chris Brew (Advisor)
268 p.

Recommended Citations

Citations

  • Vaillette, N. (2004). Logical specification of finite-state transductions for natural language processing [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1072058657

    APA Style (7th edition)

  • Vaillette, Nathan. Logical specification of finite-state transductions for natural language processing. 2004. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1072058657.

    MLA Style (8th edition)

  • Vaillette, Nathan. "Logical specification of finite-state transductions for natural language processing." Doctoral dissertation, Ohio State University, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=osu1072058657

    Chicago Manual of Style (17th edition)