The following six examples provide an idea of the language-like syntax of ULFs. The first two are from the Tatoeba database, the next three are from The Little Prince (which was used for the first AMR-annotated corpus), and the last is from the Web:

  1. Could you dial for me?
    (((pres could.aux-v) you.pro (dial.v {ref1}.pro 
                               (adv-a (for.p me.pro)))) ?)
    
  2. If I were you I would be able to succeed.
    ((if.ps (I.pro ((cf were.v) (= you.pro))))
     (I.pro ((cf will.aux-s) 
          (be.v (able.a (to succeed.v)))))) 
    
  3. He neglected three little bushes
    (he.pro ((past neglect.v) 
          (three.d (little.a (plur bush.n)))))
    
  4. Flowers are weak creatures
    ((k (plur flower.n)) ((pres be.v) 
                       (weak.a (plur creature.n))))
    
  5. My drawing is not a picture of a hat
    ((my.d drawing.n) ((pres be.v) not
                    (= (a.d (picture-of.n (a.d hat.n))))))
    
  6. Very few people still debate the fact that the earth is heating up
    (((fquan (very.mod-a few.a)) (plur person.n)) still.adv-s 
      (debate.v (the.d (n+preds fact.n 
                             (= (that ((the.d |Earth|.n) 
                                       ((pres prog) heat_up.v)))))))))
    

As can be seen, ULF structure quite closely reflects phrase structure; and the type tags of atomic constituents, such as .pro, .v, .p, .a, .d, .n, etc., are intended to echo the part-of-speech origins of these constituents, such as pronoun, verb, preposition, adjective, determiner, noun, etc., respectively. Originally, ULFs contained some \(\lambda\)-abstracts, for example to form a conjunctive predicate from postmodified nouns, but we have introduced syntactic sugar elements that relieve annotators from coding such abstracts. An example is seen in (6): The n+preds macro takes a noun and one or more predicates as complements, and these are expanded into a \(\lambda\)-abstracted conjunctive predicate in postprocessing. As a result, ULFs are relatively amenable to human creation and intuitive interpretation. Moreover, as mentioned in the Introduction, the proximity to surface structure enables NLog-like inference and more.

But then isn’t parsing into ULF just another variant of syntactic parsing? The essential difference is that the type tags correspond to broad semantic categories (certain types of model-theoretic functions), and as such enable us to ensure that the type structure of ULFs – their operator-operand combinations – are semantically coherent. Richard Montague’s profoundly influential work can be viewed as demonstrating the crucial importance of paying attention to the semantic types of words and phrases, and that doing so leads to a view of language as very close to logic; as a result it lends itself to inference, at least to the extent that we can resolve – or are prepared to tolerate – various forms of ambiguity, context-dependence and indexicality.

Our semantic types are not as high-order as Montague’s, nor as “rigid” as Montague’s, but they suffice for maintaining type coherence. In particular, quantification is first-order, i.e., it iterates over individual entities, not over predicates, etc. – though through reification of predicate meanings and sentence meanings, we can “talk about” kinds of things, kinds of actions, propositions, etc., not just ordinary objects.

As soon as we take semantic types seriously in ULFs like the above, we see that certain type-shifting operators are needed to maintain type coherence. For example, in sentence (1) the phrase for me is coded as (adv-a (for.p me.pro)), rather than simply (for.p me.pro). That is because it is functioning here as a predicate modifier, semantically operating on the verbal predicate (dial.v {ref1}.pro) (dial a certain thing). Without the adv-a operator the prepositional phrase is just a 1-place predicate. Its use as a predicate is apparent in contexts like “This puppy is for me”. Note that semantically the 1-place predicate (for.p me.pro) is formed by applying the 2-place predicate for.p to the (individual-denoting) term me.pro. (Viewing \(n\)-place predicates as successively applied to their arguments, each time reducing the adicity, is in keeping with the traditions of Schönfinkel, Church, Curry, Montague, and others – hence “curried” predicates.) If we apply (for.p me.pro) to another argument, such as |Snoopy| (the name of a puppy), we obtain a truth value. So semantically, adv-a is a type-shifting operator of type (predicate \(\rightarrow\) (predicate \(\rightarrow\) predicate))), where the predicates are 1-place and thus of type (entity \(\rightarrow\) truth value). Of course, the name adv-a is intended to suggest “adverbial”, in recognition of the grammatical distinction between predicative and adverbial uses of prepositional phrases.

Then there is the issue of intensionality. For example, (2) is a counterfactual conditional, and the consequent clause “I would be able to succeed” is not evaluated in the actual world, but in a possible world where the (patently false) antecedent is imagined to be true. ULF and deeper LFs derived from it are based on a semantics where sentences are evaluated in possible situations (episodes), whose maxima are possible worlds. Details about syntactic forms and semantic types in the Episodic Logic approach to LF have been provided in many past publications (Hwang, 1992; Hwang & Schubert, 1994; Schubert & Hwang, 2000). This project website also contains separate posts that describe the syntax and semantic types in depth.

We don’t want to dive too deep for this introduction, but we note some further type-shifting operators in the examples to clarify the role of type-shifters in ULF. to (synonym: ka) in (2) shifts a verbal predicate to a kind (type) of action or attribute, which is an abstract individual; k in (4) shifts a nominal predicate to a kind of thing (so the subject here is the abstract kind, flowers, whose instances consist of sets of flowers; and that in (6) produces a reified proposition (again an abstract individual) from a sentence meaning. Through these type shifts, we are able to maintain a simple, classical view of predication, while allowing greater expressivity than the most widely employed logical forms, for example enabling generalized quantification (as in (6)), modification, reification, and other forms of intensionality.

The positioning of (adv-a (for.p me.pro)) within the verbal predicate it modifies, rather than in prefix-operator position, already indicates a certain looseness in the ULF syntax, as opposed to the rigidity of formal logic. This is unproblematic because we restrict the way operators may combine with operands so that type consistency is assured – and in fact in subsequent processing, any (adv-a (...)) constituents of a verbal predicate are moved so as to immediately precede that predicate. There are a number of further kinds of looseness in ULFs, but we defer further discussion to separate posts.

To wrap up this introductory discussion, we note a general concern that might be raised about ULFs. Since they largely conform with surface syntax, they are clearly language-specific. Isn’t the point of semantics to get at the deeper meanings underlying the surface forms or language, and shouldn’t these be somewhat uniform across languages? Our answer is two-fold: First, from a semantic perspective, the ULFs for different languages will have certain essential commonalities, namely, means to express predication, truth-functional and other connectives, generalized quantifiers, predicate and sentence modification, predicate and sentence reification, implicit and explicit reference to events/situations, comparatives, and a few other devices. Surface order is less important than these semantic commonalities. Second, we do think that sentence meanings should be factored into (as far as possible) minimal, separately usable, canonical propositions. This seems plausible both from speculations in cognitive science about “Mentalese”, and from a practical perspective, since canonicalization ensures that connections between ideas can be quickly recognized and used for inference. This canonicalization is the topic of the next post of the ULF introduction. Of course the meaning of sentences “in the wild” can be much more complex and subtle. Our hypothesis is that we can conquer those complexities effectively by starting with a type-coherent surface form, and systematically deriving canonical forms, bringing to bear many different kinds of influential factors. The next subsection elaborates on this view.

Next: ULF Intro 2

References

  1. Hwang, C. H. (1992). A logical approach to narrative understanding [PhD thesis]. University of Alberta.
  2. Hwang, C. H., & Schubert, L. K. (1994). Interpreting Tense, Aspect and Time Adverbials: A Compositional, Unified Approach. Proceedings of the First International Conference on Temporal Logic, 238–264.
  3. Schubert, L. K., & Hwang, C. H. (2000). In L. M. Iwańska & S. C. Shapiro (Eds.), Natural Language Processing and Knowledgeepresentation (pp. 111–174). MIT Press.