Seminar in Computational Linguistics

  • Date: –14:30
  • Location: 9-3042 &
  • Lecturer: Daniel Dakota
  • Contact person: Gongbo Tang
  • Seminarium

What’s in a Span? Evaluating Constituency Span-Based Multilingual Parsing

Detailed analysis for constituency parsing is often not performed, particularly in a multilingual setting, with instead only F-scores being reported. As new state-of-the-art parsers have resulted in a transition from traditional PCFG-based grammars to span-based approaches, there have not been a systematic, detailed examination of how such fundamentally different approaches interact with various treebanks. We attempt to perform non-trivial analysis of how span-based parsing performs across 11 treebanks in order to examine what overall behaviors this parsing approach exhibits, and what role a treebank's specific annotation may play. We find that the parser tends to prefer flatter spans but the approach succeeds because is robust enough to pick up on treebank annotation variation both internally and externally, though sometimes to its own detriment.