Lecture 2: Lexical association measures and hypothesis testing

Pre-lecture Readings. Lexical association Named entities: http://www.nltk.org/book/ch07.html Information extraction architecture raw text->sentence segmentation->takenization0<part of speech tagging->entity detection->relation detection chunking: segments and labels multi-token sequences as illustrated in 2.1. Noun-phrase (NP) chunking tag patterns: describe sequences of tagged words Chunking with Regular Expressions Exploring Text Corpora Chinking: define a chink to be a sequence of tokens that is not included in

CMSC773: HW1

Question2: Word order: Explanation: Word order refers to the structure of a sentense: Alexa, when is your birthday? (Alexa answers) Alexa, when your birthday is? (Alexa answers) This will test whether alexa handles some wrong word order. Inflectional morphology: Explanation: Question 3: Question 4: Reference: http://statweb.stanford.edu/~serban/116/bayes.pdf Question 5: Question 6: New definitions: log-entropy weighting cosine

CMSC773: pre-coarse knowledge

Dependency Parsing Dependency: focuses on relations between words Typed: Label indicating relationship between words Untyped: Only which words depend Phrase structure: focuses on identifying phrases and their recursive structure Dependency parsing methods: Shift reduce: Predict from left to right Fast, but slightly less accurate MaltParser Spanning tree: Calculate full tree at once Slightly more accurate,