Lecture 8: Evaluation

  • Information about midterm
  • PCFG
    • Start with S
    • ∑Pr(A -> gamma | A) = 1
      • (conditional) probability of each item has to sum to one
    • Pr(O = o1,o2,…,on|µ)
      • HMM: Forward
      • PCFG: Inside-Outside
    • Guess Pr: argmax_(Z)[ Pr(Z|O, µ) ]
      • HMM:Use Viterbi to get
      • PCFG: Use Viterbi CKY to get
      • *Z is the best sequence of states
    • Guess µ: argmax_(µ)[Pr(O|µ)]
      • HMM:Use forward-backward to get
      • PCFG: Use Inside-outside to get
    • Example:
      • Sentence:
        • ——————-S
        • ——–NP—————-VP
        • ——–NP———-V————-NP
        • ——people——eats —–adj——–N
        • —————————roasted—-peanuts
      • Problem:
        • Pr_µ(peanuts eat roasted people) = Pr_µ(people eat roasted peanut)
      • We can try to generate head of each phrase:
        • ————————————S (Head: eat)
        • ——–NP(Head: people)—————————–VP(Head: eat)
        • ——–NP(Head: people)———-V(Head: eat)——————————–NP(Head: peanut)
        • ——people(Head: people)——eats(Head: eat)————-adj(Head: N/A)—————–N(Head: peanut)
        • —————————————————————–roasted(Head: N/A)————-peanuts(Head: peanut)
      • Should have: Pr[S (Head: eat) -> NP(Head: people) VP(Head: eat)] > Pr[ S (Head: eat) -> NP(Head: peanut) VP(Head: eat) ]
    • Dependency representation:
      • Sentence:
        • —————————eat
        • —————people—————peanuts
        • —————–the—————–roasted
      • Lexical (bottom up)
      • NP ->det N
  • Evaluation
    • Reference Reading:How Evaluation Guides AI Research
      • Intrinsic evaluation
      • Extrinsic evaluation
    • Kappa’s evaluation
    • Metric: precision recall
    • How to evaluate two structures which could generate the same sentence?
      • Answer: Generate more than one output for each input, convert the output into set of output, and use precision and recall to measure.
    • Reader evaluation:
      • If the reader’s score agree with the machine, stop
      • else: let another reader read the essay

Leave a Reply

Your email address will not be published. Required fields are marked *