Lecture 5: Reduced-dimensionality representations for documents: Gibbs sampling and topic models

watch the new talk and write summary Noah Smith: squash network Main points: difference between LSA & SVD Bayesian graphical models informative priors are useful in the model Bayesian network DAG X1X2…Xn Po(X1, X2, …, Xn) Generative story: HMM (dependencies) A and B are conditionally independent given C iff P(A,B|C) = P(A|C) * P(B|C)   … Read moreLecture 5: Reduced-dimensionality representations for documents: Gibbs sampling and topic models

Lecture3: Information Theory

Today’s class is about: Hypothesis testing collocations Info theory Hypothesis Testing Last lecture covered the methodology. Collocation “religion war” PMI, PPMI PMI = pointwise mutual information PMI = log2(P(x,y)/(P(x)P(y))) = I(x,y) PPMI = positive PMI = max(0, PMI) Example: Hong Kong, the frequency of “Hong” and “Kong” are low, but the frequency for “Hong Kong” … Read moreLecture3: Information Theory

The test speed of neural network?

Basically, the time spent on testing depends on: the complexity of the neural network For example, the fastest network should be the fully-connected network. CNN should be faster than LSTM because LSTM is sequential (sequential = slow) Currently, there are many ways to compress deep learning model (remove nodes with lighter weight) the complexity of … Read moreThe test speed of neural network?