April 2018 – Xiaoxu Meng

Talk: Learning efficiency of outcome in games

By Eva Tardos
Repeated games
- player’s value/cost additive over periods, while playing
- players try to learn what is the best from past data
- what can we say about the outcome? how long do they have to stay to ensure OK social welfare?
Result: routing, limit for very small users
- Theorem:
 - In any network with continuous, non-decreasing cost functions and small users
 - cost of Nash with rates ri for all i <= cost of opt with rate 2ri for all i
- Nash equilibrium: stable solution where no player had incentive to deviate.
- Price of Anarchy = cost of worst Nash equilibrium / social optimum cost;
Examples of price of anarchy bounds
Price of anarchy in auctions
- First price is auction
- All pay auction…
- Other applications include:
  - public goods
  - fair sharing
  - Walrasian Mechanism
Repeated game that is slowly changing
- Dynamic population model
  - at each step t each player I is replaced with an arbitrary new player with probability p
  - in a population of N players, each step, Np players replaced in expectation
  - population changes all the time: need to adjust
  - players stay long enough…
Learning in repeated game
- what is learning?
- Does learning lead to finding Nash equilibrium?
- fictitious play = best respond to past history of other players goal: “pre-play” as a way to learn to play Nash
Find a better idea when the game is playing?
- Change of focus: outcome of learning in playing
Nash equilibrium of the one shot game?
- Nash equilibrium of the one-shot game: stable actions a with no regret for any alternate strategy x.
- cost_i(x, a_-i) >= cost_i(a)
Behavior is far from stable
no regret without stability: learning
- no regret: for any fixed action x (cost \in [0,1]):
 - sum_t(cost_i(a^t)) <= sum_t(cost_i(x, a_-i^t)) + error
 - error <= √T (if o(T) called no-regret)
Outcome of no-regret learning in a fixed game
- limit distribution sigma of play (action vectors a = (a1, a2,…,an))
No regret leanring as a behavior model:
- Pro:
  - no need for common prior or rationality assumption on opponents
  - behavioral assumption: if there is a consistently good strategy: please notice!
  - algorithm: many simple rules ensure regret approx.
  - Behavior model ….
Distribution of smallest rationalizable multiplicative regret
- strictly positive regret: learning phase maybe better than no-regret
Today (with d options):
- sum_t(cost_i(a^t)) <= sum_t(cost_i(x, a_-i^t)) + √Tlogd
- sum_t(cost_i(a^t)) <= (1 + epsilon)sum_t(cost_i(x, a_-i^t)) + log(d)/epsilon
Quality of learning outcome
Proof technique: Smoothness
Learning and price of anarchy
Learning in dynamic games
- Dynamic population model
  - at each step t each player I is replaced with an arbitrary new player with probability p
- how should they learn from data?
Need for adaptive learning
Adapting result to dynamic populations
- inequality we wish to have
Change in optimum solution
Use differential privacy -> stable solution

How to write MP4 with OpenCV3

When trying to write MP4 file (H264), I tired code of

    fourcc = cv2.VideoWriter_fourcc(*'MP4V')
    //or
    fourcc = cv2.VideoWriter_fourcc(*'X264')
    voObj = cv2.VideoWriter('output.mp4',fourcc, 15.0, (1280,360))

fourcc = cv2.VideoWriter_fourcc(*'MP4V')

//or

fourcc = cv2.VideoWriter_fourcc(*'X264')

voObj = cv2.VideoWriter('output.mp4',fourcc, 15.0, (1280,360))

And I got error saying:

<span class="pln">FFMPEG</span><span class="pun">:</span><span class="pln"> tag </span><span class="lit">0x5634504d</span><span class="pun">/</span><span class="str">'MP4V'</span> <span class="kwd">is</span> <span class="kwd">not</span><span class="pln"> supported </span><span class="kwd">with</span><span class="pln"> codec id </span><span class="lit">13</span> <span class="kwd">and</span><span class="pln"> format </span><span class="str">'mp4 / MP4 (MPEG-4 Part 14)'</span>

FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 13 and format 'mp4 / MP4 (MPEG-4 Part 14)'

This problem is solved by changing the fourcc to the ASCII number directly to cv2.VideoWriter(), i.e.

    outputVideo.open(writeName, 0x00000021, texHelper-&gt;caps[0].get(cv::CAP_PROP_FPS), S, true);

1	outputVideo.open(writeName, 0x00000021, texHelper->caps[0].get(cv::CAP_PROP_FPS), S, true);

reference:

https://devtalk.nvidia.com/default/topic/1029451/jetson-tx2/-python-what-is-the-four-characters-fourcc-code-for-mp4-encoding-on-tx2/

Paper Reading: View Direction and Bandwidth Adaptive 360 Degree Video Streaming using a Two-Tier System

Original paper:

Each segment is coded as a base-tier (BT) chunk, and multiple enhancement-tier (ET) chunks.

BT chunks:

represent the entire 360 view at a low bit rate and are pre-fetched in a long display buffer to smooth the network jitters effectively and guarantee that any desired FOV can be rendered with minimum stalls.

ET chunks:

Facebook 360 video:

https://code.facebook.com/posts/1638767863078802

Assessment:

https://code.facebook.com/posts/2058037817807164

STACKGAN: Text to photo-realistic Image Synthesis

STACKGAN: generate image from text

PointNet, PointNet++, and PU-Net

point cloud -> deep network -> classification / segmentation / super-resolution

traditional classification / segmentation:

projection onto 2D plane and use 2D classification / segmentation

unordered set

point(Vec3) -> feature vector (Vec5) -> normalize (end with the bound of the pointcloud)

N points:

segmentation:

feature from N points ->NxK classes of each point (each point will have a class)

classification:

feature from N points -> K x 1 vector (K classes)

Lecture 10: Neural Network

Deep learning
Representation learning
Rule-based
1. high explainability
Linguistic supervision
Semi-supervision
1. have small set of data with label
2. has large set of data without label
Recurrent-level supervision
Language structure

description lengths DL= size(lexicon) + size( encoding)

lex1
1. do
2. the kitty
3. you
4. like
5. see
Lex2
1. do
2. you
3. like
4. see
5. the
6. kitty
How to evaluate the two lexicons?
1. lex 1 have 5 words, lex 2 has 6 words
2. Potential sequence
  1. lex1: 1 3 5 2, 5 2, 1 3 4 2
  2. lex2: 1 3 5 2 6, 5 2 6, 1 3 4 2 6

MDL: minimum description lengths
1. unsupervised
2. prosodic bootstrapping

Boltzmenn machine

Lexical space

relatedness vs. similarity

use near neighbors: similarity
use far neighbors: relatedness

ws-353 has similarity & relatedness

loss function:

project:

Part1: potential methods

LDA
readability
syntactic analysis