06/15/2017

Research:

LeetCode:

Parallel Computing Study:

• Finish watching video of Lesson3: Fundamental GPU Algorithms
• Reduce
• Reduction Operator a) binary (two arguments)     b) associative
• Scan
• Input:
• input array
• binary associative operator
• identity element [I op a = a]

Guitar:

• practice

Others:

• O(1) independent of size of inputs

Presentation next Monday:

• Prepare the PPT

06/14/2017

Research:

• Implement KFR with Optical Flow in OpenGL
• Cannot Output the model from Unity (only part of the model can be exported), but we can try to test FPS on KroEngine.
• Origin: 90 – 100 FPS
• GausBlur: 62 -69 FPS
• KFR: 58 – 65 FPS

LeetCode:

Parallel Computing Study:

• Finish watching video of Lesson2
• coalesced: access contiguous memory
• strided: access non-contiguous memory
• for floats, (a + b) + c  != a + (b + c), e.g. a = 1, b = 10^99, c = -10 ^ 99
• atomic functions are costly

Guitar:

• practice.

Others:

Presentation next Monday:

• Prepare the PPT

06/13/2017

Research:

• Measure time of the three scenes (KFR, Origin, GausOptFlow).
• Origin: 32.2 fps
• KFR: 20.4 fps
• GausOptFlow: 20.2 fps

LeetCode:

• 4. Median of Two Sorted Arrays
• Cannot use concatinate + sort because the time complexity goes beyond log(m+n).
• If time complexity is log(*), consider about dichotomy.
• 5. Longest Palindromic Substring
• Manacher
• DP
• s.substr(start, length) (can also use this to read the ith element in string in string format, you know, s[i] is char)

Parallel Computing Study:

• Finish watching video of Lesson2

Guitar:

• practice.

Others:

Presentation next Monday:

• Prepare the PPT

06/12/2017

Research:

• Merge the three scenes (KFR, Origin, GausOptFLow) into one program.
• Tried to measure time: have problem (the time are all the same)
• Tried to export mesh but failed.

LeetCode:

• Cannot use list2num/num2list (overflow), should use full adder algorithm.
• 3. Longest Substring Without Repeating Characters
• “dvdf”

Parallel Computing Study:

• Finish watching video of Lesson2

733Code->C++:

• Panorama Maker

Guitar:

• practice.

Others:

Presentation next Monday:

• Prepare the PPT

06/08/2017

Research:

• no research task today~~~lol~~~! Where is varshney?????

Leetcode:

Finished 4 leetcode problems.

• 228. Summary Ranges (easy)

• 209. Minimum Size Subarray Sum
• 18. 4Sum
• sort, and squeeze from both sizes. Actually it will be faster to store the sum of two numbers by using hashmap.
• 373. Find K Pairs with Smallest Sums
• should use priority queue

Parallel Computing Study:

• Finish hw1.
• Finish watching video of Lesson2

Website:

• Installed plugin Crayon Syntax Highlighter, but actually I don’t know how to use it…

Guitar:

• Watch a new tutorial and practice.

06/07/2017

Research:

• no research task today~~~lol~~~! Where is varshney?????

Leetcode:

Finished 5 leetcode problems.

• Know the difference between vec.emplace_back() and vec.push_back().
• https://stackoverflow.com/questions/4303513/push-back-vs-emplace-back

Parallel Computing Study:

• Watch video of class2
• https://classroom.udacity.com/courses/cs344/lessons/67054969/concepts/673560200923
• Bill Dally mentioned that CPU is not getting faster…we use more parallelism now.
• Four communication patterns:
• map: all the inputs do the same calculation (efficient in GPU).
• gather: calc average of several pixels.
• scatter: write several memory units with one value.
• stencil: tasks read input from a fixed neighborhood in an array.
• Transpose:
• reorder data elements in memory
• Transpose: AoS [array of structures] vs.  SoA [structure of arrays]
• Transpose: AoS [array of structures] vs.  SoA [structure of arrays]
• install CUDA on my PC.

Website:

• no

Guitar:

• Watch a new tutorial.