06/15/2017

Research:

LeetCode:

Parallel Computing Study:

  • Finish watching video of Lesson3: Fundamental GPU Algorithms
    • Reduce
      • Reduction Operator a) binary (two arguments)     b) associative
    • Scan
      • Input:
        • input array
        • binary associative operator
        • identity element [I op a = a]

Guitar:

  • practice

Others:

  • O(1) independent of size of inputs

Presentation next Monday:

  • Prepare the PPT

06/14/2017

Research:

  • Implement KFR with Optical Flow in OpenGL
  • Cannot Output the model from Unity (only part of the model can be exported), but we can try to test FPS on KroEngine.
    • Origin: 90 – 100 FPS
    • GausBlur: 62 -69 FPS
    • KFR: 58 – 65 FPS

LeetCode:

Parallel Computing Study:

  • Finish watching video of Lesson2
    • coalesced: access contiguous memory
    • strided: access non-contiguous memory
    • for floats, (a + b) + c  != a + (b + c), e.g. a = 1, b = 10^99, c = -10 ^ 99
    • atomic functions are costly

Guitar:

  • practice.

Others:

Presentation next Monday:

  • Prepare the PPT

06/13/2017

Research:

  • Measure time of the three scenes (KFR, Origin, GausOptFlow).
  • Origin: 32.2 fps
  • KFR: 20.4 fps
  • GausOptFlow: 20.2 fps

LeetCode:

  • 4. Median of Two Sorted Arrays
    • Cannot use concatinate + sort because the time complexity goes beyond log(m+n).
    • If time complexity is log(*), consider about dichotomy.
  • 5. Longest Palindromic Substring
    • Manacher
    • DP
    • s.substr(start, length) (can also use this to read the ith element in string in string format, you know, s[i] is char)

Parallel Computing Study:

  • Finish watching video of Lesson2

Guitar:

  • practice.

Others:

Presentation next Monday:

  • Prepare the PPT

06/12/2017

Research:

  • Merge the three scenes (KFR, Origin, GausOptFLow) into one program.
  • Tried to measure time: have problem (the time are all the same)
  • Tried to export mesh but failed.

LeetCode:

  • 2. Add Two Numbers
    • Cannot use list2num/num2list (overflow), should use full adder algorithm.
  • 3. Longest Substring Without Repeating Characters
    • “dvdf”

Parallel Computing Study:

  • Finish watching video of Lesson2

733Code->C++:

  • Panorama Maker

Guitar:

  • practice.

Others:

Presentation next Monday:

  • Prepare the PPT

06/08/2017

Research:

  • no research task today~~~lol~~~! Where is varshney?????

Leetcode:

Finished 4 leetcode problems.

  • 228. Summary Ranges (easy)

  • 209. Minimum Size Subarray Sum
    • Add at the tail and subtract at the head.
  • 18. 4Sum
    • sort, and squeeze from both sizes. Actually it will be faster to store the sum of two numbers by using hashmap.
  • 373. Find K Pairs with Smallest Sums
    • should use priority queue

Parallel Computing Study:

  • Finish hw1.
  • Finish watching video of Lesson2

Website:

  • Installed plugin Crayon Syntax Highlighter, but actually I don’t know how to use it…

Guitar:

  • Watch a new tutorial and practice.

06/07/2017

Research:

  • no research task today~~~lol~~~! Where is varshney?????

Leetcode:

Finished 5 leetcode problems.

  • Know the difference between vec.emplace_back() and vec.push_back().
    • https://stackoverflow.com/questions/4303513/push-back-vs-emplace-back

Parallel Computing Study:

  • Watch video of class2
    • https://classroom.udacity.com/courses/cs344/lessons/67054969/concepts/673560200923
    • Bill Dally mentioned that CPU is not getting faster…we use more parallelism now.
    • Four communication patterns:
      • map: all the inputs do the same calculation (efficient in GPU).
      • gather: calc average of several pixels.
      • scatter: write several memory units with one value.
      • stencil: tasks read input from a fixed neighborhood in an array.
    • Transpose:
      • reorder data elements in memory
      • Transpose: AoS [array of structures] vs.  SoA [structure of arrays]
    • Transpose: AoS [array of structures] vs.  SoA [structure of arrays]
  • install CUDA on my PC.

Website:

  • no

Guitar: