• Fixed depth error. Should initialize depth buffer after obj load
  • Fixed drawing triangles for the cut line:
    • I wrongly set the “layout (triangle_strip, max_vertices = 3) out”, if max_vertices = 3, only one triangle will be outputted!
  • Should use vertex, tess, geom, frag shaders together now! Let recall the whole pipeline…https://www.khronos.org/opengl/wiki/Rendering_Pipeline_Overview
  • (09/08 added) I met an error for tessellation:
  • Use tessellation CTRL shader to determine the level of tessellation:
    • how to determine the level dynamically? Can do this
    • Ensure that the shared edge(s) between the patches use the same level of tessellation
  • Use tessellation eval shader to calculate the area of the triangle.
  • How to dynamically solve the problem of tessellation?
  Central Triangle??





Could load texture correctly when rendering to framebuffer. However, when framebuffer is used, only left down corner of the texture is displayed.


I loaded a model with some texture initialization before initiating framebuffer. And the model affected the initialization of framebuffer;

Should initialize framebuffer before loading the model.

Task for tomorrow:

Try every simple mesh in the object separately.


Assignment1: KNN

  • Install BallTree Library: sudo apt-get install python-sklearn
  • list.append(), add new item to the end.
  • numpy.arange(), sort
  • numpy.reshape(), the same with reshape() in matlab
  • numpy.median(numpy.array(outs))????
  • defaultDict.get(), https://stackoverflow.com/questions/11041405/why-dict-getkey-instead-of-dictkey
  ballTree.query(X, k == k_in), http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.BallTree.html


  1. Find large model
    1. Can load Sponza
    2. The loading library is not complete. The .mtl cannot be fully loaded. But the rendering time can achieve >28ms
  2. Color:
    1. Patney: people can easily find difference if color is changed.
  Use short in to replace float

*.mtl content explanation

newmtl mtlName # mtlName is the name of the material

Ka 1.000 1.000 1.000 #材质的阴影色(ambient color)用Ka声明。颜色用RGB定义,每条通道的值从0到1之间取。

Kd 1.000 1.000 1.000 #固有色(diffuse color)

Ks 0.000 0.000 0.000 #高光色(specular color), if black {specular color is closed}

Ns 10.000 #Use Ns to represent weighted specular color, range 0 – 1000

illum 2 #illumination mode

0. 色彩开,阴影色关
1. 色彩开,阴影色开
2. 高光开
3. 反射开,光线追踪开
4. 透明: 玻璃开 反射:光线追踪开
5. 反射:菲涅尔衍射开,光线追踪开
6. 透明:折射开 反射:菲涅尔衍射关,光线追踪开
7. 透明:折射开 反射:菲涅尔衍射开,光线追踪开
8. 反射开,光线追踪关
9. 透明: 玻璃开 反射:光线追踪关
10. 投射阴影于不可见表面

d 0.9 # dissolve? 有些用’d’实现
Tr 0.9 # 其他的用’Tr’

map_Ka lena.tga # 阴影色纹理贴图
map_Kd lena.tga # 固有色纹理贴图 (多数情况下与其阴影色纹理贴图相同)
map_Ks lena.tga # 高光色纹理贴图
map_d lena_alpha.tga # alpha通道纹理贴图
map_bump lena_bump.tga # 凹凸贴图
bump lenna_bump.tga # 也有用’bump’而非’map_Bump’标签

For textures:

.dif are for diffuse

.alpha are for transparentcy

.spec are for specular reflection

.ddn are tengent space normal maps



Plan for tomorrow:

  1. try to download dependency of https://github.com/NCCA/Sponza
  2. if not work, try to load the textures
  3. Change from single float to short in for peripheral pixels
  finish homework for 726 in python


low level Optimization of KFR

  1. Optimization of log(|| x – x0, y – y0||)
  2. Optimization of log function
  3. Optimization of fast atan
  4. Make the shader more complex to extend the rendering time to greater than 16ms

I will talk about every step in detail

  • Optimization of log(|| x – x0, y – y0||)
    • There is rendering time reduction
      • Original 52.96ms
      • 1/2 buffer: 15.39ms ->15.57ms
      • 1/4 buffer: 4.20ms -> 4.10ms
  • Optimization of log function
    • The fast-log contains at least 5 branches (possibly 5 additions and 5 shifts for 32 bit calculation)
    • The Nvidia log algorithm is not available on line. But the log, exp, sin, cos in AMD GPU is 4x that of add/sub. We can guess Nvidia doesn’t do worse than AMD.
      • Reference1: http://www.iquilezles.org/www/articles/palettes/palettes.htm (Iq talking about sin, cos in GLSL)
        • Popular wisdom (especially between old-school coders) is that trigonometric functions are expensive and that therefore it is important to avoid them (by means of LUTs or linear/triangular approximations). Often popular wisdom is wrong – despite the above still holds true in some especial cases (a CPU heavy inner loop) it does not in general: for example, in the GPU, computing a cosine is way, way faster than any attempt to approximate it. So, lets take advantage of this and go with the straight cosine expression.
      • Analysis of AMD GPU: https://seblagarde.wordpress.com/tag/gpu-performance/
        • Full rate (FR): mul, mad, add, sub, and, or, bit shift… Quater rate(QR): transcendental instruction like rcp, sqrt, rsqrt, cos, sin, log, exp…
      • Discussion about complexity of complexity:
        • 1/x, sin(x), cos(x), log2(x), exp2(x), 1/sqrt(x) – 0 or close to 0, as long as they are limited to 1/9 of all total ops (can go up to 1/5 for Maxwell).
  • Optimization of fast atan (I only tried diamond angle now. I will try the CORDIC later.)
    • Simple comparison of atan2 and diamond angle.
    • A test of shadertoy: https://www.shadertoy.com/view/lllyR4
  • Make the shader more complex to extend the rendering time to greater than 16ms