August 2017 – Xiaoxu Meng

*.mtl content explanation

newmtl mtlName # mtlName is the name of the material

Ka 1.000 1.000 1.000 #材质的阴影色（ambient color）用Ka声明。颜色用RGB定义，每条通道的值从0到1之间取。

Kd 1.000 1.000 1.000 #固有色（diffuse color）

Ks 0.000 0.000 0.000 #高光色（specular color）, if black {specular color is closed}

Ns 10.000 #Use Ns to represent weighted specular color, range 0 – 1000

illum 2 #illumination mode

0. 色彩开，阴影色关
1. 色彩开，阴影色开
2. 高光开
3. 反射开，光线追踪开
4. 透明：玻璃开反射：光线追踪开
5. 反射：菲涅尔衍射开，光线追踪开
6. 透明：折射开反射：菲涅尔衍射关，光线追踪开
7. 透明：折射开反射：菲涅尔衍射开，光线追踪开
8. 反射开，光线追踪关
9. 透明：玻璃开反射：光线追踪关
10. 投射阴影于不可见表面

d 0.9 # dissolve? 有些用’d’实现
Tr 0.9 # 其他的用’Tr’

map_Ka lena.tga # 阴影色纹理贴图
map_Kd lena.tga # 固有色纹理贴图 (多数情况下与其阴影色纹理贴图相同)
map_Ks lena.tga # 高光色纹理贴图
map_d lena_alpha.tga # alpha通道纹理贴图
map_bump lena_bump.tga # 凹凸贴图
bump lenna_bump.tga # 也有用’bump’而非’map_Bump’标签

For textures:

.dif are for diffuse

.alpha are for transparentcy

.spec are for specular reflection

.ddn are tengent space normal maps

.BGR?

08/30/2017

Plan for tomorrow:

try to download dependency of https://github.com/NCCA/Sponza
if not work, try to load the textures
Change from single float to short in for peripheral pixels
finish homework for 726 in python

OpenGL Commonly Used Models

Crytek Atrium Sponza
Rungholt scene

08/28/2017

low level Optimization of KFR

Optimization of log(|| x – x0, y – y0||)
Optimization of log function
Optimization of fast atan
Make the shader more complex to extend the rendering time to greater than 16ms

I will talk about every step in detail

Optimization of log(|| x – x0, y – y0||)
- There is rendering time reduction
  - Original 52.96ms
  - 1/2 buffer: 15.39ms ->15.57ms
  - 1/4 buffer: 4.20ms -> 4.10ms
Optimization of log function
- The fast-log contains at least 5 branches (possibly 5 additions and 5 shifts for 32 bit calculation)
- The Nvidia log algorithm is not available on line. But the log, exp, sin, cos in AMD GPU is 4x that of add/sub. We can guess Nvidia doesn’t do worse than AMD.
  - Reference1: http://www.iquilezles.org/www/articles/palettes/palettes.htm (Iq talking about sin, cos in GLSL)
    - Popular wisdom (especially between old-school coders) is that trigonometric functions are expensive and that therefore it is important to avoid them (by means of LUTs or linear/triangular approximations). Often popular wisdom is wrong – despite the above still holds true in some especial cases (a CPU heavy inner loop) it does not in general: for example, in the GPU, computing a cosine is way, way faster than any attempt to approximate it. So, lets take advantage of this and go with the straight cosine expression.
  - Analysis of AMD GPU: https://seblagarde.wordpress.com/tag/gpu-performance/
    - Full rate (FR): mul, mad, add, sub, and, or, bit shift… Quater rate(QR): transcendental instruction like rcp, sqrt, rsqrt, cos, sin, log, exp…
  - Discussion about complexity of complexity:
    - 1/x, sin(x), cos(x), log2(x), exp2(x), 1/sqrt(x) – 0 or close to 0, as long as they are limited to 1/9 of all total ops (can go up to 1/5 for Maxwell).
Optimization of fast atan (I only tried diamond angle now. I will try the CORDIC later.)
- Simple comparison of atan2 and diamond angle.
- A test of shadertoy: https://www.shadertoy.com/view/lllyR4
Make the shader more complex to extend the rendering time to greater than 16ms

08/27/2017

Permutation:

http://blog.csdn.net/hackbuteer1/article/details/6657435

Tomorrow:

CORDIC

FastLog
Question: Can glsl do bitwise operation?
- Answer1: Can https://web.cs.ship.edu/~djmoon/cg/cg-notes/cg-glsl-language.pdf
- Answer2: No.

How to build 5d array in C++

If I want to have Arr [const][variable][const][variable][const], what should I do?

Use 5-layer typedef

typedef int A1 [9];

typedef A1 *A2;

typedef A2 A3[8];

typedef A3 *A4;

typedef A4 A5 [7];

int main()

{

A5* x;

return 0;

}

2. Use only one typedef

typedef int(*(*(*B[7])[8])[9])

int main()

{

B y = 0;

}

08/22/2017

Research:

8:30am – 11: 30am

meet with Var
- need to figure out the advantage of our algorithm
Try to update VS15 to get DirectX SDK

3:00 pm – 6:00 pm

Variance sampling TAA
Write paper

Tomorrow:

Read push-pull paper
Read Europe Log polar paper
think about ellipse log-polar

Leetcode:

https://leetcode.com/problems/integer-break/description/

Direct3D Resources

Official Tutorial:

https://code.msdn.microsoft.com/Direct3D-Tutorial-Win32-829979ef?SRC=VSIDE

http://blog.csdn.net/xueyedie1234/article/details/51315640

08/18/2017

Decouple shading rate & visibility rate from pixels: allow for space for anti-aliasing and coarse pixel shading.

Texel Shading (shading rate reduction):

We show performance improvements in three ways. First, we show some improvement for the “small triangle problem”. Second, we reuse shading results from previous frames. Third, we enable dynamic spatial shading rate choices, for further speedups.

Visibility: updating visibility at the full frame rate.

Shading rate: dynamically varying the spatial shading rate by simply biasing the mipmap level choice, texel shading and temporal shading reuse

Some reason for increased shading cost

The first is the mapping from pixels to texels
The second source of shading increase is in the caching system.

Process: deferred decoupled shading

rasterization -> records texel accesses as shading work rather than running a shade per pixel. Shading is performed by a separate compute stage, storing the results in a texture. A final stage collects data from the texture

Object Space Lighting:

Inspired by REYES (render everything your eyes can see)

Overall process

All objects in game are submitted for shading and rasterization. Queued for process
During submission step, the estimated projected area of the object is calculated. Thus an object requests a certain amount of shading
During shading, system allocates texture space for all objects which require shading. If the total request is more then available shading space, all objects are progressively scaled at shading rate until it fits
Material shading occurs, processing each material layer for each object. Results are accumulated into the master shading texture(s)
MIPS are calculated on master shading texture as appropriate
Rasterization step: each object references the shading part step. No specific need that there is a 1:1 correspondence, but this feature is rarely used.

AMFS:

Our architecture is also the first to support pixel shading at multiple different rates, unrestricted by the tessellation or visibility sampling rates.

automatic shading reuse between triangles in tessellated primitives

we decouple pixel shading from screen space
it allows lazy shading and reuse simultaneously at multiple different frequencies

enables a wider use of tessellation and fine geometry, even at very limited power budgets

Month: August 2017