Brief on visual

My career is mainly visual related. In the early stage, I have done some image processing related work and then switch to game. This article is an attempt to brief the visual related area in a simple way.

1. What is a pixel?
2. How to see?
3. How to draw?
4. How does the computer think?
5. How does the computer draw?

1. What is a pixel?
There are a lot of answers depends on the context. We could general refer a pixel as the
a. Picture element.
The name pixel is short form of pix element. Pix is the short form of picture. So pixel means picture element. An image is divided into a rectangular grid of discrete dots known as pixels. One common interpretation could be a pixel is the minimal element square block in an image. For example, an image with resolution 200x100 means there are 20000 pixel with 200 horizontally and 100 vertically.

b. A light blob.
From hardware point of view, it is quite true. The traffic light for pedestrians normally consist of small light blobs. Those light-blobs could show a green man walking animation indicating that pedestrians are allowed to pass.

c. Three light blobs.
One answer regarding the question is a pixel a light-blob? in reddit:
"In terms of a screen, a pixel is the smallest thing that makes up the image on a screen. It's probably not fair in most cases to call it a bulb - apart from plasma screens, they don't usually light up by themselves.

A pixel itself, on LCD or Plasma screens (LED is a sub-type of LCD) is made up of three sub-pixels, a red, green and blue one - together, the three sub-pixels define the color of the pixel."

2. How to see?

Humans see thing through eye. Our retinal could receive light info gone through lens.
Cameras work similar to human. Generally the light went through lens and then project to the retinal of camera.
The conventional camera's retinal works as entirely on chemical and mechanical process to form an image.
Digital camera's retinal is a semiconductor device that records light electronically.

Human see thing passively. Laser see things actively. One type of Laser works by sending a ray and the ray reflect to the sensor then based on the time travel of the light to compute the distance.

The thing saw is store in a medium. We store it in our memory. Conventional cameras store it in films . The digital sensor (digital camera, laser) store the data in electronic device.) The digital camera often store it in a SD card.

3. How to draw?

It is stated that Nicola Tesla could back project his thinking into retinal. Well, normal people could try to draw thing following Davinich's way, using hand. The drawing process if often from grain to fine as our memory is very fuzzy.

For conventional cameras, films could be wash out as photos. The drawing process is revealing the info captured though chemical reaction. Good thing about this approach is that you have infinite resolution.

Digital camera just copy the data stored in SD card and let the screen(series of light blobs) to display.

4. How does the computer think?

As more and more sensor available, for example, infrared laser scanner. Computer could capture the info of environment quite well.

Computer process the data capture though sensor. Processing 3D models captured by scanners falls into the subject of computational geometry. Processing images falls into the subject of computer vision. From images, computer could derive the 3D information, refer to stereo, structure from motion(SfM). It works by compute the reference of images and then compute the camera parameters(intrinsic and extrinsic) and the depth info(use triangulation).

So well equipped computer could see the world just like human and could do some simple thinking, for example detect simple shapes(plane).

5. How does the computer draw?
From the data to the computer monitor, how does computer draw? Depends on the data.

If the data is just an image, the drawing process could be just treated as a simple process of showing color info through light-blobs.

However if the data is a model, plus the color information, plus a lots of computation, the computer will try to render the data into an image and then display. It is called Rendering, which falls into the subject of computer graphics.

The renderer is the software doing the rendering. The hardware could be CPU, or CPU+GPU. In the CPU rendering, is referred as software rendering.

Rendering process could be treated as a data-processing pipeline. From vertex data, go through a series of coordination transformation, projection, normalization, assign color information and show to the screen. More info, refer to OpenGL rendering pipeline for am example.

The software rendering could consist of following problems,
P1. Given a vertex, how to draw it as a point?

The vertex data just go through a series of transformation to derive the location on the screen and then draw it on screen.

P2. Given two vertices, how to draw a line segment?
a. draw each vertex on the screen
b. Recursively draw mid point of the two screen point. or refer Bresenham as an enhance approach.

P3. How to draw a triangle?
a. draw triangle edges through the drawing line segment
b. fill in the triangle interior part(for example, just fill the triangle vertically base on the Y info. It is a kind of line sweeping). It is call scan-line fill.

P4. How to draw many triangles?
As triangles might intersect with each other and the most front info is supposed to be draw(assume triangles are opaque), we need an array buffer to store the most front depth info. The array buffer is often referred as depth buffer.

For each triangle, do the following process
a. draw triangle to a temp screen
b. for each pixel on the temp screen,

compute the depth info of all pixels through interpolate of the 3 triangle point
if the pixel's depth value is smaller than depth buffer value, draw it to the main screen and update the depth buffer value by the pixel's depth value, otherwise, discard.

Note that this process can be optimized.

P5. How to draw a mesh?A mesh just a set of connected triangles.

P6. How to handle point lighting?There are two common approaches
a. Flat shading:
a.1 compute the normal of a triangle,
a.2 change the color of pixel using the cosine of the angle between the light vector and normal vector (color.r*ndotL, color.g *ndotL, color.b*ndotL, 1)

b.Gouraud shading
b1. for each triangle vertex, compute the normal by average all the normals of triangles own the vertex.
b2. same as a.2
b3. interpolate the color for each interior point while doing scan line fill.

P7. How to apply texture to a mesh?Each mesh point has a texture coordinate, which is call UV coordinate.
When drawing the triangle of a mesh,
the triangle vertices color info could just be picked through given UV coordinate.
the triangle interior color info could just be picked through UV coordinate derived from interpolation of three vertices' UV coordinate.

In the rendering process, there are lots of coordinate notation, e.g., homogeneous coordinate (x, y, z, w), screen coordinate (x, y), 3d coordinate (x, y, z), UV coordinate (u, v). These coordinate notation form a simple alphabet song, U, V, W, X, Y, Z

CPU+GPU rendering is more advanced topic, which consist of lots of sub-topics. However, the underlying concept is same with software rendering.
I have went through a lot of books to study rendering. In the end, the easiest way to understand could be just understand

software rendering
study the architecture of GPU and know how each GPU hardware play a role by substitute the software rendering component.
study how CPU, GPU interact
treat your eye as a computer camera

Fast Game Development in C#(Unity, ET, MongoDB)

This is the continue of the previous blog fast prototyping of a new game project. With the ET game framework(refer to https://github.com/egametang/Egametang), and lack of server support, I decide to continue my attempt to develop our a game all in C#. The progress become very fast, due to following factors. 1. A fast UI making approach. UI has been one of most time consuming work during game development. Managing buttons/labels is extremely tedious when there are lots of UI pages. There is a simple way to handle UI. Take NGUI for example, we assemble the UI widget in a prefab, and drag a list of widgets(example, buttons, labels) in a script, click a button to auto-generate all the boilerplate code. The auto-generated code could be just two files, one view script one control script. For example, Login UI panel, three scripts would be generated, Login_C, Login_V, Login_M. In Login_V, all the widget assignment related code is generated, in Login_C, the value of widget is as...

HamsterCode

Search This Blog

Brief on visual

Comments

Post a Comment

Popular posts from this blog

Tech note: Java Virtual Machine(JVM) vs Erlang Runtime System(ERTS)

A simple prototype to MOBA in Unity C#

Fast Game Development in C#(Unity, ET, MongoDB)