Thursday, November 10, 2011

WebGL Pre-Tutorial, Part 2: Drawing a 2d Triangle

This part of the pre-tutorial will show you how to write a very basic WebGL program that draws a 2d triangle on the screen.


Unfortunately, even the simplest WebGL program is somewhat long and involved. Below is the complete code for the program followed by more detailed explanations of the different parts of that code.

<!doctype html>

<canvas width="500" height="500" id="mainCanvas"></canvas>

<script>
function main()
{
   // Configure the canvas to use WebGL
   //
   var gl;
   var canvas = document.getElementById('mainCanvas');
   try {
      gl = canvas.getContext('webgl');
   } catch (e) {
      throw new Error('no WebGL found');
   }

   // Copy an array of data points forming a triangle to the

   // graphics hardware
   //
   var vertices = [
      0.0, 0.5,
      0.5,  -0.5,
      -0.5, -0.5,
   ];
   var buffer = gl.createBuffer();
   gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
   gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(vertices), gl.STATIC_DRAW);

   // Create a simple vertex shader
   //
   var vertCode =
        'attribute vec2 coordinates;' +
        'void main(void) {' +
        '  gl_Position = vec4(coordinates, 0.0, 1.0);' +
        '}';

   var vertShader = gl.createShader(gl.VERTEX_SHADER);
   gl.shaderSource(vertShader, vertCode);
   gl.compileShader(vertShader);
   if (!gl.getShaderParameter(vertShader, gl.COMPILE_STATUS))
      throw new Error(gl.getShaderInfoLog(vertShader));

   // Create a simple fragment shader
   //
   var fragCode =
      'void main(void) {' +
      '   gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);' +
      '}';

   var fragShader = gl.createShader(gl.FRAGMENT_SHADER);
   gl.shaderSource(fragShader, fragCode);
   gl.compileShader(fragShader);
   if (!gl.getShaderParameter(fragShader, gl.COMPILE_STATUS))
      throw new Error(gl.getShaderInfoLog(fragShader));

   // Put the vertex shader and fragment shader together into
   // a complete program
   //
   var shaderProgram = gl.createProgram();
   gl.attachShader(shaderProgram, vertShader);
   gl.attachShader(shaderProgram, fragShader);
   gl.linkProgram(shaderProgram);
   if (!gl.getProgramParameter(shaderProgram, gl.LINK_STATUS))
      throw new Error(gl.getProgramInfoLog(shaderProgram));

   // Everything we need has now been copied to the graphics
   // hardware, so we can start drawing

   // Clear the drawing surface
   //
   gl.clearColor(0.0, 0.0, 0.0, 1.0);
   gl.clear(gl.COLOR_BUFFER_BIT);

   // Tell WebGL which shader program to use
   //
   gl.useProgram(shaderProgram);

   // Tell WebGL that the data from the array of triangle

   // coordinates that we've already copied to the graphics
   // hardware should be fed to the vertex shader as the
   // parameter "coordinates"
   //
   var coordinatesVar = gl.getAttribLocation(shaderProgram, "coordinates");
   gl.enableVertexAttribArray(coordinatesVar);
   gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
   gl.vertexAttribPointer(coordinatesVar, 2, gl.FLOAT, false, 0, 0);

   // Now we can tell WebGL to draw the 3 points that make 

   // up the triangle
   //
   gl.drawArrays(gl.TRIANGLES, 0, 3);
}

window.onload = main;
</script>

The first part of the HTML page creates the HTML element where the actual 3d drawing will occur. WebGL uses the canvas element to define its drawing surface, which can also used in HTML5 for doing 2d drawing.

<canvas width="500" height="500" id="mainCanvas"></canvas>

Then, we get into the actual code. First, we need to configure the canvas for use with WebGL instead of for 2d drawing by grabbing a WebGL context object that lets us invoke WebGL commands on the canvas.

var gl;
var canvas = document.getElementById('mainCanvas');
try {
   gl = canvas.getContext('webgl');
} catch (e) {
   throw new Error('no WebGL found');
}

Next, we have an array holding the coordinates of the three points that make up a triangle.
As mentioned in part 1, we need to copy this triangle data to the graphics hardware before we can draw it. This is done by using createBuffer() to tell WebGL to that we want to set aside some memory at the graphics hardware for our data, bindBuffer() to select this buffer as something we want to manipulate, and then bufferData() to actually copy the triangle data to the currently selected buffer in the graphics hardware.

var vertices = [
   0.0, 0.5,
   0.5,  -0.5,
   -0.5, -0.5,
];
var buffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(vertices), gl.STATIC_DRAW);

The “vertices” array that holds the triangle data is put inside a Float32Array object before being sent to the bufferData() call. This Float32Array object specifies how the array data should be laid out in memory. JavaScript is purposely vague about the exact memory layout of objects, but these details are important when working with graphics hardware. In this case, we specify that the triangle data should be stored as consecutive 32-bit floating point numbers.

As mentioned in part 1, the WebGL graphics pipeline requires us to define vertex shader and fragment shader programs in order to draw anything on the screen. We first define the vertex shader program.

var vertCode =
     'attribute vec2 coordinates;' +
     'void main(void) {' +
     '  gl_Position = vec4(coordinates, 0.0, 1.0);' +
     '}';

var vertShader = gl.createShader(gl.VERTEX_SHADER);
gl.shaderSource(vertShader, vertCode);
gl.compileShader(vertShader);
if (!gl.getShaderParameter(vertShader, gl.COMPILE_STATUS))
   throw new Error(gl.getShaderInfoLog(vertShader));

The actual code for the vertex shader is held in a string. The vertex shader allows us to move the points of a triangle or other small transformations before they are displayed. Each point that makes up a triangle is given to the vertex shader, and the vertex shader program returns the final position of the point, plus additional data that it may want to specify. In our case, we don't need to move the points around, but we need to put the data in a proper form for WebGL. WebGL displays the parts of triangles that fit inside the 3d cube between (-1,-1,-1) and (1,1,1). Our triangle coordinates are only (x,y) values, so we need to specify an extra z value so that WebGL can determine where the triangle is in 3d space. We just use 0 for this z coordinate. In fact, WebGL needs us to specify four values: x, y, z, and a fourth value that is normally always 1. So our vertex shader will take the 2d (x,y) coordinates for a point in the triangle and transform it to (x,y,0,1).

attribute vec2 coordinates;
void main(void) {
  gl_Position = vec4(coordinates, 0.0, 1.0);
}

Next we define the fragment shader.

var fragCode =
   'void main(void) {' +
   '   gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);' +
   '}';

var fragShader = gl.createShader(gl.FRAGMENT_SHADER);
gl.shaderSource(fragShader, fragCode);
gl.compileShader(fragShader);
if (!gl.getShaderParameter(fragShader, gl.COMPILE_STATUS))
   throw new Error(gl.getShaderInfoLog(fragShader));

A fragment shader controls the color of each pixel making up the triangle. For each pixel, a fragment shader should return four numbers describing the color: the amount of red, the amount of green, the amount of blue, and the amount of transparency. If we look at the code for the fragment shader, we can see that the fragment shader is fairly simple. It simply uses the same color for every pixel: an opaque white color.

void main(void) {
   gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);
}

After defining a vertex shader and fragment shader, we need to put these two shaders together in a single program for drawing things.

var shaderProgram = gl.createProgram();
gl.attachShader(shaderProgram, vertShader);
gl.attachShader(shaderProgram, fragShader);
gl.linkProgram(shaderProgram);
if (!gl.getProgramParameter(shaderProgram, gl.LINK_STATUS))
   throw new Error(gl.getProgramInfoLog(shaderProgram));

Now that we've copied the triangle data and shader programs over to the graphics hardware, we're finally ready to do some drawing. First, we clear the drawing surface to black so that the white triangle we're drawing will show up.

gl.clearColor(0.0, 0.0, 0.0, 1.0);
gl.clear(gl.COLOR_BUFFER_BIT);

Then we specify which shader program to use for drawing.

gl.useProgram(shaderProgram);

We then tell WebGL to feed our buffer of triangle data through this shader program. To do this, we need to specify how the values in this array of points should be given to the program. We want our point data to be given to the vertex shader as the “coordinates” variable. So we get a handle for this variable and configure the variable. We then select our buffer of data, and use vertexAttribPointer() to specify that the buffer should be divided into groups of two floating-point numbers, and these numbers should be fed into the shader program as the “coordinates” variable.

var coordinatesVar = gl.getAttribLocation(shaderProgram, "coordinates");
gl.enableVertexAttribArray(coordinatesVar);
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
gl.vertexAttribPointer(coordinatesVar, 2, gl.FLOAT, false, 0, 0);

Finally, now that we've properly configured everything, we can now instruct WebGL to actually draw something. We tell it to take the three (x,y) points in our array, feed them through the shader program, and draw the result as a triangle.

gl.drawArrays(gl.TRIANGLES, 0, 3);

So this is the end of the pre-tutorial. To include anything more would turn this into an actual tutorial.

Here is part 1 of the pre-tutorial in case you missed it.

Sunday, November 06, 2011

WebGL Pre-Tutorial, Part 1

I've recently tried to learn WebGL, and it turns out that it's pretty complicated. Most of the tutorials that I've found on the web have been geared towards getting the learner doing 3d as quickly as possible by skipping over a lot of details. This is probably what most learners want, but I find that this approach doesn't suit my learning style. I, personally, want to understand the details so that I can write my own 3d code instead of haphazardly cutting and pasting together code snippets that I've found on the Internet. People who just want to do 3d as quickly as possible would probably be better served by using a higher-level 3d graphics engine. The purpose of this pre-tutorial is to fill in some of the details that are left out of other WebGL tutorials. This should make it easier to understand what the code in those tutorials is trying to do. I will try my best to keep the explanations of this pre-tutorial as simple as possible, but WebGL is inherently a pretty low-level API, so it helps if people have some limited systems and 3d programming experience first.

WebGL is the JavaScript version of the 3d library OpenGL ES 2.0. OpenGL ES 2.0 is targeted as a low-level 3d API for phones and other mobile devices. Unlike the full OpenGL for desktop computers, OpenGL ES 2.0 leaves out support for high-precision numbers, which are mostly needed for scientific computation or engineering design, and it leaves out support for a lot of older library calls that aren't used in modern 3d programs. Despite this missing functionality, OpenGL ES 2.0 and WebGL are still great for games, and they are designed to provide good performance on modern 3d graphics hardware.

WebGL is designed for a hardware architecture like that shown below:


The architecture assumes that a machine's graphics hardware is separated from its CPU. The graphics hardware likely has separate memory from the CPU. The graphics processing unit (GPU) specializes in running multiple copies of a single program at the same time. Unlike a normal CPU, these programs must be small and simple, but a GPU can run many, many copies of these programs simultaneously, making it faster than a normal CPU for graphics tasks. With this sort of hardware architecture, one of the biggest bottlenecks for fast 3d programs is the communication between the CPU and GPU.

The general design philosophy of WebGL is based on the idea that this communication between CPU and GPU should be minimized. Instead of repeatedly sending instructions and graphics resources from the CPU to the GPU, all of this data should be copied over only once and kept on the GPU. This minimizes the communication that needs to happen between the CPU and GPU. The communication that does happen is batched together into clumps so that the GPU can work independently from the CPU.

WebGL Graphics Model


WebGL presents a certain abstract model of how the graphics hardware works to the programmer. In the WebGL graphics model, the window or area where 3d graphics are displayed is modeled as a cube. The (x,y,z) values of one corner of the cube is (-1,-1,-1), and the values of the opposite corner are (1,1,1).


An x value of -1 refers to the left side of the window while an x value of 1 refers to the right side of the window. Similarly, a y value of -1 refers to the bottom of the window while a y value of 1 refers to the top of the window. Different z values refer to how close or how far an object is.

You create graphics by drawing triangles in this cube. If you provide the 3d coordinates of the three corners of a triangle, WebGL will draw them on the window. Since the sides of the cube refer to the sides of the window, WebGL will chop off any parts of the triangle that fall outside of the cube.


Triangles are a little limiting, and it takes a lot of communication overhead to transfer all the triangle coordinates from the CPU to the GPU. To overcome this problem, WebGL offers two mechanisms for programming how these triangles are displayed.

The first mechanism is called the fragment shader. When drawing the pixels that make up a triangle, WebGL runs a little program for each pixel that says what color the pixel should be. You can have the triangle be a single color, rainbow-colored, or even a complex pattern.


The second mechanism is called the vertex shader. The main purpose behind the vertex shader is to reduce the amount of communication between the CPU and GPU. Suppose you have a complex 3d model made up of lots of triangles, and you send all of these triangles to the GPU to draw them. If you then want to move the model over to the left a little bit, you would normally need to change the positions of all the points of all the triangles and then send all those new coordinates from the CPU to the GPU. This is a bit wasteful because the coordinates of all those triangles are already stored at the GPU. You just wanted to move them a bit, but you had to send all the coordinates a second time. With a vertex shader, you can send a program to the GPU that will enable the GPU to rewrite the coordinates of all the triangles itself. This means you only have to send the triangles of the 3d model once, plus a little program for moving the coordinates of the triangles.



So the WebGL graphics model can be thought of as a pipeline that processes triangles through different stages. You initially provide WebGL with some triangles to draw. The triangles can be moved around and shifted by vertex shader programs. Any triangles that fit within the cube at (-1,-1,-1) and (1,1,1) will be drawn. WebGL does this by running a fragment shader program for each pixel of the triangle to determine which color to draw there.


Part 2 of the pre-tutorial demonstrates some code for a very basic WebGL program.

Saturday, September 24, 2011

Arabic Support in Java

So a few months back, I started working on an Arabic version of JavaScript, which required me to figure out how to write an Arabic Java program. To my surprise, Java actually has pretty lousy support for BIDI and Arabic.

Now, I always suspected that the old Sun severely understaffed its Java UI group and I always had this odd feeling that their UI programmers never actually programmed any real applications using their UI frameworks. The fact that Java, the major enterprise programming language of the last decade, still doesn't properly support Arabic even after all these years of development just tells you everything you need to know right there. IBM/eclipse, with its SWT UI framework for Java, has had proper Arabic support for a long time now. Instead of doing everything alone, why didn't Sun get help from others when developing Java? Why? Mind you, they rely heavily on proper OS support for Arabic while Java's Swing framework is designed in such a way that they have to reimplement Arabic support from scratch. And the Unicode character encoding was specifically designed so that text processing of Unicode is easy but the handling Unicode in a GUI is the most awfully complicated piece of code possible. But still, we're at Java 7 now, and they should have been able to get this basic piece of infrastructure supported by now.

It makes me wonder which programming languages computer science students are taught in the Arab world. It can't be Java. What's the point of learning a programming language that can't properly handle the input and display of your native language? I guess it must be C# or something.

Anyway, this is what I found. I was working in Windows in Java 6, so my observations may be specific to this combination.

In AWT, I tried setting the text direction using applyComponentOrientation(), but I could never get that thing to work. The text always stayed left-to-right using AWT widgets. In both AWT and Swing, there's a setLocale() method, but I have yet to figure out whether that method actually does anything. I suspect it doesn't do anything. So basically, I don't think there's any BIDI support in AWT. This is somewhat annoying because Windows has reasonable BIDI and Arabic support, so if Java simply passed on this text orientation information from AWT to Windows, then we could simply use AWT for our UIs, and everyone would be happy. Unfortunately, it doesn't.

In Swing, the applyComponentOrientation() method did seem to properly set the text orientation, so there's some good support for bi-directional text there. Unfortunately, for Arabic text, you also need support for text numeric shaping. Although some Arab nations use western number symbols (which, as you probably know, are called Arabic numerals, which makes things nice and confusing for everybody), most Arab nations have their own number symbols. Unicode, in its infinite wisdom, decided that these symbols will actually be encoded in character streams as western digits 0-9, but the UI will be responsible for automatically substituting in different number shapes when the characters are actually displayed. The Java graphics libraries actually have some support for numeric shaping when displaying text, but it's sort of broken. Java doesn't automatically extract region and locale information from the OS, so it defaults to guessing which numeric shapes it should use. If the string you're displaying starts with numbers, there's no initial context for the numeric shaper to use in guessing which number forms to use, so it will default to Western shapes. Also, the Java numeric shaper only reshapes numbers and forgets to reshape the decimal point and thousand separator. I tried a release candidate for Java 7, and this problem was still there. Does anybody at Sun/Oracle actually use their UI framework in real applications? Anyway, even this broken support for numeric shaping isn't actually enabled in Swing, so you can't display numbers in Arabic. In the Java 7 release candidate, programmers could manually enable Arabic numeric shaping in rich text widgets (i.e. JTextArea), but it wasn't clear whether that support was also added to other widgets like JButton or JLabel, etc. In any case, given the brokenness of Java numeric shaping, that isn't exactly a big win.

So in the end, if you want to build an Arabic GUI in Java, use SWT. Eclipse now has a free GUI builder available called Window Builder, and even though I didn't actually know any SWT, I was able to throw together a SWT GUI in a couple of hours with almost no work. GUI builders are awesome. Previously, I was always concerned that I would lose some flexibility in the code I could write, but the one in Eclipse is a real dream and saves a lot of time. I really should have paid money to buy one years ago.

Saturday, August 13, 2011

Getting Type 1 Postscript Fonts from pdflatex

For the past several years, I've been using pdflatex to generate PDFs from my LaTeX, and it usually works like a charm. All modern versions of pdflatex correctly use type 1 (vector) postscript fonts and not type 3 (bitmap) fonts.

Just today, I was generating a PDF for a camera-ready, but whenever I looked at the list of fonts in the document, I kept seeing type 3 fonts there. I couldn't figure out where those type 3 fonts were coming from. I initially thought they might be from some figures, but that ended up not being the case. After playing with my document for a while, I discovered that the problem was that I was using some non-English characters, which was causing problems for LaTeX. Apparently my default installation of LaTeX had type 1 fonts that only covered the standard English character set (plus some accents). When I used characters like French guillemets quotation marks, LaTeX would need to use type 3 fonts because those were the only versions of the fonts with those characters. Using \usepackage[T1]{fontenc} did nothing since my LaTeX installation simply didn't have the necessary fonts.

Fortunately, there is a package called cm-super that seemed to contain the necessary fonts. I couldn't figure out how to coax my Fedora TexLive distribution to download this package, so I tried to grab it in Windows using MiKTeX. Unfortunately, the package is fairly large, and MiKTeX kept killing my computer while trying to download it. The download would start quickly, but gradually slow down to a crawl while my cpu utilization would go upwards. I suspected MiKTeX was doing something silly like storing the file in memory, but in a linearly expanding buffer of some sort. My Thinkpad X61t has always had heat problems, and running with high cpu usage for such long periods kept causing my laptop to overheat and die before the download would finish. Fortunately, I was able to download the packages plus some index files directly from a mirror, and get MiKTeX to load the package locally instead of from the Internet. Some people also needed to do the usual MiKTeX oddness to get things to work right afterwards.

But everything seemed to work ok once I had cm-super installed. My PDFs contained only type 1 fonts. I really don't have a strong understanding of why cm-super isn't included in my default LaTeX installation. Are the fonts too big? Is the quality of the fonts not as good as the default ones? I have no idea. But it looked ok to me, so I'm going with it.

Sunday, May 08, 2011

How to Use Constructors and Do Inheritance Properly in JavaScript

With all my complaints about how difficult it is to find good documentation about how to do inheritance properly in JavaScript, I thought it might be useful to write a blog post about it, so then it would be easier (for me, at least) to find information about the topic.

Constructors

So, let's suppose you have a program, and you need to have a Student object. Students have a name and an array of courses that they are taking. The quickest way to create a Student object is to use the object literal syntax:

var student = { 
   name: 'Joe', 
   courses: [],
   addCourse: function(course) { courses.push(course); } 
};

If you need to create many Student objects, you can simply create a function that creates them for you:

function makeStudent(student_name) {
   return {
         name: student_name, 
         courses: [],
         addCourse: function(course) { courses.push(course); } 
      };
}
var student = makeStudent('Joe');

You can also use a constructor function, which essentially does the same thing:

function Student(student_name) {
   this.name = student_name;
   this.courses = [];
   this.addCourse = function(course) { courses.push(course); };
}
var joe = new Student('Joe');
var ann = new Student('Ann');
var sue = new Student('Sue');

There is a small problem with using constructors in this way. When you create many Student objects, the same name, courses, and addCourse properties show up in each Student. Everything still works, but it's not as efficient as it could be.


To make things more efficient, we need all the Student objects to share some properties instead of having separate, identical properties in each Student. For objects to share properties, we need to make use of JavaScript prototypes. In JavaScript, every object has a prototype object. When you read a property, JavaScript first looks if the property exists in the current object, if it doesn't, JavaScript then looks to see if the property exists in the chain of prototype objects. When you write to a property, the property is changed in the current object and not the prototype. Different Student objects can share the same prototype. Initially, all the properties of the prototype will appear as properties of the Student. As the Student objects change, they will get their own versions of the properties.

What all this amounts to is that prototypes are useful for describing the initial, "prototypical" view of what a certain type of object should look like. Over time, objects will change and stray from looking like the prototype, but that's fine.

For our Student example, let's first create a generic Student object that can serve as a prototype.

var studentPrototype = {
   name: 'Generic Name',
   courses: [],
   addCourse: function(course) { courses.push(course); } 
};

Now, in order to create objects that have this object as a prototype, you need to set the prototype field of a constructor function. When you create an object using the constructor, the constructor will set the prototype of the object to whatever its prototype field is set to. It can be a bit confusing. Setting the prototype field of a function does not change the prototype of that Function object. In general, JavaScript does not provide a variable where you can set or get the prototype of an object. You have to do it all indirectly by using these constructor functions.

var studentPrototype = {
   name: 'Generic Name',
   courses: [],
   addCourse: function(course) { courses.push(course); } 
};
function Student() {
}
Student.prototype = studentPrototype;

Now, when you create Student objects using the constructor, you will get objects that share a lot of fields.

var joe = new Student();
var ann = new Student();
var sue = new Student();


As you set the names of the Student objects, the prototype remains the same, but the name property of the individual Student objects will be set. This name property in the individual objects essentially "overrides" the name property of the prototype.

joe.name = 'Joe';
sue.name = 'Sue';


You can also set the name inside the constructor function, which will give the same result:

var studentPrototype = {
   name: 'Generic Name',
   courses: [],
   addCourse: function(course) { courses.push(course); } 
};
function Student(student_name) {
   this.name = student_name;
}
Student.prototype = studentPrototype;

var joe = new Student('Joe');
var sue = new Student('Sue');

Unfortunately, a problem happens though when you start dealing with courses.

joe.addCourse('math');
alert(sue.courses);

Somehow, Sue ends up enrolled in the math course, even though the course was only added to Joe. It seems that when modifying the courses property, the property of the prototype gets modified and not of the Joe object. Since all the Student objects share the same courses property of the prototype, it looks like Sue is taking math.


The problem is that the courses property of Joe is never written to. When addCourse is called, the method reads the courses property and gets an array. The "math" course is then added to this array. The array is modified, but the courses property is never written to, so Joe ends up modifying the prototype's array of courses instead of creating its own copy that isn't shared. To get around this problem, we should create a new array of courses for each Student, and assign it to the courses property in the Student constructor function.

Constructors Summary

When you want to create many objects with the same structure, you should use constructors and prototypes. You should take all the properties of these objects and divide them up into

  • properties that are constant and that can be shared between different instances of an object (i.e. methods)
  • properties that are unique to each object and that can be different in each object (i.e. data fields and pretty much everything that isn't a method)

Constant properties should be put into the prototype of the objects while the other properties should be set in the constructor function.

So suppose we want to be able to make multiple Student objects:

student = { 
   name: 'Joe', 
   courses: [],
   addCourse: function(course) { courses.push(course); } 
};

The method addCourse is the same between all Student objects, so it should appear in the prototype. The other properties will be set in the constructor.

var studentPrototype = {
   addCourse: function(course) { courses.push(course); } 
};
function Student(student_name) {
   this.name = student_name;
   this.courses = [];
}
Student.prototype = studentPrototype;

var joe = new Student('Joe');


Inheritance

Inheritance in JavaScript is less useful than in class-based object-oriented languages. JavaScript has a dynamic type system, so inheritance is not used to indicate which objects share the same interfaces. Inheritance is only used in JavaScript for code reuse. If you have many different objects that all have the same interface (e.g. in a graphics system, you might have objects for different shapes like lines and circles, and these shapes all understand a draw command), these objects do NOT need to inherit from the same base object unless these objects share code for implementing these interfaces.

The easiest way to think about inheritance hierarchies in JavaScript is to ignore it and to simply focus on the concept of prototypes.

Suppose we have a program that makes use of Student objects, which have the fields name and courses in addition to a method called addCoruse, as in the previous section:

function Student(student_name) {
   this.name = student_name;
   this.courses = [];
}
Student.prototype.addCourse = function(course) {
   courses.push(course);
};

Some of the students are special because take their courses via the Internet. For these students, we not only need to know their name and the courses they are taking, but also their e-mail information so that lectures and homework can be e-mailed to them. We'll use an InternetStudent object to represent these types of students.

function makeInternetStudent(student_name, student_mail) {
   return {
      name: student_name,
      courses: [],
      addCourse: function(course) { courses.push(course); },
      email: student_email,
      sendEmail: function(subject, body) { ... }
   };
}

To make a constructor for an InternetStudent, we need a prototype that serves as a good initial template for what an InternetStudent object looks like. Since an InternetStudent shares most of the properties of a Student object, we can use a Student as the prototype.

function InternetStudent() {
}
InternetStudent.prototype = new Student('GenericName');


There are two new properties in an InternetStudent that aren't in a Student: an emailfield that varies from student to student and a constant sendEmail method. The field should go in the constructor function while the method should be added to the prototype.

function InternetStudent(student_name, student_email) {
   this.name = student_name;
   this.email = student_email;
}
InternetStudent.prototype = new Student('GenericName');
InternetStudent.prototype.sendEmail = function(subject, body) { ... };

As in the section on constructors, you have to be careful when dealing with prototypes that you don't accidentally share data structures between different objects. In the above example, the courses array is shared between all InternetStudent objects because they all use the version in the prototype even though each InternetStudent should get their own separate array.

If you recall, the prototype contains all the properties of an object that shouldn't change between different objects while the constructor function initializes the properties that do change from object to object. The constructor function for Student objects creates new name and courses properties for each Student object. We can use that function to create new versions of these properties in the InternetStudent through constructor chaining:

function InternetStudent(student_name, student_email) {
   // Chain the constructor
   Student.call(this, student_name);

   // Initialize the fields that are unique to InternetStudent objects
   this.email = student_email;
}
InternetStudent.prototype = new Student('GenericName');
InternetStudent.prototype.sendEmail = function(subject, body) { ... };

var john = new InternetStudent('John', 'john@example.com');

And we're done.


Inheritance Summary

Inheritance is used for code reuse in JavaScript. In JavaScript inheritance, instead of using a plain Object as a prototype in a constructor, you create some other type of object to use as a prototype base instead.

As always with prototypes, you have to be careful not to accidentally share data structures in your objects. This is especially important when inheriting from something other than a plain Object because you may not know all of the internal data structures of the object you are inheriting from. If you are careful to initialize all non-constant data fields in your constructor functions, you can use constructor chaining to properly initialize these inherited internal data structures.

Monday, April 04, 2011

JavaScript Constructors Were a Mistake and Should Be Removed

For a long time now, I've been uncomfortable with JavaScript's syntax for constructors, but I could never quite figure out what was wrong with them. They always seemed sort of cumbersome and rigid. In fact, for most of my code, I created my objects almost exclusively using the object literal syntax instead of using constructors. The object literal syntax for creating objects simply seems so natural and spontaneous in comparison to using constructors.

JavaScript constructors simply feel like a weird hack. They're normal functions except you call them with new. In fact, all functions have extra fields for tracking prototypes that are only used when you use functions as a constructor. It also seems incongruent that JavaScript objects are not class-based and can be dynamically changed at any time, yet you're expected to put all your object construction code in one place in a single function. I would personally avoid using JavaScript constructors in all situations, but you can only make use of inheritance and reuse if you use constructors.

But constructors just end up feeling even more cumbersome once you start using them for inheritance. Theoretically, the inheritance model of JavaScript is very simple. Every object has a prototype object. When you use a property, JavaScript first looks if the property exists in the current object, if it doesn't, JavaScript then looks to see if the property exists in the chain of prototype objects. But when you actually go about trying to use inheritance in JavaScript, everything suddenly becomes really complicated. Every time I want to use inheritance, I always end up having to search on the Internet for tutorials about how to do it because it simply never sticks in my mind. There's something with the way constructors and inheritance are implemented in JavaScript that somehow makes the simple concept of prototype inheritance completely confusing to me. If you search for "JavaScript inheritance," you end up with all sorts of weird frameworks for abstracting away the inheritance process. It's actually somewhat hard to find good documentation on how to do inheritance (actually, the Mozilla docs that I linked to are very confusing as well because their description of constructor chaining is hard to grasp). I think that part of the problem is that the using constructors to create objects simply doesn't fit well with the prototype model of objects. If I'm creating an abstract base object that other things will inherit from, why do I need to create a constructor for allowing anybody to create multiple instances of that object, then use that constructor to make a single instance, and then inherit from that one instance? It just seems weird that if I want a single Employee base object that other objects will inherit from, I need to create an Employee constructor object, create the Employee, and then use this instance as a prototype in other constructor functions.

After many years of having these JavaScript issues gnaw at me, I've recently started to realize that my unease with constructors and inheritance wasn't a problem with me but a problem with JavaScript. Constructors are simply a mistake in the language and should be removed. It is, of course, impractical to literally remove constructors from the JavaScript language, but they should be deprecated, and all documentation should cease to mention them in reference to objects. JavaScript constructors attempt to impose a class-based object model onto JavaScript, but JavaScript is prototype-based, so it simply doesn't work. Because JavaScript is somewhat inspired by Java, there was an attempt to bring in some Java's object syntax into the language, but Java's constructors simply don't work in JavaScript. And I know that people really want class-based inheritance in JavaScript as evidenced by the fact that people keep trying to add class-based inheritance into JavaScript, but class-based inheritance doesn't exist in JavaScript now, so there's no point in trying to fit a square peg in a round hole. Fortunately, ECMAScript 5 has a new Object.create method that offers a reasonable alternative syntax for creating objects. This method can also be retrofitted into older versions of JavaScript as well (albeit with a performance penalty).

When using Object.create() to create objects, the code for creating an inheritance hierarchy then ends up closely matching the inheritance hierarchy itself:

Employee = {
      department : '',
      giveBonus : function(bonus) {...},
      handleVacation : function() {...},
   }

WageEmployee = Object.create(Employee, {
      wagePerHour : 10 
   });

SalaryEmployee = Object.create(Employee, {
      salaryPerYear : 50000 
   });

When you need to create multiple instances of the same type of object, you can still create a function to do that, instead of an explicit constructor.

function createSalariedEmployee(name, salary) {
   var emp = Object.create(SalaryEmployee);
   emp.name = name;
   emp.salaryPerYear = salary;
   return emp;
}

Sometimes you need to do constructor chaining in order to handle state that cannot be safely be stored in a prototype. For these situations, I would suggest having an initialization method in each object that can be chained (similar to Smalltalk conventions).

Employee = {
   __init__ : function(name) { this.name = name; },
   department : '',
   giveBonus : function(bonus) {...},
   handleVacation : function() {...},
}

WageEmployee = Object.create(Employee, {
      __init__ : function(name, wage) {
         Employee.__init__.call(this, name);
         this.wagePerHour = wage;
      } 
   });

var newEmployee = Object.create(WageEmployee);
newEmployee.__init__('John', 10);

I think this approach for creating objects is less confusing than using constructors. It's definitely a lot cleaner for singleton objects and for abstract objects. It may not be that great for flat hierarchies, where programmers want to create many instances of the same object, but it's not a big loss there either.

Friday, April 01, 2011

Scaling Up the Transitway

I was recently late for a train and had to rush across Ottawa from the west-end to the train station.Unfortunately, what is normally a quick dash across downtown on the Transitway turned into a grueling crawl. Most of the pain is entirely self-inflicted though. The Transitway buses ran on dedicated bus lanes through downtown, so the buses didn't have to weave through waves of cars. No, the problem was that there were just too many buses. The Transitway is used above its capacity, so during rush hour, the Transitway becomes a long line of bumper to bumper buses. The Transitway in downtown is just a single lane, so there's no passing of other buses, and every bus needs to stop at the bus stops, and then stop at the traffic lights, so you have to wait ages for every bus in front of yours to move forward a bit, load and unload passengers, wait at the light, and move on. If there are five buses in front of yours, you have to wait in turn for each bus to stop, board passengers, and then move on. It's brutal.

Given these capacity problems, it's clear to me that the current model for how to run buses along the Transitway through downtown just isn't very efficient any more. Ideally, more bus lanes would be provided downtown, but that isn't going to happen. The LRT through downtown will solve the capacity problems, but it'll be many years before that gets built, if ever. Until then, it would be nice if we could find some way to scale up the capacity of the Transitway a bit so that it can be used more efficiently.

I think it's time to remove the express buses from the Transitway in downtown. I know that one of the main advantages of a Bus Rapid Transit system like the Transitway is that people can take one bus from downtown and have it zip along a transit corridor, leave the corridor, and then deliver people directly to their houses. Transit users don't have to transfer, and they can time their arrivals at bus stops so that they don't have to wait around much. Unfortunately, insufficient capacity on the Transitway through downtown makes this model extremely inefficient. Traveling through downtown takes much too long, and all the traffic makes the arrival times of buses unpredictable. Previously, I've complained that Larry O'Brien's plan for having an underground downtown train line to be silly because it eliminates the advantage of express buses (i.e. no transfers) and makes it impossible for transit users to cross the city efficiently. But I'm now thinking that he might have been on to something. The capacity problems in downtown are so bad that the amount of time needed to get through downtown is beginning to outweigh the advantages of being able to take a single bus home from downtown. Having 20 different express buses all lined up with each one having to stop to let a few people on board, delaying all the buses behind it, just doesn't work. Instead, there should be single bus crossing downtown that moves people to a bus station/terminal where people can then transfer to their express buses. If there's only a single cross-downtown bus available, it would be more efficient because bus users would all simply board onto the first bus that arrives, and they'll fill the bus to capacity. There'll be no need to have a long line of somewhat full express buses inching across downtown. These cross-downtown buses could then deposit passengers off at large bus stations with bus loops where there's space for buses to stop and wait for large numbers of people to board buses. Hurdman and Bayview probably make the most sense as these bus transfer points. This proposal is similar to the initial phase of the current LRT plan in that there'll be a high capacity transit system just for downtown which focuses only on moving people to transfer points outside of downtown where people can transfer to buses for the rest of their journeys home.

It might also be possible to improve the efficiency of the downtown portions of the Transitway by having some sort of preboarding system. Other BRT systems do this, where they treat their bus systems like a subway. People have to pay before boarding a bus, and go through a turnstile into a loading/unloading zone. Then, when the bus comes, people can just jam themselves into the bus from whatever door is available since everyone in the loading/unloading zone has already paid. As a result, the loading and unloading of buses can be done faster.

Friday, February 18, 2011

HTML5 Audio is Still Too Immature to Use in Games

I made a little HTML game a year and a half ago that made some basic use of HTML5 audio, and it worked well enough, but it was very finicky, but I could see HTML5 audio eventually being useful once it matured.

Recently, I reprogrammed an old XNA game I wrote into HTML5 (I never want to go near XNA coding or the XNA community ever again), and I was hoping to program a much more extensive audio engine for the game. Unfortunately, I found HTML5 audio support to be just as finicky as before. Apparently the iPhone/iPad can't play more than one audio stream at once, meaning you can't play music and sound effects at the same time, so it's not possible to make HTML5 games that work well on the iPhone and iPad (basically, all of Apple's talk about HTML5 being superior to Flash on the iPhone and iPad was just propaganda and not reality). Firefox doesn't play short audio files, so the bips, clicks, and other sounds for button presses won't play. Chrome (and possibly Safari too) will not let you play an audio file more than once unless you serve the file from certain types of web servers. The audio in Chrome just sort of dies out after a while anyway, so you don't get any sound after a while. The Internet Explorer 9 Release Candidate seems to only support mp3 files for audio, and using mp3 files in games requires the payment of a licensing fee of $2500 to the mp3 licensing guys. So basically, it's not possible to build a very basic, reliable sound engine for HTML5 games right now.

This struck me as strange though because there are plenty of HTML5 demos and games that have extensive sound support. How can they pull off reliable sound while I can't?

They cheat, of course. All of these games make use of one of these common JavaScript sound libraries like SoundManager2. The dirty little secret is that these libraries will always use Flash for their sound unless Flash support isn't available. Flash is able to offer reliable, solid sound support across many platforms, so developers are able to build solid sound engines for their HTML5 games on top of it. If you try using pure HTML5 audio, things just fall apart.

Perhaps by this time next year, it will be possible to build a pure HTML5 game with good sound and graphics. I haven't used Firefox 4 yet, but I have high hopes for it. Until then, I think Flash is still the only viable platform for online games.

Monday, January 31, 2011

GWT Isn't a Good Environment for HTML5 Games

Last year, I made a small game in XNA. No one played it, so I've started porting it to be a web HTML5 game. Since the game was originally written in C#, I decided that the easiest way to webify it would be to rewrite the code in Java and then use GWT to translate the code into JavaScript. GWT is a set of tools from Google that let you write web code in statically typed Java. It then translates this code into cross-browser JavaScript for you.

After quickly discarding the GWT UI framework, I found GWT experience to be much smoother and much nicer than I had originally expected. Unfortunately, I've found that the current incarnation of GWT (GWT 2) doesn't work well for HTML5 games. The problem is that in development mode, GWT doesn't actually translate any of your code into JavaScript. It runs all of your code as regular Java, and then uses an intermediary layer to transfer manipulations of JavaScript objects or the DOM to a browser where the manipulation is done and the result transferred back to the Java world. Initially, I thought this wasn't a big deal because it only causes problems if you make an outrageous number of DOM API calls or decide to store a lot of things in JavaScript objects for some reason. Unfortunately, I found I was doing this alot in my HTML5 game. I was drawing lots of things to the HTML5 canvas, which requires lots of API calls. I was also using JSON for my save game data, which means you have to store a lot of data in the form of JavaScript objects.

As a result, my game code ran really sluggishly when run in development mode. This was especially true of Chrome whose sandbox design means that the Chrome GWT development plugin is particularly slow in transferring data between the browser and Java code. Doing my development with Firefox made things bearable, but I still found that I was optimizing things incorrectly. Things that seemed to be slow when running in development mode (e.g. the JSON game saving code appeared so slow that it would timeout the browser) were actually instantaneous when the code was properly compiled down to JavaScript. The overhead of interfacing Java code with a brower's JavaScript engine simply distorts performance information so much that it's hard for a developer to get a good feel for how a game behaves.

I understand why GWT is designed in this way. Most browsers don't expose JavaScript debugger APIs that would let a tool like GWT map lines and variables in JavaScript code to the original Java code that a programmer has written. Fortunately, browsers like Firefox are becoming mature enough platforms to have such APIs, so I'm hopeful that in the future, someone might reprogram GWT to actually translate Java code into JavaScript when in development mode and still let you properly debug it.

In the meantime, I'm going to finish coding up my game in GWT, and then go back to pure JavaScript games. I find the complete freeform, unstructured nature of JavaScript to be unproductive, but I'm wondering whether the static typing of Java is the best way to solve the problem. When I used to code in Smalltalk, everything was also dynamically typed, but it was a fairly productive environment to code in. Smalltalk organizes your code in a very structured way though, so it was easy to navigate the code and find things in it. Currently, the style of JavaScript code that I write is too freeform that even Eclipse with JDST can't analyze it too well and can only provide me simple ways to browse it. Perhaps I'll try writing my code in a more structured style to see if a proper code editor can extract useful structure from it, thereby allowing me to navigate and code JavaScript code more productively.

Wednesday, January 05, 2011

Rhino JavaScript security

My programming website programmingbasics.org contains a Java applet with a code interpreter for running user code. Users will not only run their own code, but possibly code from other people as well, meaning that they might be exposed to malicious code. The user is kept safe though because the code interpreter runs as part of an applet, meaning everything runs within the Java security sandbox.

For many years, I've been planning on making a standalone-version of my applet that can be easily downloaded and run as an application, but I've been concerned about security issues. I want user's to be able to run random code that they've found on the Internet without having to worry about the code infecting their systems with something. Without Java's applet security sandbox, my application would have to create its own sandbox. I always assumed that with Java's multiple layers of security, that I would be able to cobble something together. In the end, due to a convoluted API design, it seems that Java's security system is much less flexible than I had originally thought, meaning it's not really possible to do something like lower your own security permissions or to chroot yourself. I think the actual security mechanism in the VM could support this, but the APIs that Java exposes don't let you access such functionality.

The main security issue that I'm trying to protect against is that I want to let users run potentially malicious code in the interpreter. This interpreter has to call into my own code to access certain features. I'm too lazy to properly secure all of my own code, so I want to sandbox the interpreter code from my own code so that potentially malicious code can't muck around with the public fields of my objects and play with my inner classes to trick my code into doing something unsafe. So basically, I need a mechanism that allows me to take part of my own code, declare that I don't trust myself, and lower my permissions for that portion of code.

Based on what I can understand from the security documentation I've read, there are two primary mechanisms that Java uses to secure itself. The first is a namespace mechanism where different threads can be given access to only certain classes (or different versions of classes). This initially sounded like a great way of separating out my code from the interpreter code. My code would simply not be visible to the interpreter code, meaning that I wouldn't have to bother securing my own code. I would only have to create a hardened API for interfacing the interpreter with my own code. The second mechanism is a permissions mechanism where every class has an associated set of permissions. Whenever a potentially dangerous operation is being performed, the permission framework will go through the stack, find the class/code on the stack with the lowest set of permissions, and only allow the operation to proceed if the permissions are sufficiently high to allow it. So for my interpreter thread, as long as I could create a class with no permissions and then slip this class at the base of the interpreter thread's stack frame, then the interpreter wouldn't be able to do malicious things.

So with these two mechanisms, I could use permissions to prevent the interpreter from doing anything bad and use namespaces to prevent the interpreter from tricking my own code into doing bad things. Unfortunately, although this sounds theoretically great, I couldn't quite make the Java APIs do this for me. It seems like the API was mainly designed so that the Java VM and library could secure itself in applets. If programmers want to use the same mechanisms to secure their own code, you have to jump through a lot of hoops. The main problem seems to be that the Java VM loads the application's code with the system class loader. This means that the application code is basically considered to be as trusted and as secure as Java library code. You can't easily create a new thread with a new namespace with fewer classes and where existing classes are relabeled with lower permissions. It's probably possible to do some crazy classloader voodoo where my code is packaged in a separate jar and the interpreter is in its own jar and then a special bootstrap jar will piece together the other jars in some sort of secure way, but it's messy, hard to debug, and hard to distribute all these jars to end-users (I think this is how Java application servers do their security though).

If I spent enough time thinking about class loaders, I might be able to figure out a way to solve it, but I was able to put together a solution that presumably has similar security but doesn't require so much mental gymnastics. The interpreter I use for programmingbasics.org is the Mozilla Rhino JavaScript engine. The interpreter has a ClassShutter which restricts which Java classes that user scripts can have access to. Assuming that the Rhino interpreter is properly secured, then setting the ClassShutter to prevent access to any Java classes should prevent user code from accessing my own insecure code except through well-defined and secured APIs. This should provide equivalent security to namespaces. I still made use of the Java permissions security mechanism, but that only required me to find a way to use class loaders to load a single class with reduced security. Basically, I created a class that implemented a proxy for java.lang.Runnable and compiled it by hand. I renamed the resulting .class file to a .bin so that the system class loader wouldn't prevent my class loader from seeing the file. I then created a classloader that would intercept attempts to create that class and create a version from the .bin file instead with lower permissions. In order to make sure you use the version of the class loaded by the custom class loader (the one with reduced permissions) and not the system class loader (the one with full permissions), you have to carefully use reflection to get the class loader to load its version though. When creating the interpreter thread, I start the thread off by running this class, thereby inserting these lower permissions at the base of the interpreter's stack frame.