Wednesday, February 20, 2013

Observations about Teaching Programming to Kids

For the past year or so, I've been acting as an assistant at various events that teach programming to kids. As an assistant, I don't actually directly teach anything up front, but stay in the back helping the kids, which allows me to see what material the kids seem to enjoy and understand and what material the kids find boring or complicated. It turns out that teaching programming to kids is tough! There are a lot of complications and constraints that you need to keep in mind when designing a programming for kids. So I'm using this blog post to gather together some of my observations so far.

Setup

  • Programming courses are often taught as "one-off" enrichment opportunities for kids. As a result, you only have 1-2 hours to teach everything that you're going to teach. You can't really teach material that gradually builds up to something advanced and cool. You have one go to teach kids enough to be self-sufficient in some task and you need to inspire them to pursue additional learning on their own.
  • The kids are usually fairly young, definitely pre-highschool. High school kids are usually "too cool" to get dragged off to such events (they'd be scared away by all the younger kids anyway), and are usually mature enough to simply learn programming on their own anyway.
  • Kids are often ok with installing all sorts of weird crap on their computers, so as long as you're there to help with the installation, you can install all sorts of weird IDEs if it helps.
  • On the other hand, some of the kids have really old, crummy computers laden with viruses, so it's better to stick with simple, well-tested things that will work even on old computers. Definitely don't write your own software for such an event because you won't have time to test it on all the weird bizarro configurations that the kids are going to come in with. 10 year old Macintosh, Chromebook, IE5, iPad, Linux-boxen, brand-new ultra-locked down versions of Windows that they don't have the password for, etc.

What to Teach

  • There isn't enough time to learn any advanced concepts, and kids aren't impressed by the inherent "beauty" of certain programming concepts. You need to teach them how to do something concrete that gives them constant feedback on their progress. Teaching things like HTML or graphics works well. 
  • Kids can't draw, so teach them how to use images. Better yet, teach them how to include random images they find on the Internet, then they can really customize things and make their creations more personal to them.
  • Although kids will often find displaying text on the screen boring, kids are easily amazed at stuff that programmers find boring and trivial. Drawing circles, displaying text on the screen, changing the headline on the webpage of a newspaper, putting stuff on the Internet that other people can download. This provides kids with a sense of empowerment. Professional programmers have lost this sense of wonder a long time ago.
  • Kids will have different levels of experience and will progress at different rates. If you only have a single "project," some will finish quickly while others will fall behind. Ideally, you should have an open-ended project where kids who finish early can keep adding their own stuff to their project while other kids are still working through the more basic material. It might also be possible to have a mix of both basic and advanced material in the same course (those who fall behind can skip the advanced material), but I'm not sure how well that would work.
  • Kids are willing to accept things on blind faith and are ok with using stuff they don't understand. They're fine with blindly typing in a magic recipe that makes the computer do interesting things. Most of the things they encounter in life are already like that, so it's ok to them. Cars move around as if by magic, babies are delivered by storks, monsters are under the bed, typing in five lines of incomprehensible math magic makes the computer do something cool. Whatever. Often, you can use this as a short-cut to teach them how to do something cool even if the material is way too advanced for them to understand.
  • Do you know what's even better than displaying some text? Cat pictures. With a teddy bear background. And flashing text. And the images move five pixels to the left when you put your mouse over them. And there's a dinosaur that randomly zips around the screen. And let's have some background music. Awesome. These are kids. They aren't interested in publishing essays on the Internet about saving the environment. They want ninja cat pictures with jetpacks. Give it to them.

How to Teach

  • Teach concepts concretely--don't teach things as abstract concepts. Don't go from abstract to concrete. If possible, try to go concrete to abstract. Get them to type something and see what it does. Then help them generalize the concrete instance to the abstract.
  • It's important to break things down to be as simple as possible. And kids often don't know a lot of math at this age either (don't include math or algebra or decimals or possibly even negative numbers)
  • Try and avoid building advanced material on top of simpler material. Some kids will fall behind on the simpler material and will just be lost for the advanced material.
  • Kids learn by doing, so don't lecture. They'll zone out and get bored. I'm an adult, and even I often zone out and get bored if a lecture goes on too long.
  • Children are pretty trustful of authority, so they'll make an honest attempt to follow along with your lesson regardless of how ill-prepared and incomprehensible you are. So even if you botch things up, it's ok because the kids will blame themselves and not you if they find that they can't understand the material. I'm not sure if this lack of meta-learning might extend to an inability to do self-assessments as well.
  • Kids often don't ask for help if they fall behind. Instead, they'll just sit quietly and fumble along. Along these lines, telling kids to "put their hands up if they're behind or need help" isn't going to work. Many kids would die of embarrassment if they put up their hands. It might be better to ask kids the opposite (put up your hands if you've finished), but I'm not sure if that would be effective either.
  • Be flexible and be observant. Often, the class falls into complete disarray after the first hour is done. There's no need to punish everyone further by plowing ahead. Adapt your material, or if that's not possible, just end it.
  • It's ok to let kids just do their own thing. Kids learn by doing, and if they're happy taking a few things you've taught them and making it their own by using those few small elements to make something meaningful to them, that's great. You've only got an hour or two, there's no need to push them to do harder stuff. Try to adapt your material to what they're doing.
  • Source code is hard to understand (but this is true of adults too). When you show source code, you must SLOWLY go through every single line and statement
  • Many presentations involve the presenter typing up code live on the screen. Doing this helps you slow down and helps kids know where to type things. But the text is usually too small (IDEs often falls apart when you make the text big enough to read from the back of the room), and you still type much faster than a kid can type, so they will usually fall behind.
  • Young kids can't really type. Programming environments like Scratch that have very little typing are great. I can't recommend Scratch enough. It's very consistent and discoverable. If you show kids a few small tricks, and just let them loose, they can usually work their own way into figuring out how to do cool stuff on their own. You only need to intervene every once in a while to show them more advanced tricks they can use or to direct them into building more fruitful things.

Monday, October 01, 2012

Scarborough Subway Plans

During the last few months, talk about building a subway in Scarborough has been cropping up again. Personally, I find this a little worrisome. Although my usual philosophy towards transit is that is that building anything is better than building nothing, building a subway is Scarborough just seems like a big waste of money that will sap money away from more important projects. Subways exist because they provide capacity and speed for transit users. There is no evidence that Scarborough is in need of either. If there were capacity problems, the TTC would be running high capacity buses in dedicated bus lanes, but they're only starting to consider deploying high capacity buses now. If speed were a problem, the TTC would offer a highway 401 bus or have special buses to redirect people to GO train stations, but they aren't. The reality is that suburbs are designed for cars and building a mass transit system there for moving hundreds of thousands of people would be a waste of money. In order to achieve increased capacity and speed in a transit system, you need space. In high density areas, there's no space, so you need to build subways. In the suburbs, there's plenty of space, so there's no need to build subways to achieve capacity and speed goals. For the price of a subway, you could probably just buy out whole suburban neighbourhoods, plow them under, and build a fast surface bus or train instead.

But to make sure I wasn't judging anything unfairly though, I decided that maybe it would be good idea to run some simulations to see what the effect of different subways plans would be. Following the same experimental methodology from my previous simulations, I ran some simulations of the effect of different plans on transit times to downtown. I believe that all of these plans would end up costing around $1bn-$2bn. Following the METROLINX guides, surface LRTs are modeled as having an average speed of 22km/h while grade-separated LRTs and subways are modeled as having an average speed of 32km/h. The future Eglinton LRT is included in the simulations even though it does not offer any benefits to people in Scarborough going downtown (though it would useful for people going to midtown).

Current System

As a baseline, here is a map of transit times of the current transit system in Scarborough. This system is actually unsustainable because the Scarborough RT line from Kennedy station to Scarborough Town Centre is obsolete, over-capacity, and needs replacement.

Transit times to downtown.
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

From the map, it can be seen that most transit users in Scarborough need about an hour to reach downtown. Most transit users don't live near a transit line, so they need to take a bus to the Scarborough RT line or to the Yonge line in order to reach downtown (people north of the 401 who don't live on Sheppard generally don't take the Sheppard subway because it only comes once every 7 minutes, so it's faster and more convenient for people to take an express bus direct to the Yonge line). Scarborough is laid out as a grid of major roads. Inside each grid square is a tangle of residential roads that are generally impassable by buses. As such, a transit user must walk to the nearest major grid road and take a bus from there. I suspect that this is part of the reason for the short distance between bus stops (as compared to suburban bus systems in other cities). Although this results in slower buses, transit users already have to walk some distance to reach a major road where buses run, and it would be unfair if they have to walk another half kilometre along the road to reach a bus stop. Buses in Scarborough generally run along these major roads, forming a grid pattern. One exception to this pattern is that many buses are rerouted to Scarborough Town Centre, which is intended as a transit hub for Scarborough.

Sheppard LRT with SRT Revamp and Extension

The default transit extension that will be built by the Ontario government if there is no political interference is an LRT along Sheppard and an extension of the SRT to Sheppard and Markham. The Sheppard LRT is modeled as coming with a six minute frequency, similar to the frequency suggested by METROLINX for the Eglinton LRT. The frequency of the SRT was not adjusted (I think it has a 3-4 minute frequency during rush hour) even though it will likely come less frequently after the revamp because newer high capacity trains will be used on the line after the change.

Transit times to downtown with a Sheppard LRT and SRT extension.
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

The Sheppard LRT does not provide any benefit over the existing system. Current Rocket express buses already reach an average speed of around 20km/h, which is similar to the speeds expected to be achieved by the LRT. The Rocket express buses also come more frequently than the LRT, so the LRT won't provide any benefit over the existing bus system in terms of transit times. The LRT will likely be cheaper to run than the bus system though. The SRT extension does seem to save transit users living on Markham south of the 401 about 10 minutes. People who live along Sheppard in Malvern will also see savings of about 5 minutes in transit times. These time savings might disappear though if the revamped SRT is run at a lesser frequency than it currently does since all the time savings of having a faster mode of transport will be lost by having to wait longer for the train to arrive.

Sheppard Subway Extension to Victoria Park

Historically, the long-term plan for a subway in Scarborough is an extension of the Sheppard subway to the Scarborough Town Centre. This would be prohibitively expensive, so current suggestions are that an extension of the Sheppard subway to Victoria Park could be used as an intermediate step to demonstrate the benefits of a full subway.

Transit times to downtown with a Sheppard subway extension to Victoria Park.
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

A subway extension to Victoria Park provides a ten minute in transit times for people who live at Victoria Park. People who live on Victoria Park near Sheppard will see a 5-8 minute improvement since they don't have to change buses at Sheppard in order to get to the Sheppard subway at Don Mills. People living on Sheppard west of Agincourt will see a modest improvement in transit times since they can transfer from a bus to a subway at Victoria Park instead of having to cross the 404.

OneCity Subway

Recently, there has been a suggestion that the SRT should be scrapped and the Bloor subway should be extended to Scarborough Town Centre and then up to Sheppard. This will supposedly result in faster travel through Scarborough because no transfer from the SRT to the subway will be needed at Kennedy subway station.

Transit times to downtown with the OneCity extension of the Bloor subway to Sheppard.
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

Eliminating the SRT results in a sharp reduction in service in the area around the SRT. There is also a surprising increase in transit times for people living in Agincourt since they no longer have a short bus ride to the SRT. Instead most of the transit benefits of having a nearby transit line are pushed east to the Scarborough hospital and Malvern West.

"Subway" to Pacific Mall

Out of curiosity, I decided to model the effect of scrapping the SRT and building an LRT along the train corridor from Kennedy to Pacific Mall in the north. One problem with building rapid transit lines in Scarborough is that the main transit hub for the area, Scarborough Town Centre, is in the middle of nowhere. It is not near the major transit corridors of Scarborough, so many buses actually have to be redirected from their natural routes in order to reach there. The only major transportation corridor that the Scarborough Town Centre is near is Highway 401. As can be seen from the 2006 strategic plan for the SRT, 75% of SRT users ride a bus to reach it with users mainly coming from Malvern, Malvern West, and the Milliken area. So instead of designing a squiggly line of a system, I decided to see the effect of building the longest, cheapest, grade-separated "subway" possible that would intersect as many existing bus lines as possible--i.e. an LRT along the current train corridor from Kennedy to Pacific Mall. Such a line has more potential for provincial and federal government support since it serves multiple regions (Markham and Scarborough) at a modest cost, can be easily extended later on to the Highway 7 BRT, and much of the electrical infrastructure will be needed in the future anyway for the GO train network. This LRT was modeled as having a frequency of six minutes. Unfortunately, it's not really possible to model the effect of such a transit line accurately since many of the current bus lines are designed to send people to Scarborough Town Centre; whereas, in this system, buses would mainly run east-west so as to dump people off at the rapid transit line running through the middle of Scarborough.

Transit times to downtown with an LRT along the train corridor to Pacific Mall.
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

Unfortunately, the simulations show only modest benefits from such a system. Near Sheppard, the train corridor is a little too far from Kennedy and Midland to make walking there worthwhile as compared to taking a bus down to the SRT. At a frequency of every six minutes, transfers to the line take too long, and transit users will often find it easier to simply take a bus directly to the Yonge subway. Arguably, the same thing will happen in a revamped SRT if it has a six minute frequency as well though. I think this scenario is too difficult to simulate using my current approach, so it's impossible to compare it with other approaches.

Conclusion

The pressure to build a subway in Scarborough seems to be based primarily on politics than on a genuine need. They feel that they deserve a transit showpiece comparable to what's available in northern and south-western Toronto. This is not a bad thing, but it means that potential transit plans should be evaluated primarily in terms of minimizing construction and operating costs. The cheapest transit line that satisfies the political needs without damaging the existing transit system too much.

For this reason, I actually think the Sheppard LRT and SRT extension project is probably best. Surprisingly, this project will actually result in worse transit times for people living in Scarborough because the increased capacity of the new lines will mean that transit will likely be less frequent than the current system. The project will likely have lower operating costs than the current system, which is the main benefit of building it. The grid spacing of Scarborough also seems to favour east-west transit lines over north-south transit lines because it's often a shorter walk to get to a north-south bus line (which can then take you to an east-west rapid transit line) than to get to an east-west bus line. To save even more money on construction costs, it might even make sense to abandon the Sheppard LRT and only build the SRT extension since the Sheppard LRT doesn't actually do anything interesting from a transit perspective.

Of the two plans that actually involve building real subways--the OneCity plan and the Victora Park extension--I actually prefer the Victoria Park extension. The OneCity plan is more expensive and the expense mostly goes towards shifting an existing transit corridor 2km to the east for no particular benefit. Overall though, I think both plans are bad because they are expensive, unnecessary, and will likely increase operating costs for the TTC. In any case, neither of these plans are feasible because neither the city nor the province have the means to pay for it. The province has run out of money, and my understanding is that they can only find money for new construction through the use of accounting tricks. They are mainly interested in building self-contained new transit lines that can be leased to private industry or mortgaged, thereby allowing the province to hide the cost of the construction by spreading it over many decades instead paying for construction upfront or something complicated like that. Subway extensions aren't self-contained, so they can't be owned and operated independently by private companies, so they don't qualify for accounting trickery, so the province can't fund them.

I'm still partial to my idea of building a "subway" to Pacific Mall. It's a big transit showpiece that can be built at a modest cost, the surrounding area are likely good candidates for densification, and I believe it would be eligible for provincial funding. Unfortunately, it's hard to simulate the effect of such a massive change to transit, so its effects on transit usage is unclear. It might be a little bit too close to the Yonge subway line, so people might still prefer taking a bus to Yonge and riding the subway downtown as opposed to taking a bus to the train corridor, taking a train to Kennedy, taking a train to Bloor, and then taking the Yonge subway line down into downtown. Also, the abandoning of rapid transit to Scarborough Town Centre is likely too politically controversial for politicians.

Tuesday, August 21, 2012

JavaScript is Over: We Just Don't Know What to Replace It With Yet


Although other people (namely Google) have been advocating this for a while, I've only come around to this way of thinking recently, but I think consensus is gradually building in the community that JavaScript is over. In the early days when people were doing simple little scripting with JavaScript, it was a fine language. But as people start writing larger and larger programs with the language, it's showing its weaknesses, and it's becoming clear that JavaScript simply can't handle programming in the large. In the past, there was some talk that software engineering features could be grafted on to JavaScript (the infamous ECMAScript4), but I think researchers are now beginning to realize that the JavaScript model is fundamentally too messy to make such an effort worthwhile. Just as people don't write databases using bash scripts, you just can't have a large million line program developed by hundreds of people written in JavaScript. It's a maintenance nightmare. Even the most radical of the proposed reforms for JavaScript would only improve the situation in minor ways.

The problem is that JavaScript is simply too dynamic. There are static languages, there are dynamic languages, and then there's JavaScript. Once you get to hundreds of thousands of lines of code written by a team of hundreds of people, it's no longer possible for any one person to understand the entire code base. An individual programmer will at best understand the general shape and structure of the code base and understand the details of only the tiny section of the code base that he or she is working on. As such, they are almost entirely dependent on programming tools to help them work with the code. Since they can't understand the interdependencies of the codebase, they need tools to tell them whether changes will likely result in errors or whether a refactoring is safe or whether certain assumptions about how the code is structured is actually true or not. With statically-typed languages like Java, C#, and C++, many tools exist for finding errors in code and performing refactorings and visualizations. Even with older dynamic languages like LISP or Smalltalk, these sorts of tools existed and were quite effective. JavaScript, though, is part of the modern wave of dyanamic languages that have been designed to be flexible and stuffed with features so as to accomodate a wide variety of programming styles. Unfortunately, this flexibility makes building good JavaScript tools extremely difficult. And without these tools, it's not possible to scale up a language for use with large code bases.

The heart of the problem is that JavaScript is too flexible and too dynamic. Duck-typing can be used to change any object from one type to another at runtime. There are no guarantees about which methods can be called when. There are no constraints as to what can happen within functions or not. In past dynamic languages, there have been constraints on the programming paradigms that allowed programmers and their tools to make assumptions about the behaviour of programs. For example, LISP is considered a very dynamic language, but since it's designed around the functional programming model, functions do not have any mutable state, which makes it easier for programmers and tools to understand LISP code. Similarly, Smalltalk, an early dynamic object-oriented language, did not allow methods to change during runtime, an object's data could only be accessed by the object itself, and most method calls occurred through a well-defined mechanism, making code analysis much simpler. JavaScript, by contrast, supports both functional and object-oriented programming, but has none of the accompanying constraints of these older languages. Without these constraints, it's very hard for programmers or their tools to understand how code works. Even simple queries about where a method is used or what methods are available on an object require complicated pointer analysis to solve, if they are even solvable. I think that the key indicator as to whether a programming language can scale to million line programs is whether there are any reliable refactoring tools available for that language. If no such tools exist, then the language is likely too wild for any programmer or their tools to be able to work with a million lines of such code.

In the past, it was thought that JavaScript would be salvagable by adding new features to the language such as modules, sealed objects, types, etc. I now realize that the problem is that these new features are additive. They add new capabilities to the language; they don't impose any additional constraints on how programs are structured. As such, the problem of analysing and understanding million line JavaScript programs becomes harder with these new features, not easier. Types do not help refactoring tools unless types are mandatory. Unless all objects are sealed, it doesn't help a programmer understand whether an object they are using has the correct interface or not at a certain point in the code. These new features do not add new constraints that reduce the search space that a programming tool needs to analyse to understand a piece of code.

Better programming practices might help a bit. But it's cumbersome and error-prone. For example, an extensive set of unit tests can provide similar error-checking as static type-checking. But the effort required to maintain these unit tests is high, and if any area of the code is missing unit tests, then programmers and their tools can no longer rely on unit tests to check their errors.

Flexibility and expressiveness are wonderful features in small languages, but they cause maintenance nightmares when these languages are scaled to giant million line programs. JavaScript is a very expressive and powerful language, but these same characteristics will prevent it from scaling to the sizes that programmers want. Programmers are beginning to realize this. Unless there's some great advance in programmer tooling that can handle the complexity and flexibility of JavaScript, programmers will have no choice but to start using alternate programming languages. These languages will have to compile down to JavaScript so that they can be deployed in web browsers. JavaScript then becomes only a deployment language or an intermediate representation. The real development is done with another language. It's not clear which of these alternate languages will win out, be it C#, Java, Dart, or perhaps some new language in the future, but I think this will be the future trend. It's probably best if new language features aren't added to future versions of JavaScript, but only new engine features or speed improvements. For example, if everyone is coding in alternate languages, then no one will use new syntactic sugar, but features like weak hash maps will greatly augment the types of alternate languages that can be built on top of JavaScript.

Update 2012 Oct 29

So I've been thinking over the problems with JavaScript as a programming language, and these are some things that I think will likely be changed in any JavaScript replacement.

to be removed: Objects as Associative Arrays
to be replaced by: Dictionary Objects, Weak Hash Maps
Associative arrays are extremely useful when programming. But having every object be an associative array is just too much. If you need an associative array, you should create an AssociativeArray object. There is no need to burden every single object with associative array capabilities. Associative arrays are widely used to store supplementary information in the DOM though, so some sort of weak map functionality will also need to be provided to replace this.


to be removed: Prototype Objects
to be replaced by: Class-based Objects
unsure: Methods are Just Functions Inside Objects
If objects are no longer associative arrays, it weakens the need for prototype based inheritance. Class based object systems are much more well-studied and understood than prototype based systems, so moving to such a system should help the building of new tooling significantly. The fact that JavaScript methods are just data members that point to function objects may or may not be a barrier preventing a move to a class-based system though.  It might be necessary to have separate notions for function and method.



to be removed: Object Literal Syntax
to be replaced by: Anonymous Classes
JavaScript's object literal syntax is a great, easy way to make small, ad hoc, short-lived objects. But once you move to a class-based system, the current behavior of object literals won't work. They'll probably have to be adapted to some sort of anonymous class system.


unsure: Automatic Casts, Automatic Semi-Colon Insertion, Val Scoping, Property Accessors, Variable Function Parameters
JavaScript has a bunch of other features that have long been troublesome such as the ones above. But I'm not sure if these features actually block the deep analysis of JavaScript code, so it might be possible to keep them if people actually want them.

to be added: Modules, More Declarative Syntax
JavaScript currently doesn't have a good mechanism for breaking code into independent chunks. A good module system should fix that. Continuing with that direction, it would also be useful to provide a more declarative syntax for the language because it would allow code and data structures to exist and be analysed independently of other neighboring code.

Thursday, August 16, 2012

Population Density Map for Toronto by Neighbourhood, 2011

Although the City of Toronto publishes all sorts of weird maps of Toronto, they haven't published a population density map using the most recent 2011 census data yet. They have published all sorts of related maps like population growth rate maps, senior maps, and children maps, but no matter how deeply I dig through their demographics website, I can't find a population density map using the 2011 data. But the City did publish the population of each of the 140 neighbourhoods in Toronto, so I thought I'd just generate my own population density map. I also grabbed Toronto's Open Data map information about the shape of each neighbourhood. So, all I needed to do was to parse the neighbourhood data and output it as svg. The shape data used a universal transverse mercator coordinate system, which means I could just calculate the areas of the shape polygons directly without really worrying about the curvature of the Earth and still get reasonable numbers. And from there, it was easy enough to calculate the population density.

Number of people per square kilometre (Contains public sector Datasets made available under the City of Toronto's Open Data Licence v2.0. Users are forbidden to copy this material and/or redisseminate the data, in an original or modified form, for commercial purposes, without the express permission of Statistics Canada. Information on the availability of the wide range of data from Statistics Canada can be obtained from Statistics Canada's Regional Offices, its World Wide Web site at: www.statcan.ca and its toll-free access number 1-800-263-1136.)
I sketched in the approximate locations of the subway (plus future extensions) so as to give the map some identifiable landmarks. The colour range is linear except for St. James Town, which had such a high population density of over 40k/km^2 that it skewed the colour range too much, so I had to treat it specially.

From looking at the City of Toronto's other maps, it looks like they have population data at a much more fine-grained level available, but I was restricted to individual neighbourhoods, which can be quite big at times and which sometimes encompasses mixes of large apartment blocks as well as low-density housing. Also, the data only shows residential density. The City has employment survey data which shows all the employment centres in the Toronto, but they don't have it in a form that I could process and overlay on top of this map.

The map is sort of limited, so it's not too interesting. I'm not sure if any interesting conclusions can be drawn from it. I guess I was previously curious as to why so much attention was being dedicated to improving transit in Scarborough in the east but not to Etobicoke in the west. But from the map, I can now see that the east end is much bigger with a higher population density, so it probably makes sense to focus efforts there. From the map, the case for a downtown relief line that extends north of Bloor in the east doesn't seem to strong since the density doesn't seem that great as compared to the west side, but the Don Valley could be screwing up the density numbers, and the density numbers don't include information about workplaces.

Tuesday, July 31, 2012

Switch Performance in JavaScript

Recently, I've been running the Mozilla Rhino JavaScript interpreter as a JavaScript program in a web browser, and the performance has been awful. I haven't really dug too deeply into the performance yet, but one thing I've been curious about is the effect of the giant switch statement in the middle of the interpreter.

When writing a bytecode interpreter in C or C++, the switch statement is usually the way to go because you can rely on the compiler to use perfect hashing to make the best lookup tables, and switches in general are very light-weight structures that have performance similar to hand-coding a bunch of goto instructions. Since JavaScript optimizations are done at runtime, it's possible that using a giant switch statement with multiple cases handling the different bytecode instructions might not be optimized correctly.

switch(instruction) {   case jump: ...   case add: ...}

So, to test things out, I created a bunch of different test cases inside jsperf, and I tested them out. Here are the results:


The larger the number, the better. The bars represent:
1. dense array
2. dense switch
3. giant sparse array with negative values
4. giant switch with negative values and common paths at the end of the switch
5. giant switch with negative values
6. giant all positive sparse array
7. giant all positive sparse switch and common paths at the end of the switch
8. sparse array with negative values (and lots of lookups of negative values)
9. switch with sparse negative values  (and lots of lookups of negative values)
10. sparse array
11. sparse switch

It's a little bit worrisome that #4 has worse performance than #5 for all the browsers. The difference between the two is that you have a giant switch where usually only 16 cases of the switch are used. In #5, those 16 cases are at the head of the switch while in #4, those 16 cases come at the end of the switch. If the JavaScript engine were using lookup tables properly, then those two cases should have similar performance. Instead, #4 has worse performance than #5, suggesting that the JavaScript engines are actually encoding the switch statement as a series of if..else blocks, which can be slow if you have a lot of cases (which is definitely the case in a bytecode interpreter).

Chrome seems like it doesn't like it when you do lookups of negative values in arrays. It's possible that Chrome is using strings for negative numbers in arrays since the performance is so poor.

Overall, the numbers seem to suggest to me that if you've got hundreds of cases inside a switch statement, then using a lookup table of functions would probably be a better design though you've got to be careful not to use negative indices.

Wednesday, March 14, 2012

Eglinton + Finch LRT vs. Eglinton Subway

Now that I've figured out how to use the TTC schedule data to build maps of transit times in Toronto, I decided that it might be interesting to use my map generator to analyze some of the proposed transit plans for Toronto. Right now there's a lot of controversy over whether the city should build one long subway under Eglinton or build a shorter subway on Eglinton with the savings going towards an LRT on Finch. By generating different maps for these two different transit plans, I could then see for myself the effect of these plans on transit times.

Methodology

I've already described in detail in previous posts how I generated my maps based on the TTC schedules. Simulating the effect of the new transit plans simply involve adding in new fake schedules for the new transit lines. I could create these fake schedules based on the descriptions of stops, headways, and vehicle speeds given in various TTC and Metrolinx reports (many of which I found by digging through Steve Munro's blog). One issue I purposely avoided dealing with though is the fact that existing bus lines are modified whenever new transit lines are introduced. For example, when a new subway is introduced, bus service in the area is drastically reduced because most people will take the subway and the TTC needs the cost savings from eliminating bus service to pay for the operation of the subway. The bus service that is maintained is usually redesigned to funnel as many people as possible to the subway instead of trying to take people directly to their destinations. I considered the possibility of building up new fake routes and schedules for buses near the new transit lines, but it seemed time-consuming and error-prone. Instead, I'll just rely on the reader to use their judgment when interpreting the maps. The maps only show a broad approximation of what might occur if the new transit lines are introduced.

Baseline

Both of these plans will probably take 10 years to finish, so the Spadina subway extension will be done long before these plans will be implemented. Since the Finch LRT is designed to link up with the Spadina extension, it's important to model the Spadina extension in the experiment.

I went to the TTC website for the Spadina subway extension and figured out the approximate positions of the new subway stations. I wasn't sure if I put the stations in the right spots (i.e. beside existing bus stops so that transfers are easy), so to be safe, I changed the simulation to allow people to walk 200m between stops (unlike the 100m limit that I used previously). The different stations were about 1-1.5km apart from each other. Elsewhere on the Yonge-University-Spadina subway line, the subway seemed to be able to travel about 1.2km in about 2 minutes and 15 seconds, so I used that as the time needed to travel between each station on the new extension. I went through the existing schedules for the YUS-subway. Whenever a subway ended its route or began its route at Downsview station, I extended its schedule with stops along the Spadina extension.

Here is a map of the transit times to downtown including the Spadina subway extension:

Transit times to downtown with the Spadina extension
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

This map serves as a baseline that we'll add the different Eglinton transit plans to. Remember that this is a map of the travel times needed to get to Queen or Osgoode stations. It shows an average of the latest times you can leave in order to reach one of those stations before 8:55 and 9:00.

Eglinton Subway

Although I was able to find general information about the Eglinton subway, I have problems finding detailed reports about its route. I was able to find information about all the stops for the Eglinton LRT, so I simply took this list of stops and pruned out those stops that were too close together and hence unlikely to be made into a subway station. The information I could gather from various Metrolinx presentations suggested that the Eglinton subway would go along Eglinton from Black Creek to Kennedy and then follow the route of the Scarborough RT to McCowan. The average speed for route was simulated to be 32 km/h with a 6 minute peak-time headway.

Transit times to downtown with the Spadina extension and the Eglinton subway
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

Eglinton and Finch LRTs

I was able to find information about the Eglinton LRT stops and simply fed those into the model. The Eglinton LRT goes along Eglinton from Jane to Kennedy. The stops from Keele to Laird are underground while the rest are below ground. When moving between underground stations, I modeled the LRT as moving at 32 km/h. At other times, I modeled the LRT as moving as 22km/h. I kept the 6 minute headways the same as with the Eglinton subway. There is talk that this plan might include rebuilding the Scarborough RT as an LRT that goes from Kennedy to Scarborough Town Centre and then to Sheppard, but I couldn't really easily find any reports on this, so it might just be talk.

For the Finch LRT, I was able to find a list of stops in the Finch BRT plan. I modeled the Finch LRT as moving at 22km/h with 5 minute headways.

Transit times to downtown with the Spadina extension, Eglinton LRT, and Finch LRT
Data, imagery and map information provided by 
MapQuest,
Open Street Map and contributors, CC-BY-SA. 

Commentary

In the end, it turns out it's really hard to tell the difference between these two transit plans. Part of the problem is that my maps only have a new colour for each 20 minute interval, so commute times have to differ by at least 10 minutes before it's even noticeable on the map. Overall though, neither the subway or LRT plans provide large transformative changes compared to the other.

The Eglinton LRT is essentially the same transit line as the Eglinton subway except that it pops up above-ground for a few small sections. These two small sections are rather short and don't have a major effect on commute times. As such, both plans have largely the same effect in the central portions that are underground or near the underground sections.

Both plans provide a significant benefit to people living on Eglinton in the west. They seem to see good improvements in commute times to downtown. The Eglinton LRT extends a little bit further west than the subway, reaching Jane Street instead of ending at Black Creek. The map seems to suggest that this extra stop gives the LRT an additional benefit over the subway for people living west of Jane Street. I'm not sure if the extra 1.5km of LRT actually makes that big a difference in travel times. It might just be an anomaly with the way the LRT schedule lines up with the local bus schedule in the simulation.

Both plans show some small benefit on Eglinton between Yonge and the Don Valley. That area was actually already well served by buses and express buses, so the subway and LRT don't provide huge improvements in commute time to downtown.

The LRT shows some improvements in commute times for people living on Eglinton between the Don Valley and Victoria Park. The subway shows even more improvement for the people who live in that area.

Once you go east of Victoria Park though, things become a little murky. My maps seem to suggest that east of Victoria Park, it's no longer clearcut as to whether you should go downtown via Eglinton or take a bus down to the Bloor Danforth subway and ride that downtown instead. If a subway is built, it looks like it would be slightly better for commuters in the area to take the Eglinton subway over going down to the Bloor Danforth, especially if bus service ends up cut later on to fund the operation of the Eglinton routes. If the LRT is built, then it's probably better to take the bus down to the Bloor-Danforth than to take the LRT.

Once you actually get to Kennedy station though, I think it really becomes a toss-up as to which route you should take. If Metrolinx was overly conservative on the speeds for the Eglinton subway, and the TTC schedule for the Bloor-Danforth subway is optimistic then maybe it makes sense to take the Eglinton. Otherwise, my maps seems to suggest that the Scarborough RT and Bloor-Danforth subway actually already work pretty well together in providing fast service to Scarborough south of the 401. The main benefit of the Eglinton subway was that you wouldn't have to transfer at Kennedy onto the Bloor-Danforth subway not that it would actually get you downtown significantly faster. If the Eglinton LRT is built, then it is definitely better to take the Bloor-Danforth subway downtown than to travel along Eglinton.

One caveat of my discussion so far is that these maps show commute times to downtown. For people in Scarborough traveling to midtown, then there is a noticeable difference between the Eglinton LRT and Eglinton subway (7 extra minutes). For Scarborough people going downtown though, the Eglinton subway doesn't provide a huge benefit.

The main benefit of the Eglinton LRT over the Eglinton subway though is that it frees up money to spend on an LRT along Finch. The maps show that a Finch LRT provides a small improvement in commute times. Compared to existing service, the Spadina subway extension will actually provide a huge benefit in terms of improving commute times to downtown from the Jane and Finch area. The building of an LRT will only provide some modest improvements over the existing bus lines by about 6-7 minutes or so for people living around highway 400. It's not a huge, transformative gain, but that and the more reliable service that comes from having dedicated transit lanes are nice.

Big Picture

To be honest, I was a little disappointed at how little difference there was between the two plans. I was thinking that there would be huge swings in transit times depending on whether you included the full Eglinton subway or Finch LRT. Instead, all this arguing and fury over the plans are mostly about minor 5 minute differences in commute times at very specific points in the city. The colour differences on my maps are so small that I find it hard to tell the difference between actual differences in commute times and calibration problems with my monitor.

Personally, I think the main thing that all this simulation has shown me is how much Toronto's obsession with subways and trains has poisoned its transit planning. The people of Toronto have literally spent decades arguing over how to find the billions of dollars of funding necessary to build new LRTs and subways. Yet small, modest, and cheap improvements that provide real benefits like buying double-decker and articulated buses or setting aside land for future transit corridors have been completely ignored.

I was actually thinking of doing some further simulations comparing the difference between a Sheppard subway extension to Victoria Park vs. a Sheppard LRT, but my current simulation results suggest that it would be pointless to do so. All the experts and consultants already know what the results of these sorts of simulations are. All that bluster and anger are about whether a billion dollars should be spent to let people at Victoria Park get to downtown 10 minutes faster or whether a billion dollars should be more evenly distributed so that people all along Sheppard can zip down Sheppard 5 minutes faster. In the end, I don't think either plan makes a big difference to anyone.

Eglinton Subway vs. Eglinton LRT: Who cares?

So I've spent a day combing through transit reports in order to gather the data needed to simulate the effects of  the Eglinton LRT and Eglinton subway transit plans on commute times to downtown. I generated the necessary maps and am now able to carefully compare their effects.

Conclusion: who cares?

I've had to bring out a microscope to be able tell the difference between these two plans. At maximum zoom, I can detect a subtle difference in shading in a few places. It turns out that neither of these plans are big game changers. Apparently, the laws of physics win out. If you live far away from downtown, then it will take you a long time to commute there. If you live on the edges of the inner suburbs, you may end up saving 10 minutes or so on your one-way commutes. If you live in outer Scarborough, you likely won't see any real difference.

I'll write this up more properly later.

Friday, March 09, 2012

I need to get there by 9ish: Refined Toronto transit time map for the TTC and GO

So I've now refined my program for visualizing the transit times for Toronto. First of all, I've now included GO Transit data in the simulation. As I suspected, it didn't have much of an effect on the results. The distances in Toronto are just to small and GO Transit service is simply too infrequent for it to show up in a simulation of this type. Although taking a GO train might get you downtown faster than riding the bus, because the GO train comes so infrequently, you will likely arrive too early at your destination and end up waiting around. Since this simulation tries to calculate the latest you can leave from a location and still arrive at your destination by 9, the GO train ends up faring poorly. It may be possible that my 100m walk restriction may have resulted in GO train stations being unreachable (since GO Transit is mostly designed for car drivers, most of the stations are surrounded by giant parking lots, and you may have to walk more than 100m from the nearest bus stop to get to the train platform), but I don't really see any evidence of this sort of behaviour on the map, and I think the TTC does try to send buses as close to the platforms as possible. I'm not sure how frequent GO Transit service would have to run before it would show an effect on the map. I'll perhaps have to run a simulation one day to figure that out.

Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Another change I made was to discard bus stops from the map that do not have any service between 7am and 9am. These are likely late-night bus stops that don't normally receive service during the day, so it was misleading to include them in the final map of rush hour transit times.

And finally, I no longer simply calculate the best way to get to downtown by 9. I also calculate the best way to get to downtown by 8:55, and I then average the two results. This helps ensure that the data isn't skewed because the schedule shows a perfect transfer of buses (e.g. a bus that only runs every 30 minutes, but just happens to arrive just before another rare bus so that you can get to downtown without waiting--even though in practice, if the timing is just a little bit off, things don't line up and you end up 30 minutes late). By averaging the results for different times, it reduces the magnitude of this effect.

In the end though, the final map visualization looks pretty much the same as the previous one I generated except for the extra GO Transit routes.