Monday, June 18, 2018

Ranking of Racism against Asian Americans at Ivy League Schools

It has long been assumed that the Ivy League schools are racist against Asians, but it's been hard to understand the extent of the racism. There are some universities that try to run purely meritocratic admissions systems involving fewer subjective evaluations. For example, Caltech has an Asian enrollment of 43%, but it's not clear whether that's comparable to other schools because of the heavy engineering focus of the school. Berkeley is a more well-rounded university and it has an Asian enrollment of 41%, but it's a public university in a state with a high Asian population, so it's not clear if it's comparable to universities in other parts of the country.

Fortunately, Harvard ran the numbers back in the 2013, and they found that if they ranked students solely by academic qualifications, then Asians would make up around 40% of the admissions. Even if Harvard continued to set aside spots for athletes and undeserving rich people, then Asians would make up 31%. If extra-curriculars and other subjective measures were included as well, then Asians should still make up around 26% of the student population (even though in 2013, only 19% of the admitted class was Asian). I believe that the Asian population has only grown since 2013.

These numbers agree with the Berkeley and Caltech numbers, so I feel it's safe to use these numbers to do back-of-the-envelope calculations for how racist against Asians each university is. The Harvard numbers should be comparable with other prestigious, well-rounded, private universities that attract students from across the country. So it should be safe to compare the numbers to other Ivy League universities.

So I visited the websites of the Ivy League universities, and grabbed their reported diversity statistics on Asian admissions. The numbers are hard to compare because different universities categorized their students differently. If universities had a separate category for unknown and/or foreign students, then I left them out of the total. If there was a category for multi-racial, I did not include that number in the count of percentage Asians. As a result, the numbers are very noisy, but I think they still give a basis for comparing universities. I think that universities with Asian enrollment in the high 20s or low 30s are demonstrating low amounts of racism against Asians.

So here are the results:

1. Brown (18.16%) - most racist
2. Yale (21%)
3. Dartmouth (21.74%)
4. Harvard (22.2%) - probably worse than Dartmouth, but the numbers are hard to compare
5. Cornell (23.26%)
6. UPenn (23.56%)
7. Princeton (25.29%)
8. Columbia (29%) - least racist

And here's Stanford's numbers even though they aren't an Ivy League university:

Stanford (26.44% Asian)

Initially, the numbers seem to suggest that Brown is the most racist against Asians of the universities. They also have the lowest enrollment of African Americans of the other Ivy League universities, so they just fail on diversity in general. They do have a large number of students classified as multi-racial, which makes things unclear, but things still look bad after removing them from the totals. It's possible that Yale or Harvard are, in fact, the worst universities because they don't have a "multi-racial" category and their percentage of enrollment being Asian is pretty low.

There's a bunch of universities in the middle that seem to be sort of racist. Princeton seems to be the best of the middle.

The least racist, by far, seems to be Columbia, which achieves a high-20s in Asian enrollment, which is the safe zone. They also have strong African enrollment as well. It demonstrates that it is possible to have diverse minority enrollment without unduly punishing Asians.

So what's going on? These universities (other than Columbia) are using the implicit bias effect to willfully keep down Asian enrollment. These universities intentionally add subjective measures into student evaluations that are known to be subject to bias. They then hire admissions officers of dubious qualifications who don't understand the Asian experience or don't like Asians in general, who implicitly prefer applicants who are more like themselves, and who are told to find applicants who match an Ivy League "culture" or "character profile" that run contrary to Asian stereotypes and biases. As a result, they end up giving lower subjective scores to Asians. Is the applicant who plays badminton more or less "brave" than the applicant who plays football? Is the atheist applicant more or less kind than the church-going applicant? There is no way of knowing those things, but people will inevitably form an opinion based on their implicit biases. For example, Harvard gave lower "personality" ratings to Asian applicants in general. I can imagine that, yes, some universities might prefer people with certain personalities over others. But that mix of personalities should be evenly distributed among all people. If one race consistently score poorly on the personality rating versus all other races, then there's some implicit racism going on there that needs to be fixed. Strangely, no personality deficiencies were found during alumni interviews. The bias only appeared during the ranking of personal qualities by the Admissions Office.

Here are some more in-depth articles if you want a deeper dive in the issues involved: 1, 2, 3.

Tuesday, June 12, 2018

Nan Native Module Asynchronous Callbacks in Electron with GWT

This problem has been causing me frustration for weeks, and I think I've finally figured out what was wrong.

I have a GWT application that I'm running as a desktop application using Electron. To access some Windows services, I wrote a native module in C++ that my JavaScript code can call into and call some Windows functions. Some of the newest Windows APIs are asynchronous and long-running, so I made use of Nan's AsyncWorker framework for running C++ code in another thread and then calling a callback function in JavaScript with the result afterwards.

But the code would always crash. If I executed the commands from the Electron/Chrome debugger console, it would run fine. But if I ran the same instructions in my compiled GWT application, the application would crash when the callback function is invoked from C++.

I spent weeks looking over the code and trying different variations, tearing my hair out, and I could never figure it out. Native modules (much like everything else in node.js and electron) are underdocumented, but my code looked the same as the examples, and I couldn't find any reports of other people having problems. Maybe I was compiling things incorrectly? Was my build set-up wrong? Maybe mixing in winrt and managed code was causing problems? But I think I've finally figured things out.

The problem is that the GWT code runs in an iframe, so the callback functions are defined in the iframe, and somehow, this leads to a crash when the C++ code tries to call these callback functions. To solve this problem, I've created a separate JavaScript shim that creates the callback functions in the context of the main web page. My GWT code can call into the main web page to create the callback functions and to pass them to the native module. Then the native module can safely call back into JavaScript from the AsyncWorker without any crashes.

Side Note: When running Electron in Windows, it seems that the Windows message queue is managed from the main process. So if you have a Windows API that needs to be called from the UI thread, it should probably be called from the main process not the renderer process.

Tuesday, May 01, 2018

ES6 Modules: Limp and Overcooked

I've been eagerly awaiting a module system for JavaScript for many years. Although plans for a standardized module system have been floating around even for ECMAScript 4, it's only become standardized and available during the last year or so. Usually, modules are a pretty intuitive language concept. You briefly look at a couple of examples, and then you dive and start using it, and everything just works. For some reason though, when I tried using ES6 modules in a project, my mind absolutely refused to accept ES6 modules. I literally spent hours staring at these lines of code, and my brain couldn't do it:
import foo from './library.js';
import {foo} from './library.js';
Both lines of code are valid ES6 Modules code. Only one line is correct though, and it depends on how you've set-up your modules. The difference is so confusing and the error messages are so cryptic that I just couldn't get my feeble brain to understand it.

Apparently, the JavaScript module system spent so long in the standardization oven that it has become overcooked and ruined. It's limp and dry and completely unappetizing. ES6 Modules are actually two completely different module systems that have been thrown together into JavaScript with no attempt to unify them at all. What's worse is that the two module systems use very similar syntax, and it's very easy to get things mixed up. Some misplaced squigglies results in you using the wrong module system that's incompatible with the library you want because the library was built with the other module system. What's doubly-worse is that one of the module systems is already deprecated, and the preferred module system has the more complicated syntax. If one system is preferred, then why does the other one exist? If the two module systems are different, why couldn't they have two completely syntaxes for them?

What's weird is that they could have unified it. There would have been a lot of weird corner cases, but they could have made a consistent syntax. When I see the two lines above, I think of destructuring assignment.
pair = getFullName();
[firstName, lastName] = getFullName();

point = getPoint();
{x, y} = getPoint();
It's a bit unusual, but it's consistent. My mind could accept that.
import foo from './library.js';
could be for importing everything in the library into an object named foo
import {foo} from './library.js';
could be for importing foo from a library.

But, no, that's not how it works. Instead, the first line is for doing imports from modules built using the default module method, and the second line is for doing imports from modules built using the namespace module method. Oh, you can also build your libraries so that they are compatible with both types of modules, but since the two module systems are completely distinct, you can design your libraries to export completely different things depending on whether they are imported using the default module method or the namespace module method.

After several hours of my mind rejecting the ES6 approach to modules, I think I've finally gotten it accepted. I explained to my mind that the ES6 module system is complete garbage, but that's all that there is to eat, and it better not barf it all out like last time. It's not happy about it though.

Saturday, March 10, 2018

Copying Hard Disks with Bootable Windows GPT Partitions Using Linux

This is just a placeholder blog post. I keep intending to do a proper blog post on this topic, but I never get around to it. Unfortunately, I always forget the steps I need to do when I'm in the middle of copying my hard disks and can't consult my notes, so I'm going to just put some placeholder notes here and flesh them out properly later.

Note: Copying hard disks is tricky. I disclaim all responsibility if you use these steps and you lose data or your BIOS becomes corrupted or whatever. This blog post mainly serves as notes to myself.

About GPT Partitions
I still don't fully understand how a modern UEFI system boots from GPT partitions. I think with GPT and UEFI, your hard disk contains multiple partitions. One of them is a FAT partition that's special because it contains some boot loader programs that contain the instructions needed to load an OS from one of the other partitions. That partition is called the EFI System Partition. The UEFI BIOS of a computer will start Windows in two ways
  1. Usually, the BIOS stores the specific UUID label of the boot EFI System Partition and the name of the boot loader program on that partition to load. The BIOS can then quickly load the boot loader and then continue on to load the OS.
  2. When you first install Windows, the BIOS doesn't have that information yet, so the BIOS is able to find the EFI System Partition on the hard disk itself, find the default Windows boot loader on that partition, and then run that to start Windows.
Why It's Tricky Copying Windows
Copying Windows GPT partitions is hard because
  • Windows makes it hard to copy Windows partitions
  • the bootloader program on the FAT partition has to be changed to load things from the new partition
  • the BIOS has to be changed to know about the new bootloader
Steps for Doing the Copy
By default, Windows is configured with some settings that lets it keep the file system in an inconsistent state on shutdown (in order to have faster shutdowns and bootups), which is a bad time to copy it. I tried various methods for disabling that, but the only reliable approach seems to be to turn off hibernation entirely. You need to start a command prompt in Administrator mode. Then run

     powercfg -h off

If you're copying to a smaller hard disk, you might want to use Windows Disk Management (right-click "This PC", choose "Manage", then choose "Drive Management" under the "Storage" category on the left) to shrink your partitions in advance, but that usually never works, and Linux can shrink your partitions anyway.

Also make sure that you have a bootable version of a Windows rescue CD or the Windows installation media.

Now, you can start-up Linux to start copying your hard disk. I always use the GParted LiveCD to do this. GParted can be slow to start-up because it scans through all your hard disks very slowly to find all the partitions. I think it might also do some really slow thing with Windows partitions as well. That scanning step gets slower the bigger your hard disk is too. But I've found it to be pretty reliable. Use this to copy your partitions to the new hard disk.

msftres Partition
You may find that your hard disk has a msftres (Microsoft reserved) partition. GParted is unable to copy this partition. It is not necessary to copy the partition. It is just a partition that Microsoft reserved so that if you ever install Windows Bitlocker disk encryption, Microsoft can store the decryption code there. If you do want to keep this msftres partition, you can manually create one. First, copy all the partitions before the msftres partition. Then boot into the Windows installation CD (you did remember to make one, right?). Go into the Advanced Repair settings and get to the command prompt. Then use the "diskpart" program to create an msftres partition. I forget the exact steps. You can type stuff like "help," "help select disk," "help list partition," "help create partition msr" or something to get the exact commands you need. But you need to select the new hard disk, then use something like "create partition msr size=128" to create a 128MB msftres partition. Then, you can reboot into GParted to copy the rest of the partitions

GParted Fix-ups
GParted will make the copied partitions have the exact same IDs as the old partitions. This can be a problem if you keep both the new hard disk and the old hard disk in the same computer (e.g. when moving from a hard disk to an SSD, you might want to keep the old hard disk around in the same computer as a backup). It's not clear which boot partition the BIOS will use when starting up. And when Windows is loaded, the wrong one can potentially start up. And even if the right one starts up, the wrong partition might end up being mapped to the C drive. To be safe, if you intend on keeping both hard disks in your system, you should go in and change the UUIDs of all the partitions on either the new drive or the old drive. I'm not 100% sure about Windows recovery partitions. To have the recovery work properly on the new hard disk, it might be better to have it keep the same UUID, so then generating new UUIDs for the old hard disk may be better. But I could never get that recovery partition to work properly anyway, and I would rather have at least one proper working copy of my hard disk in case the copy goes bad, so maybe it's better to generate new UUIDs for the new hard disk.

GParted also often doesn't copy the partition labels and flags correctly. You can go in and set those manually so that they're the same on both disks. I've forgotten to do this before, and everything still seemed to work, so this might not be necessary.

Update UEFI BIOS with Location of Bootloader
Now that Windows is copied, you need to update the BIOS with which bootloader to use on start-up. If you didn't change the UUIDs of the newly copied partitions, then this might not be necessary since the existing BIOS entry for the location of the old bootloader should still work. You might need to swap hard drive cables to ensure that the new hard drive takes over the old drive number from the old hard drive. Or maybe not? 

If you did change the UUID of the boot partition, then this is definitely necessary. Open a Linux terminal. Become root by using "sudo bash". Then use the "efibootmgr" program to create the necessary entries. 

I can never figure out how to create new BIOS EFI entries using efibootmgr, so I sometimes try to use some other approach to create these new entries. Sometimes, if you start up your system with only your new hard disk, the BIOS won't find the default bootloader, so it will default to searching for the Windows bootloader itself, and it will then add an entry for it in the BIOS itself. Or you can try starting a Windows Recovery CD or installation DVD, going to the command-line repair tools, and trying to use "bcdedit", "bcdboot", or "bootrec /rebuildbcd" to do this. I'm not sure what these programs do, but I think one of them will create a new EFI BIOS entry for the bootloader partition.

Once you have an EFI entry in the BIOS for the bootloader, you can go back to using "efibootmgr" to rearrange the order of your boot entries so that it comes first, which is a lot easier than creating a new entry from scratch.

Update Bootloader with New Location of Windows Partition
Now the BIOS can find the bootloader to start loading Windows, but the bootloader may not point to the correct partition to actually start the OS. Again, you can try starting a Windows Recovery CD or installation DVD, going to the command-line repair tools, and try to use "bcdedit", "bcdboot", or "bootrec /rebuildbcd" to do this. Again, I'm not sure what these programs do. I don't actually know what the BCD is, and the Microsoft documentation is very vague on that fact. I think it refers mostly to the configuration files for the bootloader on the EFI FAT system partition, but I don't know. Sometimes, everything has gone bad, and you need to use "bcdedit" to start a completely new BCD store. In any case, after randomly running some combination of those programs, Windows will somehow fix itself and become bootable.

Checking Windows
When you do manage to successfully boot into Windows again, going into Drive Management to make sure that the correct drive is listed as your boot drive and that it is the C drive. You might also need to manually remove drive letters from some of your recovery drives and other drives. Don't forget to reenable hibernation by going to the command prompt as an administrator and using

     powercfg -h on

Fixing Up Your Linux Bootloader
I normally use Windows, but I keep a copy of Linux on my drives for occasional use. Normally, I just let Linux install a boot loader into the EFI system partition and add an entry to the BIOS. I change the boot order to normally boot to Windows, but I use the "boot from alternate drive" key on startup to show all the EFI boot entries, and then I manually choose the Linux one. 

After copying a Linux partition to a new hard disk, you have to reinstall the grub bootloader on the EFI system partition to point to the new Linux partition. Reinstalling grub is mostly impossible, so I find it easier to simply reinstall Linux over the old version (I keep my Linux data on a separate /home partition, so that it's safe to do that without losing data).

Sunday, March 04, 2018

Creating a SDF Texture for a Font at Runtime

I was recently trying to implement text for the graphics engine behind Omber. Unfortunately, although I had a complete vector graphics engine, I hadn't gotten around to implementing support for shapes with holes in them, so I couldn't just take the vector representation of each font glyph and render them directly. Instead, I ended up using the standard approach used in many 3d game engines, which is to use Signed Distance Field fonts.

Most people pre-generate their SDF textures, but that didn't really seem feasible to me. I wanted my code to let people use their own fonts in their drawings, so I couldn't precalculate SDF textures for those fonts. Also, international fonts might contain thousands of characters, and it would be too memory intensive to calculate textures for all of them in advance. So I went about trying to figure out how to generate my own SDF textures, and I learned some good lessons about how to do it.

At first, I didn't really understand how SDF worked, so I tried using the dumb approach of drawing a character on a bitmap, and then manually trying to calculate the SDF values. This actually takes a bit of time to code up, it's really slow to run, and the results are sort of poor. I didn't really understand that the shader really only cares about SDF values for the one or two pixels near the edge of a glyph, and the SDF values need to have subpixel accuracy. True, you might see some demos of people adjusting SDF cut-off values to make variations of a font with different font weights. But for normal situations, you only care about SDF values within a pixel or two of the edges of a glyph because when those values are linearly interpolated, you get a good approximation of the angle of the edge in that area. And you really do need subpixel accuracy or your linear interpolation will simply give you back your chunky pixels. To get that sub-pixel accuracy using a raster approach, you would have to draw your glyphs at a really big size and then scale the bitmaps down, but that's even slower and you lose a lot of accuracy.

Instead, it turned out to be both faster and more accurate to generate the SDF directly from the vector representation. I already had a vector graphics engine, which made it easier, but you actually don't need much vector logic. Basically, you only need a few things. You need a way to extract the bezier curves of each glyph. I was working in JavaScript, so typr.js and opentype.js were available libraries for that. Then you need a Bezier subdivider to convert all the bezier curves to lines. Then you take that bag of lines and throw them into a Point-in-Polygon routine (that calculates whether you cross an even or odd number of lines to see if you're inside a polygon or not) to get the sign and a Distance-to-Line routine to get the distance, and you're done. Since you're working with floating-point values, you get very high precision with sub-pixel accuracy. And since you don't have to scan through lots of pixels to calculate distances, it turns out it's really fast. That actually makes sense because even old computers could render vector fonts at a reasonable speed, so it should be possible to calculate a low-resolution SDF quite quickly too.


So, yeah. Calculate your SDF textures at runtime straight from the vector representation because it's not much code and it's faster.

Thursday, October 26, 2017

glTF 2.0: I Like It!

Although I'm not a 3d graphics person, I have worked with several 3d file format1234567891011. In general, I've been very disappointed in the design of these file formats. But I've finally found a 3d file format that I've liked. glTF 2.0 is actually pretty nice.

It's a mostly straight-forward, easy to understand file format that's pretty unambiguous. It doesn't try to implement any fancy features like U3D. It doesn't contain weird legacy baggage like X3D or COLLADA. Its design isn't so overly configurable or flexible that it's impossible to know whether what you store in it can be read by other programs like COLLADA or TIFF. It just holds a bunch of triangles and associated data structures. It seems like it was built from the ground up as a proper file format for interchange instead of growing out of some existing system with all sorts of strange behavior based on how the codebase for the original system evolved. It also has good extension points making it easy to store additional application-specific data in a file.

I think part of the reason why it came out so well is that it was originally designed for one purpose only: for sending 3d models to be displayed by WebGL. With a well-defined and basic use case, the designers had the focus to make something straight-forward and easy to work with. With glTF 2.0, the file format has been extended to support more general use cases, but the core use case--holding 3d models--hasn't been diluted by that. Storing 3d models in glTF 2.0 is still clear and concise without a lot of confusion.

I still have a few niggles with it though that could be improved. Right now, the file format doesn't have widespread support yet, but it is starting to grow. Still, given that this is a file format specification, I feel like there should have been at least one proper reference importer/exporter for the file format before it was finalized. There are multiple implementations of the spec, which is good, but none of the implementations are complete and comprehensive and allow for a proper bidirectional interfacing with a proper 3d application, so it's just hard to know if the files I've created are correct or whether all the corners of the file format has been fully tested.

Some parts of specification don't really give proper explanations or context for why they are needed. For example, I still don't understand why accessor.min and accessor.max exist. Like, I'm sure there's a good reason, but they just seem like an unnecessary hassle to me. Especially given that it's impossible to properly encode a 32-bit floating point number as a decimal string, I just can't see what use an inaccurate record of the min and max x,y,z values of some points are. Having more context there would be useful for implementors. Another example are the different buffer, bufferview, and accessor objects needed to refer to memory. It took me a long time to figure out what the difference was. At first, I thought you could the data for everything in a single bufferview, and just use different accessors to refer to different chunks of it. It was only later when I read that bufferviews were intended to refer to OpenGL memory buffers did I finally understand what each level of memory reference is for. The different buffers are meant to refer to different data stored on-disk. Usually, you'll only have one buffer, but if you have different models that share data, you can put this shared data in a separate file/buffer that those two models can share. The bufferview refers to a single in-memory chunk of data loaded into memory for a model. So, having a single bufferview for an entire scene would be wrong. You would normally have one or more bufferview for each 3d object in a scene. In general, when accessing data from a bufferview, you would always read from the start of the bufferview. If you find yourself reading from an offset into the bufferview, then you should probably just use a separate bufferview instead. The accessors describe how to read individual data fields of a bufferview. Notably, the bufferview contains a byteStride property that allows a bufferview to be broken up into different records or entries. An accessor describes how different fields are stored/interleaved inside a record or entry of a bufferview. An accessor's byteOffset is supposed to be used for offsets into a record or entry, not for starting at an offset into a bufferview.

glTF 2.0 also offers a convenient format for storing all the 3d data in a single file called GLB. The GLB specification is nice in that it's really basic and straight-forward, but its design is a little sloppy. The GLB file format has its file size encoded in it, which is unnecessary and prevents the data from being streamed. Even if that were fixed, the design of the chunks inside the file also prevent being able to write out the data in a single stream. All the parts of the file have to be written out separately first, their sizes determined, and then they can be assembled and written out into a GLB file. This is caused by the fact that there can only be a single buffer chunk, and the JSON chunk (which will contain references into the buffer chunk) has to be written out before the buffer chunk.

Overall though, I really like the glTF 2.0 format. I really hope it gets widespread adoption. I definitely see it displacing the .OBJ format in the long term.

Thursday, July 27, 2017

WKWebView for Clueless Mac Programmers

I've been recently trying to package up my vector design web app Omber as a Macintosh app. Unfortunately, I had zero knowledge about Mac programming. Like, I never owned a Mac. I didn't even know how to get the cursor to go to the start of a line or skip a word using the keyboard without having to look up Stack Overflow. I tried using Electron, but after spending a long time going through various documentation to rebrand and package things (the nw.js documentation is so much better. The nw.js documentation is always such a joy to read compared to the electron docs), I wasn't too satisfied with the result. It worked, but it was sort of clunky, and I think there was some weird sandbox thing going on that caused file reading to sometimes work but sometimes not. With Windows, it makes sense to use Electron because the Windows default browser engine has weird behaviour and not everyone has the latest version of Windows. But on the Mac, everyone gets free OS upgrades to the latest version and the browser engine is fairly decent, so there's no need to include a 100MB browser engine with an app. So I figured I could whip together a quick Mac application that's just a window with a web view in it in about the same amount of time it would take to debug the Electron sandbox issues.

<rant about Mac programming>
Programming for the Mac is just like using a Mac. Apple hides important details and tries to force you to do things their way. Apple keeps changing things underneath you so all the documentation online or in books is always vaguely out of date. It's also expensive. I bought the cheapest Mac mini with 4GB RAM and a hard disk for development, thinking I could do mostly command-line stuff, but that's not the case. You really need to work from Xcode, and Xcode is a pig of a program that takes up a lot of RAM and is sort of slow. I almost immediately had to switch to using an external SSD on USB to get any reasonable responsiveness from my system. Apple is really trying to stuff Swift down everyone's throats, but I opted to go with Objective-C because of my Smalltalk background. It's not bad except the syntax is somewhat awful. My main issue with it is that part of what Smalltalk so productive is that it comes with an advanced IDE that's super fast and makes it easy to browse around large application frameworks to figure out how to use an API. Objective-C comes with an overwhelmingly huge application framework as well, but Xcode is slow and pokey and doesn't come with good facilities for diving through the framework. Code completion is not good enough. There should be a quick way to find how other people call the same method, check out the documentation for a method, and to check out the inheritance tree. Xcode is more of a traditional IDE with some code completion. It would be nice if Xcode actually labelled all of its inscrutable icons too. No one knows what any of those buttons mean, but using those buttons isn't optional either. The latest MacOS/OSX versions do include a web view, but I always get the feeling that Safari developers don't really understand web apps and want to discourage people from making them. I find that they only implement just enough features to Safari support their own uses and then lose all interest in implementing things in a general way that can have multiple uses. For example, for the longest time, they refused to implement the download attribute on links because Apple didn't need it, so why should anyone else need it? Then, when they did implement it, it initially didn't work on data-urls and blobs because they didn't understand how important that was for web apps. Similarly, the new WKWebView initially could only show files from the Internet and not load up anything locally, making it useless for JavaScript downloadable apps. Then, even when they did fix it, things like web workers or XMLHttpRequest are still broken, really limiting its usefulness. 
</rant about Mac programming>

Anyway, I found a great blog post that shows how to make a one window app with a web view in it. It lists every step, so it's easy to follow along even with no understanding of Mac programming. It worked for me, but Xcode has changed its default app layout to use storyboards so some of the instructions don't work any more, and it used the old WebView which is very limited. The new WKWebView is better because it allows for JIT compilation of the JavaScript, and it comes with proper facilities for letting the JavaScript send data to native code (the old web view required a bad hack to do that). So here are some updated instructions:
  1. Get Xcode and start it up
  2. Create a new Xcode project
  3. Make a MacOS Cocoa Application
  4. Fill in the appropriate application info, choose Objective-C for the language
  5. That should bring you to the screen where you can adjust the project settings
    1. If you want to run in a sandbox, I think you have to turn on signing. I think Xcode will take care of getting the appropriate certificates for you (I had already gotten them earlier).
    2. At the bottom of the General settings, under "Linked Frameworks and Libraries", you should add the WebKit.framework
    3. In the Capabilities tab, you can turn on the App Sandbox if you want (I think this is needed for the Mac App Store). Be careful, there seems to be a UI bug there. Once you turn on the app sandbox, you can't turn it off from the UI any more.
    4. If you do enable the App Sandbox, you also need to enable "Outgoing Connections (Client)" in the Network category. This is required even if you don't use the network. WKWebView seemed to have problems loading local files if the network entitlement wasn't enabled.
  6. Go to your ViewController.h, and change it to
  7. #import <Cocoa/Cocoa.h>
    #import <WebKit/WebKit.h>
    
    @interface ViewController : NSViewController
    
    @property(strong,nonatomic) WKWebView* webView;
    
    @end
    
  8. When using storyboards, the app delegate doesn't have direct access to the view, so you have to control the view from the view controller instead.
  9. Then go to your ViewController.m. Usually, you would draw a web view in the view of the storyboard and then hook it up to the view controller. Although this is possible with the WKWebView, all the documentation I've seen suggest manually creating the WKWebView instead. I think this might be necessary to pass in all the configuration you want for the WKWebView. To manually create the WKWebView, add these methods that show a basic web page:
  10. - (void)loadView {
        [super loadView];
        _webView = [[WKWebView alloc] initWithFrame: 
            [[self view] bounds] ];
        [[self view] addSubview:_webView];
        // Instead of adding the web view as a subview as
        // in above, you can also just replace the whole 
        // view with the web view using
        //     [self setView: _webView];
    }
    
    - (void)awakeFromNib {
        [_webView loadRequest:
            [NSURLRequest requestWithURL:
            [NSURL URLWithString:@"https://www.example.com"]]];
    }
    
  11. Now when you run the program, you should see the web page from example.com there.
  12. The next step is to create a directory with all your local web pages that you want to show. Create a folder named html in the Finder (i.e. just a normal folder somewhere outside of Xcode)
  13. Drag that folder onto your project in the file list. Enable "Destination: Copy if needed" and "Added folders: Create folder references"
  14. You should now have a html folder in your project. You can delete the original html folder that you created earlier in the Finder since you no longer need it any more. (You can confirm that the html folder will be included in your project properly by looking at your project file under the Build Phases tab, and  the html folder should be listed in the Copy BundleResources section)
  15. Create an index.html file in your new html folder. Put some stuff in it.
  16. To show that page, go to your ViewController.m and change the awakeFromNib method to this:
  17. - (void)awakeFromNib {
        NSString *resourcePath = 
            [[NSBundle mainBundle] resourcePath];
        NSString *htmlPath = [resourcePath 
            stringByAppendingString:@"/html/index.html"];
        NSString *htmlDirPath = [resourcePath 
            stringByAppendingString:@"/html/"];
        [_webView
            loadFileURL:[NSURL fileURLWithPath:htmlPath]
            allowingReadAccessToURL:
                [NSURL fileURLWithPath:htmlDirPath isDirectory:true]];
    }