Tuesday, September 10, 2019

Swift ASN.1 Decoder for iOS Receipt Validation

If you want to have in-app purchases in an iOS or MacOS app, you need a way to check what purchases have been made. Annoyingly, Apple does not provide developers with any code for doing this. Apple's APIs will give your program a receipt, listing what was purchased, but the receipt is encoded in a weird format, and Apple doesn't provide any code for reading this format. Apple's reasoning is that not providing code for this is like a very limited form of DRM/copy protection. If every program has custom code for parsing and interpreting the receipt, software pirates will need to do extra work to crack your software.

It is true that software piracy is rampant on Android, and it probably exists on iOS too. Some of us aren't really too concerned with this software piracy issue though, and we just want to implement some quick and dirty handling of IAP with the assumption that most software pirates wouldn't have purchased the software anyway. 

Apple's preferred solution is for you to create your own receipt validation server that your programs can connect to, which will then contact Apple's servers to parse the receipt and to confirm that it's valid. This is a bit of hassle because you have to make an online service, figure out how to keep it running, protect it from hackers, and make your app more fragile because it will always be connecting to this online service.

The other solution is to do receipt validation on the app itself. This is annoying because Apple doesn't provide code for parsing the receipt, the receipt stored on the app contains less information than what Apple provides to servers, and iOS doesn't really bother to keep the receipt up-to-date all the time meaning you often have to go out of your way to update the receipt yourself. The most common way to do the receipt parsing is to just include a copy of OpenSSL in the app, but that involves some annoying interfacing with C code.

I just wanted something quick & dirty, and I'm not too concerned about doing all the signature checking and whatnot, so I just wanted some simpler Objective-C or Swift code online for doing receipt parsing. I tried looking around online a lot, but I couldn't find one, so eventually, I just rolled my own. It's pretty rough since I just threw it together until it worked just enough that it would work for my own app, so use at your own risk. Here it is:

struct Asn1BerTag : CustomStringConvertible {
    var constructed: Bool
    var tagClass: Int
    var tag: Int
    var description: String {
        return String(tagClass) + (constructed ? "C": "-") + String(tag);
    }
}

struct Asn1Entry {
    let tag : Asn1BerTag
    let data : Data
    let len : Int
}

// TODO: This parser thing is sort of insecure because it doesn't really do bounds-checking on
// anything, but it's only used for reading internal data structures so whatever
class Asn1Parser {
    // Parse a single ASN 1 BER entry
    static func parse(_ data: Data, startIdx: Int = 0) -> Asn1Entry {
        var idx = startIdx
        // Try to parse the tag
        var val = data[idx]
        idx += 1
        let tagClass = Int((val >> 6) & 3)
        let constructed = (val & (1 << 5)) != 0
        var tagVal = Int(val & 0x1F)
        if tagVal == 31 {
            val = data[idx]
            idx += 1
            while (val & 0x80) != 0 {
                tagVal <<= 8
                tagVal |= Int(val & 0x7F)
                val = data[idx]
                idx += 1
            }
            tagVal <<= 8
            tagVal |= Int(val & 0x7F)
        }
        let tag = Asn1BerTag(constructed: constructed, tagClass: tagClass, tag: tagVal)
        
        // Try to parse the size
        var len = 0
        var nextTag = 0
        val = data[idx]
        idx += 1
        if val & 0x80 == 0 {
            len = Int(val)
            nextTag = idx + len
        } else if val != 0x80 {
            let numOctets = Int(val & 0x7f)
            for _ in 0..<numoctets {
                len <<= 8
                val = data[idx]
                idx += 1
                len |= Int(val) & 0xFF
            }
            nextTag = idx + len
        } else {
            // Indefinite length. Scan until we encounter 2 zero bytes
            var scanIdx = idx
            while data[scanIdx] != 0 && data[scanIdx+1] != 0 {
                scanIdx += 1
            }
            len = scanIdx - idx
            nextTag = scanIdx + 2
        }
        return Asn1Entry(tag: tag, data: data.subdata(in: idx..<(idx + len)), len: nextTag - startIdx)
    }
    
    static func parseSequence(_ data: Data) -> [Asn1Entry] {
        var toReturn : [Asn1Entry] = []
        var idx = 0
        while idx < data.count {
            let entry = Asn1Parser.parse(data, startIdx: idx)
            toReturn.append(entry)
            idx += entry.len
        }
        
        return toReturn
    }
    
    static func parseInteger(_ data: Data) -> Int {
        let len = data.count
        var val = 0
        
        for i in 0..<len {
            if i == 0 {
                val = Int(data[i] & 0x7F)
            } else {
                val <<= 8
                val |= Int(data[i])
            }
        }
        if len > 0 && data[0] & 0x80 != 0 {
            let complement = 1 << (len * 8)
            val -= complement
        }
        return val
    }
    
    static func parseObjectIdentifier(_ data:Data, startIdx: Int = 0, len: Int? = nil) -> [Int] {
        let dataLen = len ?? data.count
        var idx = startIdx
        var identifier: [Int] = []
        while idx < startIdx + dataLen {
            var subidentifier = 0
            var val = data[idx]
            idx += 1
            while (val & 0x80) != 0 {
                subidentifier <<= 7
                subidentifier |= Int(val & 0x7F)
                val = data[idx]
                idx += 1
            }
            subidentifier <<= 7
            subidentifier |= Int(val & 0x7F)
            identifier.append(subidentifier)
        }
        
        return identifier
    }
}

class IapReceipt {
    var quantity: Int?
    var product_id: String?
    var transaction_id: String?
    var original_transaction_id: String?
    var purchase_date: Date?
    var original_purchase_date: Date?
    var expires_date: Date?
    var is_in_intro_offer_period: Int?
    var cancellation_date: Date?
    var web_order_line_item_id: Int?
}

class AppReceipt {
    var bundle_id : String?
    var application_version : String?
    var receipt_creation_date: Date?
    var expiration_date: Date?
    var original_application_version : String?
    var iaps: [IapReceipt] = []
}

class ReceiptInsecureChecker {
   
    func parsePkcs7ReceiptForPayload(_ data: Data) -> Data? {
        
        // Root is a sequence (tag 16 is sequence)
        let root = Asn1Parser.parseSequence(data)
        guard root.count == 1 && root[0].tag.tag == 16 else { return nil }
        
        // Inside the sequence is some signed data (tag 6 is object identifier)
        let rootSeq = Asn1Parser.parseSequence(root[0].data)
        guard rootSeq.count == 2 && rootSeq[0].tag.tag == 6 && Asn1Parser.parseObjectIdentifier(rootSeq[0].data) == [42, 840, 113549, 1, 7, 2] else { return nil }
        
        // Signed Data contains a sequence
        let signedData = Asn1Parser.parseSequence(rootSeq[1].data)
        guard signedData.count == 1 && signedData[0].tag.tag == 16 else { return nil }
        
        // The third field of the signed data sequence is the actual data
        let signedDataSeq = Asn1Parser.parseSequence(signedData[0].data)
        guard signedDataSeq.count > 3 && signedDataSeq[2].tag.tag == 16 else { return nil }
        
        // The content data should be tagged correctly
        let contentData = Asn1Parser.parseSequence(signedDataSeq[2].data)
        guard contentData.count == 2 && contentData[0].tag.tag == 6 && Asn1Parser.parseObjectIdentifier(contentData[0].data) == [42, 840, 113549, 1, 7, 1] else { return nil }
        
        // Payload should just be some bytes (tag 4 is octet string)
        let payload = Asn1Parser.parse(contentData[1].data)
        guard payload.tag.tag == 4 else { return nil }
        
        return payload.data
    }
    
    func parseReceiptAttributes(_ data: Data) -> AppReceipt? {
        var appReceipt = AppReceipt()
        
        // Root is a set (tag 17 is a set)
        let root = Asn1Parser.parse(data)
        guard root.tag.tag == 17 else { return nil }
        
        // Read set entries
        let receiptAttributes = Asn1Parser.parseSequence(root.data)
        
        // Parse each attribute
        for attr in receiptAttributes {
            if attr.tag.tag != 16 { continue }
            let attrEntries = Asn1Parser.parseSequence(attr.data)
            guard attrEntries.count == 3 && attrEntries[0].tag.tag == 2 && attrEntries[1].tag.tag == 2 && attrEntries[2].tag.tag == 4 else { return nil }
            
            let type = Asn1Parser.parseInteger(attrEntries[0].data)
            let version = Asn1Parser.parseInteger(attrEntries[1].data)
            let value = attrEntries[2].data
            switch (type) {
            case 2:
                let valEntry = Asn1Parser.parse(value)
                // tag 12 = utf8 string
                guard valEntry.tag.tag == 12 else { break }
                appReceipt.bundle_id = String(bytes: valEntry.data, encoding: .utf8)
            case 3:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { break }
                appReceipt.application_version = String(bytes: valEntry.data, encoding: .utf8)
            case 12:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                appReceipt.receipt_creation_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 17:
                let iap = parseIapAttributes(value)
                if iap != nil {
                    appReceipt.iaps.append(iap!)
                }
            case 19:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { break }
                appReceipt.original_application_version = String(bytes: valEntry.data, encoding: .utf8)
            case 21:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                appReceipt.expiration_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            default:
                break
            }
        }
        
        return appReceipt
    }
    
    func parseIapAttributes(_ data: Data) -> IapReceipt? {
        let iap = IapReceipt()
        
        // Root is a set (tag 17 is a set)
        let root = Asn1Parser.parse(data)
        guard root.tag.tag == 17 else { return nil }
        
        // Read set entries
        let receiptAttributes = Asn1Parser.parseSequence(root.data)
        
        // Parse each attribute
        for attr in receiptAttributes {
            if attr.tag.tag != 16 { continue }
            let attrEntries = Asn1Parser.parseSequence(attr.data)
            guard attrEntries.count == 3 && attrEntries[0].tag.tag == 2 && attrEntries[1].tag.tag == 2 && attrEntries[2].tag.tag == 4 else { return nil }
            
            let type = Asn1Parser.parseInteger(attrEntries[0].data)
            let version = Asn1Parser.parseInteger(attrEntries[1].data)
            let value = attrEntries[2].data
            switch (type) {
            case 1701:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 2 else { return nil }
                iap.quantity = Asn1Parser.parseInteger(valEntry.data)
            case 1702:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { return nil }
                iap.product_id = String(bytes: valEntry.data, encoding: .utf8)
            case 1703:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { return nil }
                iap.transaction_id = String(bytes: valEntry.data, encoding: .utf8)
            case 1704:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.purchase_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1706:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.original_purchase_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1708:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.expires_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1719:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 2 else { return nil }
                iap.is_in_intro_offer_period = Asn1Parser.parseInteger(valEntry.data)
            case 1712:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.cancellation_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1711:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 2 else { return nil }
                iap.web_order_line_item_id = Asn1Parser.parseInteger(valEntry.data)
            default:
                break
            }
        }
        return iap
    }
    
    func parseRfc3339Date(_ str: String) -> Date? {
        let posixLocale = Locale(identifier: "en_US_POSIX")
        
        let formatter1 = DateFormatter()
        formatter1.locale = posixLocale
        formatter1.dateFormat = "yyyy'-'MM'-'dd'T'HH':'mm':'ssX5"
        formatter1.timeZone = TimeZone(secondsFromGMT: 0)
        
        let result = formatter1.date(from: str)
        if result != nil {
            return result
        }
        
        let formatter2 = DateFormatter()
        formatter2.locale = posixLocale
        formatter2.dateFormat = "yyyy'-'MM'-'dd'T'HH':'mm':'ss.SSSSSSX5"
        formatter2.timeZone = TimeZone(secondsFromGMT: 0)
        
        return formatter2.date(from: str)
    }    
}

To use the code, you would write something like this:

let data = Data(base64Encoded: "... BASE 64 DATA ...")
let receiptChecker = ReceiptInsecureChecker()
let payload = receiptChecker.parsePkcs7ReceiptForPayload(receipt!)
let appReceipt = receiptChecker.parseReceiptAttributes(payload!)
print(appReceipt!.iaps)

Note: I'm not a Swift coder. I only starting learning Swift about a month ago, so I apologize if the code is not very Swift-y

Wednesday, July 31, 2019

CorelDraw Graphics Suite 2019 Review

Since I'm originally from Ottawa, I've always used CorelDRAW for vector graphics. This actually works out well. Since I'm not an artist or designer, I rarely need to do any vector graphics work, so CorelDRAW has worked for me because it comes with a lot of functionality, I could make a one-time purchase of a perpetual license to the software, and I was occasionally able to get good deals when buying it.

I previously used CorelDraw X5, and it did what I needed it to do, but the menus didn't work quite right on Windows 10, so I was looking to upgrade if the upgrade price ever dropped to around $100-$150 or so, but the price never dropped that low, so I just kept using my old version. Especially since I now make my own vector graphics package, I rarely needed CorelDraw except for some occasional obscure feature. Unfortunately, Corel declared that 2019 would be the last year they would offer upgrade pricing on CorelDraw, so I decided to pick up a copy of CorelDraw 2019 since it would be my last chance to get an upgrade.

I have to say that I feel a little disappointed with CorelDraw 2019. CorelDraw has always been a buggy piece of software. But usually it's the new features that are buggy, but if you stick with the core vector graphics stuff then it works fine. Usually, the new features would be so buggy that they would be unusable, but Corel wouldn't bother fixing it until a later version, so you would just have to pay for an upgrade to fix those bugs and get working versions of the new features. Unfortunately, it seems like they rewrote the core user interface code in this version, so now the core vector graphics functionality is buggy. I suspect it might be related to the fact that they've rewritten stuff so that it works on the Mac (previously, CorelDraw was Windows only). This is annoying to me because CorelDraw 2019 is too buggy for basic vector graphics work, but it likely won't be fixed unless I buy an upgrade to a later version, but Corel isn't going to be selling upgrades any more. I'm sorely tempted to keep using CorelDraw X5. The bugs are just little annoying little things like the screen blanking out if you scroll the window using the scrollbar, requiring you to press ctrl-W to manually refresh the screen. Groups also no longer snaps properly to grids. If you try to move a group, CorelDraw will choose one of the objects of the group (I think it's the top one?), and snap that to the grid instead of aligning the group as a whole. This makes grids sort of useless to me. CorelDraw also doesn't let you snap to grids and snap to objects at the same time. It gets confused and tries snapping to objects, and will completely ignore any possible grid snapping you can do. If basic functionality like scrolling and snap to grid don't work, then how is anyone supposed to get any productive vector graphics work done with CorelDraw?

On top of that, CorelDraw feels slow and sluggish. To be fair, CorelDraw has always felt slow and sluggish, but if you keep using an old version, then after a few years, your computer gets fast enough that it feels snappy and usable. Still, I was hoping that Corel would have left well enough alone, and stopped meddling with the old code so that it would stay fast. That's not the case. It feels sluggish. After all these years, Corel still has not learned that responsiveness is one of those magic unspoken features that make a graphics package feel good to use. Even though Corel Photo-Paint has many more features than my old copy of Photoshop Elements, I still use Photoshop Elements as my primary paint program because it's just so much faster and responsive. CorelDraw 2019 also just stops and hangs for a couple of seconds sometimes. I think it might be that the saving code is now very slow for some reason. Since CorelDraw autosaves fairly often (due to its buggy nature), I think CorelDraw will just occasionally become unresponsive as its incredibly slow autosave happens.

In the end, I feel like I've wasted my money. I bought CorelDraw 2019 because it was the last upgrade version they would offer. But CorelDraw 2019 is really buggy and not very usable. These bugs likely won't be fixed until a later version of Corel, which there won't be any upgrade pricing available for. Every time I use CorelDraw 2019, I keep wanting to go back to using my old version of CorelDraw X5 instead, which I sometimes do. I think the verdict is that if you are in a rush to upgrade CorelDraw because it's a last upgrade version available, DON'T get CorelDraw 2019 because it's too slow and buggy. If you can find an upgrade to an older version of CorelDraw, that might be a better choice to buy actually. Otherwise, just stick to your old version.

Monday, June 18, 2018

Ranking of Racism against Asian Americans at Ivy League Schools

It has long been assumed that the Ivy League schools are racist against Asians, but it's been hard to understand the extent of the racism. There are some universities that try to run purely meritocratic admissions systems involving fewer subjective evaluations. For example, Caltech has an Asian enrollment of 43%, but it's not clear whether that's comparable to other schools because of the heavy engineering focus of the school. Berkeley is a more well-rounded university and it has an Asian enrollment of 41%, but it's a public university in a state with a high Asian population, so it's not clear if it's comparable to universities in other parts of the country.

Fortunately, Harvard ran the numbers back in the 2013, and they found that if they ranked students solely by academic qualifications, then Asians would make up around 40% of the admissions. Even if Harvard continued to set aside spots for athletes and undeserving rich people, then Asians would make up 31%. If extra-curriculars and other subjective measures were included as well, then Asians should still make up around 26% of the student population (even though in 2013, only 19% of the admitted class was Asian). I believe that the Asian population has only grown since 2013.

These numbers agree with the Berkeley and Caltech numbers, so I feel it's safe to use these numbers to do back-of-the-envelope calculations for how racist against Asians each university is. The Harvard numbers should be comparable with other prestigious, well-rounded, private universities that attract students from across the country. So it should be safe to compare the numbers to other Ivy League universities.

So I visited the websites of the Ivy League universities, and grabbed their reported diversity statistics on Asian admissions. The numbers are hard to compare because different universities categorized their students differently. If universities had a separate category for unknown and/or foreign students, then I left them out of the total. If there was a category for multi-racial, I did not include that number in the count of percentage Asians. As a result, the numbers are very noisy, but I think they still give a basis for comparing universities. I think that universities with Asian enrollment in the high 20s or low 30s are demonstrating low amounts of racism against Asians.

So here are the results:

1. Brown (18.16%) - most racist
2. Yale (21%)
3. Dartmouth (21.74%)
4. Harvard (22.2%) - probably worse than Dartmouth, but the numbers are hard to compare
5. Cornell (23.26%)
6. UPenn (23.56%)
7. Princeton (25.29%)
8. Columbia (29%) - least racist

And here's Stanford's numbers even though they aren't an Ivy League university:

Stanford (26.44% Asian)

Initially, the numbers seem to suggest that Brown is the most racist against Asians of the universities. They also have the lowest enrollment of African Americans of the other Ivy League universities, so they just fail on diversity in general. They do have a large number of students classified as multi-racial, which makes things unclear, but things still look bad after removing them from the totals. It's possible that Yale or Harvard are, in fact, the worst universities because they don't have a "multi-racial" category and their percentage of enrollment being Asian is pretty low.

There's a bunch of universities in the middle that seem to be sort of racist. Princeton seems to be the best of the middle.

The least racist, by far, seems to be Columbia, which achieves a high-20s in Asian enrollment, which is the safe zone. They also have strong African enrollment as well. It demonstrates that it is possible to have diverse minority enrollment without unduly punishing Asians.

So what's going on? These universities (other than Columbia) are using the implicit bias effect to willfully keep down Asian enrollment. These universities intentionally add subjective measures into student evaluations that are known to be subject to bias. They then hire admissions officers of dubious qualifications who don't understand the Asian experience or don't like Asians in general, who implicitly prefer applicants who are more like themselves, and who are told to find applicants who match an Ivy League "culture" or "character profile" that run contrary to Asian stereotypes and biases. As a result, they end up giving lower subjective scores to Asians. Is the applicant who plays badminton more or less "brave" than the applicant who plays football? Is the atheist applicant more or less kind than the church-going applicant? There is no way of knowing those things, but people will inevitably form an opinion based on their implicit biases. For example, Harvard gave lower "personality" ratings to Asian applicants in general. I can imagine that, yes, some universities might prefer people with certain personalities over others. But that mix of personalities should be evenly distributed among all people. If one race consistently score poorly on the personality rating versus all other races, then there's some implicit racism going on there that needs to be fixed. Strangely, no personality deficiencies were found during alumni interviews. The bias only appeared during the ranking of personal qualities by the Admissions Office.

Here are some more in-depth articles if you want a deeper dive in the issues involved: 1, 2, 3.

Tuesday, June 12, 2018

Nan Native Module Asynchronous Callbacks in Electron with GWT

This problem has been causing me frustration for weeks, and I think I've finally figured out what was wrong.

I have a GWT application that I'm running as a desktop application using Electron. To access some Windows services, I wrote a native module in C++ that my JavaScript code can call into and call some Windows functions. Some of the newest Windows APIs are asynchronous and long-running, so I made use of Nan's AsyncWorker framework for running C++ code in another thread and then calling a callback function in JavaScript with the result afterwards.

But the code would always crash. If I executed the commands from the Electron/Chrome debugger console, it would run fine. But if I ran the same instructions in my compiled GWT application, the application would crash when the callback function is invoked from C++.

I spent weeks looking over the code and trying different variations, tearing my hair out, and I could never figure it out. Native modules (much like everything else in node.js and electron) are underdocumented, but my code looked the same as the examples, and I couldn't find any reports of other people having problems. Maybe I was compiling things incorrectly? Was my build set-up wrong? Maybe mixing in winrt and managed code was causing problems? But I think I've finally figured things out.

The problem is that the GWT code runs in an iframe, so the callback functions are defined in the iframe, and somehow, this leads to a crash when the C++ code tries to call these callback functions. To solve this problem, I've created a separate JavaScript shim that creates the callback functions in the context of the main web page. My GWT code can call into the main web page to create the callback functions and to pass them to the native module. Then the native module can safely call back into JavaScript from the AsyncWorker without any crashes.

Side Note: When running Electron in Windows, it seems that the Windows message queue is managed from the main process. So if you have a Windows API that needs to be called from the UI thread, it should probably be called from the main process not the renderer process.

Tuesday, May 01, 2018

ES6 Modules: Limp and Overcooked

I've been eagerly awaiting a module system for JavaScript for many years. Although plans for a standardized module system have been floating around even for ECMAScript 4, it's only become standardized and available during the last year or so. Usually, modules are a pretty intuitive language concept. You briefly look at a couple of examples, and then you dive and start using it, and everything just works. For some reason though, when I tried using ES6 modules in a project, my mind absolutely refused to accept ES6 modules. I literally spent hours staring at these lines of code, and my brain couldn't do it:
import foo from './library.js';
import {foo} from './library.js';
Both lines of code are valid ES6 Modules code. Only one line is correct though, and it depends on how you've set-up your modules. The difference is so confusing and the error messages are so cryptic that I just couldn't get my feeble brain to understand it.

Apparently, the JavaScript module system spent so long in the standardization oven that it has become overcooked and ruined. It's limp and dry and completely unappetizing. ES6 Modules are actually two completely different module systems that have been thrown together into JavaScript with no attempt to unify them at all. What's worse is that the two module systems use very similar syntax, and it's very easy to get things mixed up. Some misplaced squigglies results in you using the wrong module system that's incompatible with the library you want because the library was built with the other module system. What's doubly-worse is that one of the module systems is already deprecated, and the preferred module system has the more complicated syntax. If one system is preferred, then why does the other one exist? If the two module systems are different, why couldn't they have two completely syntaxes for them?

What's weird is that they could have unified it. There would have been a lot of weird corner cases, but they could have made a consistent syntax. When I see the two lines above, I think of destructuring assignment.
pair = getFullName();
[firstName, lastName] = getFullName();

point = getPoint();
{x, y} = getPoint();
It's a bit unusual, but it's consistent. My mind could accept that.
import foo from './library.js';
could be for importing everything in the library into an object named foo
import {foo} from './library.js';
could be for importing foo from a library.

But, no, that's not how it works. Instead, the first line is for doing imports from modules built using the default module method, and the second line is for doing imports from modules built using the namespace module method. Oh, you can also build your libraries so that they are compatible with both types of modules, but since the two module systems are completely distinct, you can design your libraries to export completely different things depending on whether they are imported using the default module method or the namespace module method.

After several hours of my mind rejecting the ES6 approach to modules, I think I've finally gotten it accepted. I explained to my mind that the ES6 module system is complete garbage, but that's all that there is to eat, and it better not barf it all out like last time. It's not happy about it though.

Saturday, March 10, 2018

Copying Hard Disks with Bootable Windows GPT Partitions Using Linux

This is just a placeholder blog post. I keep intending to do a proper blog post on this topic, but I never get around to it. Unfortunately, I always forget the steps I need to do when I'm in the middle of copying my hard disks and can't consult my notes, so I'm going to just put some placeholder notes here and flesh them out properly later.

Note: Copying hard disks is tricky. I disclaim all responsibility if you use these steps and you lose data or your BIOS becomes corrupted or whatever. This blog post mainly serves as notes to myself.

About GPT Partitions
I still don't fully understand how a modern UEFI system boots from GPT partitions. I think with GPT and UEFI, your hard disk contains multiple partitions. One of them is a FAT partition that's special because it contains some boot loader programs that contain the instructions needed to load an OS from one of the other partitions. That partition is called the EFI System Partition. The UEFI BIOS of a computer will start Windows in two ways
  1. Usually, the BIOS stores the specific UUID label of the boot EFI System Partition and the name of the boot loader program on that partition to load. The BIOS can then quickly load the boot loader and then continue on to load the OS.
  2. When you first install Windows, the BIOS doesn't have that information yet, so the BIOS is able to find the EFI System Partition on the hard disk itself, find the default Windows boot loader on that partition, and then run that to start Windows.
Why It's Tricky Copying Windows
Copying Windows GPT partitions is hard because
  • Windows makes it hard to copy Windows partitions
  • the bootloader program on the FAT partition has to be changed to load things from the new partition
  • the BIOS has to be changed to know about the new bootloader
Steps for Doing the Copy
By default, Windows is configured with some settings that lets it keep the file system in an inconsistent state on shutdown (in order to have faster shutdowns and bootups), which is a bad time to copy it. I tried various methods for disabling that, but the only reliable approach seems to be to turn off hibernation entirely. You need to start a command prompt in Administrator mode. Then run

     powercfg -h off

If you're copying to a smaller hard disk, you might want to use Windows Disk Management (right-click "This PC", choose "Manage", then choose "Drive Management" under the "Storage" category on the left) to shrink your partitions in advance, but that usually never works, and Linux can shrink your partitions anyway.

Also make sure that you have a bootable version of a Windows rescue CD or the Windows installation media.

Now, you can start-up Linux to start copying your hard disk. I always use the GParted LiveCD to do this. GParted can be slow to start-up because it scans through all your hard disks very slowly to find all the partitions. I think it might also do some really slow thing with Windows partitions as well. That scanning step gets slower the bigger your hard disk is too. But I've found it to be pretty reliable. Use this to copy your partitions to the new hard disk.

msftres Partition
You may find that your hard disk has a msftres (Microsoft reserved) partition. GParted is unable to copy this partition. It is not necessary to copy the partition. It is just a partition that Microsoft reserved so that if you ever install Windows Bitlocker disk encryption, Microsoft can store the decryption code there. If you do want to keep this msftres partition, you can manually create one. First, copy all the partitions before the msftres partition. Then boot into the Windows installation CD (you did remember to make one, right?). Go into the Advanced Repair settings and get to the command prompt. Then use the "diskpart" program to create an msftres partition. I forget the exact steps. You can type stuff like "help," "help select disk," "help list partition," "help create partition msr" or something to get the exact commands you need. But you need to select the new hard disk, then use something like "create partition msr size=128" to create a 128MB msftres partition. Then, you can reboot into GParted to copy the rest of the partitions

GParted Fix-ups
GParted will make the copied partitions have the exact same IDs as the old partitions. This can be a problem if you keep both the new hard disk and the old hard disk in the same computer (e.g. when moving from a hard disk to an SSD, you might want to keep the old hard disk around in the same computer as a backup). It's not clear which boot partition the BIOS will use when starting up. And when Windows is loaded, the wrong one can potentially start up. And even if the right one starts up, the wrong partition might end up being mapped to the C drive. To be safe, if you intend on keeping both hard disks in your system, you should go in and change the UUIDs of all the partitions on either the new drive or the old drive. I'm not 100% sure about Windows recovery partitions. To have the recovery work properly on the new hard disk, it might be better to have it keep the same UUID, so then generating new UUIDs for the old hard disk may be better. But I could never get that recovery partition to work properly anyway, and I would rather have at least one proper working copy of my hard disk in case the copy goes bad, so maybe it's better to generate new UUIDs for the new hard disk.

GParted also often doesn't copy the partition labels and flags correctly. You can go in and set those manually so that they're the same on both disks. I've forgotten to do this before, and everything still seemed to work, so this might not be necessary.

Update UEFI BIOS with Location of Bootloader
Now that Windows is copied, you need to update the BIOS with which bootloader to use on start-up. If you didn't change the UUIDs of the newly copied partitions, then this might not be necessary since the existing BIOS entry for the location of the old bootloader should still work. You might need to swap hard drive cables to ensure that the new hard drive takes over the old drive number from the old hard drive. Or maybe not? 

If you did change the UUID of the boot partition, then this is definitely necessary. Open a Linux terminal. Become root by using "sudo bash". Then use the "efibootmgr" program to create the necessary entries. 

I can never figure out how to create new BIOS EFI entries using efibootmgr, so I sometimes try to use some other approach to create these new entries. Sometimes, if you start up your system with only your new hard disk, the BIOS won't find the default bootloader, so it will default to searching for the Windows bootloader itself, and it will then add an entry for it in the BIOS itself. Or you can try starting a Windows Recovery CD or installation DVD, going to the command-line repair tools, and trying to use "bcdedit", "bcdboot", or "bootrec /rebuildbcd" to do this. I'm not sure what these programs do, but I think one of them will create a new EFI BIOS entry for the bootloader partition.

Once you have an EFI entry in the BIOS for the bootloader, you can go back to using "efibootmgr" to rearrange the order of your boot entries so that it comes first, which is a lot easier than creating a new entry from scratch.

Update Bootloader with New Location of Windows Partition
Now the BIOS can find the bootloader to start loading Windows, but the bootloader may not point to the correct partition to actually start the OS. Again, you can try starting a Windows Recovery CD or installation DVD, going to the command-line repair tools, and try to use "bcdedit", "bcdboot", or "bootrec /rebuildbcd" to do this. Again, I'm not sure what these programs do. I don't actually know what the BCD is, and the Microsoft documentation is very vague on that fact. I think it refers mostly to the configuration files for the bootloader on the EFI FAT system partition, but I don't know. Sometimes, everything has gone bad, and you need to use "bcdedit" to start a completely new BCD store. In any case, after randomly running some combination of those programs, Windows will somehow fix itself and become bootable.

Checking Windows
When you do manage to successfully boot into Windows again, going into Drive Management to make sure that the correct drive is listed as your boot drive and that it is the C drive. You might also need to manually remove drive letters from some of your recovery drives and other drives. Don't forget to reenable hibernation by going to the command prompt as an administrator and using

     powercfg -h on

Fixing Up Your Linux Bootloader
I normally use Windows, but I keep a copy of Linux on my drives for occasional use. Normally, I just let Linux install a boot loader into the EFI system partition and add an entry to the BIOS. I change the boot order to normally boot to Windows, but I use the "boot from alternate drive" key on startup to show all the EFI boot entries, and then I manually choose the Linux one. 

After copying a Linux partition to a new hard disk, you have to reinstall the grub bootloader on the EFI system partition to point to the new Linux partition. Reinstalling grub is mostly impossible, so I find it easier to simply reinstall Linux over the old version (I keep my Linux data on a separate /home partition, so that it's safe to do that without losing data).

Sunday, March 04, 2018

Creating a SDF Texture for a Font at Runtime

I was recently trying to implement text for the graphics engine behind Omber. Unfortunately, although I had a complete vector graphics engine, I hadn't gotten around to implementing support for shapes with holes in them, so I couldn't just take the vector representation of each font glyph and render them directly. Instead, I ended up using the standard approach used in many 3d game engines, which is to use Signed Distance Field fonts.

Most people pre-generate their SDF textures, but that didn't really seem feasible to me. I wanted my code to let people use their own fonts in their drawings, so I couldn't precalculate SDF textures for those fonts. Also, international fonts might contain thousands of characters, and it would be too memory intensive to calculate textures for all of them in advance. So I went about trying to figure out how to generate my own SDF textures, and I learned some good lessons about how to do it.

At first, I didn't really understand how SDF worked, so I tried using the dumb approach of drawing a character on a bitmap, and then manually trying to calculate the SDF values. This actually takes a bit of time to code up, it's really slow to run, and the results are sort of poor. I didn't really understand that the shader really only cares about SDF values for the one or two pixels near the edge of a glyph, and the SDF values need to have subpixel accuracy. True, you might see some demos of people adjusting SDF cut-off values to make variations of a font with different font weights. But for normal situations, you only care about SDF values within a pixel or two of the edges of a glyph because when those values are linearly interpolated, you get a good approximation of the angle of the edge in that area. And you really do need subpixel accuracy or your linear interpolation will simply give you back your chunky pixels. To get that sub-pixel accuracy using a raster approach, you would have to draw your glyphs at a really big size and then scale the bitmaps down, but that's even slower and you lose a lot of accuracy.

Instead, it turned out to be both faster and more accurate to generate the SDF directly from the vector representation. I already had a vector graphics engine, which made it easier, but you actually don't need much vector logic. Basically, you only need a few things. You need a way to extract the bezier curves of each glyph. I was working in JavaScript, so typr.js and opentype.js were available libraries for that. Then you need a Bezier subdivider to convert all the bezier curves to lines. Then you take that bag of lines and throw them into a Point-in-Polygon routine (that calculates whether you cross an even or odd number of lines to see if you're inside a polygon or not) to get the sign and a Distance-to-Line routine to get the distance, and you're done. Since you're working with floating-point values, you get very high precision with sub-pixel accuracy. And since you don't have to scan through lots of pixels to calculate distances, it turns out it's really fast. That actually makes sense because even old computers could render vector fonts at a reasonable speed, so it should be possible to calculate a low-resolution SDF quite quickly too.


So, yeah. Calculate your SDF textures at runtime straight from the vector representation because it's not much code and it's faster.