Monday, December 09, 2024

How Long Do You Have to Wait for a Bus?

If you go down to the bus stop to catch the next bus, how long do you have to wait for the next bus? If buses come every 5 minutes, then you have to wait on average 2.5 minutes for the bus to arrive. But what happens if the buses come unevenly? What happens if the buses get bunched up, so that three buses come within 1 minute between each bus and then it takes 13 minutes for the next bus to arrive? If you just take a simple average of the time between buses:

1+1+133

You will calculate that there's, on average, a 5 minute gap between buses, which again suggests that you have to wait, on average, 2.5 minutes for the bus to arrive. But that doesn't seem right. And it's not.

Let's look more deeply into this problem and see if we can calculate the wait time for the bus more correctly.

Total Wait Time

To build some intuition about the problem, let's consider a bus stop where people arrive every 2 minutes. Buses come to the bus stop at uneven times though.

Depending on when you arrive, you may have to wait longer for the bus. If there's a larger gap between buses, there are more people who have to wait too.

If we want to calculate the total amount of time that people have to wait for the bus, we just have to add up the times from when each person arrives to the time when the bus comes. To make this more clear, we'll draw a little stick figure for a person having to wait one minute for the bus.

In the graph, we can see that over time, the number of people waiting stacks up until a bus comes. It forms a bit of a staircase or triangle. If we add up the number of stick figures in the graph, we can see that there are 14 stick figures, so people waited for a total of 14 minutes for the bus. Over the time period, there were 6 people who came to the bus stop, so the average amount of time each person had to wait was

14 minutes waited6 people=2.3 minutes / person

Now that we have some intuition about calculating bus wait times, let's try generalizing this approach.

Generalization

Every bus stop has people arriving at different times, so to generalize over them, let's assume that people arrive at the bus stop at a constant rate a. As people arrive, they wait for the bus until it arrives. When you graph this, you get a lot of right angle triangles. The triangles have a slope a that matches the arrival rate of passengers.

The total time that people spend waiting for buses is the area under the triangles. We can calculate the dimensions of the triangles using a.

Knowing the dimensions of the triangles, we can then calculate the area of the triangles. 

total time waited=all triangles 12·base·height =12·5·5a+12·4·4a+12·2·2a =45a2

Or if we want to generalize this further for a bunch of buses where we know the time duration between each bus:

all buses12·duration2·a

Knowing the total wait time that all people wait for their bus, we can just divide the total wait time by the number of people to get the average wait time per person.

number of people=time range·arrival rate

average wait time=total wait timenumber of people=45a2(5+4+2)a=4522

Or if we generalize things to a bunch of buses where we know the time duration between each bus:

number of people=time range·arrival rate=all buses duration·a

average wait time=total wait timenumber of people=all buses 12duration2·aall buses duration·a=duration22·duration

Notice that the average wait time does not involve the arrival rate of new people at the bus stop a.

So now that we have a way of calculating average wait times for a bus, let's look again at the problem posed in the introduction. How long do you have to wait for a bus if the buses are bunched up and the spacing between three buses is 13 minutes, 1 minute, and 1 minute?

average wait time=all buses duration22·all buses duration =132+12+122·(13+1+1) =5.7 minutes

That's much longer than the 2.5 minutes that we calculated using an incorrect simple approach. If buses are severely bunched up, you will have to wait more than 2x longer for a bus than if the buses were evenly spaced.

Alternate Formulation

To check if our formulation is correct, we can try calculating the average wait time for buses a different way and see if we end up with the same formula. 

Suppose that during a certain period of time, buses can come at different times. You can arrive at any point during that time period.

If we want to create a graph of how long you have to wait at the bus stop, it's pretty easy. The wait time is equal to the time until the next bus arrives. So if you happen to arrive 4 minutes before the next bus, then you'll have to wait 4 minutes. If you arrive 2 minutes before the next bus, you'll have to wait 2 minutes.

You'll notice that the graph of wait times creates a similar set of triangles to the graphs we calculated in our previous formulation.

Now that we have a graph of how long we would need to wait depending on when we arrive, we can calculate the average wait time. We can do that using some calculus.

average wait time=1b-aabf(t)dt  where range [a,b] is the time period you are calculating an average over  =1durationall buses 12duration2

Integrating over these triangles is the same as taking the area of the triangles. So according to this formulation, the average wait time for a bus results in the same expression as what we determined from the other formulation.

Conclusion

As we can see, calculating how long you have to wait for a bus does take some care, but the resulting math isn't too burdensome.

Tuesday, March 26, 2024

I was DDOSed and I didn't notice

For the past 15-20 years, I've been running a little website called Programming Basics that teaches programming to kids. It's an ancient, very amateur website that clearly shows its 20 year age too. It has amateur programmer art. It has JavaScript popovers and other archaic HTML stuff. It has PDFs of handouts that teachers can print out. It's main distinguishing characteristic is that it still works after nearly 20 years, and I haven't taken it down for some reason. It's very obscure, and it receives a very small amount of traffic.

And I think someone tried to DDOS it, but I'm not sure.

I host that website on Amazon Web Services (AWS), and when I looked at the bill a few weeks ago, I noticed that the charges seemed elevated. The per month charges have been pretty much the same for a decade, so I thought I must have accidentally left a cloud computer running or something. But after digging through my billing reports, the charges seemed to be caused by unusual web traffic to my Programming Basics website.

Digging into the general usage statistics provided by AWS, it seems I received 126,755,687 requests for the page "/" on February 28. The day before, I only received 209 requests for that page. The "/" page automatically redirects to the "/en/" page. So typically, I should receive a similar number of requests to the "/en/" page as to the "/" page or even a little more since bookmarks and search engines will usually go directly to the "/en/" page. Instead, I received 785,650 requests for the "/en/" page. That's incredibly high, but it's strange that while most of the clients didn't bother following the redirect, some did. The ones that didn't follow the redirect were obviously basic traffic generating bots that simply generate requests, but  why were some bots coded up differently to follow redirects? Accesses to other webpages on the website and accesses on other days seemed fairly reasonable. I wonder why that attack was only for the root page of the website, especially considering that the page was essentially blank? Wouldn't it have been better to access a page with a larger file size? Or a spread of different pages or even non-existent pages? I suppose it doesn't matter since the whole website is a static website anyway.

Digging into hourly usage statistics, it seems that almost all the requests happened in a one hour period:

28.05 GB was used right at 12:00 UTC time. The timing is a little odd. I suppose the attack was scheduled in advance to occur right at 12. But why was it so short? Did AWS recognize an attack was occurring and block it? Or did the attackers only purchase a small DDOS attack, so it couldn't be sustained? Or did the attackers realize that they attacked the wrong target or that it was pointless trying to attack a website hosted by Amazon and call it off?

I'm too lazy to download the gigabytes of logs and do a proper analysis, but when I took a look at one or two log files, it seems like the attack happened around 12:50 UTC and lasted only 3-4 minutes, so maybe it could have been manually triggered after all. But if it was manually triggered, maybe the attacker would manually visit the website to verify if the attack was working or not. If so, I could search through the logs, and maybe I could pick out the request that comes directly from the attacker's computer. Of course, maybe the botnet automatically monitors its own effectiveness. That might explain why some of the requests followed the redirect while others did not. The requests that followed the redirect were actually trying to verify the effectiveness of the attack by sending a normal request to the website and measure its response time.

AWS provides summary statistics of the country where requests came from. The accesses seem to be spread pretty widely geographically, so it really was a distributed botnet. Here's a breakdown of the top few countries where the accesses came from:

  1. United States 27,235,432
  2. Bulgaria 15,866,047
  3. Turkey 6,927,133
  4. France 6,166,693
  5. Germany 6,141,788
  6. Indonesia 6,115,375
  7. Netherlands 5,347,080
  8. Canada 5,156,815
  9. Australia 4,458,130
  10. China 3,511,069
  11. India 3,093,340
  12. Japan 3,039,478
  13. Vietnam 2,406,960
  14. Brazil 2,321,232
  15. Russian Federation 1,859,764
  16. Iran, Islamic Republic of 1,631,397
  17. Colombia 1,510,684
  18. United Kingdom 1,485,535
  19. Korea, Republic of 1,323,381
  20. Bangladesh 1,219,066
  21. Spain 1,170,194
  22. Thailand 1,081,853
  23. Finland 1,052,344
  24. Ecuador 921,248
  25. Poland 867,562
  26. Argentina 843,783
  27. Ukraine 842,882
  28. Mexico 838,305
  29. Hungary 719,549
  30. South Africa 669,794
  31. Philippines 610,215
  32. Kazakhstan 599,328
  33. Italy 540,665
  34. Luxembourg 532,379
  35. Chile 512,108
  36. Libya 486,914
  37. Venezuela, Bolivarian Republic of 471,266
  38. Singapore 455,583
  39. Ireland 435,134
  40. Peru 396,041
  41. Latvia 357,309
  42. Dominican Republic 353,845
  43. Sri Lanka 314,126
  44. Norway 298,881
  45. Albania 268,222
  46. Myanmar 235,063
  47. ...

I just looked at a few log entries of requests, it seems like some clients only made a few requests while other clients would submit hundreds of requests, all within the span of a few seconds. Just randomly grabbing a few IPs from the logs and doing IP lookups, it looks like the requests seemed to be coming from compromised servers in various data centers. Just grabbing some random IPs, I can see Hurricane Electric, Heymman Servers, BelCloud, RK Telecom, Maxnet Telecom, Min Proxy Company--just a lot of servers from all over. I wonder if these are compromised servers or cloud servers bought using stolen credit cards. Is it possible to report these servers to the service providers as being compromised so that they can be taken down and fixed?

So overall, it does seem like I was the victim of a DDOS, but since everything is cloud-hosted, I didn't really notice at all. To be honest, the site is so off my radar, I don't think I would have noticed even if the DDOS really had taken down the website and made it inaccessible. Honestly, I can't fathom why someone would want to run a distributed denial of service against the website. It's a really insignificant site with little traffic and no commercial value, so there can't be any commercial reason to try to knock it off the Internet. I don't think any Internet scammers sent me any threats or extortion messages asking for money to avoid a DDOS. I even looked into my spam folders and didn't see anything there. Maybe I pissed someone off on the Internet and they decided to attack back by hiring a DDOS service, but I don't think I annoyed anyone on February 28th. Perhaps I annoyed someone before then, and they only scheduled the attack for later, but that seems to defeat the point of a DDOS if it just seems random and doesn't cause me to fear that hackers are out to get me. 

So a DDOS attack was made against my website. But it was over in three minutes. I don't know why it ended so quickly. I didn't even know about it until a week later. I don't know why I was attacked. It's all just very confusing and mysterious.

Thursday, June 08, 2023

Using My Own Programming Language in a Game Jam


For the past while, I've been working on creating a new programming language called Plom. This is a bit unusual because I don't actually believe in creating new programming languages. I think there are enough programming languages. I don't think there's much benefit to creating new ones beyond personal satisfaction. And I even dedicated a whole PhD to researching how to add new features to existing programming languages so that you don't have to create new ones.

The reason I ended up making a new programming language was that I was interested in getting more people into programming by making it easier for people to program using cellphones. In a world where more and more people have cellphones, we need to bring programming to cellphones instead of expecting people to buy computers to do any programming. But then I faced a standard research dilemma of whether I should adapt an existing programming language resulting in something more practical but full of compromises or whether I should design a new language entirely from scratch giving me more avenues to explore to find the best solution but resulting in something less practical for real-world use. In this case, I did decide that making a new programming language would simply offer considerably more gains over adapting an existing language, and I felt like it was important to show as many gains as possible. I need to convince others that this direction in programming language design is important, and I feel that I need to show clear improvements over existing techniques in order to do that.

When creating a new programming language, I have a theory that it's important that the programming language be usable for real programming. It doesn't matter if it's a toy programming language or an educational programming language or whatever. To fully understand the main problems and issues behind a new programming language, it must be used. It's also important for marketing because the primary evangelists behind programming languages are other programmers. So it doesn't matter if a programming language is intended for a non-programmer audience, the people who will evangelize the language to them are, in fact, programmers. As such, the programming language must be usable for real programming by real programmers if you want it to gain any traction.

So all this is a long-winded explanation for how I ended up game jamming with my own programming language. I've been working on my Plom programming language for a while, and though it's still in rough shape, I felt that I really needed to start using it for something real to get a feel for the real issues facing the language and where it needed improvement. If I were to just make some toy programs in a relaxed environment, I would end up working in starts and stops. Seeing one annoyance in the language, then taking a break to work on it, and then switching back and forth, again and again. But game jams are useful because you have a limited time and you must absolutely focus on the language to drive it through any problems and issues. Instead of taking a break to fix an annoyance, I had to keep programming despite the annoyance, which sometimes revealed deeper, more important issues. So I entered a weekend game jam with the intention to make a game entirely using Plom, allowing me to focus entirely on Plom for one weekend so that I could understand its strengths and weaknesses. I spent the weeks before the game jam making sure that Plom had at least rudimentary support for being used in a game jam such as basic support for importing external resources (like images for games), a rudimentary runtime so that code could be run, and the ability to export everything as a game. I then installed everything on my iPad and Android phone and headed off to the jam.

Overall, programming with Plom worked better and worse than expected. 

It worked better than expected in that I was actually able to finish making a small game with the language. I kept worrying that I would encounter some fundamental flaw in the language that would make the project go awry. I was sure that I must have overlooked some implementation detail that would cause the language to behave unreliably, requiring hours and hours to debug, derailing things. In the end though, the language itself seemed to work fine, and it behaved as it was designed to. Its implementation seemed to scale properly to support a small, real program, and its performance was adequate. PlomGit, the little git version control app I wrote earlier, was solid, and I was able to move between developing on Android, iOS, and web depending on where I was without any complications. I did most of my programming on my iPad because it was a faster machine, but I did do some programming on my phone on the subway too. I really only had to stop game jamming and focus on improving Plom once, which was to add support for importing Plom projects made in iOS/Android directly into the web version, so that I could debug them more easily. It wasn't a fundamental issue with Plom itself. So overall, I think the general design that I have for Plom seems solid.

Plom worked much worse than expected in that some aspects of Plom simply weren't ready for real programming yet. Mainly, the error reporting and logging were totally inadequate. I knew of this shortcoming when entering the game jam, but I expected to be able to set-up a full build environment for Plom at the game jam where I could debug Plom and dig into its internals at the jam site to figure out what was going wrong. But the computers at George Brown College were shared, and I didn't want to store my ssh keys there, and installing xcode on those machines to allow me to deploy new iOS versions seemed a little dubious (curse you, Apple and your closed developer ecosystem that requires complicated pipelines and signing credentials just to deploy small programs). So even if I made small coding errors in my game, there was inadequate feedback about where the errors were, and I wouldn't be able to properly debug it until I brought the code home and the entire Plom environment in a Chrome debugger (I've never really been able to get the Safari debugger to work properly with GWT code running in frames, so debugging the Plom environment in iOS never really worked for me). As a result, at the beginning, when I wasn't really confident in which parts of Plom were reliable or not and didn't have much experience with programming in Plom, I made several coding mistakes that weren't obvious, couldn't track them down, and pretty much became stuck for hours at a time because I didn't know whether the bugs were in Plom or my game code or somewhere else. In fact, by half way through the jam, I still couldn't reliably display images on the screen, let alone make a game. I then had to discard my initial plan for a game and come up with an entirely different game idea that was much simpler because there was no way I could make my original game idea with the pace I was proceeding at. By the end of the second day, I was finally able to get some small things running and with a game idea that had a more manageable, smaller scope, I didn't feel so stressed out. By the third day, I was more confident in using Plom, and I was more confident that any bugs I encountered were bugs in the game and not with Plom, so I was really able to focus my efforts and finish the game.

So overall, the game jam was pretty stressful. Usually with game jams, I spend the first day coming up with an idea and programming the basic groundwork for the game. I then have the game playable by the end of the second day. And I spend the third day polishing the game up so that it's enjoyable to play. With this game jam, I went in with a simple game idea already and I spent the first day getting used to Plom and trying to draw sprites onto the screen. By the second day, I was still fighting with getting a basic game framework going, and disheartened by how things were going, I had to change game ideas.  I was even making a new Array implementation for Plom that could interface better with JavaScript code. It was only on the final day where I was able to program most of the functionality of the game itself. I always felt like I was behind and struggling to catch up. But I did catch up, and though the game isn't remarkable by any means, it is a real program and it was made with Plom. Plom still has a long road before it's a usable language. During the jam, I must have made 2-3 pages of notes of things that needed improvement. But I'm encouraged by how things went, and I'm beginning to think that Plom might actually work as a language.


Wednesday, November 30, 2022

Steam Deck is Like the DOS Era All Over Again

I recently purchased a Steam Deck for my parents, hoping that it would be an easy to use gaming machine for the occasional times when my parents want to game. The promise of the Steam Deck was that it would be an easy to use gaming machine like a gaming console, but for PC games. Instead, I've found the Steam Deck to be like DOS-era gaming where I have to spend huge amounts of time doing configuration and setup, and afterwards, everything is still sort of fiddly and difficult to use. 

In the end, I now realize that the Steam Deck is not actually for PC gaming. It can play PC games, but the hardware and software have been specifically designed as a new gaming console designed specifically to play Steam Deck games for Steam Deck gamers. What I mean by that, is that the Steam Deck isn't really designed for more general gamers, and it really isn't really designed for non-Steam Deck games. I had a whole library of old Steam games that I've accumulated through Humble Bundles and elsewhere over the years, and I assumed that they would work okay on the Steam Deck. In fact, the experience of playing these games on the Steam Deck isn't that great. The Steam Deck is designed for playing Steam Deck games--games that have been customized and programmed specifically for running on the Steam Deck. If you have a lot of those games, then that's great. I think those are mostly action-oriented games, especially if they have been ported from other gaming consoles. The Steam Deck is also not designed for casual gamers. To use the Steam Deck, you have to learn a bunch of UI quirks and memorize several shortcuts. Non-tech-savvy people will never remember all these things and will become frustrated by the device. A lot of fit and polish issues needed for a general audience are lacking. For example, just turning on the device is a little complicated. There's a one or two second delay between pressing the power button and anything showing up on the screen. So when I press the button, I can never figure out whether the press was registered, and whether I should press the button again, which might turn it off, or long-press it to actually turn it on or whatever (a lot of other UI actions have long delays with insufficient feedback like that too--I'm looking at you, "return to game mode"). And when it does finish booting up, it dumps you on a non-customizable "home" screen, which doesn't actually list the games that you can play on your device. Instead, it lists a jumble of games that you recently purchased on Steam, some that you've played recently, etc. You have to press an unmarked shortcut (the B button) or navigate through the Steam menu to get to the games list, and then you have to navigate around that to get to your list of installed games. There's no way my parents or young kids will remember all those steps to get to their favorite game. You would think that this convoluted UI is a scheme to get you to buy more Steam games, but that's not the case either because you have to navigate the menus to get to the Steam store as well. I just don't understand why the UI is this way.

I've watched several videos about how the Steam Deck can be used as a computer. In fact, it only makes a suitable computer if you plug in a monitor and mouse and keyboard. The Steam Deck designers did not bother refining that aspect of the experience to make it practical if you're using just the Steam Deck itself. For example, I'm not sure if the hardware digitizer is poor quality or the touch drivers are poor, but all touchscreen actions are pretty janky. Swiping to scroll in web browsers and elsewhere doesn't really work smoothly. The virtual touch keyboard always misses key presses, so you can't really type quickly using it. I'm not sure if the soft keyboard is part of the OS or if it's a Steam thing because in some programs, the program loses keyboard focus when I'm in the virtual keyboard, which is annoying. There's no dedicated button to pull up the soft keyboard. Instead, you need to use the Steam-X shortcut, which normal people won't remember. That shortcut is also a hassle because it requires two hands to press (a good portable device should be usable one-handed), and I often end up accidentally pressing the grip buttons on the back of the device when I have to shift my hands over. When using the trackpad like a mouse, the R2 trigger is used for left-click, and the L2 trigger is used for right-click, which is going to throw beginners off. Also, the L2 and R2 triggers are analog triggers, so it's a pain having to squeeze them all the way down just to do a mouse-click. In particular, double-clicking is a real pain, and sometimes, I have to shift my hand a bit to fully depress the trigger, causing my thumb to shift on the trackpad a bit, moving my mouse pointer before clicking. Personally, I think R1 for left-click, and R2 for right-click might have been better. You can install your own programs and games, but Steam discourages that, requiring you to add 4 different pieces of artwork in 3 different locations to get your own programs to integrate nicely with the Steam interface. 

Playing games that aren't optimized for the Steam Deck isn't too great too. Part of the problem is that the device is optimized for Steam Deck games at the expense of being good for general PC gaming. For example, besides the keyboard being janky, the Steam Deck doesn't have enough buttons for it to act like both a mouse and a game controller at the same time. With non-Steam PC games, there's an assumption that even if you have a gamepad, you might sometimes have to click on things or type things to configure things. But the Steam Deck can't be configured as both. You have to go into a mouse mode to do your mouse things, then switch back to controller mode to do your controller things. And there's no button for doing that switch, so you have to navigate menus or whatever every time you need to switch. If the Steam Deck were designed for general PC gaming, they would have lost one of the trackpads and had a dedicated left-click/right-click mouse buttons, plus a keyboard button. That way, you could do easily switch between mouse/gamepad/keyboard for non-Steam Deck games without much hassle. Instead, the games need to be customized specifically for Steam Deck to work well.

I still think the Steam Deck is a nice device, but a lot of the hype oversells what it is. It's not easy to use like a gaming console at all. It isn't great at general PC gaming, and you aren't going to pull out your old collection of Steam strategy games or whatever to play on it. It's not a general computer. You aren't even going to browse the web with it. I think there's still a lot of room for other manufacturers like GPD, AYN etc. to make better devices that are easier to use and better for gaming.

Tuesday, September 10, 2019

Swift ASN.1 Decoder for iOS Receipt Validation

If you want to have in-app purchases in an iOS or MacOS app, you need a way to check what purchases have been made. Annoyingly, Apple does not provide developers with any code for doing this. Apple's APIs will give your program a receipt, listing what was purchased, but the receipt is encoded in a weird format, and Apple doesn't provide any code for reading this format. Apple's reasoning is that not providing code for this is like a very limited form of DRM/copy protection. If every program has custom code for parsing and interpreting the receipt, software pirates will need to do extra work to crack your software.

It is true that software piracy is rampant on Android, and it probably exists on iOS too. Some of us aren't really too concerned with this software piracy issue though, and we just want to implement some quick and dirty handling of IAP with the assumption that most software pirates wouldn't have purchased the software anyway. 

Apple's preferred solution is for you to create your own receipt validation server that your programs can connect to, which will then contact Apple's servers to parse the receipt and to confirm that it's valid. This is a bit of hassle because you have to make an online service, figure out how to keep it running, protect it from hackers, and make your app more fragile because it will always be connecting to this online service.

The other solution is to do receipt validation on the app itself. This is annoying because Apple doesn't provide code for parsing the receipt, the receipt stored on the app contains less information than what Apple provides to servers, and iOS doesn't really bother to keep the receipt up-to-date all the time meaning you often have to go out of your way to update the receipt yourself. The most common way to do the receipt parsing is to just include a copy of OpenSSL in the app, but that involves some annoying interfacing with C code.

I just wanted something quick & dirty, and I'm not too concerned about doing all the signature checking and whatnot, so I just wanted some simpler Objective-C or Swift code online for doing receipt parsing. I tried looking around online a lot, but I couldn't find one, so eventually, I just rolled my own. It's pretty rough since I just threw it together until it worked just enough that it would work for my own app, so use at your own risk. Here it is:

struct Asn1BerTag : CustomStringConvertible {
    var constructed: Bool
    var tagClass: Int
    var tag: Int
    var description: String {
        return String(tagClass) + (constructed ? "C": "-") + String(tag);
    }
}

struct Asn1Entry {
    let tag : Asn1BerTag
    let data : Data
    let len : Int
}

// TODO: This parser thing is sort of insecure because it doesn't really do bounds-checking on
// anything, but it's only used for reading internal data structures so whatever
class Asn1Parser {
    // Parse a single ASN 1 BER entry
    static func parse(_ data: Data, startIdx: Int = 0) -> Asn1Entry {
        var idx = startIdx
        // Try to parse the tag
        var val = data[idx]
        idx += 1
        let tagClass = Int((val >> 6) & 3)
        let constructed = (val & (1 << 5)) != 0
        var tagVal = Int(val & 0x1F)
        if tagVal == 31 {
            val = data[idx]
            idx += 1
            while (val & 0x80) != 0 {
                tagVal <<= 8
                tagVal |= Int(val & 0x7F)
                val = data[idx]
                idx += 1
            }
            tagVal <<= 8
            tagVal |= Int(val & 0x7F)
        }
        let tag = Asn1BerTag(constructed: constructed, tagClass: tagClass, tag: tagVal)
        
        // Try to parse the size
        var len = 0
        var nextTag = 0
        val = data[idx]
        idx += 1
        if val & 0x80 == 0 {
            len = Int(val)
            nextTag = idx + len
        } else if val != 0x80 {
            let numOctets = Int(val & 0x7f)
            for _ in 0..<numoctets {
                len <<= 8
                val = data[idx]
                idx += 1
                len |= Int(val) & 0xFF
            }
            nextTag = idx + len
        } else {
            // Indefinite length. Scan until we encounter 2 zero bytes
            var scanIdx = idx
            while data[scanIdx] != 0 && data[scanIdx+1] != 0 {
                scanIdx += 1
            }
            len = scanIdx - idx
            nextTag = scanIdx + 2
        }
        return Asn1Entry(tag: tag, data: data.subdata(in: idx..<(idx + len)), len: nextTag - startIdx)
    }
    
    static func parseSequence(_ data: Data) -> [Asn1Entry] {
        var toReturn : [Asn1Entry] = []
        var idx = 0
        while idx < data.count {
            let entry = Asn1Parser.parse(data, startIdx: idx)
            toReturn.append(entry)
            idx += entry.len
        }
        
        return toReturn
    }
    
    static func parseInteger(_ data: Data) -> Int {
        let len = data.count
        var val = 0
        
        for i in 0..<len {
            if i == 0 {
                val = Int(data[i] & 0x7F)
            } else {
                val <<= 8
                val |= Int(data[i])
            }
        }
        if len > 0 && data[0] & 0x80 != 0 {
            let complement = 1 << (len * 8)
            val -= complement
        }
        return val
    }
    
    static func parseObjectIdentifier(_ data:Data, startIdx: Int = 0, len: Int? = nil) -> [Int] {
        let dataLen = len ?? data.count
        var idx = startIdx
        var identifier: [Int] = []
        while idx < startIdx + dataLen {
            var subidentifier = 0
            var val = data[idx]
            idx += 1
            while (val & 0x80) != 0 {
                subidentifier <<= 7
                subidentifier |= Int(val & 0x7F)
                val = data[idx]
                idx += 1
            }
            subidentifier <<= 7
            subidentifier |= Int(val & 0x7F)
            identifier.append(subidentifier)
        }
        
        return identifier
    }
}

class IapReceipt {
    var quantity: Int?
    var product_id: String?
    var transaction_id: String?
    var original_transaction_id: String?
    var purchase_date: Date?
    var original_purchase_date: Date?
    var expires_date: Date?
    var is_in_intro_offer_period: Int?
    var cancellation_date: Date?
    var web_order_line_item_id: Int?
}

class AppReceipt {
    var bundle_id : String?
    var application_version : String?
    var receipt_creation_date: Date?
    var expiration_date: Date?
    var original_application_version : String?
    var iaps: [IapReceipt] = []
}

class ReceiptInsecureChecker {
   
    func parsePkcs7ReceiptForPayload(_ data: Data) -> Data? {
        
        // Root is a sequence (tag 16 is sequence)
        let root = Asn1Parser.parseSequence(data)
        guard root.count == 1 && root[0].tag.tag == 16 else { return nil }
        
        // Inside the sequence is some signed data (tag 6 is object identifier)
        let rootSeq = Asn1Parser.parseSequence(root[0].data)
        guard rootSeq.count == 2 && rootSeq[0].tag.tag == 6 && Asn1Parser.parseObjectIdentifier(rootSeq[0].data) == [42, 840, 113549, 1, 7, 2] else { return nil }
        
        // Signed Data contains a sequence
        let signedData = Asn1Parser.parseSequence(rootSeq[1].data)
        guard signedData.count == 1 && signedData[0].tag.tag == 16 else { return nil }
        
        // The third field of the signed data sequence is the actual data
        let signedDataSeq = Asn1Parser.parseSequence(signedData[0].data)
        guard signedDataSeq.count > 3 && signedDataSeq[2].tag.tag == 16 else { return nil }
        
        // The content data should be tagged correctly
        let contentData = Asn1Parser.parseSequence(signedDataSeq[2].data)
        guard contentData.count == 2 && contentData[0].tag.tag == 6 && Asn1Parser.parseObjectIdentifier(contentData[0].data) == [42, 840, 113549, 1, 7, 1] else { return nil }
        
        // Payload should just be some bytes (tag 4 is octet string)
        let payload = Asn1Parser.parse(contentData[1].data)
        guard payload.tag.tag == 4 else { return nil }
        
        return payload.data
    }
    
    func parseReceiptAttributes(_ data: Data) -> AppReceipt? {
        var appReceipt = AppReceipt()
        
        // Root is a set (tag 17 is a set)
        let root = Asn1Parser.parse(data)
        guard root.tag.tag == 17 else { return nil }
        
        // Read set entries
        let receiptAttributes = Asn1Parser.parseSequence(root.data)
        
        // Parse each attribute
        for attr in receiptAttributes {
            if attr.tag.tag != 16 { continue }
            let attrEntries = Asn1Parser.parseSequence(attr.data)
            guard attrEntries.count == 3 && attrEntries[0].tag.tag == 2 && attrEntries[1].tag.tag == 2 && attrEntries[2].tag.tag == 4 else { return nil }
            
            let type = Asn1Parser.parseInteger(attrEntries[0].data)
            let version = Asn1Parser.parseInteger(attrEntries[1].data)
            let value = attrEntries[2].data
            switch (type) {
            case 2:
                let valEntry = Asn1Parser.parse(value)
                // tag 12 = utf8 string
                guard valEntry.tag.tag == 12 else { break }
                appReceipt.bundle_id = String(bytes: valEntry.data, encoding: .utf8)
            case 3:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { break }
                appReceipt.application_version = String(bytes: valEntry.data, encoding: .utf8)
            case 12:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                appReceipt.receipt_creation_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 17:
                let iap = parseIapAttributes(value)
                if iap != nil {
                    appReceipt.iaps.append(iap!)
                }
            case 19:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { break }
                appReceipt.original_application_version = String(bytes: valEntry.data, encoding: .utf8)
            case 21:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                appReceipt.expiration_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            default:
                break
            }
        }
        
        return appReceipt
    }
    
    func parseIapAttributes(_ data: Data) -> IapReceipt? {
        let iap = IapReceipt()
        
        // Root is a set (tag 17 is a set)
        let root = Asn1Parser.parse(data)
        guard root.tag.tag == 17 else { return nil }
        
        // Read set entries
        let receiptAttributes = Asn1Parser.parseSequence(root.data)
        
        // Parse each attribute
        for attr in receiptAttributes {
            if attr.tag.tag != 16 { continue }
            let attrEntries = Asn1Parser.parseSequence(attr.data)
            guard attrEntries.count == 3 && attrEntries[0].tag.tag == 2 && attrEntries[1].tag.tag == 2 && attrEntries[2].tag.tag == 4 else { return nil }
            
            let type = Asn1Parser.parseInteger(attrEntries[0].data)
            let version = Asn1Parser.parseInteger(attrEntries[1].data)
            let value = attrEntries[2].data
            switch (type) {
            case 1701:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 2 else { return nil }
                iap.quantity = Asn1Parser.parseInteger(valEntry.data)
            case 1702:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { return nil }
                iap.product_id = String(bytes: valEntry.data, encoding: .utf8)
            case 1703:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 12 else { return nil }
                iap.transaction_id = String(bytes: valEntry.data, encoding: .utf8)
            case 1704:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.purchase_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1706:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.original_purchase_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1708:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.expires_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1719:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 2 else { return nil }
                iap.is_in_intro_offer_period = Asn1Parser.parseInteger(valEntry.data)
            case 1712:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 22 else { return nil }
                iap.cancellation_date = parseRfc3339Date(String(bytes: valEntry.data, encoding: .utf8) ?? "")
            case 1711:
                let valEntry = Asn1Parser.parse(value)
                guard valEntry.tag.tag == 2 else { return nil }
                iap.web_order_line_item_id = Asn1Parser.parseInteger(valEntry.data)
            default:
                break
            }
        }
        return iap
    }
    
    func parseRfc3339Date(_ str: String) -> Date? {
        let posixLocale = Locale(identifier: "en_US_POSIX")
        
        let formatter1 = DateFormatter()
        formatter1.locale = posixLocale
        formatter1.dateFormat = "yyyy'-'MM'-'dd'T'HH':'mm':'ssX5"
        formatter1.timeZone = TimeZone(secondsFromGMT: 0)
        
        let result = formatter1.date(from: str)
        if result != nil {
            return result
        }
        
        let formatter2 = DateFormatter()
        formatter2.locale = posixLocale
        formatter2.dateFormat = "yyyy'-'MM'-'dd'T'HH':'mm':'ss.SSSSSSX5"
        formatter2.timeZone = TimeZone(secondsFromGMT: 0)
        
        return formatter2.date(from: str)
    }    
}

To use the code, you would write something like this:

let data = Data(base64Encoded: "... BASE 64 DATA ...")
let receiptChecker = ReceiptInsecureChecker()
let payload = receiptChecker.parsePkcs7ReceiptForPayload(receipt!)
let appReceipt = receiptChecker.parseReceiptAttributes(payload!)
print(appReceipt!.iaps)

Note: I'm not a Swift coder. I only starting learning Swift about a month ago, so I apologize if the code is not very Swift-y

Wednesday, July 31, 2019

CorelDraw Graphics Suite 2019 Review

Since I'm originally from Ottawa, I've always used CorelDRAW for vector graphics. This actually works out well. Since I'm not an artist or designer, I rarely need to do any vector graphics work, so CorelDRAW has worked for me because it comes with a lot of functionality, I could make a one-time purchase of a perpetual license to the software, and I was occasionally able to get good deals when buying it.

I previously used CorelDraw X5, and it did what I needed it to do, but the menus didn't work quite right on Windows 10, so I was looking to upgrade if the upgrade price ever dropped to around $100-$150 or so, but the price never dropped that low, so I just kept using my old version. Especially since I now make my own vector graphics package, I rarely needed CorelDraw except for some occasional obscure feature. Unfortunately, Corel declared that 2019 would be the last year they would offer upgrade pricing on CorelDraw, so I decided to pick up a copy of CorelDraw 2019 since it would be my last chance to get an upgrade.

I have to say that I feel a little disappointed with CorelDraw 2019. CorelDraw has always been a buggy piece of software. But usually it's the new features that are buggy, but if you stick with the core vector graphics stuff then it works fine. Usually, the new features would be so buggy that they would be unusable, but Corel wouldn't bother fixing it until a later version, so you would just have to pay for an upgrade to fix those bugs and get working versions of the new features. Unfortunately, it seems like they rewrote the core user interface code in this version, so now the core vector graphics functionality is buggy. I suspect it might be related to the fact that they've rewritten stuff so that it works on the Mac (previously, CorelDraw was Windows only). This is annoying to me because CorelDraw 2019 is too buggy for basic vector graphics work, but it likely won't be fixed unless I buy an upgrade to a later version, but Corel isn't going to be selling upgrades any more. I'm sorely tempted to keep using CorelDraw X5. The bugs are just little annoying little things like the screen blanking out if you scroll the window using the scrollbar, requiring you to press ctrl-W to manually refresh the screen. Groups also no longer snaps properly to grids. If you try to move a group, CorelDraw will choose one of the objects of the group (I think it's the top one?), and snap that to the grid instead of aligning the group as a whole. This makes grids sort of useless to me. CorelDraw also doesn't let you snap to grids and snap to objects at the same time. It gets confused and tries snapping to objects, and will completely ignore any possible grid snapping you can do. If basic functionality like scrolling and snap to grid don't work, then how is anyone supposed to get any productive vector graphics work done with CorelDraw?

On top of that, CorelDraw feels slow and sluggish. To be fair, CorelDraw has always felt slow and sluggish, but if you keep using an old version, then after a few years, your computer gets fast enough that it feels snappy and usable. Still, I was hoping that Corel would have left well enough alone, and stopped meddling with the old code so that it would stay fast. That's not the case. It feels sluggish. After all these years, Corel still has not learned that responsiveness is one of those magic unspoken features that make a graphics package feel good to use. Even though Corel Photo-Paint has many more features than my old copy of Photoshop Elements, I still use Photoshop Elements as my primary paint program because it's just so much faster and responsive. CorelDraw 2019 also just stops and hangs for a couple of seconds sometimes. I think it might be that the saving code is now very slow for some reason. Since CorelDraw autosaves fairly often (due to its buggy nature), I think CorelDraw will just occasionally become unresponsive as its incredibly slow autosave happens.

In the end, I feel like I've wasted my money. I bought CorelDraw 2019 because it was the last upgrade version they would offer. But CorelDraw 2019 is really buggy and not very usable. These bugs likely won't be fixed until a later version of Corel, which there won't be any upgrade pricing available for. Every time I use CorelDraw 2019, I keep wanting to go back to using my old version of CorelDraw X5 instead, which I sometimes do. I think the verdict is that if you are in a rush to upgrade CorelDraw because it's a last upgrade version available, DON'T get CorelDraw 2019 because it's too slow and buggy. If you can find an upgrade to an older version of CorelDraw, that might be a better choice to buy actually. Otherwise, just stick to your old version.

Monday, June 18, 2018

Ranking of Racism against Asian Americans at Ivy League Schools

It has long been assumed that the Ivy League schools are racist against Asians, but it's been hard to understand the extent of the racism. There are some universities that try to run purely meritocratic admissions systems involving fewer subjective evaluations. For example, Caltech has an Asian enrollment of 43%, but it's not clear whether that's comparable to other schools because of the heavy engineering focus of the school. Berkeley is a more well-rounded university and it has an Asian enrollment of 41%, but it's a public university in a state with a high Asian population, so it's not clear if it's comparable to universities in other parts of the country.

Fortunately, Harvard ran the numbers back in the 2013, and they found that if they ranked students solely by academic qualifications, then Asians would make up around 40% of the admissions. Even if Harvard continued to set aside spots for athletes and undeserving rich people, then Asians would make up 31%. If extra-curriculars and other subjective measures were included as well, then Asians should still make up around 26% of the student population (even though in 2013, only 19% of the admitted class was Asian). I believe that the Asian population has only grown since 2013.

These numbers agree with the Berkeley and Caltech numbers, so I feel it's safe to use these numbers to do back-of-the-envelope calculations for how racist against Asians each university is. The Harvard numbers should be comparable with other prestigious, well-rounded, private universities that attract students from across the country. So it should be safe to compare the numbers to other Ivy League universities.

So I visited the websites of the Ivy League universities, and grabbed their reported diversity statistics on Asian admissions. The numbers are hard to compare because different universities categorized their students differently. If universities had a separate category for unknown and/or foreign students, then I left them out of the total. If there was a category for multi-racial, I did not include that number in the count of percentage Asians. As a result, the numbers are very noisy, but I think they still give a basis for comparing universities. I think that universities with Asian enrollment in the high 20s or low 30s are demonstrating low amounts of racism against Asians.

So here are the results:

1. Brown (18.16%) - most racist
2. Yale (21%)
3. Dartmouth (21.74%)
4. Harvard (22.2%) - probably worse than Dartmouth, but the numbers are hard to compare
5. Cornell (23.26%)
6. UPenn (23.56%)
7. Princeton (25.29%)
8. Columbia (29%) - least racist

And here's Stanford's numbers even though they aren't an Ivy League university:

Stanford (26.44% Asian)

Initially, the numbers seem to suggest that Brown is the most racist against Asians of the universities. They also have the lowest enrollment of African Americans of the other Ivy League universities, so they just fail on diversity in general. They do have a large number of students classified as multi-racial, which makes things unclear, but things still look bad after removing them from the totals. It's possible that Yale or Harvard are, in fact, the worst universities because they don't have a "multi-racial" category and their percentage of enrollment being Asian is pretty low.

There's a bunch of universities in the middle that seem to be sort of racist. Princeton seems to be the best of the middle.

The least racist, by far, seems to be Columbia, which achieves a high-20s in Asian enrollment, which is the safe zone. They also have strong African enrollment as well. It demonstrates that it is possible to have diverse minority enrollment without unduly punishing Asians.

So what's going on? These universities (other than Columbia) are using the implicit bias effect to willfully keep down Asian enrollment. These universities intentionally add subjective measures into student evaluations that are known to be subject to bias. They then hire admissions officers of dubious qualifications who don't understand the Asian experience or don't like Asians in general, who implicitly prefer applicants who are more like themselves, and who are told to find applicants who match an Ivy League "culture" or "character profile" that run contrary to Asian stereotypes and biases. As a result, they end up giving lower subjective scores to Asians. Is the applicant who plays badminton more or less "brave" than the applicant who plays football? Is the atheist applicant more or less kind than the church-going applicant? There is no way of knowing those things, but people will inevitably form an opinion based on their implicit biases. For example, Harvard gave lower "personality" ratings to Asian applicants in general. I can imagine that, yes, some universities might prefer people with certain personalities over others. But that mix of personalities should be evenly distributed among all people. If one race consistently score poorly on the personality rating versus all other races, then there's some implicit racism going on there that needs to be fixed. Strangely, no personality deficiencies were found during alumni interviews. The bias only appeared during the ranking of personal qualities by the Admissions Office.

Here are some more in-depth articles if you want a deeper dive in the issues involved: 1, 2, 3.