The Universal Translator, as seen in Star Trek, regularly raises questions.

For example, why do the users not hear double voices? But more commonly, how do characters keep saying untranslated klingon words? How can they keep saying p’takh? How can Worf call his embarrassing pimple a gorch without the UT biting him in his klingon ass and making him call it a pimple?

@RikerGoogling has repeated the question a couple times, why the UT doesn’t seem to work on Klingonese. I keep replying, it’s intentional on the speaker’s part. But let’s me go into way too much detail.

How the UT determines intent, I don’t know. It doesn’t matter. Maybe something else makes it skip those words,  but think about it: how would you explain something about a particular language when every word you say ends up rendered in English, or whatever the listener speaks? It’s not just Klingonese either — Picard would sometimes curse in French, right there on the bridge. Likewise, the senior crew singing “For He’s a Jolly Good Fellow” in Klingonese? They meant for it to remain the way they said it, so it was. Even if their manual translation was a bit rough.

So what about the lack of overlapping sounds? Well, I don’t care what you think about Discovery, but one particular scene stood out to me.

You know those channel logos in the corner of the TV screen? Or for you viewers of certain Japanese imports, the (much more obvious) clock? Eventually, you kinda stop seeing them. They’re still there, but you don’t actively register them like you do the rest of whatever it is you’re watching. Or in a cheap dub where the original voice tracks are faintly heard underneath the louder dub track. Background noise, this mild headache I’ve had since before I care to remember, it all just gets tuned out. And that brings me to that one scene in Discovery.

Michael Burnham sneaks onto the klingon vessel and eventually takes out her communicator, enabling its UT feature. We hear the communicator repeat what the klingons say, in English, with a slight delay and text-to-speech quality. Revealing herself, Burnham addresses the leader in English, only for the communicator to echo her words in Klingonese. And shortly into the conversation, the delay disappears and in fact, the klingon leader himself starts speaking fluent English.

I think that’s meant to represent Burnham, but perhaps mostly the viewers, getting used to the UT’s effect. Eventually, you don’t even hear the original lines any more, and your mind just ignores the delay.

There’s only one counter-argument that comes to mind: people without universal translators who listen to people with. Ferengi speaking their own language to 20th century humans, who (once the devices are fixed) can understand the ferengi perfectly well? All the above can’t explain that.

But at least the thing where they keep speaking untranslated klingon is covered.

No start!

Back in February there was this whole deal on Twitter about the Konami Code, when the person who introduced the thing passed away, and I wondered if I was an asshole for being bothered by all of these people including the Konami PR guys posting the Code… with a start at the end.

Because the Code, as Kazuhisa Hashimoto originally introduced it, never included the start button. And I brought receipts! So here’s a quick rehash of what I wrote the day after.

This is part of the 6502 code for Contra, where it determines how many lives you should start the game with. As you can see the amount of lives is stored at address $32 and is set to either 2 or 29 depending on the value of address $24, where it tracks if the Code had been entered.

This is a view of the actual memory of the NES, or at least the relevant part of it. The right half had been cut off for size in the original tweets, but none of the values we’re looking at are on that side anyway. Focus is on $24, the Code flag, which is unset.

Entering the Code flips $24 to 1 when you press A. The only thing the start button does is… start the game. That includes running the code above to initialize the lives counter.

“But Kawa, what about Gradius?” you might ask. Well, you go start up Gradius and enter the code, and tell me what you see. You begin the game, press start to pause, enter the code, and press start to unpause. Don’t enter those button presses too quickly or you might not catch on and fool yourself! Turns out the options and such appear the moment you press A.

Here’s the RAM for Gradius during a pause. $33 tracks how far along the Code entry you are and ranges from zero up to nine. When it’s zero, the next button expected is up. When it’s nine, the next expected input is A. You finish, $33 equals ten, and your loadout changes on the spot:

Also the tracker is reset to zero so you can enter it again. Now, in Gradius you can freely make a mistake and try again because the tracker just resets to zero when you mistype. In Contra, the tracker ($3F so it’s not visible in those screenshots) is set to 255 as a “lock” value when you mess up and you don’t get to try again.

But yeah, no start in the Konami Code. Your tattoo is now ruined.


How to SCI – Old vs New

To make an SCI game today, you can just grab a copy of SCI Studio and use that to import graphics and music, do the scripting, text… basically everything but drawing bitmap-based backgrounds. You get a copy of the interpreter matching the template you chose, the system scripts are all set up for you, and you can just hit Compile, watch it work, and it’ll even automatically run the game for you in DOSBox, perhaps even starting you off in a specific room.

But dear lord do we have it easy nowadays.

In the old days, making an SCI game involved several separate utilities, many of them interface-less command line tools, and a particular network setup. That is, the tools expect to be invoked from a specific hard drive letter, as they are provided from one point of the network. There’s another where the system programmers keep the latest builds of the interpreter and system scripts, and the team for a given game has a batch script file to pull the latest into that game’s working directory. Writing the actual script code is roughly the same as it is now, but instead of a dedicated script editor they mostly used Brief. To test their changes, the programmers had to invoke SC, the Script Compiler. Given that Brief was apparently pretty extensible, this could probably be done from there.

While we mostly work directly on RESOURCE.### files, Sierra’s games were developed what I call loose-leaf style. Each type of resource was stored in its own folder, and a “wherefile” specified where each of them could be found — they were basically just RESOURCE.CFG by another name, really. And that name was literally WHERE. Turns out you can specify which configuration file you want the interpreter to use.


To make the game, they didn’t use makefiles. They used batch scripts that invoked SC and compiled the .sc source files to .scr files in the SCRIPT directory, and copied over the script resources from SYSTEM.

Given how relative paths work, running another particular batch file would run the interpreter from one directory while in another, from which point the paths given above can be considered valid. Since they didn’t have the “start at the room specified in this file” feature that SCI Companion’s template game adds, we get the game-specific debug modes that ask for a room number on startup, as extensively documented elsewhere.

And then, when the game is considered fit to ship, they build a list of which resources go on which disk, pass that to yet another command-line tool, which goes through all that and produces the RESOURCE.### files. Copy the result and there you go.

That list does not need to include all resources though. Indeed, as there are things that are included in the game data but left unused, there are some things that never got on a release disk in the first place. Let’s just say some things in the Larry source assets are even raunchier than you’d expect.

On object lists, meta-tiles, and Mario

A fun fact about most 2D Super Mario platform games is that they all share a common way of storing their level data. A common paradigm as it were. Only the Game Boy games don’t.

If ROM-based games load so fast compared to disk-based games, why does Super Mario Bros 1 make you wait on a mostly-black screen before you get to play a given level? Why does Super Mario World? Surely it’s doing more than just sitting idly?

The answer? Besides graphics in the case of later games, it’s converting the level map from one format to another. From a list of objects to a tile map, to be precise. That brick-block-brick-block-brick line we all know and love from SMB1‘s world 1-1 for example? Five tiles, but only three objects. First, a brick object set to five tiles wide. Then two separate question block objects that overlap the five bricks. On load, these objects are rendered into a tile map.

(Now, SMB1 didn’t have the space to hold an entire converted level in memory and only had a screen or two at once, which is why you can’t backtrack. So in SMB1‘s case, it does in fact sit idly. Thanks to NovaSquirrel for mentioning that.)

While the NES has 8×8 pixel tiles, the map this object list is rendered to has 16×16 pixel tiles. It is what some would call a meta tile map, where each entry itself refers to a different data structure that says “meta tile 2 has this color palette and is built from these four tiles”. That’s the map format a great many games of the era and later use. When an area is about to scroll into view, that tile map is then quickly converted to VRAM-native tiles. And that’s how you can have a set of three coins be defined as one object, yet pick each coin up separately, or have a strip of bricks that you can individually break. And since that alters the big tile map in memory, if you were to backtrack (even though you can’t do that in SMB1, as mentioned, but you can in the later games) those coins and bricks would not reappear.

Sprite objects come in a separate list, usually after the level geometry, and at least for the “classic” games they are subdivided into pages, about a screen wide. They’re only instantiated when their page is just off-screen, and they’re not marked as properly dead. Which is why if you knock out a koopa trooper but leave him there, go about a screen away then double back, the trooper will be back in his starting position and perfectly fine. Your leaving that page made him despawn without marking as properly dead.

Now, the Super Mario Land games… they do what they want. SML1 for example subdivides levels into screens, which are lists of strips that can be reused at will. I think the screens themselves can also be reused. And that is then converted to a regular tile map. The original Legend of Zelda used a similar strip-based layout.

I think I remember Super Mario Land 2 used straight-up 16×16 pixel tile maps for its level geometry. Both of these methods are still better than storing several screens worth of tile map in its native size.

Using straight-up tile maps of any resolution is of course a common technique used by many games. As a rule, the larger your levels can be the larger you want your meta tiles to be. Sonic the Hedgehog has positively huge meta tiles, themselves defined in terms of smaller tiles, since your average speed almost requires levels be large to accommodate. And it makes constructing those loops easy as a bonus. Most NES games tend to have 16×16 pixel meta tiles though, because of the attribute map being that size.

Ball Road

The dictionary for AGI and SCI games’ text parser input is stored in alphabetical order. This allows a prefix-based compression:

  • another
  • any
  • appear
  • appearance
  • apple
  • at
  • attack

Though the formats for the two engines’ dictionaries are completely different, they share this one aspect. Each of these words is then assigned a group number which is then used to store the said specs. I’ve written about that before. The thing is that when you decompile a game script, you can’t tell which synonym from a given group was originally used. And that’s why when you decompile Leisure Suit Larry 2 and look in the scripts regarding really any female character you’ll see them being called bimbos.

(if (or (Said 'call/bimbo,agent') (Said 'get,buy/ticket'))

That is of course because “bimbo” is in the same group (#42) as “woman” and “lady”, but alphabetically comes before them. “Agent” is in its own group (#50) together with various other jobs. You can call this particular woman either by gender or by profession. You can even call her a KGB agent and the game will allow it. By that same token, “call” is in the same group as the “talk” you’d expect to see here (#11).

But the decompiler has little to no idea of these things.

I have recently acquired the full source code for Larry 2, and that shows a slightly different, more sensible word choice:

(if (or (Said 'talk/girl, clerk') (Said 'get, buy/ticket'))

That is of course because these are the actual word groups being used here:

11 42 50

You can see how alphabetical order would mess that up.

And by that same token I can now securely say that the debug cheat code in Larry 3 is not in fact “ascot backdrop”.

“Backdrop” in Larry 3 is in group #1063 together with “put”, “drop”, “release”, “set”, “stash”, and various other “put something here” verbs. You know by now how the smart fella who discovered the debug cheat may have gone about it, and how “backdrop” would be the first word in that group. The canonical phrase however, is

Ascot Place

Because of course if I have the Larry 2 code why wouldn’t I have Larry 3 as well?

  ((Said 'ascot/place')
    (^= debugging TRUE)
    (if debugging
      (Print "Hi, Al!")
      (Print "\"Goodbye.\"")

The question is… why is this the debug phrase?

And the answer? It’s a callback to Larry 2:

And just like that this post’s title makes a little sense.

More pronoun problems in Ranma fanfics

(Edited from a Twitter rant in ten parts.)

One thing I find linguistically interesting about Ranma ½ fan fiction, especially most of the more recent works, is that they make a big fucking deal of Ranma’s pronouns. Mind you, these stories are set in Japan, starring Japanese characters, speaking Japanese. It’s all just rendered in English because Internet.

Third-person pronouns in English have genders. He/she and such, you know the ones. Since these stories are written in English, you’ll often find characters refer to Ranma with one pronoun or the other. No problem there, the original manga and anime do it too. But there’s a twist.

There are lots of stories about Ranma being transgender, especially in recent years as far as I’ve seen. Which is totally understandable, really. That’s not the problem. Write about transgender Ranma all you want. The problem, at least to me, is when characters start mentioning how other characters use this or that pronoun to refer to Ranma.

(This of course applies not just to Ranma but to any other character who shares the same curse. Let’s keep it simple, though.)

Worse, for the purpose of this rant, is this one story where Ranma joins a support group for LGBTQ people and the members all introduce themselves and state their pronouns. See, if these are Japanese characters (they are) speaking Japanese (this is implied), and my research is correct (I can only hope), that is literally not a thing they could do.

Where in English it is the third person pronouns that are gendered (he/him, she/her), Japanese has them in the first person. Ano hito, yatsu, and koitsu, those are all gender-neutral. Boku, watashi, and atai, are all gendered. And that’s just a small sample of first person pronouns.

So the very first time someone like Ranma opens his pie hole and speaks of himself, he’ll use whatever pronoun he wants. That’d be ore, a very manly one, as in “ore wa otoko da,” “I’m a guy.” It’s when Ranma uses a feminine pronoun that the eyebrows rise. Mind the phrasing there!

I was reminded of the Twitter rant this post is adapted from by another fanfic I read last night, where Genma caught himself thinking about his recently-cursed child with female pronouns. As in, the English third-person ones. It didn’t do much to damage the scene or anything but I felt mildly distracted by the idea that a Japanese man would think in English terms.

There are in fact fanfics, written in English, where Ranma will say something and maybe there’s something about the phrasing in English, and another character remarks that Ranma used a feminine pronoun, perhaps even saying the pronoun itself in Japanese, in the middle of a story otherwise written in English. Just as an example: “Ranma used atashi just now instead of ore, and he’s not trying to trick Ryōga. Something’s going on here.” Something like that. It’s quite interesting how you might handle this difference.

The episode Am I Pretty comes to mind, where Ranma’s entire way of talking changes right along with his first-person pronoun. I only watched it in Japanese, but I’d imagine the dub just only has his way of talking change. If you’ve seen it dubbed, feel free to let me know how they handled it in the comments.

Suffice it to say, as weird as pronouns can get in one language, it gets so much weirder when there’s two in play.

Some personal notes on the case of byuu v. Google

Put simply: if the warnings were to escalate for not being addressed, I could see my entire website potentially blocked, which is something I’ve personally seen happen to a friend of mine who also hosted a completely safe binary: EliteMap, which was an editor for Pokemon games.

— byuu, Google Safe Browsing

That was me. I am the friend. Now, byuu described an issue somewhat different from mine. That article is about the higan multi-console emulator and how its latest release was googleblocked for being an “uncommon” download. No shit, we all cry out, it was just released that day. Now, byuu’s fear of the whole damn site being eventually turned red isn’t unfounded but I’d still like to describe why exactly mine was.

EliteMap was not an uncommon download. I’ve had this very subdomain for years now, and one of the first versions of my site had a custom-made content system with a matching uploader. That uploader happened to use the filebin directory, and I used it to release EliteMap 3.7 way back then. The problem was that DJ Bouché and I made it in Visual Basic 6, and I wanted to keep the download size down a bit so I compressed all the executables that made it up with UPX.

Bad move.

Some antivirus applications were a bit notorious even back then and the contents of elitemap37.zip were considered harmful by over-eager heuristics that thought if a Win32 executable is UPX-compressed, it must be hiding something. Oh well, no biggie. Just put up a note saying Avast sucks and get on with it, right?

I’ve replaced my site several times since then, though I still have backups. The one constant that I never removed for long? filebin/elitemap37.zip and filebin/sappy12.exe. I did that once, and soon enough people contacted me about it. So I put it back. History demanded it.

Then some time back, Google struck. I got word on the search console that elitemap37.zip was considered harmful and should be removed. I appealed, stating that the file is perfectly fine and AV are being silly about it… and it wasn’t long until my entire site was flagged and every page you’d try to open would turn the whole damn window red with danger. Twice, even.

So I heaved a heavy sigh and finally, after what might be ten years or more, deleted the damn file. And Sappy 1.2 along with it just in case. Sent word back to Google about having “fixed” it, and the red flag was lifted.

Happy new year to all of you, and a hearty fuck you to Google 🖕

Priorities Revisited

I just spent way too long drawing little hexagons. Here’s why.

As you know by now, SCI0 up to SCI11 use vector-based priority and control screens, while SCI2 introduced bitmap-based priority screens while also removing the control screen entirely. That means when I render a background for The Dating Pool, I have to trace out everything you can stand behind, by hand, as vectors. And a while back I replaced the simple railing on the space station scenes with hexagons, so I had to re-vector the priority screen to match.

I didn’t draw any hexagons where the booths are because you can’t stand close enough for it to matter. Saves a lotta work for me. That’s still a lot of Line and Fill commands though. And that’s just one screen up there. There’s three.

Here’s what it’d look look if we targeted SCI2, hypothetically:

Much easier to produce, perhaps. I could just load it into my editor of choice, select everything that’d be closer, and cut that out onto a new layer. Rinse and repeat. And despite appearances here, the priority layers wouldn’t all be the full screen size either. They’d actually only be… 320×17 pixels for the one and 320×22 pixels for the other. Could be even smaller if I cut the booth layer into three separate pieces, perhaps.

Note that the background layer is completely missing the colors from the railing and booth layers. Why compress that twice or more after all?

Shampoo, Cologne, Mousse

(Edited from a Twitter rant in nine parts.)

I’ve complained about Shampoo’s name and how fanfic writers tend to write her “original Chinese name” often enough. I’d like to discuss Cologne and Mousse this time.

Now then. Knowing that Shampoo is the only one of the three with a name in actual Chinese characters, Cologne and Mousse are only ever written in katakana: コロン and ムース. Koron and Mūsu. Simple, right? That’s basically exactly how the products would be pronounced, just like with Shampoo.

Problem #1: Cologne’s fandom name is usually written as Khu Lon. Sometimes I think without the H. First of all, I can find no romanization scheme where khu is a valid sound, written that way. There’s khu but that’s in a scheme that’s not even used, and kuh isn’t quite it either. Second, it seems to me to be simply the wrong sound. The vowel is all wrong!

Problem #2: Mousse is usually given the name Mu Tsu, Mu Tse, and in this one doorstopper I’m slogging through, Mse Tsu. I already covered how absurdly wrong that third one is. If these were at all right, wouldn’t his name be written ムー, tsu? But it’s not. There is no T sound in either his established katakana writing or in his name as spoken.

Shampoo’s name, as covered before, is shanpū in Japanese, with a lengthened -u. It’s shānpū in Chinese, where I believe the accent marks denote tone? Correct me on that if I’m wrong. So at worst the length of the -u is different. Likewise, Ranma’s name “translates” to Chinese as luanma. Exactly the same kanji (乱馬) and all that. Readings are cool like that. Between the well-known L/R difficulty and an easily drowned out u, that’s also basically the same when pronounced.

Why then, in the name of logic, should Cologne and Mousse have their names so different? At least back when they started, fanfic authors had no way to look this shit up — there was no Wiktionary or such back then. You really have no excuse now. You can do better than this. If you were to tell me “but those are their names”, you basically admit to being both uninformed and being too lazy. “That’s how we’ve always called it in the fandom” is arguably better, but again you can do better.

Since we can look shit up, let’s actually do look this shit up! What is “eau de cologne” in Mandarin Chinese? It’s 科隆香水, kēlóng xiāngshuǐ. “Mousse” as in the hair product? That’s 摩絲, mósī.

Shānpū, Kēlóng, Mósī. Research over.

But wait. Kēlóng? Ke? Didn’t I say earlier that the vowel is all wrong? I did. It’s a romanization difference. The zhuyin is ㄎㄜ. I gave their names just now in hanyu pinyin. In Wade-Giles, it’s written with an o. Simple. Mousse’s ㄇㄛ is mo in both.

Just for fun, I thought I’d look up some common fanmazon names, but reconsidered when I found Chinese has a perfectly good word for perfume. Oh well, I’d made my point already 🤷‍♀️

More regarding Interrupt 21

Last time I explained how your standard file rename function as seen in MS-DOS worked. You’d set up two CPU registers with pointers to the old and new names, set AH to 0x56, and called Int 0x21. Easy, right? And then I went into detail on how malformed inputs were handled. They weren’t handled too well, and DOSBox does it differently from MS-DOS on top of that.

But what if we had a file system and rename function that did support spaces? Maybe more than eight characters, even? In mixed case?

That is of course VFAT, an extension to regular FAT16 available in Windows 95, NT 3.5, and later. With a VFAT driver, most of the old file operations available from Int 0x21 had counterparts installed that generally took the same arguments and had the same numbers, but accepted long filenames.

So to rename a file with long filename support, you’d do exactly what you’d do before but instead of setting AH to 0x56 you’d set AX to 0x7156. Assuming Windows is running and we use the same inputs as last time, your file will now be named hello world.txt. And that’s all that takes, even if it’s a pure DOS program doing it.

Which raises a question. How do you make a pure DOS program that handles files that may have long names, may be run from Windows, and should not drop any of those long names if it is in fact running in Windows? Well, it turns out all those LFN functions — the ones starting with 0x71, all reset AX to 0x7100 if they’re not installed. A trick of the system, I suppose. So what you could do for your LFN-enabled rename function is try to use 0x7156, see if AX has reset to 0x7100, and if it has, you try again with AH set to 0x56. In other words, it’s time to bring back the rename function from SCI11… or rather a branch of SCI11+ that I’ve been working on.

rename	proc	oldName:ptr byte, newName:ptr byte
	mov	dx, oldName	; ds:dx = old name
	push	ds
	pop	es
	mov	di, newName	; es:di = new name
	mov	ax, 7156h	; LFN Rename
	int	21h
	.if	ax == 7100h	; LFN failed, try DOS 2.0 version
		mov	ah, 56h
		int	21h
	.if	carry?
		xor	ax, ax
rename	endp

It’s that easy. Of course, this is old-school MASM code which has some nice things like .if but that’s just sugar to avoid having to write compares and branches — the concept should be clear enough. An attempt to rename a file to Introduction.txt will result in exactly that on Windows, or transparently collapse to introduc.txt on plain DOS.

Note that in the actual SCI11+ code, if you’re crazy enough to look it up, there’s an extra function I made that’s called right before the DOS 2.0 rename call that replaces all spaces with underscores, which renders them about 100% not as confusing and untouchable as the one shown last time. I left that part out for brevity.

