Logo Pending


Snappy and Clear

(This was originally posted as a Twitter thread.)

I’d like to talk about bad words for a moment. Specifically, words used as tags and such for porn site content. So there’s your content warning right there.

“Dickgirl”. (pause for audience gasp) There are those who consider this word to be a Bad Word™. I respectfully disagree. I think this word is snappy and clear, like a content tag word ought to be.

(Now, I’d like to interrupt this repost to clarify that this is not about how it’s somehow not bad. It’s about how on another level it’s good. This was never meant to convince anyone otherwise, as we all know this to be impossible. With that in mind, back to the repost.)

Imagine, if you will, a fresh-faced pervert’s first go on the Internet. Our freshly-hatched pervert finds a porn site with a comprehensive tagging system, that includes hentai and its related genres and tropes. He finds an image set or a comic or such tagged “dickgirl, bukkake” among other things. What do you suppose this person, who has never seen these two words before (I know, incredible), expects to find upon reading these tags and opening the comic?

At least one chick with a dick, and the other thing is a surprise.

Y’see, when you see the word “dickgirl”, there’s only a few things you can take it to possibly mean, and only one or two make enough sense to likely be correct. It refers, of course, to a girl with a dick. It’s a hentai comic, it’s allowed to have weird shit okay?

But yeah. Two short syllables, each a perfectly clear word on their own, and there’s very little doubt as to what it means. The opposite, “cuntboy”, is exactly the same in all regards. No need to repeat myself there. “Bukkake” on the other hand… oh boy. Oooh boy! First of all, it’s Japanese. Our (I’ll remind you) uninformed perv doesn’t speak the language, so he has nothing to go on. Second of all, it’s got a non-sexy meaning too, that came earlier. Something about pouring water on your noodles from higher up? It’s not that snappy, and it’s certainly unclear as all get out. If our perv were Japanese or otherwise familiar with the other meaning, he might justifiably think “oh, the dickgirls are gonna eat noodles afterwards.” Boy is he in for a surprise!

You could say the same about “futanari”. Again, Japanese. Means, roughly, “two forms”. And frankly, reading that I’d sooner think of giant fighting robots that turn into jet planes and such than a girl with too many bits down her panties.

And please, don’t even think of suggesting “pre-op transwoman”. It’s not at all snappy, and it’s frankly too specific. Surprising, I know. Not all female-presenting dick-having characters in these stories are trans, okay? Like 99% aren’t!

(“you could also argue that while it’s definitely a bad idea to call real-life trans women by porn terms, it’s probably a lesser bad idea to call porn by the current words used by trans women, in that it may lead for those terms to become sexualized/pornographic.” “Exactly why you just keep calling dickgirl porn dickgirl porn.”)

Anyway thanks for listening to my TED talk. I’m Kawa, hobbyist linguist, and all-round lazy bastard.

[ ] Leave a Comment

AGI versus SCI – A Comparison

AGI, which Sierra used up until around the release of King’s Quest IV – The Perils of Rosella, was a strictly linear scripting language with a fairly simple bytecode. Much simpler than SCI’s object-oriented virtual machine. But how do they compare?

Inspired by this one particular book that’s ostensibly about SCI but contains only AGI snippets from what I’ve seen, here’s the first playable room in King’s Quest IV, and how it’s initialized. I’ve left out a bunch of stuff for clarity. Because it’s so simple, AGI bytecode is pretty easy to decompile:

if (justEntered)
{
  set.horizon(84);
 
  if (!nightTime) { v152 = 1; }
  else { v152 = 101; }
  load.pic(v152);
  draw.pic(v152);
  discard.pic(v152);
 
  //Place Ego according to the previous room
  if (lastRoom == 1) { position(ego, 141, 82); }
  if (lastRoom == 2) { position(ego, 107, 82); }
  if (lastRoom == 9) { position(ego, 96, 82); }
  if (lastRoom == 10) { position(ego, 80, 82); }
  if ((lastRoom == 11 || lastRoom == 15)) { position(ego, 70, 82); }
  if ((lastRoom == 12 || lastRoom == 14)) { position(ego, 60, 82); }
 
  //Add some waves in the water.
  animate.obj(o3);
  load.view(55);
  set.view(o3, 55);
  set.loop(o3, 4);
  set.priority(o3, 5);
  ignore.objs(o3);
  cycle.time(o3, 3);
  position(o3, 64, 152);
  draw(o3);
 
  draw(ego);
  show.pic();
}

Hmm. Compare that to its SCI equivalent. SCI bytecode is much harder to decompile. You can disassemble it, but until SCI Companion came out it wasn’t possible to decompile it. Still here we are:

(instance Room1 of Room
  (properties
    picture 1
    north 25
    south 7
    west 31
    east 2
    horizon 100
  )
 
  (method (init)
    (if gNightTime (= picture 101))
    (super init:)
    (self setRegions: 503 501 504 506)
    (wave1
      isExtra: 1
      view: 665
      loop: 0
      cel: 0
      posn: 203 76
      setPri: 0
      ignoreActors:
      cycleSpeed: 3
      init:
    )
    ; Other waves left out for clarity
    (waves add: wave1 wave2 wave3)
    (wave1 setScript: waveActions)
 
    ; This part is simplified significantly.
    (switch gPreviousRoom
      (south (gEgo posn: 225 188))
      (north (gEgo x: 225 y: (+ horizon 1)))
      (0 (gEgo x: 220 y: 135))
      (east (gEgo x: 318)) ; Y stays the same.
    )
    (gEgo init:)
  )
)

You might notice that there’s no equivalent to draw.pic and discard.pic and such. The Room class handles that by itself the moment Room1 calls (super init:).

[ , , , ] Leave a Comment

Going in-depth on SCI map files

One thing that stood out to me when I look at my collection of Sierra SCI games is that basically all the 16-color and early 256-color games (that is, SCI0, SCI01, SCI1) have multiple resource volumes, but the SCI11 games do not. Not because a lot of them came on CD either, the diskette versions, once installed, had a single resource volume too.

Why do you suppose that is? Why do the diskettes for SCI11 games contain a single resource.000 file that’s been split across them all for the install script to merge back together on the hard drive?

Because the map format doesn’t allow it.

The resource.map file specifies which resources can be found on which disks. For SCI0, it’s a list of six-byte entries. The first two encode the type (upper five bits) and number (the lower 11 bits), the next four have the volume number (upper six) and absolute offset in the volume (the rest). If a resource appears on multiple disks, it’s listed once for every appearance. In SCI01, there are only four bits for the volume number, trading amount for space.

In the later versions, the map file starts off with a list of offsets for each type listed in the map. With this list and a contract that the map entries are sorted by type, the interpreter can look up a given type of resource much faster. Since we already know that this part of the list only contains resources of a given type, we can use the full 16 bit range for the numbers. In SCI1, the next four bytes are just like in SCI01. In SCI11 however there’s only three, and there’s no volume number. Then in SCI2 it’s a straight-up plain 32 bit offset value.

The trick with SCI11 having only a 24 bit range for its offsets is that the value is shifted. They must be aligned on a two byte boundary so that the offset range is effectively doubled again.

As a practical example, let’s look at The Dating Pool. If I open its resource.map in my favorite hex editor and try to look up the second view, it’d go like this.

The first few bytes are 80 2B 00 81 EE 00 82 BB 01. We know the format, so this means that views (type 80) start at offset 002B and background pictures (81) start at offset 00EE, which means the views end there. Skipping ahead to the given offset, we see this: 00 00 00 00 00 05 00 02 26 00. Obviously the first resource (view 0) is at offset zero. Splitting this up into handy chunks, 0000 000000 0005 002602, we see that view 5 (which is the second one in the game data) is at offset 2602 << 1 = 4C04. Open up resource.000 and go there, and the first thing we see is confirmation: 80 05 00 36 44 36 44 00 00. View number 5, 4436 bytes in size both unpacked and packed, no compression, followed by the actual view data. Then there’s a single null byte for padding, and view 10 begins.

[ ] Leave a Comment

Platforms: a follow-up

The good news: it turns out I have an older copy of that C#/XNA platform game. Possibly the very same version in the screenshot.

That’s this one, for the record:

It’s so old, it’s made in VS 2008 and has a particle effect renderer that I scrapped after it wouldn’t upgrade to 2010.

And after testing it for a bit, I couldn’t find any in-your-face game breaking problems with the collision detection. It could stand to be perhaps a bit slower to better match the scale of things and fix the part where you can “hover” on the ceiling and…

Ah, right. That part. The part where you can’t actually go further to the left without jumping, and if you stand on the left half, face right, and then jump straight up you skip to the right.

I remember now.

[ , ] Leave a Comment

Animation Shop

December 23rd last year, I was working on some example material for my seqmaker tool, a converter/player that turns image sequences into SEQ files that SCI11 can play. The DOS versions of King’s Quest 6 and Gabriel Knight 1 used them extensively. At first I wanted to use a GIF sequence from Prince of Persia, the original one, because it has very few colors (32) and the only animated elements in that sequence are the Prince and a touchplate.

The only tool I had available to me to manipulate the GIF was ffmpeg. It wasn’t very helpful as after I’d made the SEQ file and watched it in examine mode, which blanks out everything outside of the changed region, I found that there was very subtle dithering.

One would expect, when the Prince starts to turn around, that he’d be the only thing stored, much like this:

“No. Fuck you and the horse you dithered in on.”

This is simply not acceptable!

To think I’d already spent too long manually undithering the solid black background! So in the end I went with the first shot from Hotel Mario instead. Worked out great.

Now, years ago I used to have a program called Jasc Animation Shop. By the creators of Paint Shop Pro, which I use to this day. The old 8.0 version from Jasc, before Corel bent it over. Importantly, Animation Shop was meant for GIFs, but also supported AVI and could import from sequences. Even more importantlier, it could export to GIF with full quality control. Or at least full enough to ensure it wouldn’t add dither noise. Today I found a copy and tried the thing again for a laugh.

Beautiful.

[ ] Leave a Comment

Platform issues

Just about ten years ago, give or take about twenty days, I posted a very happy if short post on my old blog. It was a happy little song accompanying this screenshot:

C:\>If you're happy and you know it, syntax error
Syntax error.
C:\>If you're happy and you know it, syntax error
Syntax error.
C:\>If you're happy and you know it and you really want to show it if you're happy and you know it, syntax error
Syntax error.

You can tell it’s fuckin’ old cos it’s the Watercolor theme I used to have in Windows XP. And the level editor is made in Visual Basic 6.

I’d added things like scrolling and even moving platforms later on. I think the main character’d been replaced by an edited Mega Man ZX Aile by then? And I was so happy with my progress.

Then I somehow lost all the code.

I still have other projects from that time period, like a Touhou clone that featured Cirno as the player years before Fairy Wars came out, but not this. So weird, and such a shame.

Over time, I’d try again. By then I’d switched to C#/XNA and I got pretty far I liked to think.

I’m not sure how but things went downhill fast once I tried to fix a bunch of obvious problems. I can’t even remember how it all happened. At least this one I still have the code for… albeit it sorely broken code. I think this one copy I have around here has no collision detection at all, and this other one is totally broken. I think this screenshot is from just before I tried to implement slopes and found the first obvious wrong thing.

I never did get slopes in either game.

I feel like I should either take this and try to fix it, or take a step back and try it in C/SDL again. But either way, this fear of failure — I’m not even sure if that’s what it is but whatever — is holding me back from trying. I’ve fucked it up too often, maybe.

…but damn that was a good theme.

[ , ] Leave a Comment

What even is an adventure game?

That’s a question SCI doesn’t even bother asking.

Weird, I know, considering it’s an adventure game engine.

Or is it? Turns out it’s really not. The “adventure game” part is almost entirely a matter of scripts.

In SCI proper, there is no concept of a player, of a room, of basically anything concrete. The engine only cares about a few data types like List, for which it provides function calls to use them, that the Cast list’s contents are View objects in a duck-typing sense, and that this list is global variable #5. Also, script zero export zero is the starting point. From there, you’re mostly on your own.

A room in SCI is just a particular kind of object that sets up the things you can find inside, handles room-specific inputs, and… that’s basically it. The SCI engine doesn’t even know the difference between an abstract View (some possibly-animated non-background element), an atmospheric effect, an NPC, or the player character. Note that all of those things are still View, in a class-inheritance sense.

The biggest blow to the idea that SCI is an adventure game engine has to be the board games though. Between Jones in the Fast Lane and the Hoyle series, I don’t think anyone could claim otherwise for long.

Contrast that with a certain other engine or two.

[ ] Leave a Comment

Son of a Submariner

This was originally posted on my old blog, around February 23rd 2009.

Looking at the old Squaresoft RPGs on the Super NES, anybody who even came close to a classic Mac might’ve noticed that the font used in most (if not all) these RPGs is remarkably but not quite similar to the Mac’s system font, Chicago.

The top half is taken straight from a screenshot of Final Fantasy 3 (US) and the bottom is a quick imitation in Chicago.

One of the first things you may notice is that the Square font is a little shorter and has wider spacing. Here are the metrics for both:

Square Chicago
X-height 6 7 The height of a lowercase X, green block in the image.
Ascent 8 9 The distance between the baseline and the top of the glyph that reaches farthest from the baseline, blue block in the image.
Descent 3 3 The distance between the baseline and the bottom of the glyph that reaches farthest down, magenta block in the image.
Body size 11 12 Total size of x-height, ascent and descent.
Line spacing 4 4 Distance between lines, purple block in the image.
Kerning 2 1 Distance between characters. Square’s font does not do ligatures.
Space width 7 6 Width of the space character.

These metrics do not include the drop shadow. Some shapes are different, but if you take Chicago and make it one pixel shorter, you could really fool the SNES fans.

Note that the GBA remakes have their own font(s) with a much lower weight than Chicago, allowing more text on fewer lines and a smaller screen.

With all this, can you really blame me for using Chicago whenever I feel like making jokes about Squaresoft SNES RPGs?

[ , , ] Leave a Comment

Exodus

Not Ultima or even God forbid the Bible. I mean of course the one following the Tumblr porn ban. I’ve mentioned it on Discord once or twice but when you’re using a service like Tumblr to host and serve your content, a service provided by a company you can’t meaningfully control, you run the risk. The service will eventually cease to be (GeoCities says hello), the company decides to make one too many dumb functionality changes, or as happened now the advertisers force the company’s hand and they do the most fucktarded thing imaginable, in the worst way.

But to say you’ll just jump ship from the one company’s service to another? That’s just sitting and waiting for the other service to take a turn fucking up.

That’s why you sigh, fork over a tenner a month, if even that much, and get your own damn server. Do what you want, host your own blog and all its content. Use the services, untrustworthy as they are, to announce the content.

[ ] Leave a Comment

On fonts

SCI, being rooted in MS-DOS and from a time before Unicode (fun fact, the first draft proposal dates back to 1988, when King’s Quest 4 came out as the first SCI game), SCI uses an 8-bit string format. That is, each character in a string is one byte, and that’s all it can be. Making strings one of very few standard data types in SCI that aren’t 16-bits and requiring a dedicated kernel call to manipulate (as seen in KQ4 Copy Protection) but that’s not the point here.

American releases of SCI games would normally have font resources ranging only up to 128 characters, with the Sierra logo at 0x01 and ~ at 0x7E. Only caring about newlines, all other characters are considered printable. European releases would include usually not 256 but 226 characters, up to ß at 0xE1, basically copying code page 437 but leaving out the graphical elements among others. This means, of course, that a Russian translation of such a game would require another custom font copying code page 866 instead.

And then there’s the whole thing where SCI Companion uses the Win-1252 code page (it’s not exactly a Unicode application) which makes translated games look pretty wild:

Ich glaube Dir gern, daá Du das tun m”chtest!

That doesn’t look quite right. That’s supposed to be “Ich glaube Dir gern, daß Du das tun möchtest!” And indeed, comparing things between DOS-437 and Win-1252, we see that á and ß are both encoded in the same byte value.

That’s the kind of bullshit Unicode was made for, isn’t it?

So what I did for my SCI11+ project, of which one version is used in The Dating Pool, is to add optional basic Unicode support, so you can write text data in UTF-8 and not have to worry about things all that much. There are however two major problems with this idea. One of them is that SCI Companion is not a Unicode-aware program, so you can’t use that to write the text data. That’s easily solved with external resource editors that are. The second is more insidious — the fonts.

What I found out about SCI font resources is this: their header fields are way too wide.

typedef struct
{
  word lowChar;
  word highChar;
  word pointSize;
  word charRecs[0];
} Font;

lowChar is always zero, but the interpreter does acknowledge it. highChar is, as discussed above, always 128, 226, or 256. The fact that it’s exclusive is basically the only reason to have it not be a char-type value.

.if bx < es:[si].highChar && bx >= es:[si].lowChar

See? Exclusive. But the takeaway here is obvious. SCI font files can contain up to 65,535 characters. That’s enough to cover the Basic Multilingual Plane. As such, I’ve added handling of double- and triple-byte UTF-8 sequences to SCI11+. I’ve tested it, too:

Switching to another, non-extended font, I expected to see "Tend ", and that’s exactly what I got. The routine I linked to would decode an ō and dutifully pass it on down to StdChar, which would see that 0x14D is way higher than 226 and simply draw a blank.

(Now, between the first draft of this post and its publication, I’ve further enhanced this system to not decode anything if the font has fewer than 256 characters, falling back to code page 437 or whatever, just not doing anything special.)

That leaves one last issue, which is mostly a matter of wasted space. I like my quotation marks to be proper curly, and in Win-1252 as The Dating Pool uses (because why shouldn’t it?) this is easy — just draw a  and in the font at 0x93 and 0x94,  and be done with it. But in Unicode, these two characters are part of the General Punctuation block, which starts all the way at U+2000. That would mean defining up to that many dummy characters. A two-byte pointer, two size bytes, and a single byte with at best one bit set per character.

That’s bullshit.

As such, I’d propose to cheat like hell and move the General Punctuation block so it covers the much earlier Combining Diacritical Marks block. It’d be way too much of a nightmare to support those. So while measuring and drawing, detect if you’re in the 0x2000 to 0x206F range, subtract 0x2000, add 0x300, and use that character instead. Or have the custom resource tools  that we’d need anyway do it.

(Again, just before publication, I came up with an idea to have a new font generation tool that takes a bitmap of the font and converts it. The trick for space-saving is that it would recognize graphics it’d already processed and simply place a pointer to the first one. Instead of five bytes per dummy, it’d use only two for all but the first. Savings!)

(Update: I made the thing.)

At any rate, your input is appreciated.

Except for yours, Covarr.

[ , ] Leave a Comment