Logo Pending

String literals

In various programming languages, different quotation marks and such can mean different things. In C/C++ for example, "this" is considered a plain string literal. It can contain various escape codes (\x69, \n, et al), and escaped double quotes (\"), but not raw newlines. A single-quoted literal is not a string at all, but a character literal. In C# meanwhile we have the plain double-quoted string and single-quoted character but also @"this", a variation that does allow raw newlines at the cost of not allowing escapes. In PHP meanwhile, we have double-quoted strings that, in contrast to C, can have raw newlines but also have variable interpolation — "Hi $name" will appear as Hi Mark, assuming that is that variable’s value at the time. Single-quoted strings in PHP don’t do escape sequences or interpolation, and then there’s “heredoc” strings.

SCI also has different types of string literals. Or rather, had, depending on which version you targeted. Double-quoted strings could have escape codes (\42, note the lack of a letter, and of course \n) and raw newlines, but whitespace was folded away on compilation so raw newlines and tabs were entirely for code readability’s sake, requiring a \n at the end of each line that had to have a line break. You could also use curly braces for strings, {like this}, that had exactly the same rules and limitations as double-quotes, except for one difference in storage.

Any string literal in curly braces would be stored as-is in the script resource, while double-quote strings would be stored in a matching text resource and replaced in the compiled code with a look-up key:

(Print "This is an example" #title {Kawa says})

(Yes, I’m aware that the syntax highlighter doesn’t pick up on the braces.)

Assuming this is script #42 just as an example, and this is the first place a double-quoted string appears, the above would be transformed like so:

(Print 42 0 #title {Kawa says})

The original string will be stored in a separate text resource with the same number as the script. This helps cut down memory use.

(Bonus banter: there’s a bug in the original SCI interpreters that was introduced when they added the Message resource format involving hexadecimal numbers where they accidentally used "01234567890ABCDEF", with an extra zero. This messes up any attempt to use a good third of the character set in a message resource, but not an inline string literal, so having \0E in a string literal will produce the intended while the same thing in a message will produce ¤ instead. In SCI11+, this has been corrected to just "0123456789ABCDEF".)

[ ] 2 Comments on String literals

Tracing in SCI2

Where 16-bit versions of SCI had two blank spaces in their PMachine instruction set, SCI2 introduced the _line_ and _file_ instructions, which the compiler could inject into the final bytecode so that the built-in debugger could then work out exactly which source code line matched the current instruction. Here’s how that goes down in practice.

First we make two test scripts, 42.SC and 69.SC:

(script# 42)
	Test 1
(procedure (Test)
	(Display "This is Test, (42 1).")
(script# 69)
	TestA 0
	TestB 1
	Test 42 1
(procedure (TestA)
	(Display "This is TestA, about to call TestB.")
	(Display "Back in TestA, gonna call (42 1).")
	(Display "Back in TestA.")
(procedure (TestB)
	(Display "This is TestB.")

This may look a mite different from SCI Companion code because despite everything, they are not the same. Now, compiling them both in SC version 4.100, from January 12 1995, and then pulling them back through a disassembler and annotating it a bit, we get this output:

; 42.SC
Test:	_line_	11	;(procedure (Test)
	_file_	"42.sc"
	_line_	12	;	(Display "Hello my darling.")
	lofsa	$6
	callk	Display, 2
	_line_	13	;)
; 69.SC
TestA:	_line_	17	;(procedure (TestA)
	_file_	"69.sc"
	_line_	18	;	(Display "This is TestA, about to call TestB.")
	lofsa	$6
	callk	Display, 2
	_line_	19	;	(TestB)
	call	TestB, 0
	_line_	20	;	(Display "Back in TestA, gonna call (42 1).")
	lofsa	$2a
	callk	Display, 2
	_line_	21	;	(Test)
	calle	Test, 0
	_line_	22	;	(Display "Back in TestA.")
	lofsa	$4a
	callk	Display, 2
	_line_	23	;)
TestB:	_line_	25	;(procedure (TestB)
	_file_	"69.sc"
	_line_	26	;	(Display "This is TestB.")
	lofsa	$59
	callk	Display, 2
	_line_	27	;)

Every time the PMachine encounters a _file_ opcode, it grabs a null-terminated string from the bytecode stream and places it into pm.curSourceFile. Likewise, _line_ takes a 16-bit number and places it into pm.curSourceLineNum. The built-in debugger can then notice when these two values change, find the source file, and display the correct line of code.

But there’s one tiny detail that threw me off initially. Can you see it?

When TestA calls Test, the current source file changes to 69.sc, but it doesn’t change back afterwards.

Although the SCI2 source I have here doesn’t seem to call it, there is in fact a pair of functions to push and pop debug state, preserving the value of pm.curSourceFile and pm.curSourceLineNum across module calls. Which is quite obvious when you think about it. The alternative I can see would be to insert another _file_ opcode after each out-of-module call.

[ ] Leave a Comment

Separating the Game and Engine

(Originally written 2017-10-07)

To most children of the nineties, “adventure game” was synonymous with “Sierra” and “LucasArts”. When you look at Sierra games like the King’s QuestSpace QuestPolice Quest, and Leisure Suit Larry series, you notice they’re all alike in one particular aspect beyond mere presentation, beyond the fact that they all control the same way: they are all third-person point-and-click adventure games.

Now that’s simplifying a little, since the earliest entries in those series weren’t point-and-click but instead had a text parser. But that doesn’t matter here, not very much.

King's Quest 5 - Crispin's house

We have our background image, our animated main character (henceforth referred to as Ego, in keeping with the script code), other animated characters where called for, a mouse cursor, and the status line. In many Sierra games from King’s Quest 5 on the status line is blank. In earlier games it was not, often displaying the game name and player’s score.

Regardless, we can control Ego in a particular way, and by placing the mouse cursor on the status line, we can summon an icon bar:

Leisure Suit Larry 1 VGA remake, Lefty's Bar exterior. The icon bar is showing.

Click a verb on the left half, then click somewhere in the game screen to act accordingly. Simple. The other three icons open your inventory, allow you to change settings and save/restore, and explain the other buttons. Almost all point-and-click Sierra games work like this.

The important question is, how much of all this is part of the engine, and how much is part of the game? Let’s find out!

As it turns out, the only common aspects that are part of the engine are:

  • The ability to draw backgrounds.
  • The ability to draw animated elements on those backgrounds, while also letting parts of it obscure them.
  • The ability to play sound effects and music on a variety of sound hardware of the time.
  • The ability to draw text.
  • Mouse, keyboard, and joystick input.
  • The ability to save and restore the game state.
  • A simple graphical user interface — windows, buttons, text fields and such.
  • The ability to overrule the way windows look, to change their border style in script.
  • The ability to draw a status line.
  • The ability to tell how to go from one point to another, in various ways.
  • For the later games, the ability to do various color tricks.
  • For the old SCI0 games, the ability to use a pull-down menu bar.
  • For the old SCI0 games, the ability to try and transform an English sentence into recognizable keywords.

That leaves this plethora as purely scripted:

  • What an animated character is.
  • What Ego is and what they can do that other animated characters can’t.
  • What an inventory item is.
  • How to respond to a click.
  • What the icon bar is, and what buttons are on it.
  • What a room is.
  • What literally anything being an abstract “object” is. This covers a lot of ground.
  • For the old SCI0 games, what goes into the menu bar and how to react to its use.
  • For the old SCI0 games, how to invoke the text parser, and how to respond to its results.

The game code can’t tell if you’re using the 320×200 256-color VGA driver or the 640×480 dithered EGA driver, only the abstract “how many colors do we have”, or “how many voices can we play at once” instead of knowing that we specifically use the PC Speaker for music output, or the Roland MT-32.

Those games I listed at the start are all third-person point-and-click adventure games because they all share a common set of scripts that define what an adventure game is. As such, not all Sierra SCI games are adventure games:

Jones in the Fast Lane, employment officeCastle of Dr. Brain, robot programming puzzle
One is a board game, the other a series of puzzles. Dr. Brain shares the user interface scripts common to the rest of them, but Jones is nothing alike. And yet, all of them are the same game engine. The script code is in the same format throughout, as are the audiovisual elements. System calls for the script code to use are the same, all throughout.

Sierra’s Creative Interpreter is not a third-person point-and-click adventure game engine, is what I’m saying.

You could remake Myst in it, after all. If you wanted.

[ ] 1 Comment on Separating the Game and Engine

Script resources – a dyad in the Force, as it were

ZvikaZ recently ran into an issue trying to hack Quest for Glory 1 VGA where they edited a particular script, and it worked fine, but when they then exported the .scr file and put it in a clean QFG1 folder, it broke in a particular way. One particular phrase stood out to me in particular:

There are ‘ch’ strings instead of the numerical values

I had a feeling what the problem might’ve been when I started reading the post but when I saw that part I knew exactly what happened.

Quest for Glory 1 VGA is an SCI11 game. That means the scripts are split up into .scr and .hep pairs, and ZvikaZ only copied the one file instead of both. One of them contains the actual script bytecode, but the other contains the amount of local variables, their default values, information on all the objects in the script, and all the text string literals in the script. It’s called a heap resource because that’s where it’s loaded.

Originally, the script and heap resources were one and the same. When a given script needed to be loaded, it would be loaded into heap memory and kept there until unloaded. And as explained before, a saved game is basically a compressed dump of the entire heap memory area, while hunk space contains all the other resources that the scripts, in turn, refer to. Now imagine for a second a script resource with a single class in it, with a single particularly big method, so that a mere fraction of the script resource describes the class, and contains any near strings and such, and all the rest of it is bytecode. Once loaded, the bytecode can’t be changed — only the class properties and any local variables can be, but all of that bytecode is still part of the heap. There’s only so much heap space available to a game, so as long as that script is resident, that bytecode will take up precious space.

SCI11 split the script resources up so that the bytecode parts would be kept in hunk space instead, swapped in from disk when actually needed by something from the script definitions in heap space. All that space taken up by PMachine bytecode is suddenly no longer part of the heap and this bad boy can fit so many script resources at once. And if your scripts use far text instead of near — text resources referenced by a module/line tuple that get loaded into hunk space, instead of "quoted strings like this" that are part of the script’s heap resource) anything those scripts try to say automatically also doesn’t take as much space. You trade a two-byte pointer for a four-byte tuple, but those numbers in turn may refer to a string of who knows what length. Savings!

ZvikaZ’s target was the script resource for QFG1‘s character creation screen, whose first class is a Room named chAlloc. That name appears in the heap resource. When ZvikaZ changed the script code and recompiled, the heap resource had its contents changed, including where exactly in the file the room’s definition started. Whatever mixed-up monstrosity resulted when ZvikaZ then tried to run the altered 203.scr against an untouched 203.hep didn’t function and notably printed ch instead of numerical statistics.

I’m honestly a little impressed it didn’t “oops” on the spot.

[ ] 1 Comment on Script resources – a dyad in the Force, as it were

SCI versions and naming

Did Sierra ever call the various versions of SCI the same names we use? We being the fans, the tool creators, and the ScummVM developers?

It’s unlikely.

One thing to keep in mind is that the interpreter was in near-constant development by one team, while other teams made the games. Every so often the game developers would pull in the latest interpreter and system scripts from a network share. Another thing to keep in mind is that the version numbers are a little weird in places, and that the games themselves had their own version numbers on top of that, so for example you could have King’s Quest 4 version 1.000.106 running on SCI 0.000.274, but also KQ4 1.000.111 on the same interpreter, released five days later, and the later update with the changed graphics that was version 1.006.003 running on SCI 0.000.502.

The first generation of SCI, the one we call “SCI0”, had versions starting with “0.000”, such as the KQ4 example above. This covers every single 16-color, parser-based, English-only game, with the lone exception of the Police Quest 2 PC-98 release. That was version “x.yyy.zzz”, no joke. This generation can also be subdivided into two blocks, where versions up to 0.000.343 had green button controls instead of using whatever the window color was set to, covering the ’88 versions of KQ4 and the first version of LSL2, and the rest covered all the other games.

What we call SCI01 had versions starting with “S.old”. At least “x.yyy.zzz” has the placeholder excuse but whatever. SCI01 games were just like SCI0 on the surface, but had support for multiple languages (previously introduced in version x.yyy.zzz), and saw no more releases than ’88 SCI0 — six of ’em. So technically there’s nothing about that version string to inspire “SCI01”, besides perhaps KQ1 using “S.old.010″ 🤔

Next up was SCI1, which came in both EGA and VGA and usually had versions starting with “1.000″. There is one game, Quest for Glory 2, with five different interpreter versions that still had the text parser (and technically one Christmas card) before it was removed in favor of the icon bar. Some SCI1 games again have interpreters with very strange versions — it appears Eco Quest and Space Quest 4, among others, had some Special Needs™, given interpreter version “1.ECO.013” and “1.SQ4.057″. But on the whole you could still tell from the first character in the version that these were SCI1 interpreters.

SCI11 removed the multi-language support in favor of things like scaling sprites and the Message resource type. All SCI11 interpreters in the wild use versions starting with”1.001“, except for the ones used in Laura Bow 2 (“2.000.274”), Quest for Glory 3 (“L.rry.083”), and Freddy Pharkas (“l.cfs.081”), among a straggler or three.

Up to now these were 16-bit real-mode applications. SCI2, with versions starting “2.000” was a 32-bit protected mode application instead, with the ability to use much more memory and run in a SuperVGA video mode. No SCI2 interpreter found in the wild seems to stray from this version pattern, mostly because all SCI2 games use version 2.000.000. SCI21, in turn, runs on interpreter version 2.100.002, although there are technically three different sub-versions of 2.100.002. That’s not confusing at all. And finally, SCI3 was only seen in interpreter version 3.000.000.

I’m thinking after the switch to 32-bits, they must’ve stopped automatically bumping version numbers on build.

So what does Sierra call them, then? Well, sources say that Sierra called the 32-bit interpreters SCI32, and the source code archive that I based SCI11+ on was SCI16.ZIP. But none of the changelogs and such seem to refer to SCI0, SCI1, or whatever.



Happy slightly belated new year 🥂

[ , ] Leave a Comment

Server move?

Well, that took the better part of my evening!

So I got an email earlier today from the provider behind my VPS saying they’re gonna stop doing VPS, that I should switch to a KVM, and they left me a coupon for six billing periods (that is, months) of free service for said KMS. Which is nicer than what Centarra did way back then, keeping a hundred euros in service credit for themselves.

And even though I barely have a clue what the difference is between a VPS and a KVM, I did manage to do it with hopefully as few hiccups as I can manage.

Thanks to Dark Kirb for putting my mind at ease, and to Emuz for handling the domain part.

Leave a Comment

Objects, functions, properties, and methods

Whether you’re trying to interpret SCI code in its source form, or compile it into bytecode, there are some inferences to make. Consider the following statements:

(foo1 bar:)
(foo2 bar: 42)
(foo3 69)

You can have object references, kernel calls, and local function calls, and those object references can be local instances or pointers which in turn can be stored in global variables, local variables, variables temporary to the current function or method, or properties of the current method’s object. How would you determine what each foo is?

First, you can see if there is a second item in the expression. If that item ends in :, like in the first two cases, you know that’s a selector so the identifier at the start must be an object reference of some sort. If it’s not, like in the third case, or if there is no second item at all, it must be a function or kernel call since anything else would be an error.

For the first two examples, we now know that foo must be an object. Having looked through the whole script before, we already have a list of all the parameters, temporary variables, local variables, and those from script 0, which are global. If either of those contains an item by that name, we know it’s a pointer to dereference. If it’s a local or imported object’s name, we’d be able to find that as well and can continue on. If it doesn’t appear in any of these six lists, the source code is in error.

For the other two examples, we know it must be a function or kernel call. There are three lists to check this time, being the local functions, imported functions, and kernels. Other than that, things are the same as before.

That leaves the matter of selectors. They can refer to either properties or methods, which are… actually rather trivial to tell apart considering the objects have two dictionaries, one for each type. Objects may have superclass chains reaching all the way to the Base Object and inherit properties and methods from those superclasses, but you might consider folding those superclasses’ dictionaries into the object instance’s so there’s only two to scan through.

Let’s say bar is a property. The first example would then mean “take the foo object and return its bar property’s value. Likewise the second would mean “set it to this expression.” If it’s a method, you’re given a pointer to that method’s code (which may be unique to that instance, having overwritten whatever the superclass chain started with) and you can pass it each non-selector argument in turn, until the next selector. I wrote about that before.

[ ] Leave a Comment

SCI decompilation and Weird Loops

They’re not really that weird on the face of it, but that depends on who’s looking.

The decompiler in SCI Companion is a work of art. You can tell because I didn’t write it. (I only fixed a thing or two.) But there are some things that it can’t figure out, and when that happens the function or method body is replaced with a raw asm block. For example, the copy protection in Laura Bow 2 – The Dagger of Amon Ra has loop in it that SCI Companion can’t hack.

It’s a bit much to take in but the important bits are as follows: this code block (rm18::init) has four discrete segments. The first isn’t shown here and sets up a few simple things. The second (code_0087) is a regular loop, where temp0 counts up from zero to eleven. When it hits twelve, the loop is broken and we go to section three, code_00ad. Section three is a weird loop. If you look at the check at the top we see this:

pushi #size
pushi 0
lofsa tempList
send 4
bnt code_0116

Which basically means that when (tempList size?) returns zero/is false, we skip to section four. At the bottom of the section, right before the label for code_0116, there’s the command that makes section three a loop; jmp code_00ad.

So that means that section three keeps repeating until tempList is out of items. Section two put a bunch of values in it, and section three then takes items out at random and puts them into goodList, effectively randomizing the order. The items, incidentally, are the tiles depicting the various Egyptian gods that the copy protection is all about, clones of egyptProp given increasing cel values. Section three positions them as they’re added to goodList. It’s a good routine, Brent.

The problem that trips up SCI Companion and makes it spit out the stuff in those two pictures is that it doesn’t recognize the second loop for what it is. Counting from one value to another by a given increment? Easy. Iterating over a collection? It can figure those out. But picking items from a bag until it’s empty? That’s not on the menu.

To make this decompile, then, we first need to break the loop by commenting out that last jmp command. A single ; suffices. Compile the script resource, then go back and re-decompile it. A conditional loop, of course, consists of a check and a jump. We removed the jump so now it’s just the check:

(method (init &tmp i theTile theX theY)
  (LoadMany rsVIEW 18) ; load the tiles
  (super init:)
  (gGame handsOn:)
  (gIconBar disable: 0 1 3 4 5 6 7)
  (goodList add:)
  (tempList add:)
  (= theX -32)
  (= theY 46)
  (= i 0)
  ; Instantiate twelve tiles, with increasing cel numbers.
  (while (< i 12)
    (tempList add: ((egyptProp new:) cel: i yourself:))
    (++ i)
  ; This should be "(while (tempList size?)" but we removed the jump, remember?
  (if (tempList size?)
    ; Pick a tile number.
    (= i (Random 0 (- (tempList size?) 1)))
    ; Get the i-th tile.
    (= theTile (tempList at: i))
    ; Set up the tile's position on the grid and add it to goodList.
    (goodList add:
         x: (= theX (+ theX 48))
         y: theY
    ; Once we're halfway through, CRLF to the next row.
    (if (== (goodList size?) 6)
      (= theX -32)
      (= theY 111)
    ; Actually remove the tile from tempList so we won't pick it again.
    (tempList delete: theTile)
  ; Section four
  (gGame handsOff:)
  (self setScript: sInitEm)

The cool part is that once we replace that if with a while and compile the script, the result is effectively the same as the original. Only some of the specific opcode choices are different. For example, the original uses the two-byte pushi 1 throughout (also 0 and 2), but SCI Companion’s script compiler prefers to use the one-byte push1 there. The same values are pushed regardless.

[ , , , ] Leave a Comment

Print, PrintD, and the other Print

So as established, there are three different Prints in SCI.

  • Print the function, in SCI0
  • PrintD the function, supplementing Print, in SCI1
  • Print the class, in SCI11.

They each have their own strengths and weaknesses, of course.

Max text items 1 infinite
Max buttons 6 infinite
Max icons 1 infinite
Max input fields 1 infinite
Animated icons¹ yes no yes
Size to fit yes
Size to max width yes no yes
Auto-dismiss yes no yes
Auto-layout² yes no
Position yes
Font yes
Text tuples³ yes no yes

¹: Animated icons require the ability to pass a reference to a DCIcon object instead of a view/loop/cel tuples.

²: Items added by PrintD flow to the right with a four pixel margin. Pass the #new command argument to reset the flow to the left edge and below the last item, or the #x/#y modifiers to shift the last item’s position. In the Print class, every item added must be manually positioned as everything defaults to the top-left. The Print function has its limits specifically because it automatically lays out the controls.

³: The Print class being from SCI11, it takes noun/verb/case/seq tuples.

That comes down to the following actions:

SCI0 Print: required string or tuple for text (may be empty), mode, font, width, time, title, at, draw, edit, button (up to six times), icon, dispose, window, first

SCI1 PrintD: new, at, title, first, text, button, icon, edit, x, y

SCI11 Print: addButton, addEdit, addIcon, addText, addTextF, addTitle, posn methods, plus mode, font, width, ticks, modeless, and saveCursor properties

…And then I messed everything up by rewriting PrintD as a wrapper around the Print class so it runs on SCI11, adding everything but auto-dismiss and animated icon support. It’s available from my SCI stash, of course. Hell, by this time tomorrow those last two things may well be included.

SCI11 PrintD: all of SCI1’s, plus modNum, cue, font, and width

[ ] Leave a Comment

A key difference between SCI0/1 and SCI11

Oddly, this one’s entirely script based.

I was reminded of this when I watched one of Space Quest Historian’s videos, where he played the Space Quest 2 remake by Infamous Quests. Now, one thing he mentions about the version he plays in that video is that besides the narrator being muted so he can narrate it himself, well…

but recently Steve Alexander, who was one of the co-CEOs of Infamous Quests had a little peek around the old game files and decided to spruce it up a bit, fix some old bugs, add some you know, touch-ups here and there. Most importantly, at least according to him, replace the standard AGS font with something that looks a little more Sierra-ish. And that’s the version I’m going to play.

All well and good but then the main menu appeared and I noticed just how important this font thing seemed to be.

One thing immediately came to mind when I saw this. Notice how the button cuts into the caption? If I was a betting cat, I’d say the original version’s main font was just slightly shorter than this replacement. Also when I prepared the image I noticed each of these buttons has a different height, but that’s not the issue here — that’s just sloppy work.

This thing with the buttons cutting into the message text? It can’t happen accidentally in SCI0/1, but it can easily happen in SCI11. How is that?

In SCI1, the system scripts include a PrintD function. It’s pretty powerful on the face of it:

	"Would you like to skip\nthe introduction or\nwatch the whole thing?"
	#at 100 60
	#button "Skip it" 0
	#button "Watch it" 1
	#button "Restore a Game" 2

And that gives you this:

The function itself will track how large every item should be as they are defined, and adjust the final size of the window accordingly. Just to simplify the explanation a bit. And then it’ll return the value of a given button so the game knows how to react.

(Update: SCI0 had Print which worked not entirely unlike this, and SCI1 added PrintD as a more streamlined variant with #new support, because…)

In SCI11, Print is now an object. You have to call a bunch of methods on it, in sequence, then finish with an init: call, and that will trigger its display and return the value chosen. But one important distinction is that addText:addButton:, and its ilk… don’t automatically position themselves. You have to do that yourself.

That’s why these games came with a built-in dialog creation tool, among others. It’d create code like this:

; DialogEditor v1.0
; by Brian K. Hughes
	posn:		0 28,
	font:		0,
	addText:	{bluh bluh} 71 18,
	addButton:	0 {button} 71 30,
	addButton:	1 {button} 0 0,

The Sierra programmers could then integrate that into their game scripts. But importantly, those manually-positioned buttons could overlap other controls.

(Update: … PrintD also had a dialog editor made for it. This is pretty explicitly stated in the changelogs.)

AGS dialog boxes also use manual positioning, just with a nice WYSIWYG form editor. So if you take a perfectly good (yeah right) introduction menu and change the font for one with a higher X-height and don’t adjust things… you get that SQ2 window.

Update just because: this is what the SQ4 intro menu would look like with font.001 instead. That is, with SCI1’s PrintD and its automatic layout.

[ , , , ] Leave a Comment