Logo Pending


Don’t use the auto-tracer

As an addendum to my previous post on SCI smells, here’s why you shouldn’t use the auto-trace function. Check it out.

This is your brain:

This is your brain on drugs:

Don’t use the auto-tracer. Especially when your source image has a lot of dithering in it, it has no earthly clue how to efficiently import those. Plus, you don’t get any priority or control screen data from it so you’d have to trace that out anyway.

Use tracing images, sure. Get a nice pencil sketch and go to town on it.

We already know they did that back in the day from this hidden background asset in Quest for Glory 2:

[ , ] Leave a Comment

Smells like SCI Spirit

If you’re going to make an adventure game in SCI Companion (or gods forbid SCI Studio) there are certain things that you might want to look out for. Missing them may make your game look somewhat lazy to those who know what to look for.

SCI0

  1. “Colors” setting: since the template is based on Leisure Suit Larry 3, it inherited the option to specify any combination of 16 window background and text colors.
  2. MT-32 patch: easily missed depending on what you’re testing with, early SCI0 fan games mightn’t have had any MT-32 testing done at all. Thus you might miss that the MT-32 patches are straight from Larry 3 (again), including the startup messages, and your background music may not sound as good as one might hope, with mismatched instruments.
  3. BriPro logo: instead of the Sierra mountain logo, the first character in three out of five included fonts is the letters BP, for Brian Provinciano.
  4. Template Guy: as I like to call him. The dude with the blue shirt and gray pants. He’s the default player character.

You might consider all that to be in the past, considering it literally is, but there’s still projects actively in development that target SCI0/01.

SCI11

  1. Blueish grays: since this template is based on Space Quest 5, the original version used to have SQ5’s palette as opposed to, say, the default SCI palette. The later template, before I even all but took over the project, had this replaced.
  2. Template Guy: he’s back, and I myself specifically made him so you wouldn’t be stuck with Roger Wilco. Same things apply as in SCI0 though.
  3. The font: by default, the SCI11 template comes with three common fonts and one specific to SQ5, and uses this as the default. It has no distinction between upper and lower case.
  4. Status line: even to this day, the template game includes SQ5’s custom-drawn status line, with the raised border.
  5. Inefficient polygons: a bit technical, might have to go into detail in another post, but early versions of SCI Companion used a rather hackish way to import external polygon data (where can you go, like in my pathfinder woes) which ate a lot of memory for no good reason. Later versions allow you to use the &getpoly command that I added, a bit of syntactic sugar that imports the polygon data in such a way you can’t tell it apart from Sierra’s code if you were to decompile it afterwards.

There’s no BriPro logo in the SCI11 template’s fonts, it’s the Sierra logo, but on the other hand only Leisure Suit Larry 6 has a menu bar despite being an SCI11 game, so you’d rarely if ever get to see it. The MT-32 patch is Sierra’s own near-GM patch, and includes a “SIERRA ON-LINE” message.

 

[ , , ] Leave a Comment

Rules for UTF-8 in SCI11+

  1. It won’t work in ScummVM yet, as nothing uses it yet so I see no reason to add it. Most of the rest of SCI11+’s gimmicks do though.
  2. The internal “draw a string” function, used to write literally anything on to the screen, is where the magic happens: if the current port’s current font has more than 256 glyphs in it, the input string is interpreted as UTF-8. If it does not, things work exactly as usual.
  3. Because combining characters and glyph substitution are not supported and general punctuation like “” and are all the way in the 20002044 range, the General Punctuation block’s glyphs take the place of Combining Diacritical Marks as 03000344.
  4. Similarly, CJK Symbols, Hiragana, and Katakana are moved from 300030FF to 020002FF, where some Latin Extended-B, IPA Extensions, and Spacing Modifiers should go.
  5. Those last two points apply to the font data, not the actual text.
  6. The new kernel functions UTF8to16 and UTF16to8 will always consider their inputs to be in Unicode, no matter what the current port’s current font says. Unless you built an SCI11+ with UTF-8 support disabled, in which case none of the above applies and all these two functions do is turn 8-bit values into 16-bit.
  7. The kernel function to turn a string lower or upper case, StrCase, unlike the two I just described, will check the current font and act like it used to same as the “draw a string” function.
  8. The functions to get the lower or upper case version of a character that StrCase ends up using, tolower and toupper, have been extended to cover the full 256-character range. Several maps are included and one can be chosen at build time. We have maps for code page 437, Win-1252, ISO 8859-1, and a fair bit of Unicode.
  9. In general, SCI11+ can be considered to use Unicode 1.1 on account of SCI 1.001.100 dating from 1993, going by the version numbers and release dates for Freddy Pharkas (1.001.095) nd Leisure Suit Larry 6 (1.001.115).
[ , , ] Leave a Comment

Sluiceboxes and SetPorts

Today I got a delightful (and long) email from sluicebox, of the ScummVM SCI team. He wrote about a lot of things but one thing stood out and he’s right, I should write about it.

Remember when I fixed the imitation AGI windows in Space Quest 4? There’s something very strange going there that sluicebox pointed out in the email.

If you’ll remember:

(method (open &tmp port temp1)
  ; temp0 was unused so we're taking it for proper SetPorting.
  (= color gColor)
  (= back gBack)
  ; Set our type to ONLY wCustom, not wCustom|wNoSave, and open.
  (= type 128)
  (super open:)
  ; Nothing will have appeared because wCustom don't draw anything, but a port has been set up!
  ; Switch to drawing on the whole screen but also *save the window's port*.
  (= port (SetPort 0))
 
  (= temp1 1)
  ; ...
  (Graph grUPDATE_BOX lsTop lsLeft lsBottom lsRight 1)
 
  ; Reset to the window's port.
  (SetPort port)
)

But if you look at this GitHub commit from ScummVM you’ll see the interesting description

SSCI doesn’t return zero; it doesn’t return anything. This shouldn’t affect any games since no scripts should depend on a non-existent return value, but this discrepancy came up while investigating a fan script that accidentally relies on this.

So I checked the leaked source code that I made SCI11+ from.

global KERNEL(SetPort)
{
	if (argCount >= 6)
	{
		picWind->port.portRect.top = arg(1);
		picWind->port.portRect.left = arg(2);
		picWind->port.portRect.bottom = arg(3);
		picWind->port.portRect.right = arg(4);
		picWind->port.origin.v = arg(5);
		picWind->port.origin.h = arg(6);
		if (argCount >= 7)
			InitPicture();
	}
	else
	{
		if (arg(1))
		{
			if ((arg(1)) == -1)
				RSetPort(menuPort);
			else
				RSetPort((RGrafPort*)Native(arg(1)));
		}
		else
		{
			RSetPort((RGrafPort*)RGetWmgrPort());
		}
	}
}

No return value, which is obvious really because the KERNEL define expands to a void function. Return values are instead handled by setting the acc global variable. So lets dig a little deeper.

RSetPort proc	pPtr:word
	mov	ax, pPtr
	mov	rThePort, ax
	ret
RSetPort endp

Nothing. It sets the rThePort global and that’s all. There’s an RGetPort function right above that does the opposite, but nothing in the kernel function calls that.

Looking back at my description of BorderWindows, there’s an important difference:

(= oldPort (GetPort))
(SetPort 0)
(Graph grUPDATE_BOX lsTop lsLeft lsBottom lsRight VISUAL)
(SetPort oldPort)

It’s very interesting indeed how this happened to Just Work. Even so, I should probably go back and correct that SQ4 script.

[ , , ] Leave a Comment

Pattern pen implementation differences

While looking into something unrelated in Space Quest 3, I noticed that the dirt on the right of the starting screen was drawn differently between DOS SCI and ScummVM. Today I looked into it a little closer, comparing SCI proper, ScummVM, SCI Companion, and SCI Viewer.

Damn, that’s some really tiny differences that you’re not gonna spot just like this. But here they are:

  • In most of them, the mound on the right looks like this:
  • Except in ScummVM, where it looks like this: (and now you know why I looked into this)
  • Below the column in the middle looks like this in SCI and SV:
  • But it looks more like this in SCI Companion and ScummVM alike:
  • The heap on the left is also mildly affected, looking like this in SCI and SV:
  • But it looks like this in SCI Companion and ScummVM:
  • And finally, SV, renowned for being Very Good At This, breaks the one rule — you don’t get to draw white on non-white:

So yeah, a slight difference in where a window border is drawn is the least of your problems.

Update: ScummVM had its pattern table corrected this week. Guess I’ll have to check out the latest nightly, huh? And yes, it does match SCI proper now. Good job everyone!

Bonus update: sluicebox suggested comparing against SCI Studio. Here you go, friend: the mound on the right looks like this in SCI Studio, a distinctive variation, and the bit under the pillar and to the left matched SCI Companion and ScummVM (past tense now),

[ , , ] 3 Comments on Pattern pen implementation differences

String literals

In various programming languages, different quotation marks and such can mean different things. In C/C++ for example, "this" is considered a plain string literal. It can contain various escape codes (\x69, \n, et al), and escaped double quotes (\"), but not raw newlines. A single-quoted literal is not a string at all, but a character literal. In C# meanwhile we have the plain double-quoted string and single-quoted character but also @"this", a variation that does allow raw newlines at the cost of not allowing escapes. In PHP meanwhile, we have double-quoted strings that, in contrast to C, can have raw newlines but also have variable interpolation — "Hi $name" will appear as Hi Mark, assuming that is that variable’s value at the time. Single-quoted strings in PHP don’t do escape sequences or interpolation, and then there’s “heredoc” strings.

SCI also has different types of string literals. Or rather, had, depending on which version you targeted. Double-quoted strings could have escape codes (\42, note the lack of a letter, and of course \n) and raw newlines, but whitespace was folded away on compilation so raw newlines and tabs were entirely for code readability’s sake, requiring a \n at the end of each line that had to have a line break. You could also use curly braces for strings, {like this}, that had exactly the same rules and limitations as double-quotes, except for one difference in storage.

Any string literal in curly braces would be stored as-is in the script resource, while double-quote strings would be stored in a matching text resource and replaced in the compiled code with a look-up key:

(Print "This is an example" #title {Kawa says})

(Yes, I’m aware that the syntax highlighter doesn’t pick up on the braces.)

Assuming this is script #42 just as an example, and this is the first place a double-quoted string appears, the above would be transformed like so:

(Print 42 0 #title {Kawa says})

The original string will be stored in a separate text resource with the same number as the script. This helps cut down memory use.

(Bonus banter: there’s a bug in the original SCI interpreters that was introduced when they added the Message resource format involving hexadecimal numbers where they accidentally used "01234567890ABCDEF", with an extra zero. This messes up any attempt to use a good third of the character set in a message resource, but not an inline string literal, so having \0E in a string literal will produce the intended while the same thing in a message will produce ¤ instead. In SCI11+, this has been corrected to just "0123456789ABCDEF".)

[ ] 2 Comments on String literals

Tracing in SCI2

Where 16-bit versions of SCI had two blank spaces in their PMachine instruction set, SCI2 introduced the _line_ and _file_ instructions, which the compiler could inject into the final bytecode so that the built-in debugger could then work out exactly which source code line matched the current instruction. Here’s how that goes down in practice.

First we make two test scripts, 42.SC and 69.SC:

(script# 42)
 
(procedure
	Test
)
 
(public
	Test 1
)
 
(procedure (Test)
	(Display "This is Test, (42 1).")
)
(script# 69)
 
(procedure
	TestA
	TestB
)
 
(public
	TestA 0
	TestB 1
)
 
(extern
	Test 42 1
)
 
(procedure (TestA)
	(Display "This is TestA, about to call TestB.")
	(TestB)
	(Display "Back in TestA, gonna call (42 1).")
	(Test)
	(Display "Back in TestA.")
)
 
(procedure (TestB)
	(Display "This is TestB.")
)

This may look a mite different from SCI Companion code because despite everything, they are not the same. Now, compiling them both in SC version 4.100, from January 12 1995, and then pulling them back through a disassembler and annotating it a bit, we get this output:

; 42.SC
;-------
Test:	_line_	11	;(procedure (Test)
	_file_	"42.sc"
	_line_	12	;	(Display "Hello my darling.")
	push1
	lofsa	$6
	push
	callk	Display, 2
	bnot
	_line_	13	;)
	ret
 
; 69.SC
;-------
TestA:	_line_	17	;(procedure (TestA)
	_file_	"69.sc"
	_line_	18	;	(Display "This is TestA, about to call TestB.")
	push1
	lofsa	$6
	push
	callk	Display, 2
	_line_	19	;	(TestB)
	push0
	call	TestB, 0
	_line_	20	;	(Display "Back in TestA, gonna call (42 1).")
	push1
	lofsa	$2a
	push
	callk	Display, 2
	_line_	21	;	(Test)
	push0
	calle	Test, 0
	_line_	22	;	(Display "Back in TestA.")
	push1
	lofsa	$4a
	push
	callk	Display, 2
	_line_	23	;)
	ret
 
TestB:	_line_	25	;(procedure (TestB)
	_file_	"69.sc"
	_line_	26	;	(Display "This is TestB.")
	push1
	lofsa	$59
	push
	callk	Display, 2
	_line_	27	;)
	ret

Every time the PMachine encounters a _file_ opcode, it grabs a null-terminated string from the bytecode stream and places it into pm.curSourceFile. Likewise, _line_ takes a 16-bit number and places it into pm.curSourceLineNum. The built-in debugger can then notice when these two values change, find the source file, and display the correct line of code.

But there’s one tiny detail that threw me off initially. Can you see it?

When TestA calls Test, the current source file changes to 69.sc, but it doesn’t change back afterwards.

Although the SCI2 source I have here doesn’t seem to call it, there is in fact a pair of functions to push and pop debug state, preserving the value of pm.curSourceFile and pm.curSourceLineNum across module calls. Which is quite obvious when you think about it. The alternative I can see would be to insert another _file_ opcode after each out-of-module call.

[ ] Leave a Comment

Script resources – a dyad in the Force, as it were

ZvikaZ recently ran into an issue trying to hack Quest for Glory 1 VGA where they edited a particular script, and it worked fine, but when they then exported the .scr file and put it in a clean QFG1 folder, it broke in a particular way. One particular phrase stood out to me in particular:

There are ‘ch’ strings instead of the numerical values

I had a feeling what the problem might’ve been when I started reading the post but when I saw that part I knew exactly what happened.

Quest for Glory 1 VGA is an SCI11 game. That means the scripts are split up into .scr and .hep pairs, and ZvikaZ only copied the one file instead of both. One of them contains the actual script bytecode, but the other contains the amount of local variables, their default values, information on all the objects in the script, and all the text string literals in the script. It’s called a heap resource because that’s where it’s loaded.

Originally, the script and heap resources were one and the same. When a given script needed to be loaded, it would be loaded into heap memory and kept there until unloaded. And as explained before, a saved game is basically a compressed dump of the entire heap memory area, while hunk space contains all the other resources that the scripts, in turn, refer to. Now imagine for a second a script resource with a single class in it, with a single particularly big method, so that a mere fraction of the script resource describes the class, and contains any near strings and such, and all the rest of it is bytecode. Once loaded, the bytecode can’t be changed — only the class properties and any local variables can be, but all of that bytecode is still part of the heap. There’s only so much heap space available to a game, so as long as that script is resident, that bytecode will take up precious space.

SCI11 split the script resources up so that the bytecode parts would be kept in hunk space instead, swapped in from disk when actually needed by something from the script definitions in heap space. All that space taken up by PMachine bytecode is suddenly no longer part of the heap and this bad boy can fit so many script resources at once. And if your scripts use far text instead of near — text resources referenced by a module/line tuple that get loaded into hunk space, instead of "quoted strings like this" that are part of the script’s heap resource) anything those scripts try to say automatically also doesn’t take as much space. You trade a two-byte pointer for a four-byte tuple, but those numbers in turn may refer to a string of who knows what length. Savings!

ZvikaZ’s target was the script resource for QFG1‘s character creation screen, whose first class is a Room named chAlloc. That name appears in the heap resource. When ZvikaZ changed the script code and recompiled, the heap resource had its contents changed, including where exactly in the file the room’s definition started. Whatever mixed-up monstrosity resulted when ZvikaZ then tried to run the altered 203.scr against an untouched 203.hep didn’t function and notably printed ch instead of numerical statistics.

I’m honestly a little impressed it didn’t “oops” on the spot.

[ ] 1 Comment on Script resources – a dyad in the Force, as it were

SCI versions and naming

Did Sierra ever call the various versions of SCI the same names we use? We being the fans, the tool creators, and the ScummVM developers?

It’s unlikely.

One thing to keep in mind is that the interpreter was in near-constant development by one team, while other teams made the games. Every so often the game developers would pull in the latest interpreter and system scripts from a network share. Another thing to keep in mind is that the version numbers are a little weird in places, and that the games themselves had their own version numbers on top of that, so for example you could have King’s Quest 4 version 1.000.106 running on SCI 0.000.274, but also KQ4 1.000.111 on the same interpreter, released five days later, and the later update with the changed graphics that was version 1.006.003 running on SCI 0.000.502.

The first generation of SCI, the one we call “SCI0”, had versions starting with “0.000”, such as the KQ4 example above. This covers every single 16-color, parser-based, English-only game, with the lone exception of the Police Quest 2 PC-98 release. That was version “x.yyy.zzz”, no joke. This generation can also be subdivided into two blocks, where versions up to 0.000.343 had green button controls instead of using whatever the window color was set to, covering the ’88 versions of KQ4 and the first version of LSL2, and the rest covered all the other games.

What we call SCI01 had versions starting with “S.old”. At least “x.yyy.zzz” has the placeholder excuse but whatever. SCI01 games were just like SCI0 on the surface, but had support for multiple languages (previously introduced in version x.yyy.zzz), and saw no more releases than ’88 SCI0 — six of ’em. So technically there’s nothing about that version string to inspire “SCI01”, besides perhaps KQ1 using “S.old.010″ 🤔

Next up was SCI1, which came in both EGA and VGA and usually had versions starting with “1.000″. There is one game, Quest for Glory 2, with five different interpreter versions that still had the text parser (and technically one Christmas card) before it was removed in favor of the icon bar. Some SCI1 games again have interpreters with very strange versions — it appears Eco Quest and Space Quest 4, among others, had some Special Needs™, given interpreter version “1.ECO.013” and “1.SQ4.057″. But on the whole you could still tell from the first character in the version that these were SCI1 interpreters.

SCI11 removed the multi-language support in favor of things like scaling sprites and the Message resource type. All SCI11 interpreters in the wild use versions starting with”1.001“, except for the ones used in Laura Bow 2 (“2.000.274”), Quest for Glory 3 (“L.rry.083”), and Freddy Pharkas (“l.cfs.081”), among a straggler or three.

Up to now these were 16-bit real-mode applications. SCI2, with versions starting “2.000” was a 32-bit protected mode application instead, with the ability to use much more memory and run in a SuperVGA video mode. No SCI2 interpreter found in the wild seems to stray from this version pattern, mostly because all SCI2 games use version 2.000.000. SCI21, in turn, runs on interpreter version 2.100.002, although there are technically three different sub-versions of 2.100.002. That’s not confusing at all. And finally, SCI3 was only seen in interpreter version 3.000.000.

I’m thinking after the switch to 32-bits, they must’ve stopped automatically bumping version numbers on build.

So what does Sierra call them, then? Well, sources say that Sierra called the 32-bit interpreters SCI32, and the source code archive that I based SCI11+ on was SCI16.ZIP. But none of the changelogs and such seem to refer to SCI0, SCI1, or whatever.

 

 

Happy slightly belated new year 🥂

[ , ] Leave a Comment

Objects, functions, properties, and methods

Whether you’re trying to interpret SCI code in its source form, or compile it into bytecode, there are some inferences to make. Consider the following statements:

(foo1 bar:)
(foo2 bar: 42)
(foo3 69)
(foo4)

You can have object references, kernel calls, and local function calls, and those object references can be local instances or pointers which in turn can be stored in global variables, local variables, variables temporary to the current function or method, or properties of the current method’s object. How would you determine what each foo is?

First, you can see if there is a second item in the expression. If that item ends in :, like in the first two cases, you know that’s a selector so the identifier at the start must be an object reference of some sort. If it’s not, like in the third case, or if there is no second item at all, it must be a function or kernel call since anything else would be an error.

For the first two examples, we now know that foo must be an object. Having looked through the whole script before, we already have a list of all the parameters, temporary variables, local variables, and those from script 0, which are global. If either of those contains an item by that name, we know it’s a pointer to dereference. If it’s a local or imported object’s name, we’d be able to find that as well and can continue on. If it doesn’t appear in any of these six lists, the source code is in error.

For the other two examples, we know it must be a function or kernel call. There are three lists to check this time, being the local functions, imported functions, and kernels. Other than that, things are the same as before.

That leaves the matter of selectors. They can refer to either properties or methods, which are… actually rather trivial to tell apart considering the objects have two dictionaries, one for each type. Objects may have superclass chains reaching all the way to the Base Object and inherit properties and methods from those superclasses, but you might consider folding those superclasses’ dictionaries into the object instance’s so there’s only two to scan through.

Let’s say bar is a property. The first example would then mean “take the foo object and return its bar property’s value. Likewise the second would mean “set it to this expression.” If it’s a method, you’re given a pointer to that method’s code (which may be unique to that instance, having overwritten whatever the superclass chain started with) and you can pass it each non-selector argument in turn, until the next selector. I wrote about that before.

[ ] Leave a Comment