Logo Pending


String functions in SCI11 and SCI2 compared

Leaving out the oddly-named StrSplit in SCI01, let’s get into the other string functions we’ve got. I have an idea that I’d like to ponder, y’see?

First up, in the old 16-bit SCI, or at least SCI11, we have the following kernel functions:

(StrCmp strA strB) Compares strA to strB until a null in strA or a mismatch. Returns 0 if the two strings match, something lower than zero if the first mismatch is lower in strA, something higher if it’s in strB.
(StrCmp strA strB maxLen) Same as (StrCmp strA strB), but only up to the first maxLen characters.
(StrLen str) Returns the number of characters in str.
(StrCpy strDest strSrc) Copies characters from strSrc into strDest, up to and including the null terminator. It’s up to you to ensure it fits.
(StrCpy strDest strSrc maxLen) If maxLen is positive, copies characters from strSrc to strDest up to and including the null terminator or up to maxLen characters. A terminator is ensured. If maxLen is negative, simply copies that many characters and damn the terminators.
(StrEnd str) Returns a pointer to the end of str. Effectively, str += strlen(str);.
(StrCat strA strB) Appends strB at the end of strA. It’s up to you to ensure this fits.
(StrAt str pos) Returns the character at pos in str.
(StrAt str pos newChar) Same as (StrAt str pos), but places newChar at pos, returning what was there.
(Format strDest format args...) Takes the format string and all the args, and prints it all to strDest. The format and any args for an %s placeholder can also be far text pairs.
(ReadNumber str) Tries to parse str as a string of digits and returns their value.

That’s a fair amount. It’s nice to have StrAt when you consider all numbers are inherently 16 bits wide and as such you can’t just manually work your way around a string. We’ve seen it around in hash calculations and dropcaps.

As an aside, the Format entry mentions far text pairs. Those refer to text resources, where instead of doing something like (Display "Hello World!") you’d do something like (Display 100 4) and have a text resource #100, where line #4 is “Hello World!”. This allows for more efficient memory use and ease of translation. In SCI0, you could only have up to 1000 resources of each type, from 0 to 999, while a script’s internal strings would be referenced with pointers that are always higher than 1000. This allows both the interpreter and scripts to tell the difference, fetching the actual string when called for. In the original SC compiler, there were in fact two ways to write strings. You could use "double quotes" as usual, or {curly braces}. One of these would be left as “near” strings in the script resource, the other would be automagically compiled into the script’s matching text resource as “far” strings. Neither SCI Companion nor Studio support this, and you can write any string in either style. I personally prefer the quotes.

Now, in SCI2 and later most of these separate kernel calls were consolidated into a single one with a bunch of subcommands, String. A few of these are wrappers around the Array kernel call, considering SCI2 strings are implemented as arrays of type string, but there are plenty proper string functions. Any function that may resize the string returns its new address.

(String StrNew size) Creates a new string data block (array of type String) of the given size.
(String StrSize str) Returns the size of the string.
(String StrAt str pos) Returns the character at pos in the string, or zero if it’s not that long.
(String StrAtPut str pos newChar) Sets the character at pos in the string, resizing it if it’s not that long.
(String StrFree str) Deallocates the string data block’s memory space.
(String StrFill str startPos length fillVal) Sets a whole range in the string to the given fillVal, resizing if needed.
(String StrCpy strDest destPos strSrc srcPos len) Copies a chunk of characters from strSrc to strDest, resizing if needed.
(String StrCmp strA strB) Compares strA and strB, as in SCI11.
(String StrCmp strA strB maxLen) Compares strA and strB up to maxLen, as in SCI11.
(String StrDup str) Duplicates the string block and returns the address of the duplicate.
(String StrGetData str) Returns a pointer to the string’s actual data.
(String StrLen str) Returns the length of the string’s actual data, up to the null terminator, as opposed to its containing array’s capacity.
(String StrFormat format args...) Takes the format and all args, printing it all to a new string, then returns the address of that new string.
(String StrFormatAt strDest format args...) Same as StrFormat but you provide an existing string to format to.
(String StrToInt str) Tries to parse str as a string of digits and returns their value.
(String StrTrim str flags) Removes whitespace from str. If flags is 1, all whitespace at the end is removed. If it’s 4, all whitespace at the front is removed. If it’s 2, everything inbetween is removed. These can be combined.
(String StrTrim str flags notThis) Same, but doesn’t consider notThis to be whitespace.
(String StrUpr str) Converts the string to uppercase.
(String StrLwr str) Converts the string to lowercase.
(String StrTrn strSrc strSrcPat strDestPat strDest) I honestly haven’t a clue. I never understood this one.

Now consider the following: these are all one and the same kernel call, and they include some functions that aren’t in the 16-bit interpreters such as case-folding and trimming. Wouldn’t it be nice? They don’t even have to be based on arrays, even if that’s a feature I’ve been working on backporting to SCI11+.

Like
[ ]

Leave a Reply

Your email address will not be published. Required fields are marked *