Logo Pending


Shampoo, Cologne, Mousse

(Edited from a Twitter rant in nine parts.)

I’ve complained about Shampoo’s name and how fanfic writers tend to write her “original Chinese name” often enough. I’d like to discuss Cologne and Mousse this time.

Now then. Knowing that Shampoo is the only one of the three with a name in actual Chinese characters, Cologne and Mousse are only ever written in katakana: コロン and ムース. Koron and Mūsu. Simple, right? That’s basically exactly how the products would be pronounced, just like with Shampoo.

Problem #1: Cologne’s fandom name is usually written as Khu Lon. Sometimes I think without the H. First of all, I can find no romanization scheme where khu is a valid sound, written that way. There’s khu but that’s in a scheme that’s not even used, and kuh isn’t quite it either. Second, it seems to me to be simply the wrong sound. The vowel is all wrong!

Problem #2: Mousse is usually given the name Mu Tsu, Mu Tse, and in this one doorstopper I’m slogging through, Mse Tsu. I already covered how absurdly wrong that third one is. If these were at all right, wouldn’t his name be written ムー, tsu? But it’s not. There is no T sound in either his established katakana writing or in his name as spoken.

Shampoo’s name, as covered before, is shanpū in Japanese, with a lengthened -u. It’s shānpū in Chinese, where I believe the accent marks denote tone? Correct me on that if I’m wrong. So at worst the length of the -u is different. Likewise, Ranma’s name “translates” to Chinese as luanma. Exactly the same kanji (乱馬) and all that. Readings are cool like that. Between the well-known L/R difficulty and an easily drowned out u, that’s also basically the same when pronounced.

Why then, in the name of logic, should Cologne and Mousse have their names so different? At least back when they started, fanfic authors had no way to look this shit up — there was no Wiktionary or such back then. You really have no excuse now. You can do better than this. If you were to tell me “but those are their names”, you basically admit to being both uninformed and being too lazy. “That’s how we’ve always called it in the fandom” is arguably better, but again you can do better.

Since we can look shit up, let’s actually do look this shit up! What is “eau de cologne” in Mandarin Chinese? It’s 科隆香水, kēlóng xiāngshuǐ. “Mousse” as in the hair product? That’s 摩絲, mósī.

Shānpū, Kēlóng, Mósī. Research over.

But wait. Kēlóng? Ke? Didn’t I say earlier that the vowel is all wrong? I did. It’s a romanization difference. The zhuyin is ㄎㄜ. I gave their names just now in hanyu pinyin. In Wade-Giles, it’s written with an o. Simple. Mousse’s ㄇㄛ is mo in both.

Just for fun, I thought I’d look up some common fanmazon names, but reconsidered when I found Chinese has a perfectly good word for perfume. Oh well, I’d made my point already 🤷‍♀️

[ , , ] Leave a Comment

More regarding Interrupt 21

Last time I explained how your standard file rename function as seen in MS-DOS worked. You’d set up two CPU registers with pointers to the old and new names, set AH to 0x56, and called Int 0x21. Easy, right? And then I went into detail on how malformed inputs were handled. They weren’t handled too well, and DOSBox does it differently from MS-DOS on top of that.

But what if we had a file system and rename function that did support spaces? Maybe more than eight characters, even? In mixed case?

That is of course VFAT, an extension to regular FAT16 available in Windows 95, NT 3.5, and later. With a VFAT driver, most of the old file operations available from Int 0x21 had counterparts installed that generally took the same arguments and had the same numbers, but accepted long filenames.

So to rename a file with long filename support, you’d do exactly what you’d do before but instead of setting AH to 0x56 you’d set AX to 0x7156. Assuming Windows is running and we use the same inputs as last time, your file will now be named hello world.txt. And that’s all that takes, even if it’s a pure DOS program doing it.

Which raises a question. How do you make a pure DOS program that handles files that may have long names, may be run from Windows, and should not drop any of those long names if it is in fact running in Windows? Well, it turns out all those LFN functions — the ones starting with 0x71, all reset AX to 0x7100 if they’re not installed. A trick of the system, I suppose. So what you could do for your LFN-enabled rename function is try to use 0x7156, see if AX has reset to 0x7100, and if it has, you try again with AH set to 0x56. In other words, it’s time to bring back the rename function from SCI11… or rather a branch of SCI11+ that I’ve been working on.

rename	proc	oldName:ptr byte, newName:ptr byte
	mov	dx, oldName	; ds:dx = old name
	push	ds
	pop	es
	mov	di, newName	; es:di = new name
 
	mov	ax, 7156h	; LFN Rename
	int	21h
	.if	ax == 7100h	; LFN failed, try DOS 2.0 version
		mov	ah, 56h
		int	21h
	.endif
 
	.if	carry?
		xor	ax, ax
	.endif
	ret
rename	endp

It’s that easy. Of course, this is old-school MASM code which has some nice things like .if but that’s just sugar to avoid having to write compares and branches — the concept should be clear enough. An attempt to rename a file to Introduction.txt will result in exactly that on Windows, or transparently collapse to introduc.txt on plain DOS.

Note that in the actual SCI11+ code, if you’re crazy enough to look it up, there’s an extra function I made that’s called right before the DOS 2.0 rename call that replaces all spaces with underscores, which renders them about 100% not as confusing and untouchable as the one shown last time. I left that part out for brevity.

[ , ] Leave a Comment

This is why you sanitize your inputs, 1983 edition

(This is heavily expanded from a few Twitter posts of mine.)

When you write an application that has to rename a file, you have your chosen language and platform’s standard library to do the heavy lifting for you. For example in C it’s usually int rename(const char* oldName, const char* newName), and a bunch of other languages follow suit. Why not, it’s a good function! But what does rename actually do?

In MS-DOS, this’d be handled by Interrupt 0x21, subfunction AH 0x56. By which I mean it’d set two specific processor registers (as mentioned in Save Early, Save How) to point to the old and new file names, set the AH register to 0x56, and execute the INT 0x21 instruction. A function installed by MS-DOS will then take over, doing the actual renaming, possibly returning an error value which the C function can immediately use as its’ return value. Since SCI has its own “need-to-use” library…

rename	proc	oldName:ptr byte, newName:ptr byte
	mov	dx, oldName	; ds:dx = old name
	push	ds
	pop	es
	mov	di, newName	; es:di = new name
 
	mov	ah, 56h
	int	21h
 
	.if	carry?
		xor	ax, ax
	.endif
	ret
rename	endp

(Full disclosure: the SCI code actually includes a dos macro to save the programmers some typing. I unrolled it here for illustration purposes.)

All of this pretty much matches what you can find on Ralph Brown’s list. Given a suitable function prototype in C such as the one in the second paragraph, SCI can now call its own rename function as it desires.

Enough about SCI though, its function as a practical example is at an end.

But what if you gave it bad inputs? Sure, if the old name doesn’t refer to an existing file it will return 2 “file not found”, but what if the new name isn’t quite valid? Remember, this is MS-DOS; we don’t have the luxury of long file names here. It’s 8.3 or bust. I don’t see any sanity checks in the above function, and Brown’s documentation only speaks of splats.

So what happens if we have a file boop.txt and call rename("boop.txt", "hello world.txt")?

In DOSBox, you’d end up with a file hellowor.txt. You are free to further manipulate this file in any way you please. The command line won’t choke on it, file managers won’t get confused. If you wanted to manually rename it back to boop.txt from the command line, ren hellowor.txt boop.txt will work perfectly fine.

This is actually not true in real MS-DOS. If your program were to run on a real MS-DOS installation, you’d end up with hello wo.txt, an 8.3 file with a space in it. And no contemporary file manager I’ve seen can handle that. The ren command built into command.com can’t parse it — ren hello wo.txt boop.txt is three arguments where ren expects only two, and the first isn’t an existing file’s name that it can change to wo.txt.

In cmd.exe of course you can use double quotes to make it unambiguously two arguments, but this isn’t cmd.exe. What about some file managers though? I have two, Norton Commander and its big brother Norton Desktop.

In Norton Commander, the file list shows hello wo.txt, and its rename function can handle it. So can the built-in editor and viewer. Top marks for Norton Commander!

Norton Desktop on the other hand is not so sturdy. It can show the file in the list but that’s all. Trying to rename it back to boop.txt reveals the incorrectness of the source file’s name quite succinctly:

Technically, this is true. You’re not supposed to have spaces in the middle of a FAT 8.3 file name. If a file has less than eight characters before the dot, it’s secretly padded with spaces, and so are the three extension characters. And the dot isn’t even — the true name as written in the FAT directory would be BOOP    TXT. But that’s just one way Norton Desktop trips. Its viewer seems to be passed the nonexistent hello. It shrugs and asks which existing file we want to open. Its editor is given the same argument(s?) and lets us edit a brand new file named hello. In Norton Desktop’s world, it can see the file, but it can’t do much with it.

What about a contemporary Windows? Can, let’s say, the Notepad from Windows 3.1 handle this file? Okay, so technically this is commdlg.dll talking, but we’re playing for effect here.

Of course not, what did you expect by now!? Norton Commander only worked because it didn’t care enough! Would you really think one of the companies who made the FAT file system would blithely ignore one of the cardinal rules at the time?

Pshaw!

Next time, we gettin’ hacky.

 

…Wait, hold up. Why does it say 1983 in the title? Well, if you notice on Ralph Brown’s site the rename function was introduced in DOS 2, which was first released in 1983. And so was I.

[ , ] Leave a Comment

Snappy and Clear

(This was originally posted as a Twitter thread.)

I’d like to talk about bad words for a moment. Specifically, words used as tags and such for porn site content. So there’s your content warning right there.

“Dickgirl”. (pause for audience gasp) There are those who consider this word to be a Bad Word™. I respectfully disagree. I think this word is snappy and clear, like a content tag word ought to be.

(Now, I’d like to interrupt this repost to clarify that this is not about how it’s somehow not bad. It’s about how on another level it’s good. This was never meant to convince anyone otherwise, as we all know this to be impossible. With that in mind, back to the repost.)

Imagine, if you will, a fresh-faced pervert’s first go on the Internet. Our freshly-hatched pervert finds a porn site with a comprehensive tagging system, that includes hentai and its related genres and tropes. He finds an image set or a comic or such tagged “dickgirl, bukakke” among other things. What do you suppose this person, who has never seen these two words before (I know, incredible), expects to find upon reading these tags and opening the comic?

At least one chick with a dick, and the other thing is a surprise.

Y’see, when you see the word “dickgirl”, there’s only a few things you can take it to possibly mean, and only one or two make enough sense to likely be correct. It refers, of course, to a girl with a dick. It’s a hentai comic, it’s allowed to have weird shit okay?

But yeah. Two short syllables, each a perfectly clear word on their own, and there’s very little doubt as to what it means. The opposite, “cuntboy”, is exactly the same in all regards. No need to repeat myself there. “Bukakke” on the other hand… oh boy. Oooh boy! First of all, it’s Japanese. Our (I’ll remind you) uninformed perv doesn’t speak the language, so he has nothing to go on. Second of all, it’s got a non-sexy meaning too, that came earlier. Something about pouring water on your noodles from higher up? It’s not that snappy, and it’s certainly unclear as all get out. If our perv were Japanese or otherwise familiar with the other meaning, he might justifiably think “oh, the dickgirls are gonna eat noodles afterwards.” Boy is he in for a surprise!

You could say the same about “futanari”. Again, Japanese. Means, roughly, “two forms”. And frankly, reading that I’d sooner think of giant fighting robots that turn into jet planes and such than a girl with too many bits down her panties.

And please, don’t even think of suggesting “pre-op transwoman”. It’s not at all snappy, and it’s frankly too specific. Surprising, I know. Not all female-presenting dick-having characters in these stories are trans, okay? Like 99% aren’t!

(“you could also argue that while it’s definitely a bad idea to call real-life trans women by porn terms, it’s probably a lesser bad idea to call porn by the current words used by trans women, in that it may lead for those terms to become sexualized/pornographic.” “Exactly why you just keep calling dickgirl porn dickgirl porn.”)

Anyway thanks for listening to my TED talk. I’m Kawa, hobbyist linguist, and all-round lazy bastard.

[ ] Leave a Comment