Logo Pending


More regarding Interrupt 21

Last time I explained how your standard file rename function as seen in MS-DOS worked. You’d set up two CPU registers with pointers to the old and new names, set AH to 0x56, and called Int 0x21. Easy, right? And then I went into detail on how malformed inputs were handled. They weren’t handled too well, and DOSBox does it differently from MS-DOS on top of that.

But what if we had a file system and rename function that did support spaces? Maybe more than eight characters, even? In mixed case?

That is of course VFAT, an extension to regular FAT16 available in Windows 95, NT 3.5, and later. With a VFAT driver, most of the old file operations available from Int 0x21 had counterparts installed that generally took the same arguments and had the same numbers, but accepted long filenames.

So to rename a file with long filename support, you’d do exactly what you’d do before but instead of setting AH to 0x56 you’d set AX to 0x7156. Assuming Windows is running and we use the same inputs as last time, your file will now be named hello world.txt. And that’s all that takes, even if it’s a pure DOS program doing it.

Which raises a question. How do you make a pure DOS program that handles files that may have long names, may be run from Windows, and should not drop any of those long names if it is in fact running in Windows? Well, it turns out all those LFN functions — the ones starting with 0x71, all reset AX to 0x7100 if they’re not installed. A trick of the system, I suppose. So what you could do for your LFN-enabled rename function is try to use 0x7156, see if AX has reset to 0x7100, and if it has, you try again with AH set to 0x56. In other words, it’s time to bring back the rename function from SCI11… or rather a branch of SCI11+ that I’ve been working on.

rename	proc	oldName:ptr byte, newName:ptr byte
	mov	dx, oldName	; ds:dx = old name
	push	ds
	pop	es
	mov	di, newName	; es:di = new name
 
	mov	ax, 7156h	; LFN Rename
	int	21h
	.if	ax == 7100h	; LFN failed, try DOS 2.0 version
		mov	ah, 56h
		int	21h
	.endif
 
	.if	carry?
		xor	ax, ax
	.endif
	ret
rename	endp

It’s that easy. Of course, this is old-school MASM code which has some nice things like .if but that’s just sugar to avoid having to write compares and branches — the concept should be clear enough. An attempt to rename a file to Introduction.txt will result in exactly that on Windows, or transparently collapse to introduc.txt on plain DOS.

Note that in the actual SCI11+ code, if you’re crazy enough to look it up, there’s an extra function I made that’s called right before the DOS 2.0 rename call that replaces all spaces with underscores, which renders them about 100% not as confusing and untouchable as the one shown last time. I left that part out for brevity.

[ , ] Leave a Comment

This is why you sanitize your inputs, 1983 edition

(This is heavily expanded from a few Twitter posts of mine.)

When you write an application that has to rename a file, you have your chosen language and platform’s standard library to do the heavy lifting for you. For example in C it’s usually int rename(const char* oldName, const char* newName), and a bunch of other languages follow suit. Why not, it’s a good function! But what does rename actually do?

In MS-DOS, this’d be handled by Interrupt 0x21, subfunction AH 0x56. By which I mean it’d set two specific processor registers (as mentioned in Save Early, Save How) to point to the old and new file names, set the AH register to 0x56, and execute the INT 0x21 instruction. A function installed by MS-DOS will then take over, doing the actual renaming, possibly returning an error value which the C function can immediately use as its’ return value. Since SCI has its own “need-to-use” library…

rename	proc	oldName:ptr byte, newName:ptr byte
	mov	dx, oldName	; ds:dx = old name
	push	ds
	pop	es
	mov	di, newName	; es:di = new name
 
	mov	ah, 56h
	int	21h
 
	.if	carry?
		xor	ax, ax
	.endif
	ret
rename	endp

(Full disclosure: the SCI code actually includes a dos macro to save the programmers some typing. I unrolled it here for illustration purposes.)

All of this pretty much matches what you can find on Ralph Brown’s list. Given a suitable function prototype in C such as the one in the second paragraph, SCI can now call its own rename function as it desires.

Enough about SCI though, its function as a practical example is at an end.

But what if you gave it bad inputs? Sure, if the old name doesn’t refer to an existing file it will return 2 “file not found”, but what if the new name isn’t quite valid? Remember, this is MS-DOS; we don’t have the luxury of long file names here. It’s 8.3 or bust. I don’t see any sanity checks in the above function, and Brown’s documentation only speaks of splats.

So what happens if we have a file boop.txt and call rename("boop.txt", "hello world.txt")?

In DOSBox, you’d end up with a file hellowor.txt. You are free to further manipulate this file in any way you please. The command line won’t choke on it, file managers won’t get confused. If you wanted to manually rename it back to boop.txt from the command line, ren hellowor.txt boop.txt will work perfectly fine.

This is actually not true in real MS-DOS. If your program were to run on a real MS-DOS installation, you’d end up with hello wo.txt, an 8.3 file with a space in it. And no contemporary file manager I’ve seen can handle that. The ren command built into command.com can’t parse it — ren hello wo.txt boop.txt is three arguments where ren expects only two, and the first isn’t an existing file’s name that it can change to wo.txt.

In cmd.exe of course you can use double quotes to make it unambiguously two arguments, but this isn’t cmd.exe. What about some file managers though? I have two, Norton Commander and its big brother Norton Desktop.

In Norton Commander, the file list shows hello wo.txt, and its rename function can handle it. So can the built-in editor and viewer. Top marks for Norton Commander!

Norton Desktop on the other hand is not so sturdy. It can show the file in the list but that’s all. Trying to rename it back to boop.txt reveals the incorrectness of the source file’s name quite succinctly:

Technically, this is true. You’re not supposed to have spaces in the middle of a FAT 8.3 file name. If a file has less than eight characters before the dot, it’s secretly padded with spaces, and so are the three extension characters. And the dot isn’t even — the true name as written in the FAT directory would be BOOP    TXT. But that’s just one way Norton Desktop trips. Its viewer seems to be passed the nonexistent hello. It shrugs and asks which existing file we want to open. Its editor is given the same argument(s?) and lets us edit a brand new file named hello. In Norton Desktop’s world, it can see the file, but it can’t do much with it.

What about a contemporary Windows? Can, let’s say, the Notepad from Windows 3.1 handle this file? Okay, so technically this is commdlg.dll talking, but we’re playing for effect here.

Of course not, what did you expect by now!? Norton Commander only worked because it didn’t care enough! Would you really think one of the companies who made the FAT file system would blithely ignore one of the cardinal rules at the time?

Pshaw!

Next time, we gettin’ hacky.

 

…Wait, hold up. Why does it say 1983 in the title? Well, if you notice on Ralph Brown’s site the rename function was introduced in DOS 2, which was first released in 1983. And so was I.

[ , ] Leave a Comment

Snappy and Clear

(This was originally posted as a Twitter thread.)

I’d like to talk about bad words for a moment. Specifically, words used as tags and such for porn site content. So there’s your content warning right there.

“Dickgirl”. (pause for audience gasp) There are those who consider this word to be a Bad Word™. I respectfully disagree. I think this word is snappy and clear, like a content tag word ought to be.

(Now, I’d like to interrupt this repost to clarify that this is not about how it’s somehow not bad. It’s about how on another level it’s good. This was never meant to convince anyone otherwise, as we all know this to be impossible. With that in mind, back to the repost.)

Imagine, if you will, a fresh-faced pervert’s first go on the Internet. Our freshly-hatched pervert finds a porn site with a comprehensive tagging system, that includes hentai and its related genres and tropes. He finds an image set or a comic or such tagged “dickgirl, bukakke” among other things. What do you suppose this person, who has never seen these two words before (I know, incredible), expects to find upon reading these tags and opening the comic?

At least one chick with a dick, and the other thing is a surprise.

Y’see, when you see the word “dickgirl”, there’s only a few things you can take it to possibly mean, and only one or two make enough sense to likely be correct. It refers, of course, to a girl with a dick. It’s a hentai comic, it’s allowed to have weird shit okay?

But yeah. Two short syllables, each a perfectly clear word on their own, and there’s very little doubt as to what it means. The opposite, “cuntboy”, is exactly the same in all regards. No need to repeat myself there. “Bukakke” on the other hand… oh boy. Oooh boy! First of all, it’s Japanese. Our (I’ll remind you) uninformed perv doesn’t speak the language, so he has nothing to go on. Second of all, it’s got a non-sexy meaning too, that came earlier. Something about pouring water on your noodles from higher up? It’s not that snappy, and it’s certainly unclear as all get out. If our perv were Japanese or otherwise familiar with the other meaning, he might justifiably think “oh, the dickgirls are gonna eat noodles afterwards.” Boy is he in for a surprise!

You could say the same about “futanari”. Again, Japanese. Means, roughly, “two forms”. And frankly, reading that I’d sooner think of giant fighting robots that turn into jet planes and such than a girl with too many bits down her panties.

And please, don’t even think of suggesting “pre-op transwoman”. It’s not at all snappy, and it’s frankly too specific. Surprising, I know. Not all female-presenting dick-having characters in these stories are trans, okay? Like 99% aren’t!

(“you could also argue that while it’s definitely a bad idea to call real-life trans women by porn terms, it’s probably a lesser bad idea to call porn by the current words used by trans women, in that it may lead for those terms to become sexualized/pornographic.” “Exactly why you just keep calling dickgirl porn dickgirl porn.”)

Anyway thanks for listening to my TED talk. I’m Kawa, hobbyist linguist, and all-round lazy bastard.

[ ] Leave a Comment