AGI Dictionary Cleaner (fix for annoying w-word problem)

sonneveld Hi,

This is the knocked up cleaner for agi dictionary files. This fixes the corrupted words.tok that have occured for some people. All it does is sort words but I'll add some other needed features later and let you know.

Available for download HERE

I have tested it with Tomas Unosson's words.tok and it produces a clean workable file. It's exactly the same as what agi studio produces when it's not affected by the bug too.

Another thing, please don't post this on other websites yet until I clean things up a bit more. Source is available in the public domain, so do whatever you wish. And yes, I know the name of the executable sounds a bit iffy.. but I thought it was funny. :)

also, PLEASE let me know if it works for you and if it fixes any problems that may have had.

To use (from docs):
Run dictclean.exe in the same directory as "words.tok"
Check out "words.tok.new" for new file.
Will have to rename for agi studio or AGI to find it though.

If you get an "assertion failed" error, it may just because "words.tok" isn't in the same directory as the exectuable. If the error still continues, please send me the full error's details and I will try and fix it.

- Nick
lemur You are gold Nick! It worked perfectly for the games I'm writing. I'll let you know if I find something strange...

Well done, I'm happy as a tabledancer!
sonneveld I've got a few other things planned. For instance, you're not meant to have particular characters in words and you can't have more than one space or agi won't detect it. The cleaner should fix this as well.

- Nick
sonneveld Version 1.1 is complete and can be found HERE TOO.

Besides better error messages and maybe some command line options, I think it's feature complete.

1.1 2002-10-9
more cleaning functions:
- excess separators removed
- illegal characters removed (agi studio supports this to a degree)
- duplicates (after removing characters) removed
- remove words that don't start with a letter
- letters converted to lower case (agi studio supports this)

documentation on words.tok format in source

- Nick