Pocket Plane Group

Friends and Neighbors => Weimer Republic (WeiDU.org) => WeiDU => Topic started by: FredSRichardson on May 20, 2006, 12:09:46 AM

Title: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 12:09:46 AM

I know this has come up before, but I thought I'd try and take on the problem of extracting resources from IWD1 .cbf files. WeiDU already has the code for .bifc compression. Does anyone know the details on the cbf format? I looked around to see if I could find WinBiff souce code, but only found the binary.

Thanks,

-Fred

Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 20, 2006, 12:42:29 AM

I used to think they were same. Whereas BIFC uses variable sized blocks (start loading resources without having to first decompress the entire BIFF, I guess), CBF is just misappropriated SAV (probably why each CBF only contains the BAMs for one animation). Anything that lets you decompress SAVs should give you most of what you need.

EDIT: the old CBF/SAV decompressor (http://www.ugcs.caltech.edu/~jedwin/un_sav.c) is still around.

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 01:04:56 AM

Quote from: devSin on May 20, 2006, 12:42:29 AM

I used to think they were same. Whereas BIFC uses variable sized blocks (start loading resources without having to first decompress the entire BIFF, I guess), CBF is just misappropriated SAV (probably why each CBF only contains the BAMs for one animation). Anything that lets you decompress SAVs should give you most of what you need.

EDIT: the old CBF/SAV decompressor (http://www.ugcs.caltech.edu/~jedwin/un_sav.c) is still around.

Hey, thanks for that link!

Actually, before checking back here I downloaded this un_bifc.c (http://www.ugcs.caltech.edu/~jedwin/un_bifc.c) and hacked it to work with CBF's (based on the venerable file info on that site). Hacking WeiDU might be a bit different. It looks like the CBF files are generally one block whereas the BIFC files can be many blocks. For WeiDU the only real complication is that BIFC files (which WeiDU can handle) are detected by their header (which has "BIFC" in it), whereas a CBF is detected by the extension (<file>.bif doesn't exist but <file>.cbf does). Once that's determined, decompression is easy. I'll see if there's a less than ugly way to hack this into biff.ml

-Fred

Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 20, 2006, 01:13:11 AM

Yeah, anything that does un-gzip should work.

I believe the header for CBF is BIFF (or "BIF ") and was unique to the format (it's different than uncompressed BIFF, and not BIFC, but I don't recall the version (probably 1.0)).

EDIT: NI (infinity.resource.key.BIFFArchive) says it's "BIF "
EDIT: "V1.0"

Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 20, 2006, 06:24:29 AM

'Kay, since I don't have IWD, I'll have to wait for you to come up with something :)

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 09:25:32 AM

Quote from: the bigg on May 20, 2006, 06:24:29 AM

'Kay, since I don't have IWD, I'll have to wait for you to come up with something :)

No troubles, I think I'm the only one who cares right now so makes sense for me to do it.

The CBF format is pretty much what the old documentation project says it is. OCaml "compress" should substitute for zlib just fine.

-Fred

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 01:15:09 PM

Oh dear, I've hit my firs hurtle.

The compressed biff files that WeiDU currently handles are manageable becaused they're compressed in relatively small sequential blocks. The CBF files in IWD1 are compressed in one big single block. "loading" a CBF can be done pretty efficiently as the zlib routines will correctly decompress up to some number of bytes from the beginning of a file (getting WeiDU to handle this might be a pain). The problem occurs when you have to extract a file. The resource options are processing time, disk space or memory usage.

Memory usage: this is the easiest to implement. Just decompress the whole biff in memory and extract the resource from some offset.
Disk space: this is what Infinity Engine does. Decompress the CBF file to the cache directory if it's not already there and then treat it like a normal biff.
Processing time: this is probably the most complicated way but also probably the most preferable. Compressed files aren't really random-access, but you can start at the beginning of a file, and decompress chunks until you reach some offset in the uncompressed data that you're looking for.

It's tempting just to report a message "File foo.bif is apparently compressed as file foo.cbf which WeiDU cannot currently read directly, please uncompress file.cbf using an external utility." Implementing the third option isn't easy as it means implementing a fair chunk of code (though we could also use the UnZip module from ExtLib). I'll have to think about this a bit more given that it's not the most useful functionality in the world.

Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 20, 2006, 01:25:30 PM

WeiDU already takes up lots of RAM space, so this isn't advisable. #2 should be decently easy to do anyway - Load.load_bif_in_game would open a biff file and read the offset table, adding these info into the game.loaded_biff hashtable. You should be able to hack this so that cbf files are decompressed, stored in the cache, and then you return the file descriptor pointing there. You can even Unix.unlink the resulting file at the end of the weidu process in main.ml.

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 02:53:26 PM

Quote from: the bigg on May 20, 2006, 01:25:30 PM

WeiDU already takes up lots of RAM space, so this isn't advisable. #2 should be decently easy to do anyway - Load.load_bif_in_game would open a biff file and read the offset table, adding these info into the game.loaded_biff hashtable. You should be able to hack this so that cbf files are decompressed, stored in the cache, and then you return the file descriptor pointing there. You can even Unix.unlink the resulting file at the end of the weidu process in main.ml.

WeiDU doesn't quiet have all the ZLib functionality even for that. The "mlgz_uncompress" routine that's there calls "uncompress" which only works only on a full in-memory block of compressed data.

What I'd have to do is make a new function that reads and tosses smallish chunks of uncompressed data until some specified offset is reached and then continue while populating a specified buffer up to some specified length. This isn't so bad using ZLib's stream routines (inflateInit/inflate/inflateEnd), but implementing that is essentially going with the 3rd option above. I'm not saying it's not worth doing, but it's a lot more work than printing out an error message telling the user to decompress there biff files ;)

Actually, that's not true. I could also create a function that just takes arguments like "mlgz_cbf2bif cbf_file out_dir" and returns the path to the decompressed file in "out_dir" and follow your recommendation from there. At first I thought it would be helpful to have the "load" routine operate on the compressed file to get the directory information, but now I see that "load" is never done unless you're actually going to access archive members so you may as well decompress then 'n there.

So, yes, this approach would lead to the smallest amount of mess in biff.ml and the least amount of headache since it would use well tested code for reading the uncompressed biff file. I'll try that out.

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 06:37:53 PM

Okay, I hacked together a cbf2bif conversion routine. I (unfortunately) couldn't use the code that's already out there (un_sav and un_cbif). They both use the in-memory routine, but fortunately I could just copy an example from the zlib docs that works pretty well (and does all their recommended error checking).

I'm going to add a routine "mlgz_cbf2bif(in_cbf_file, out_bif_file)" to zlib.c so the uncompressing can be done in WeiDU.

I can also add code that uses BIF files in the "/cache" directory when they don't exist anywhere else (so this would work even if the compressed BIF has some other extension besides CBF and someone leaves an uncompressed version in the cache directory).

Aside from that it's all pretty straight forward. When I create a file in the /cache, I'll make sure I clean it up.

-Fred

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 09:10:59 AM

Well, I hacked this together and have it all working except for the cache cleanup part.

Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 21, 2006, 09:26:01 AM

Cool - if you need help with this, let me know.

Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 11:22:33 AM

Quote from: the bigg on May 21, 2006, 09:26:01 AM

Cool - if you need help with this, let me know.

Hey, before you change your mind I'll take you up on that :D

All that's needed is a appending to a list of files that were created in the cache for cleanup.

Along the way, I noticed some things I was wondering about. In load.ml, WeiDU has to handle the problem of case-sensitive file access. In order to check for the existence of a file, it has to list all the files in the parent directory and compare them against the candidate file in a case-insensitve way. Other parts of the WeiDU code use a routine "file_exists" which is case sensitive. Is it the case that files created by WeiDU are always one case (say lower)? It looks like this is what he "case_ins*.ml" packages are meant to take care of. I guess I came across this because I was looking for a CBF file which is really a Game file so it has to use the more complicated directory scanning routine in "load.ml".

But I noticed that the Override folder isn't handeled in the same way, and it can have files with all sorts of mixed cases. So for folks running on Linux, do you just make all your files lower case or something?

It's not very appealing to have to scan a directory every time you want to check for a file, and that's probably why "file_exists" is used everywhere, but we could add a hash table and scan only when we haven't already hashed the entries of a directory. The hash table probably wouldn't get that big, you could probably store over 50,000 entries before hitting 1Meg of memory.

Anway, here's my patch for handling CBF files (without the needed cleanup):