Pocket Plane Group

Friends and Neighbors => Weimer Republic (WeiDU.org) => WeiDU => Topic started by: FredSRichardson on May 20, 2006, 12:09:46 AM

Title: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 12:09:46 AM
I know this has come up before, but I thought I'd try and take on the problem of extracting resources from IWD1 .cbf files.  WeiDU already has the code for .bifc compression.  Does anyone know the details on the cbf format?  I looked around to see if I could find WinBiff souce code, but only found the binary.

Thanks,

-Fred
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 20, 2006, 12:42:29 AM
I used to think they were same. Whereas BIFC uses variable sized blocks (start loading resources without having to first decompress the entire BIFF, I guess), CBF is just misappropriated SAV (probably why each CBF only contains the BAMs for one animation). Anything that lets you decompress SAVs should give you most of what you need.

EDIT: the old CBF/SAV decompressor (http://www.ugcs.caltech.edu/~jedwin/un_sav.c) is still around.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 01:04:56 AM
I used to think they were same. Whereas BIFC uses variable sized blocks (start loading resources without having to first decompress the entire BIFF, I guess), CBF is just misappropriated SAV (probably why each CBF only contains the BAMs for one animation). Anything that lets you decompress SAVs should give you most of what you need.

EDIT: the old CBF/SAV decompressor (http://www.ugcs.caltech.edu/~jedwin/un_sav.c) is still around.
Hey, thanks for that link!

Actually, before checking back here I downloaded this un_bifc.c (http://www.ugcs.caltech.edu/~jedwin/un_bifc.c) and hacked it to work with CBF's (based on the venerable file info on that site).  Hacking WeiDU might be a bit different.  It looks like the CBF files are generally one block whereas the BIFC files can be many blocks.  For WeiDU the only real complication is that BIFC files (which WeiDU can handle) are detected by their header (which has "BIFC" in it), whereas a CBF is detected by the extension (<file>.bif doesn't exist but <file>.cbf does).  Once that's determined, decompression is easy.  I'll see if there's a less than ugly way to hack this into biff.ml

-Fred
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 20, 2006, 01:13:11 AM
Yeah, anything that does un-gzip should work.

I believe the header for CBF is BIFF (or "BIF ") and was unique to the format (it's different than uncompressed BIFF, and not BIFC, but I don't recall the version (probably 1.0)).

EDIT: NI (infinity.resource.key.BIFFArchive) says it's "BIF "
EDIT: "V1.0"
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 20, 2006, 06:24:29 AM
'Kay, since I don't have IWD, I'll have to wait for you to come up with something  :)
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 09:25:32 AM
'Kay, since I don't have IWD, I'll have to wait for you to come up with something  :)
No troubles, I think I'm the only one who cares right now so makes sense for me to do it.

The CBF format is pretty much what the old documentation project says it is.  OCaml "compress" should substitute for zlib just fine.

-Fred
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 01:15:09 PM
Oh dear, I've hit my firs hurtle.

The compressed biff files that WeiDU currently handles are manageable becaused they're compressed in relatively small sequential blocks.  The CBF files in IWD1 are compressed in one big single block.  "loading" a CBF can be done pretty efficiently as the zlib routines will correctly decompress up to some number of bytes from the beginning of a file (getting WeiDU to handle this might be a pain).  The problem occurs when you have to extract a file.  The resource options are processing time, disk space or memory usage.
It's tempting just to report a message "File foo.bif is apparently compressed as file foo.cbf which WeiDU cannot currently read directly, please uncompress file.cbf using an external utility."  Implementing the third option isn't easy as it means implementing a fair chunk of code (though we could also use the UnZip module from ExtLib).  I'll have to think about this a bit more given that it's not the most useful functionality in the world.
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 20, 2006, 01:25:30 PM
WeiDU already takes up lots of RAM space, so this isn't advisable. #2 should be decently easy to do anyway - Load.load_bif_in_game would open a biff file and read the offset table, adding these info into the game.loaded_biff hashtable. You should be able to hack this so that cbf files are decompressed, stored in the cache, and then you return the file descriptor pointing there. You can even Unix.unlink the resulting file at the end of the weidu process in main.ml.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 02:53:26 PM
WeiDU already takes up lots of RAM space, so this isn't advisable. #2 should be decently easy to do anyway - Load.load_bif_in_game would open a biff file and read the offset table, adding these info into the game.loaded_biff hashtable. You should be able to hack this so that cbf files are decompressed, stored in the cache, and then you return the file descriptor pointing there. You can even Unix.unlink the resulting file at the end of the weidu process in main.ml.
WeiDU doesn't quiet have all the ZLib functionality even for that.  The "mlgz_uncompress" routine that's there calls "uncompress" which only works only on a full in-memory block of compressed data.

What I'd have to do is make a new function that reads and tosses smallish chunks of uncompressed data until some specified offset is reached and then continue while populating a specified buffer up to some specified length.  This isn't so bad using ZLib's stream routines (inflateInit/inflate/inflateEnd), but implementing that is essentially going with the 3rd option above.  I'm not saying it's not worth doing, but it's a lot more work than printing out an error message telling the user to decompress there biff files ;)

Actually, that's not true.  I could also create a function that just takes arguments like "mlgz_cbf2bif cbf_file out_dir" and returns the path to the decompressed file in "out_dir" and follow your recommendation from there.  At first I thought it would be helpful to have the "load" routine operate on the compressed file to get the directory information, but now I see that "load" is never done unless you're actually going to access archive members so you may as well decompress then 'n there.

So, yes, this approach would lead to the smallest amount of mess in biff.ml and the least amount of headache since it would use well tested code for reading the uncompressed biff file.  I'll try that out.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 20, 2006, 06:37:53 PM
Okay, I hacked together a cbf2bif conversion routine.  I (unfortunately) couldn't use the code that's already out there (un_sav and un_cbif).  They both use the in-memory routine, but fortunately I could just copy an example from the zlib docs that works pretty well (and does all their recommended error checking).

I'm going to add a routine "mlgz_cbf2bif(in_cbf_file, out_bif_file)" to zlib.c so the uncompressing can be done in WeiDU.

I can also add code that uses BIF files in the "/cache" directory when they don't exist anywhere else (so this would work even if the compressed BIF has some other extension besides CBF and someone leaves an uncompressed version in the cache directory).

Aside from that it's all pretty straight forward.  When I create a file in the /cache, I'll make sure I clean it up.

-Fred
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 09:10:59 AM
Well, I hacked this together and have it all working except for the cache cleanup part.
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 21, 2006, 09:26:01 AM
Cool - if you need help with this, let me know.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 11:22:33 AM
Cool - if you need help with this, let me know.
Hey, before you change your mind I'll take you up on that :D

All that's needed is a appending to a list of files that were created in the cache for cleanup.

Along the way, I noticed some things I was wondering about.  In load.ml, WeiDU has to handle the problem of case-sensitive file access.  In order to check for the existence of a file, it has to list all the files in the parent directory and compare them against the candidate file in a case-insensitve way.  Other parts of the WeiDU code use a routine "file_exists" which is case sensitive.  Is it the case that files created by WeiDU are always one case (say lower)?  It looks like this is what he "case_ins*.ml" packages are meant to take care of.  I guess I came across this because I was looking for a CBF file which is really a Game file so it has to use the more complicated directory scanning routine in "load.ml".

But I noticed that the Override folder isn't handeled in the same way, and it can have files with all sorts of mixed cases.  So for folks running on Linux, do you just make all your files lower case or something?

It's not very appealing to have to scan a directory every time you want to check for a file, and that's probably why "file_exists" is used everywhere, but we could add a hash table and scan only when we haven't already hashed the entries of a directory.  The hash table probably wouldn't get that big, you could probably store over 50,000 entries before hitting 1Meg of memory.

Anway, here's my patch for handling CBF files (without the needed cleanup):
Code: [Select]
--- ./WeiDU-192.orig/src/load.ml 2006-04-10 16:58:53.000000000 -0400
+++ ./WeiDU-192/src/load.ml 2006-05-21 12:17:58.171875000 -0400
@@ -314,25 +314,42 @@
 
 let skip_next_load_error = ref false
 
+external cbf2bif : string -> string -> int
+    = "mlgz_cbf2bif"
+
 let load_bif_in_game game bif_file =
     if Hashtbl.mem game.loaded_biffs bif_file then
       Hashtbl.find game.loaded_biffs bif_file (* already here *)
     else begin
       (* we must load the BIF *)
-      let biff_path =
-        let rec trial lst =
+      let biff_path = begin
+        let rec trial f lst =
           match lst with
-            [] -> find_file_in_path game.game_path bif_file
+            [] -> find_file_in_path game.game_path f
           | hd :: tl ->
-            let perhaps = find_file_in_path hd bif_file in
-            log_only "BIFF may be in hard-drive CD-path [%s]\n" perhaps ;
-            if file_exists perhaps then
-              perhaps
-            else trial tl
+              let perhaps = find_file_in_path hd f in
+              log_only "BIFF may be in hard-drive CD-path [%s]\n" perhaps ;
+              if file_exists perhaps then
+                perhaps
+              else trial f tl
         in
-        trial (game.cd_path_list)
-      in
+        (* Check to see if the bif file exists, if it doesn't try for a .CBF file *)
+        let bf = trial bif_file (game.cd_path_list @ [ game.game_path ^ "/cache" ] ) in
+        if file_exists bf then
+          bf
+        else begin
+          let cbf = Filename.chop_extension bif_file ^ ".cbf" in
+          let cbf_file = trial cbf (game.cd_path_list) in
+          if file_exists cbf_file then
+            let cache_file = game.game_path ^ "/cache/" ^ bif_file in
+            let sz = cbf2bif cbf_file cache_file in
+            let _ = log_and_print "[%s] decompressed bif file %d bytes\n" cbf_file sz in
+            cache_file
+          else
+            bf
+        end
+      end in
       let the_biff = Biff.load_biff biff_path in
       Hashtbl.add game.loaded_biffs bif_file the_biff ;
       the_biff
--- ./WeiDU-192.orig/zlib/zlib.c 2003-06-02 06:08:24.000000000 -0400
+++ ./WeiDU-192/zlib/zlib.c 2006-05-20 19:35:08.156250000 -0400
@@ -85,3 +85,211 @@
    */
   return v_ret ;
 }
+
+/* zerr() and def() are copied directly from zlib example code. */
+
+#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
+#  include <fcntl.h>
+#  include <io.h>
+#  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
+#else
+#  define SET_BINARY_MODE(file)
+#endif
+
+#define CHUNK 16384
+
+
+/* Raise an exception for a zlib or i/o error */
+void mlgz_zerr(int ret)
+{
+    switch (ret) {
+    case Z_ERRNO:
+        raise_sys_error(copy_string(strerror(errno))) ;
+        break;
+    case Z_STREAM_ERROR:
+        raise_mlgz_exn("invalid compression level");
+        break;
+    case Z_DATA_ERROR:
+        raise_mlgz_exn("invalid or incomplete deflate data");
+        break;
+    case Z_MEM_ERROR:
+        raise_out_of_memory() ;
+        break;
+    case Z_VERSION_ERROR:
+        raise_mlgz_exn("zlib version mismatch!");
+    }
+}
+
+/* Yes, this is bad practice: */
+static char errstr[1024];
+
+/* This is a bit better for error checking: */
+int fread_check(void* b, size_t sz, size_t cnt, FILE* fp, const char* fn)
+{
+    if (fread(b, sz, cnt, fp) != cnt) {
+        sprintf(errstr, "Failed to read %d bytes from file %s", sz*cnt, fn);
+        raise_mlgz_exn(errstr);
+        return 1;
+    }
+    return 0;
+}
+
+/* Decompress from file source to file dest until stream ends or EOF.
+   inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
+   allocated for processing, Z_DATA_ERROR if the deflate data is
+   invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
+   the version of the library linked do not match, or Z_ERRNO if there
+   is an error reading or writing the files. */
+int inf(FILE *source, FILE *dest)
+{
+    int ret;
+    unsigned have;
+    z_stream strm;
+    unsigned char in[CHUNK];
+    unsigned char out[CHUNK];
+   
+    /* allocate inflate state */
+    strm.zalloc = Z_NULL;
+    strm.zfree = Z_NULL;
+    strm.opaque = Z_NULL;
+    strm.avail_in = 0;
+    strm.next_in = Z_NULL;
+    ret = inflateInit(&strm);
+    if (ret != Z_OK)
+        return ret;
+    /* decompress until deflate stream ends or end of file */
+    do {
+        strm.avail_in = fread(in, 1, CHUNK, source);
+        if (ferror(source)) {
+            (void)inflateEnd(&strm);
+            return Z_ERRNO;
+        }
+        if (strm.avail_in == 0)
+            break;
+        strm.next_in = in;
+        /* run inflate() on input until output buffer not full */
+        do {
+            strm.avail_out = CHUNK;
+            strm.next_out = out;
+            ret = inflate(&strm, Z_NO_FLUSH);
+            /* assert(ret != Z_STREAM_ERROR); */  /* state not clobbered */ /* no asserts for Ocaml */
+            switch (ret) {
+            case Z_NEED_DICT:
+                ret = Z_DATA_ERROR;     /* and fall through */
+            case Z_DATA_ERROR:
+            case Z_MEM_ERROR:
+                (void)inflateEnd(&strm);
+                return ret;
+            }
+            have = CHUNK - strm.avail_out;
+            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
+                (void)inflateEnd(&strm);
+                return Z_ERRNO;
+            }
+        } while (strm.avail_out == 0);
+        /* done when inflate() says it's done */
+    } while (ret != Z_STREAM_END);
+    /* clean up and return */
+    (void)inflateEnd(&strm);
+    return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
+}
+
+/*
+   Unompresses a CBF files to a specified BIF file.  Return 0 on failure or the
+   number of uncompressed bytes on success.
+ */
+value mlgz_cbf2bif(value _cbf_file, value _bif_file)
+{
+    const char* cbf_file = String_val(_cbf_file);
+    const char* bif_file = String_val(_bif_file);
+    FILE *cbf_fp;
+    FILE *bif_fp;
+    char sigver[9];
+    int bif_file_len;
+    unsigned long cmplen, uncmplen;
+    int zret;
+   
+
+    if (!(cbf_fp = fopen(cbf_file, "rb"))) {
+        sprintf(errstr, "failure opening file: %s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+    if (!(bif_fp = fopen(bif_file, "wb"))) {
+        fclose(cbf_fp);
+        sprintf(errstr, "failure opening file: %s", bif_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    sigver[8] = 0;
+    if (fread_check(sigver, 1, 8, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+   
+    if (strcmp(sigver, "BIF V1.0")) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        sprintf(errstr, "incorrect CBF header for file %s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    if (fread_check(&bif_file_len, 4, 1, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+   
+    if (bif_file_len <=0 || bif_file_len > 128) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        sprintf(errstr, "corrupt CBF file %s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    /* Seek ahead past embedded file name, doesn't really matter what it is */
+    if (fseek(cbf_fp, bif_file_len, SEEK_CUR)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        sprintf(errstr, "failure seeking %d bytes into file %s", bif_file_len, cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    if (fread_check(&uncmplen, 4, 1, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+   
+    if (fread_check(&cmplen, 4, 1, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+    /* printf("CBF %s (%ld bytes) -> BIF %s [%ld bytes]", cbf_file, cmplen, bif_file, uncmplen); */
+
+    if ((zret=inf(cbf_fp, bif_fp)) != Z_OK) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        mlgz_zerr(zret);
+        return Val_int(0);
+    }
+
+    if (fclose(cbf_fp)) {
+        fclose(bif_fp);
+        sprintf(errstr, "failure closing file %s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+    if (fclose(bif_fp)) {
+        sprintf(errstr, "failure closing file %s", bif_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+    return Val_int(uncmplen);
+}
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 01:13:33 PM
It's not very appealing to have to scan a directory every time you want to check for a file, and that's probably why "file_exists" is used everywhere, but we could add a hash table and scan only when we haven't already hashed the entries of a directory.  The hash table probably wouldn't get that big, you could probably store over 50,000 entries before hitting 1Meg of memory.
Sounds like a good idea.

I think the idea is that Linux users have to manually change the case of their files. I'm not sure about the other stuff. If it's loading game files, it doesn't make much difference (the end result is apparently case-insensitive), otherwise most WeiDU files (created internally) are mixed or all upper case.
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 01:34:29 PM
And now that I look at the code, this isn't going to work on Mac OS X. :-(
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 21, 2006, 04:44:32 PM
All that's needed is a appending to a list of files that were created in the cache for cleanup.
I can code this if you have problems, yeah  :-*

Quote
Along the way, I noticed some things I was wondering about.  In load.ml, WeiDU has to handle the problem of case-sensitive file access.  In order to check for the existence of a file, it has to list all the files in the parent directory and compare them against the candidate file in a case-insensitve way.  Other parts of the WeiDU code use a routine "file_exists" which is case sensitive.  Is it the case that files created by WeiDU are always one case (say lower)?  It looks like this is what he "case_ins*.ml" packages are meant to take care of.  I guess I came across this because I was looking for a CBF file which is really a Game file so it has to use the more complicated directory scanning routine in "load.ml".

But I noticed that the Override folder isn't handeled in the same way, and it can have files with all sorts of mixed cases.  So for folks running on Linux, do you just make all your files lower case or something?
WesDU used find_file_in_path in a couple of places and simple calls everywhere else. Starting from 190 (?), I coded in the case_ins modules:
On Winows, file names are evaluated as they are.
On OSX, every \ is turned to a / before making FS calls.
On Linux, every \ is turned to a/, and the file name is lowercased, to avoid problems (such as using find_file_in_path everywhere). The Linux distro contains a file called to_lower.sh to turn automagically all files to lowercase  :)
Perhaps I can forget all the find_file_in_path stuff, since now I assume that the FS is case-insensitive or everything is mapped to lowercase, but I don't feel like taking extra risks.

Quote
Anway, here's my patch for handling CBF files (without the needed cleanup):
<snip>
Thanks  :D

And now that I look at the code, this isn't going to work on Mac OS X. :-(
Uh, why?
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 04:57:27 PM
Quote
Uh, why?
Byte order.
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 21, 2006, 05:10:45 PM
Crap.

EDIT: cbfs aren't IWD1 - only? And is there a Mac version of IWD at all?
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 05:57:44 PM
There is a Mac version of ID (not HoW or ID2, though). With the same data files as the PC version. I don't know if BIS used CBF in ID2 (although, I can't see why they would, given the advantages of BIFC over SAV).
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 07:50:50 PM
Hmm... well if there are CBF files on the MacOS, those should decompress correctly.  If BIS has the MacOS version of the IE engine do byte-swapping (i.e. the data files aren't byte-swapped), then yes you'll have quiet a challenge adding byte-swaping to WeiDU in general.  It could probably be abstracted in most places in int_of_str_off/short_of_str_off and write_int/write_short.  But for the code I submitted I don't see a specific byte-swapping problem.

I've only seen CBF's in IWD1.  IWD2 uses plane old bif (if you have the disk space, then it does same time).

-Fred

EDIT Oh wait, I see what you mean.  Yes, if the MacOS CBF's are the same as the Windows ones, then I'll  have to add a byteswapping routine.  What Macro(s) does GCC define on MacOS?  I'm happy to add byte-swapping in through a #define if you want.
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 08:09:55 PM
Yeah. The data is little endian, so it has to go to big endian on read, and back to little on write. I assume WeiDU gets it from OCaml, but I've honestly never looked.

Macros should be pretty close to any other gcc variant. You can get a semi-complete list here (http://developer.apple.com/documentation/DeveloperTools/gcc-3.3/cpp/Predefined-Macros.html#Predefined-Macros), and there shouldn't be any surprises (IIRC, it's a straight copy of normal gcc docs).
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 08:15:00 PM
WesDU used find_file_in_path in a couple of places and simple calls everywhere else. Starting from 190 (?), I coded in the case_ins modules:
On Winows, file names are evaluated as they are.
On OSX, every \ is turned to a / before making FS calls.
On Linux, every \ is turned to a/, and the file name is lowercased, to avoid problems (such as using find_file_in_path everywhere). The Linux distro contains a file called to_lower.sh to turn automagically all files to lowercase  :)
Perhaps I can forget all the find_file_in_path stuff, since now I assume that the FS is case-insensitive or everything is mapped to lowercase, but I don't feel like taking extra risks.

I guess is copy_to_override and the like all transform file names to lower-case on Linux (and all the game fles are already lower-case) then case_ins should do the trick.  "file_exists" should probably also go in there:
Code: [Select]
let sys_file_exists s = Sys.file_exists (String.lowercase (backslash_to_slash s)) ;and replace "file_exists".

But I agree, there's definitely some risk associated with taking out find_file_in_path (maybe early versions Ocaml/WeiDU didn't have these library calls).

-Fred
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 08:25:51 PM
Yeah. The data is little endian, so it has to go to big endian on read, and back to little on write. I assume WeiDU gets it from OCaml, but I've honestly never looked.

Macros should be pretty close to any other gcc variant. You can get a semi-complete list here (http://developer.apple.com/documentation/DeveloperTools/gcc-3.3/cpp/Predefined-Macros.html#Predefined-Macros), and there shouldn't be any surprises (IIRC, it's a straight copy of normal gcc docs).
Okay, it sounds like WeiDU has already abstracted byte-swapping then.

For the C code it's not a problem.  The Macros I need are not a general one, but an OS/arch specific ones.  For example, GCC on Cygwin defines "__CYGWIN__" and "_x86_".  Try running the command below and send me the output:

Code: [Select]
> echo 'int main() {}' > tmp.c
> cpp -dD  tmp.c
# 1 "tmp.c"
# 1 "<built-in>"
#define __STDC_HOSTED__ 1
#define __GNUC__ 3
#define __GNUC_MINOR__ 4
#define __GNUC_PATCHLEVEL__ 4
#define __SIZE_TYPE__ unsigned int
#define __PTRDIFF_TYPE__ int
#define __WCHAR_TYPE__ short unsigned int
#define __WINT_TYPE__ unsigned int
#define __GXX_ABI_VERSION 1002
#define __USING_SJLJ_EXCEPTIONS__ 1
#define __SCHAR_MAX__ 127
#define __SHRT_MAX__ 32767
#define __INT_MAX__ 2147483647
#define __LONG_MAX__ 2147483647L
#define __LONG_LONG_MAX__ 9223372036854775807LL
#define __WCHAR_MAX__ 65535U
#define __CHAR_BIT__ 8
#define __FLT_EVAL_METHOD__ 2
#define __FLT_RADIX__ 2
#define __FLT_MANT_DIG__ 24
#define __FLT_DIG__ 6
#define __FLT_MIN_EXP__ (-125)
#define __FLT_MIN_10_EXP__ (-37)
#define __FLT_MAX_EXP__ 128
#define __FLT_MAX_10_EXP__ 38
#define __FLT_MAX__ 3.40282347e+38F
#define __FLT_MIN__ 1.17549435e-38F
#define __FLT_EPSILON__ 1.19209290e-7F
#define __FLT_DENORM_MIN__ 1.40129846e-45F
#define __FLT_HAS_INFINITY__ 1
#define __FLT_HAS_QUIET_NAN__ 1
#define __DBL_MANT_DIG__ 53
#define __DBL_DIG__ 15
#define __DBL_MIN_EXP__ (-1021)
#define __DBL_MIN_10_EXP__ (-307)
#define __DBL_MAX_EXP__ 1024
#define __DBL_MAX_10_EXP__ 308
#define __DBL_MAX__ 1.7976931348623157e+308
#define __DBL_MIN__ 2.2250738585072014e-308
#define __DBL_EPSILON__ 2.2204460492503131e-16
#define __DBL_DENORM_MIN__ 4.9406564584124654e-324
#define __DBL_HAS_INFINITY__ 1
#define __DBL_HAS_QUIET_NAN__ 1
#define __LDBL_MANT_DIG__ 64
#define __LDBL_DIG__ 18
#define __LDBL_MIN_EXP__ (-16381)
#define __LDBL_MIN_10_EXP__ (-4931)
#define __LDBL_MAX_EXP__ 16384
#define __LDBL_MAX_10_EXP__ 4932
#define __DECIMAL_DIG__ 21
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
#define __LDBL_EPSILON__ 1.08420217248550443401e-19L
#define __LDBL_DENORM_MIN__ 3.64519953188247460253e-4951L
#define __LDBL_HAS_INFINITY__ 1
#define __LDBL_HAS_QUIET_NAN__ 1
#define __REGISTER_PREFIX__
#define __USER_LABEL_PREFIX__ _
#define __VERSION__ "3.4.4 (cygming special) (gdc 0.12, using dmd 0.125)"
#define __NO_INLINE__ 1
#define __FINITE_MATH_ONLY__ 0


#define __i386 1
#define __i386__ 1
#define i386 1
#define __tune_i686__ 1
#define __tune_pentiumpro__ 1
#define _X86_ 1

#define __stdcall __attribute__((__stdcall__))
#define __fastcall __attribute__((__fastcall__))
#define __cdecl __attribute__((__cdecl__))
#define __declspec(x) __attribute__((x))
#define _stdcall __attribute__((__stdcall__))
#define _fastcall __attribute__((__fastcall__))
#define _cdecl __attribute__((__cdecl__))
# 1 "<command line>"
#define __CYGWIN32__ 1
#define __CYGWIN__ 1
#define unix 1
#define __unix__ 1
#define __unix 1
# 1 "tmp.c"
int main() {}
>
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 08:29:51 PM
I wanted to do it the lazy way, sorry. :-) You'll want to use __APPLE__ and __BIG_ENDIAN__ (as opposed to __ppc__).

The list
Code: [Select]
# 1 "tmp.c"
# 1 "<built-in>"
#define __STDC_HOSTED__ 1
#define __GNUC__ 4
#define __GNUC_MINOR__ 0
#define __GNUC_PATCHLEVEL__ 1
#define __APPLE_CC__ 5247
#define __SIZE_TYPE__ long unsigned int
#define __PTRDIFF_TYPE__ int
#define __WCHAR_TYPE__ int
#define __WINT_TYPE__ int
#define __INTMAX_TYPE__ long long int
#define __UINTMAX_TYPE__ long long unsigned int
#define __GXX_ABI_VERSION 1002
#define __SCHAR_MAX__ 127
#define __SHRT_MAX__ 32767
#define __INT_MAX__ 2147483647
#define __LONG_MAX__ 2147483647L
#define __LONG_LONG_MAX__ 9223372036854775807LL
#define __WCHAR_MAX__ 2147483647
#define __CHAR_BIT__ 8
#define __INTMAX_MAX__ 9223372036854775807LL
#define __FLT_EVAL_METHOD__ 0
#define __FLT_RADIX__ 2
#define __FLT_MANT_DIG__ 24
#define __FLT_DIG__ 6
#define __FLT_MIN_EXP__ (-125)
#define __FLT_MIN_10_EXP__ (-37)
#define __FLT_MAX_EXP__ 128
#define __FLT_MAX_10_EXP__ 38
#define __FLT_MAX__ 3.40282347e+38F
#define __FLT_MIN__ 1.17549435e-38F
#define __FLT_EPSILON__ 1.19209290e-7F
#define __FLT_DENORM_MIN__ 1.40129846e-45F
#define __FLT_HAS_INFINITY__ 1
#define __FLT_HAS_QUIET_NAN__ 1
#define __DBL_MANT_DIG__ 53
#define __DBL_DIG__ 15
#define __DBL_MIN_EXP__ (-1021)
#define __DBL_MIN_10_EXP__ (-307)
#define __DBL_MAX_EXP__ 1024
#define __DBL_MAX_10_EXP__ 308
#define __DBL_MAX__ 1.7976931348623157e+308
#define __DBL_MIN__ 2.2250738585072014e-308
#define __DBL_EPSILON__ 2.2204460492503131e-16
#define __DBL_DENORM_MIN__ 4.9406564584124654e-324
#define __DBL_HAS_INFINITY__ 1
#define __DBL_HAS_QUIET_NAN__ 1
#define __LDBL_MANT_DIG__ 106
#define __LDBL_DIG__ 31
#define __LDBL_MIN_EXP__ (-968)
#define __LDBL_MIN_10_EXP__ (-291)
#define __LDBL_MAX_EXP__ 1024
#define __LDBL_MAX_10_EXP__ 308
#define __DECIMAL_DIG__ 33
#define __LDBL_MAX__ 1.79769313486231580793728971405301e+308L
#define __LDBL_MIN__ 2.00416836000897277799610805135016e-292L
#define __LDBL_EPSILON__ 4.94065645841246544176568792868221e-324L
#define __LDBL_DENORM_MIN__ 4.94065645841246544176568792868221e-324L
#define __LDBL_HAS_INFINITY__ 1
#define __LDBL_HAS_QUIET_NAN__ 1
#define __REGISTER_PREFIX__
#define __USER_LABEL_PREFIX__ _
#define __VERSION__ "4.0.1 (Apple Computer, Inc. build 5247)"
#define __NO_INLINE__ 1
#define __FINITE_MATH_ONLY__ 0
#define _ARCH_PPC 1
#define __BIG_ENDIAN__ 1
#define _BIG_ENDIAN 1

#define __LONG_DOUBLE_128__ 1
#define __ppc__ 1
#define __POWERPC__ 1
#define __NATURAL_ALIGNMENT__ 1
#define __MACH__ 1
#define __APPLE__ 1
#define __strong
#define __weak
#define __PIC__ 1
# 1 "<command line>"
#define __DYNAMIC__ 1
# 1 "tmp.c"
int main()
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 09:39:11 PM
Yeah, <endian.h> isn't found in the include path when I compile with Ocaml.  I think I'll have to use the __ppc__ macro (that's pretty common to all Macs, and it's not like we can run IE on other bigendian architectures).

Can you try this code out?  You'll need a CBF file to test it with (and a reference BIF file to test the results with).  If you need these, let me know and we can figure out a way for me to get them to you.

Code: [Select]
/* $Id: un_bifc.c,v 1.1 2000/08/18 23:37:01 jedwin Exp $ */

/*
 * un_bifc: unpack a compressed .bif file (BG2 style)
 *
 * This is a sample program from the Infinity Engine File Format Hacking
 * Project.  Use it as you like.  Author assumes no responsibility, yada yada
 * yada.
 */

#include <zlib.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <strings.h>


int
fread_check(void* b, size_t sz, size_t cnt, FILE* fp, const char* fn)
{
    if (fread(b, sz, cnt, fp) != cnt) {
        fprintf(stderr, "Failed to read %d bytes from %s\n", sz*cnt, fn);
        return 1;
    }
    return 0;
}

/* BS for byte-swap */
#define BS_4BYTE(a) ((((a)&0xFF000000)>>24)|(((a)&0x00FF0000)>>8)|(((a)&0x0000FF00)<<8)|(((a)&0x000000FF)<<24))
#define BS_2BYTE(a) ((((a)&0xFF00)>>8)|(((a)&0x00FF)<<8)

int fread_uint(unsigned int* i, FILE* fp, const char* fn)
{
    if (fread_check(&i, 4, 1, fp, fn))
        return 1;
#ifdef __ppc__
    i = BS_4BYTE(i);
#endif
    return 0;
}


/* zerr() and def() are copied directly from zlib example code. */

#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
#  include <fcntl.h>
#  include <io.h>
#  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
#else
#  define SET_BINARY_MODE(file)
#endif

#define CHUNK 16384


/* report a zlib or i/o error */
void zerr(int ret)
{
    fputs("zpipe: ", stderr);
    switch (ret) {
    case Z_ERRNO:
        if (ferror(stdin))
            fputs("error reading stdin\n", stderr);
        if (ferror(stdout))
            fputs("error writing stdout\n", stderr);
        break;
    case Z_STREAM_ERROR:
        fputs("invalid compression level\n", stderr);
        break;
    case Z_DATA_ERROR:
        fputs("invalid or incomplete deflate data\n", stderr);
        break;
    case Z_MEM_ERROR:
        fputs("out of memory\n", stderr);
        break;
    case Z_VERSION_ERROR:
        fputs("zlib version mismatch!\n", stderr);
    }
}

/* Decompress from file source to file dest until stream ends or EOF.
   inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
   allocated for processing, Z_DATA_ERROR if the deflate data is
   invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
   the version of the library linked do not match, or Z_ERRNO if there
   is an error reading or writing the files. */
int inf(FILE *source, FILE *dest)
{
    int ret;
    unsigned have;
    z_stream strm;
    unsigned char in[CHUNK];
    unsigned char out[CHUNK];
   
    /* allocate inflate state */
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    strm.avail_in = 0;
    strm.next_in = Z_NULL;
    ret = inflateInit(&strm);
    if (ret != Z_OK)
        return ret;
    /* decompress until deflate stream ends or end of file */
    do {
        strm.avail_in = fread(in, 1, CHUNK, source);
        if (ferror(source)) {
            (void)inflateEnd(&strm);
            return Z_ERRNO;
        }
        if (strm.avail_in == 0)
            break;
        strm.next_in = in;
        /* run inflate() on input until output buffer not full */
        do {
            strm.avail_out = CHUNK;
            strm.next_out = out;
            ret = inflate(&strm, Z_NO_FLUSH);
            assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
            switch (ret) {
            case Z_NEED_DICT:
                ret = Z_DATA_ERROR;     /* and fall through */
            case Z_DATA_ERROR:
            case Z_MEM_ERROR:
                (void)inflateEnd(&strm);
                return ret;
            }
            have = CHUNK - strm.avail_out;
            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
                (void)inflateEnd(&strm);
                return Z_ERRNO;
            }
        } while (strm.avail_out == 0);
        /* done when inflate() says it's done */
    } while (ret != Z_STREAM_END);
    /* clean up and return */
    (void)inflateEnd(&strm);
    return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
}

int
cbf2bif(const char* cbf_file, const char* bif_file)
{
    FILE *cbf_fp;
    FILE *bif_fp;
    char sigver[9];
    unsigned int bif_file_len, cmplen, uncmplen;
    int zret;
   
    if (!(cbf_fp = fopen(cbf_file, "rb"))) {
        fprintf(stderr, "Failure opening file: %s\n", cbf_file);
        return 1;
    }
    if (!(bif_fp = fopen(bif_file, "wb"))) {
        fclose(cbf_fp);
        fprintf(stderr, "Failure opening file: %s\n", bif_file);
        return 1;
    }

    sigver[8] = 0;
    if (fread_check(sigver, 1, 8, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        return 1;
    }
   
    if (strcmp(sigver, "BIF V1.0")) {
        fprintf(stderr, "Incorrect CBF header for file %s\n", cbf_file);
        fclose(cbf_fp);
        fclose(bif_fp);
        return 1;
    }

    if (fread_uint(&bif_file_len, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        return 1;
    }
   
    if (bif_file_len <=0 || bif_file_len > 128) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Corrupt CBF file %s\n", cbf_file);
        return 1;
    }

    if (fseek(cbf_fp, bif_file_len, SEEK_CUR)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Failed to seek ahead in file %s\n", cbf_file);
        return 1;
    }

    if (fread_uint(&uncmplen, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Failed to seek ahead in file %s\n", cbf_file);
        return 1;
    }

    if (fread_uint(&cmplen, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Failed to seek ahead in file %s\n", cbf_file);
        return 1;
    }

    printf("CBF %s (%d bytes) -> BIF %s [%d bytes]\n", cbf_file, cmplen, bif_file, uncmplen);


    if ((zret=inf(cbf_fp, bif_fp)) != Z_OK) {
        fclose(cbf_fp);
        fclose(bif_fp);
        zerr(zret);
        return 1;
    }

    if (fclose(cbf_fp)) {
        fclose(bif_fp);
        fprintf(stderr, "Failure closing file %s\n", cbf_file);
        return 1;
    }
    if (fclose(bif_fp)) {
        fprintf(stderr, "Failure closing file %s\n", bif_file);
        return 1;
    }
   
    return 0;
}


int main( int argc, char *argv[] )
{
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <in-cbf-file> <out-bif-file>\n", argv[0]);
        exit(1);
    }
   
    if (cbf2bif(argv[1], argv[2])) {
        fprintf(stderr, "Conversion failed.\n");
        exit(1);
    }
    return 0;
}
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 09:48:27 PM
You may want to check both __ppc64__ and __ppc__ (I can't remember if the G5 even defines __ppc__, or just the 64).

I'll try it with one of the CBFs on ID disc 2.

What use for the reference BIFF?
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 10:02:13 PM
Code: [Select]
#define BS_4BYTE(a) ((((a)&0xFF000000)>>24)|(((a)&0x00FF0000)>>8)|(((a)&0x0000FF00)<<8)|(((a)&0x000000FF)<<24))
#define BS_2BYTE(a) ((((a)&0xFF00)>>8)|(((a)&0x00FF)<<8)
These will need to be cast. I just hacked
Code: [Select]
#define BS_4BYTE(a) ((uint32_t)(((uint32_t)(a)&0xFF000000)>>24) | \
                   (((uint32_t)(a)&0x00FF0000)>>8)| \
                   (((uint32_t)(a)&0x0000FF00)<<8)| \
                   (((uint32_t)(a)&0x000000FF)<<24))
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 10:09:21 PM
Fails with "corrupt CBF file"

I'm using AR200B.CBF from CD2/data.
MD5 (AR200B.CBF) = 89ef1caf7d916024f66d41f14c166fdb
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 11:45:47 PM
Try commenting out the byte-swapping code (I only use BS_4BYTE in one place) and see what happens.  Maybe the CBF files on the Mac were created on a Mac.

-Fred
EDIT I take it back.  My file has the same md5sum, so they must be compressed on the same arch.
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 21, 2006, 11:50:56 PM
un_sav.c works for them. un_sav requires 3 additional lines to work on PPC. Assuming OSSwapInt32 is more or less identical to the (uint32_t)macro in un_cbf (it is):
Code: [Select]
#include <zlib.h>
#include <stdio.h>
#include <stdlib.h>

int extractFile( FILE *fIn )
{
  unsigned long namelen;
  char namebuf[1024];
  unsigned long cmplen, uncmplen;
  FILE *fOut;
  void *destBuf, *srcBuf;
  unsigned long offset;

  offset = ftell( fIn );
  printf( "un-saving at offset 0x%08lx\n", offset );
  if ( fread( &namelen, 4, 1, fIn ) != 1 ) return 0;
/* BYTE SWAP */
  namelen = OSSwapInt32(namelen); // swap it
  if ( fread( namebuf, 1, namelen, fIn ) != namelen ) return 0;
  if ( fread( &uncmplen, 4, 1, fIn ) != 1 ) return 0;
/* BYTE SWAP */
  uncmplen = OSSwapInt32(uncmplen); // swap it
  if ( fread( &cmplen, 4, 1, fIn ) != 1 ) return 0;
/* BYTE SWAP */
  cmplen = OSSwapInt32(cmplen); // swap it

  fOut = fopen( namebuf, "wb" ); // char array - no swap
 
//  everything else is up to gzip (it will read the little endian chunks and
// spit them out correctly
  if ( fOut == NULL ) return 0;
  srcBuf = malloc( cmplen );
  destBuf = malloc( uncmplen );
  if ( !srcBuf || !destBuf || fread( srcBuf, 1, cmplen, fIn )!=cmplen )
    {
      fclose( fOut );
      if ( destBuf ) free( destBuf );
      if ( srcBuf ) free( srcBuf );
      return 0;
    }
  uncompress( destBuf, &uncmplen, srcBuf, cmplen );
  fwrite( destBuf, 1, uncmplen, fOut );
  fclose( fOut );
  free( destBuf );
  free( srcBuf );
  return 1;
}

void unSav( const char *filename )
{
  char signature[4], version[4];
  FILE *fIn = fopen( filename, "rb" );
  if ( !fIn ) return;
  fread( signature, 1, 4, fIn ); // garbage
  fread( version, 1, 4, fIn );   // read
  while( extractFile( fIn ) ); // nothing Mac specific; let 'er rip
  fclose( fIn );
}

int main( int c, char **v )
{
  unSav( v[1] );
  return 0;
}
This code will successfully decompress the CBF. un_bifc (and probably the bam one) is similarly easy.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 21, 2006, 11:57:44 PM
Ooops, I didn't test my code very well.  Try this version:

Code: [Select]
/* cbf2bif.c: Based on jedwin un_bifc, hacked by frichard */

/*
 * Small program the unpacks a CBF file as a BIF file with minimal memory overhead.
 */

#include <zlib.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <strings.h>


int
fread_check(void* b, size_t sz, size_t cnt, FILE* fp, const char* fn)
{
    if (fread(b, sz, cnt, fp) != cnt) {
        fprintf(stderr, "Failed to read %d bytes from %s\n", sz*cnt, fn);
        return 1;
    }
    return 0;
}

/* BS for byte-swap */
#define BS_4BYTE(a) ((uint32_t)(((uint32_t)(a)&0xFF000000)>>24) | \
                   (((uint32_t)(a)&0x00FF0000)>>8)| \
                   (((uint32_t)(a)&0x0000FF00)<<8)| \
                   (((uint32_t)(a)&0x000000FF)<<24))
#define BS_2BYTE(a) ((uint32_t)((((uint32_t)(a)&0xFF00)>>8)|(((uint32_t)(a)&0x00FF)<<8)))

int fread_uint(uint32_t* i, FILE* fp, const char* fn)
{
    if (fread_check(i, 4, 1, fp, fn))
        return 1;
#ifdef __ppc__
    *i = BS_4BYTE(*i);
#endif
    return 0;
}


/* zerr() and def() are copied directly from zlib example code. */

#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
#  include <fcntl.h>
#  include <io.h>
#  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
#else
#  define SET_BINARY_MODE(file)
#endif

#define CHUNK 16384


/* report a zlib or i/o error */
void zerr(int ret)
{
    fputs("zpipe: ", stderr);
    switch (ret) {
    case Z_ERRNO:
        if (ferror(stdin))
            fputs("error reading stdin\n", stderr);
        if (ferror(stdout))
            fputs("error writing stdout\n", stderr);
        break;
    case Z_STREAM_ERROR:
        fputs("invalid compression level\n", stderr);
        break;
    case Z_DATA_ERROR:
        fputs("invalid or incomplete deflate data\n", stderr);
        break;
    case Z_MEM_ERROR:
        fputs("out of memory\n", stderr);
        break;
    case Z_VERSION_ERROR:
        fputs("zlib version mismatch!\n", stderr);
    }
}

/* Decompress from file source to file dest until stream ends or EOF.
   inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
   allocated for processing, Z_DATA_ERROR if the deflate data is
   invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
   the version of the library linked do not match, or Z_ERRNO if there
   is an error reading or writing the files. */
int inf(FILE *source, FILE *dest)
{
    int ret;
    unsigned have;
    z_stream strm;
    unsigned char in[CHUNK];
    unsigned char out[CHUNK];
   
    /* allocate inflate state */
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    strm.avail_in = 0;
    strm.next_in = Z_NULL;
    ret = inflateInit(&strm);
    if (ret != Z_OK)
        return ret;
    /* decompress until deflate stream ends or end of file */
    do {
        strm.avail_in = fread(in, 1, CHUNK, source);
        if (ferror(source)) {
            (void)inflateEnd(&strm);
            return Z_ERRNO;
        }
        if (strm.avail_in == 0)
            break;
        strm.next_in = in;
        /* run inflate() on input until output buffer not full */
        do {
            strm.avail_out = CHUNK;
            strm.next_out = out;
            ret = inflate(&strm, Z_NO_FLUSH);
            assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
            switch (ret) {
            case Z_NEED_DICT:
                ret = Z_DATA_ERROR;     /* and fall through */
            case Z_DATA_ERROR:
            case Z_MEM_ERROR:
                (void)inflateEnd(&strm);
                return ret;
            }
            have = CHUNK - strm.avail_out;
            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
                (void)inflateEnd(&strm);
                return Z_ERRNO;
            }
        } while (strm.avail_out == 0);
        /* done when inflate() says it's done */
    } while (ret != Z_STREAM_END);
    /* clean up and return */
    (void)inflateEnd(&strm);
    return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
}

int
cbf2bif(const char* cbf_file, const char* bif_file)
{
    FILE *cbf_fp;
    FILE *bif_fp;
    char sigver[9];
    uint32_t bif_file_len, cmplen, uncmplen;
    int zret;
   
    if (!(cbf_fp = fopen(cbf_file, "rb"))) {
        fprintf(stderr, "Failure opening file: %s\n", cbf_file);
        return 1;
    }
    if (!(bif_fp = fopen(bif_file, "wb"))) {
        fclose(cbf_fp);
        fprintf(stderr, "Failure opening file: %s\n", bif_file);
        return 1;
    }

    sigver[8] = 0;
    if (fread_check(sigver, 1, 8, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        return 1;
    }
   
    if (strcmp(sigver, "BIF V1.0")) {
        fprintf(stderr, "Incorrect CBF header for file %s\n", cbf_file);
        fclose(cbf_fp);
        fclose(bif_fp);
        return 1;
    }

    if (fread_uint(&bif_file_len, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        return 1;
    }
   
    if (bif_file_len <=0 || bif_file_len > 128) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Corrupt CBF file %s\n", cbf_file);
        return 1;
    }

    if (fseek(cbf_fp, bif_file_len, SEEK_CUR)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Failed to seek ahead in file %s\n", cbf_file);
        return 1;
    }

    if (fread_uint(&uncmplen, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Failed to seek ahead in file %s\n", cbf_file);
        return 1;
    }

    if (fread_uint(&cmplen, cbf_fp, cbf_file)) {
        fclose(cbf_fp);
        fclose(bif_fp);
        fprintf(stderr, "Failed to seek ahead in file %s\n", cbf_file);
        return 1;
    }

    printf("CBF %s (%d bytes) -> BIF %s [%d bytes]\n", cbf_file, (int)cmplen, bif_file, (int)uncmplen);


    if ((zret=inf(cbf_fp, bif_fp)) != Z_OK) {
        fclose(cbf_fp);
        fclose(bif_fp);
        zerr(zret);
        return 1;
    }

    if (fclose(cbf_fp)) {
        fclose(bif_fp);
        fprintf(stderr, "Failure closing file %s\n", cbf_file);
        return 1;
    }
    if (fclose(bif_fp)) {
        fprintf(stderr, "Failure closing file %s\n", bif_file);
        return 1;
    }
   
    return 0;
}


int main( int argc, char *argv[] )
{
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <in-cbf-file> <out-bif-file>\n", argv[0]);
        exit(1);
    }
   
    if (cbf2bif(argv[1], argv[2])) {
        fprintf(stderr, "Conversion failed.\n");
        exit(1);
    }
    return 0;
}
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 12:05:11 AM
un_sav.c works for them. un_sav requires 3 additional lines to work on PPC. Assuming OSSwapInt32 is more or less identical to the (uint32_t)macro in un_cbf (it is):

Yes un_sav (and un_bifc) have the problem that they use "uncompress" which loads the whole file into memory.  Some of these CBF files can be quite big so this is a problem for WeiDU.  Try out the newer version I just posted.

OSSwapInt32?  I probably should've used that instead of rolling my own.  I'm glad to see at least one vendor is including standard byte-swapping macros :)

-Fred
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 22, 2006, 12:08:40 AM
Ooops, I didn't test my code very well.  Try this version:
The result
Code: [Select]
Mainframe:~/Desktop akay$ ./a.out AR200B.CBF AR200B.CBF.uncompressed
CBF AR200B.CBF (1145074 bytes) -> BIF AR200B.CBF.uncompressed [1939843 bytes]
Mainframe:~/Desktop akay$ MD5 /Users/akay/Desktop/AR200B.CBF.uncompressed
MD5 (/Users/akay/Desktop/AR200B.CBF.uncompressed) = 59470dd75d3914b052ecf6dc956ccd30
I'd change the following to uint16_t
Code: [Select]
#define BS_2BYTE(a) ((uint32_t)((((uint32_t)(a)&0xFF00)>>8)|(((uint32_t)(a)&0x00FF)<<8)))not least because it's such a vital part of the program. ;-)

Assuming our MD5s match, it looks to work OK here!
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 22, 2006, 12:13:56 AM
OSSwapInt32?  I probably should've used that instead of rolling my own.  I'm glad to see at least one vendor is including standard byte-swapping macros :)
You'd want OSSwapConstInt32(x) and OSSwapConstInt16(x). Mac OS X has macros to swap to little or big endian (generic macros and inline assembly), and limited support for pdp, as well as the posix standard (nh)to(hn)(sl). I use OSSwap* when I'm stealing code from others. :D
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 12:15:31 AM
Ooops, I didn't test my code very well.  Try this version:
The result
Code: [Select]
Mainframe:~/Desktop akay$ ./a.out AR200B.CBF AR200B.CBF.uncompressed
CBF AR200B.CBF (1145074 bytes) -> BIF AR200B.CBF.uncompressed [1939843 bytes]
Mainframe:~/Desktop akay$ MD5 /Users/akay/Desktop/AR200B.CBF.uncompressed
MD5 (/Users/akay/Desktop/AR200B.CBF.uncompressed) = 59470dd75d3914b052ecf6dc956ccd30
I'd change the following to uint16_t
Code: [Select]
#define BS_2BYTE(a) ((uint32_t)((((uint32_t)(a)&0xFF00)>>8)|(((uint32_t)(a)&0x00FF)<<8)))not least because it's such a vital part of the program. ;-)

Assuming our MD5s match, it looks to work OK here!
Ooops, good catch!  More sloppy coding on my part (I probably shouldn't write code I don't use :D).

BTW: I've found that the way to be happiest with IWD1 is to unpack all the CBF files to BIF files and put them into the "data" directory.  That is, if disk space is no problem.

For IWD1Tutu, I have to make sure the CBF files are all un-packed.  I'm referencing resources directly in the IWD1 BIF files from IWD2 (by making a mess of the Chitin.key).  I haven't checked to see if IWD2 can handle CBF files.
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 22, 2006, 12:24:47 AM
The only thing left that you may want to do is #ifdef __ppc64__ (I compile the Mac OS X releases on a frumpy G4, though, so no difference to me). I wish I knew for sure if 64-bit PPC also defined __ppc__, sorry.

I'd toyed with decompressing all the CBFs and BIFCs, but during gameplay, ID isn't intensive enough to really need it, and BG2 can decompress the BIFC archive faster than a slow machine could ever load the required resources, so I just put up with any overhead the unzip requires. For editing, though, I'd have to agree, but luckily most mortals don't ever need to deal with the compressed resources.

It would probably be most beneficial if you do some magic to prevent the game trying to cache the data (the increased copy would kill off some of the benefit), but I haven't checked to see if any solutions work on Mac OS X (I always assumed that the game recreates cache/data if it doesn't exist, but maybe not).
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 12:44:38 AM
OSSwapInt32?  I probably should've used that instead of rolling my own.  I'm glad to see at least one vendor is including standard byte-swapping macros :)
You'd want OSSwapConstInt32(x) and OSSwapConstInt16(x). Mac OS X has macros to swap to little or big endian (generic macros and inline assembly), and limited support for pdp, as well as the posix standard (nh)to(hn)(sl). I use OSSwap* when I'm stealing code from others. :D
Well, since the only big-endian architecture we have to worry about is the Mac, then I'll go ahead and make that change.  I guess the [nh]to[hn][sl] routines can't be used in this case since "net" is always big-endian (and "host" is native).  Now a PDP port of WeiDU, that could be interesting :)
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 12:49:02 AM
The only thing left that you may want to do is #ifdef __ppc64__ (I compile the Mac OS X releases on a frumpy G4, though, so no difference to me). I wish I knew for sure if 64-bit PPC also defined __ppc__, sorry.

I think it's safe to say that __ppc__ is defined on all Mac's unless you're running an Intel Mac.  I think it's equivalent to __x86__ (even Cygwin/GCC on my dual-core Athalon X2 defines this.)
Title: Re: iwd1 compressed bif files (cbf)
Post by: devSin on May 22, 2006, 12:52:20 AM
OSSwapInt32?  I probably should've used that instead of rolling my own.  I'm glad to see at least one vendor is including standard byte-swapping macros :)
You'd want OSSwapConstInt32(x) and OSSwapConstInt16(x). Mac OS X has macros to swap to little or big endian (generic macros and inline assembly), and limited support for pdp, as well as the posix standard (nh)to(hn)(sl). I use OSSwap* when I'm stealing code from others. :D
Well, since the only big-endian architecture we have to worry about is the Mac, then I'll go ahead and make that change.  I guess the [nh]to[hn][sl] routines can't be used in this case since "net" is always big-endian (and "host" is native).  Now a PDP port of WeiDU, that could be interesting :)
Yes. The order of bytes shall be influenced by the phases of the moon. ;-)

Anyway, if there's ever cause for a Sparc port or somesuch, it'll be easy enough to roll back out into zlib.c.

Quote
I think it's safe to say that __ppc__ is defined on all Mac's unless you're running an Intel Mac.  I think it's equivalent to __x86__ (even Cygwin/GCC on my dual-core Athalon X2 defines this.)
I don't want to look at the tech docs, but I'm fairly sure that Apple's recommendation is that __ppc__ should not be defined on 64-bit systems.

EDIT: What am I thinking? WeiDU will never be compiled as a 64-bit executable. :-/

Yeah, __ppc__ should be sufficient.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 08:55:04 AM
Quote
I think it's safe to say that __ppc__ is defined on all Mac's unless you're running an Intel Mac.  I think it's equivalent to __x86__ (even Cygwin/GCC on my dual-core Athalon X2 defines this.)
I don't want to look at the tech docs, but I'm fairly sure that Apple's recommendation is that __ppc__ should not be defined on 64-bit systems.

EDIT: What am I thinking? WeiDU will never be compiled as a 64-bit executable. :-/

Yeah, __ppc__ should be sufficient.
Yes, you're right on both counts!  I'll do the right thing and put in both macros anyway.  I'll post another patch here shortly :)

EDIT: I just realized that it's possible that WeiDU will get ported to more OS's with GemRB.  So I'll play it safe and try to do the right thing :)
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 22, 2006, 09:27:59 AM
Not to be lazy, but, uh, did you update the diff?  :)
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 09:32:38 AM
Not to be lazy, but, uh, did you update the diff?  :)
I was just getting to that :)

Still no cleanup code, and the load.ml patch is the same.  This just adds MacOS big-endian support (and a few minor cleanups):

Code: [Select]
diff -u -r ./WeiDU-192.orig/src/load.ml ./WeiDU-192/src/load.ml
--- ./WeiDU-192.orig/src/load.ml 2006-04-10 16:58:53.000000000 -0400
+++ ./WeiDU-192/src/load.ml 2006-05-21 12:17:58.171875000 -0400
@@ -314,24 +314,41 @@
 
 let skip_next_load_error = ref false
 
+external cbf2bif : string -> string -> int
+    = "mlgz_cbf2bif"
+
 let load_bif_in_game game bif_file =
     if Hashtbl.mem game.loaded_biffs bif_file then
       Hashtbl.find game.loaded_biffs bif_file (* already here *)
     else begin
       (* we must load the BIF *)
-      let biff_path =
-        let rec trial lst =
+      let biff_path = begin
+        let rec trial f lst =
           match lst with
-            [] -> find_file_in_path game.game_path bif_file
+            [] -> find_file_in_path game.game_path f
           | hd :: tl ->
-            let perhaps = find_file_in_path hd bif_file in
-            log_only "BIFF may be in hard-drive CD-path [%s]\n" perhaps ;
-            if file_exists perhaps then
-              perhaps
-            else trial tl
+              let perhaps = find_file_in_path hd f in
+              log_only "BIFF may be in hard-drive CD-path [%s]\n" perhaps ;
+              if file_exists perhaps then
+                perhaps
+              else trial f tl
         in
-        trial (game.cd_path_list)
-      in
+        (* Check to see if the bif file exists, if it doesn't try for a .CBF file *)
+        let bf = trial bif_file (game.cd_path_list @ [ game.game_path ^ "/cache" ] ) in
+        if file_exists bf then
+          bf
+        else begin
+          let cbf = Filename.chop_extension bif_file ^ ".cbf" in
+          let cbf_file = trial cbf (game.cd_path_list) in
+          if file_exists cbf_file then
+            let cache_file = game.game_path ^ "/cache/" ^ bif_file in
+            let sz = cbf2bif cbf_file cache_file in
+            let _ = log_and_print "[%s] decompressed bif file %d bytes\n" cbf_file sz in
+            cache_file
+          else
+            bf
+        end
+      end in
       let the_biff = Biff.load_biff biff_path in
       Hashtbl.add game.loaded_biffs bif_file the_biff ;
       the_biff
diff -u -r ./WeiDU-192.orig/zlib/zlib.c ./WeiDU-192/zlib/zlib.c
--- ./WeiDU-192.orig/zlib/zlib.c 2003-06-02 06:08:24.000000000 -0400
+++ ./WeiDU-192/zlib/zlib.c 2006-05-22 10:16:54.250000000 -0400
@@ -85,3 +85,221 @@
    */
   return v_ret ;
 }
+
+/* zerr() and def() are copied directly from zlib example code. */
+
+#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
+#  include <fcntl.h>
+#  include <io.h>
+#  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
+#else
+#  define SET_BINARY_MODE(file)
+#endif
+
+#define CHUNK 16384
+
+
+/* Raise an exception for a zlib or i/o error */
+void mlgz_zerr(int ret)
+{
+    switch (ret) {
+    case Z_ERRNO:
+        raise_sys_error(copy_string(strerror(errno))) ;
+        break;
+    case Z_STREAM_ERROR:
+        raise_mlgz_exn("invalid compression level");
+        break;
+    case Z_DATA_ERROR:
+        raise_mlgz_exn("invalid or incomplete deflate data");
+        break;
+    case Z_MEM_ERROR:
+        raise_out_of_memory() ;
+        break;
+    case Z_VERSION_ERROR:
+        raise_mlgz_exn("zlib version mismatch!");
+    }
+}
+
+/* Yes, this is bad practice.  I compensated by using "%.256s" instead of just "%s": */
+static char errstr[1024];
+
+/* This is a bit better for error checking: */
+int fread_check(void* b, size_t sz, size_t cnt, FILE* fp, const char* fn)
+{
+    if (fread(b, sz, cnt, fp) != cnt) {
+        sprintf(errstr, "Failed to read %d bytes from file %.256s", sz*cnt, fn);
+        raise_mlgz_exn(errstr);
+        return 1;
+    }
+    return 0;
+}
+
+int fread_uint(uint32_t* i, FILE* fp, const char* fn)
+{
+    if (fread_check(&i, 4, 1, fp, fn))
+        return 1;
+#if defined(__ppc__) || defined(__ppc64__)
+    *i = OSSwapInt32(*i);
+#endif
+    return 0;
+}
+
+
+/* Decompress from file source to file dest until stream ends or EOF.
+   inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
+   allocated for processing, Z_DATA_ERROR if the deflate data is
+   invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
+   the version of the library linked do not match, or Z_ERRNO if there
+   is an error reading or writing the files. */
+int inf(FILE *source, FILE *dest)
+{
+    int ret;
+    unsigned have;
+    z_stream strm;
+    unsigned char in[CHUNK];
+    unsigned char out[CHUNK];
+   
+    /* allocate inflate state */
+    strm.zalloc = Z_NULL;
+    strm.zfree = Z_NULL;
+    strm.opaque = Z_NULL;
+    strm.avail_in = 0;
+    strm.next_in = Z_NULL;
+    ret = inflateInit(&strm);
+    if (ret != Z_OK)
+        return ret;
+    /* decompress until deflate stream ends or end of file */
+    do {
+        strm.avail_in = fread(in, 1, CHUNK, source);
+        if (ferror(source)) {
+            (void)inflateEnd(&strm);
+            return Z_ERRNO;
+        }
+        if (strm.avail_in == 0)
+            break;
+        strm.next_in = in;
+        /* run inflate() on input until output buffer not full */
+        do {
+            strm.avail_out = CHUNK;
+            strm.next_out = out;
+            ret = inflate(&strm, Z_NO_FLUSH);
+            /* assert(ret != Z_STREAM_ERROR); */  /* state not clobbered */ /* no asserts for Ocaml */
+            switch (ret) {
+            case Z_NEED_DICT:
+                ret = Z_DATA_ERROR;     /* and fall through */
+            case Z_DATA_ERROR:
+            case Z_MEM_ERROR:
+                (void)inflateEnd(&strm);
+                return ret;
+            }
+            have = CHUNK - strm.avail_out;
+            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
+                (void)inflateEnd(&strm);
+                return Z_ERRNO;
+            }
+        } while (strm.avail_out == 0);
+        /* done when inflate() says it's done */
+    } while (ret != Z_STREAM_END);
+    /* clean up and return */
+    (void)inflateEnd(&strm);
+    return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
+}
+
+/*
+   Unompresses a CBF files to a specified BIF file.  Return 0 on failure or the
+   number of uncompressed bytes on success.
+ */
+value mlgz_cbf2bif(value _cbf_file, value _bif_file)
+{
+    const char* cbf_file = String_val(_cbf_file);
+    const char* bif_file = String_val(_bif_file);
+    FILE *cbf_fp;
+    FILE *bif_fp;
+    char sigver[9];
+    uint32_t bif_file_len, cmplen, uncmplen;
+    int zret;
+   
+
+    if (!(cbf_fp = fopen(cbf_file, "rb"))) {
+        sprintf(errstr, "failure opening file: %.256s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+    if (!(bif_fp = fopen(bif_file, "wb"))) {
+        fclose(cbf_fp);
+        sprintf(errstr, "failure opening file: %.256s", bif_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    sigver[8] = 0;
+    if (fread_check(sigver, 1, 8, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+   
+    if (strcmp(sigver, "BIF V1.0")) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        sprintf(errstr, "incorrect CBF header for file %.256s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    if (fread_uint(&bif_file_len, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+   
+    if (bif_file_len <=0 || bif_file_len > 128) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        sprintf(errstr, "corrupt CBF file %.256s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    /* Seek ahead past embedded file name, doesn't really matter what it is */
+    if (fseek(cbf_fp, bif_file_len, SEEK_CUR)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        sprintf(errstr, "failure seeking %d bytes into file %.256s", bif_file_len, cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+
+    if (fread_uint(&uncmplen, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+   
+    if (fread_uint(&cmplen, cbf_fp, cbf_file)) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        return Val_int(0);
+    }
+    /* printf("CBF %s (%ld bytes) -> BIF %s [%ld bytes]", cbf_file, cmplen, bif_file, uncmplen); */
+
+    if ((zret=inf(cbf_fp, bif_fp)) != Z_OK) {
+        fclose(cbf_fp);
+        fclose(bif_fp);
+        mlgz_zerr(zret);
+        return Val_int(0);
+    }
+
+    if (fclose(cbf_fp)) {
+        fclose(bif_fp);
+        sprintf(errstr, "failure closing file %.256s", cbf_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+    if (fclose(bif_fp)) {
+        sprintf(errstr, "failure closing file %.256s", bif_file);
+        raise_mlgz_exn(errstr);
+        return Val_int(0);
+    }
+    return Val_int(uncmplen);
+}
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 22, 2006, 09:39:58 AM
Still no cleanup code, and the load.ml patch is the same.  This just adds MacOS big-endian support (and a few minor cleanups):
Ok, thanks. The cache cleanup code should be straightforward - would you prefer me to code it?
BTW, since I saw that mentioned, you shouldn't do
Code: [Select]
file_exists (String.lowercase (Arch.backslash_to_shash file))
[..]
open_in (String.lowercase (Arch.backslash_to_shash file))
etc.
but use the Case_ins module instead:
Code: [Select]
file_exists file (* already case_insed by Util *)
[..]
Case_ins.perv_open_in file;
etc.
Dev specifically requested that OSX doesn't get the lowercasing  ;)
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 09:51:48 AM
Still no cleanup code, and the load.ml patch is the same.  This just adds MacOS big-endian support (and a few minor cleanups):
Ok, thanks. The cache cleanup code should be straightforward - would you prefer me to code it?
BTW, since I saw that mentioned, you shouldn't do
Code: [Select]
file_exists (String.lowercase (Arch.backslash_to_shash file))
[..]
open_in (String.lowercase (Arch.backslash_to_shash file))
etc.
but use the Case_ins module instead:
Code: [Select]
file_exists file (* already case_insed by Util *)
[..]
Case_ins.perv_open_in file;
etc.
Dev specifically requested that OSX doesn't get the lowercasing  ;)

If you don't mind handling the cache cleanup coding that would be great.  I kind of lost interest when I realized this wasn't quite going to solve my problem with IWD1Tutu (long story).

Do you think it would be a good idea to have a sys_file_exists in case_ins*.ml?  I didn't include this small patch, but this seems like a logical addition:
Code: [Select]
diff -u -r ./WeiDU-192.orig/src/case_ins_linux.ml ./WeiDU-192/src/case_ins_linux.ml
--- ./WeiDU-192.orig/src/case_ins_linux.ml 2006-03-18 19:30:18.000000000 -0500
+++ ./WeiDU-192/src/case_ins_linux.ml 2006-05-20 11:29:09.000000000 -0400
@@ -16,3 +16,5 @@
 let unix_unlink s = Unix.unlink (String.lowercase (backslash_to_slash s)) ;;
 let unix_mkdir s p = Unix.mkdir (String.lowercase (backslash_to_slash s)) p ;;
 let unix_opendir s = Unix.opendir (String.lowercase (backslash_to_slash s)) ;;
+
+let sys_file_exists s = Sys.file_exists (String.lowercase (backslash_to_slash s)) ;;
diff -u -r ./WeiDU-192.orig/src/case_ins_mac.ml ./WeiDU-192/src/case_ins_mac.ml
--- ./WeiDU-192.orig/src/case_ins_mac.ml 2006-03-18 19:30:34.000000000 -0500
+++ ./WeiDU-192/src/case_ins_mac.ml 2006-05-20 11:29:33.187500000 -0400
@@ -16,3 +16,5 @@
 let unix_unlink s = Unix.unlink (backslash_to_slash s) ;;
 let unix_mkdir s p = Unix.mkdir (backslash_to_slash s) p ;;
 let unix_opendir s = Unix.opendir (backslash_to_slash s) ;;
+
+let sys_file_exists s = Sys.file_exists (backslash_to_slash s) ;;
diff -u -r ./WeiDU-192.orig/src/case_ins_win.ml ./WeiDU-192/src/case_ins_win.ml
--- ./WeiDU-192.orig/src/case_ins_win.ml 2006-04-10 17:01:05.000000000 -0400
+++ ./WeiDU-192/src/case_ins_win.ml 2006-05-20 11:28:37.218750000 -0400
@@ -12,3 +12,5 @@
 let unix_unlink s = Unix.unlink s ;;
 let unix_mkdir s p = Unix.mkdir s p ;;
 let unix_opendir s = Unix.opendir s ;;
+
+let sys_file_exists s = Sys.file_exists s ;;

file_exists in util.ml doesn't take any measures to transform the file name (maybe it's important to use the "file_exists and has non-zero bytes" but I can't see why):
Code: [Select]
let file_exists name = (file_size name >= 0)
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 22, 2006, 09:57:37 AM
Well, file_size is

Code: [Select]
let file_size name =
  try
    let stats = Case_ins.unix_stat name in
    stats.Unix.st_size
  with _ ->  -1

So it's already Case_insed. As for the >=0, it was added by Wes because _ and is't needed by a host of already existing mods, so :(
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 22, 2006, 10:04:55 AM
Well, file_size is

Code: [Select]
let file_size name =
  try
    let stats = Case_ins.unix_stat name in
    stats.Unix.st_size
  with _ ->  -1

So it's already Case_insed. As for the >=0, it was added by Wes because _ and is't needed by a host of already existing mods, so :(
Yes, I see your point there  and I should've read a bit more of the code. :)

And there's really no reason to fix what's not broken so I'll drop the topic.  Having a test for "exists and is non-zero bytes" is probably safer anway (since a zero-byte file is almost always going to be corrupt for this program).
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 23, 2006, 11:59:10 AM
WARNING I've found at least one bug in the old patch (also, the exception handling was broken in the old compress code).  I'll get a new patch out soon.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 23, 2006, 12:43:41 PM
Okay, this should work a bit better.  If you have IWD1, you can test it with:
Code: [Select]
> ./weidu.asm.exe --biff-get AR3401.WED
[C:\cygwin\usr\src\WeiDU-192.tmp\weidu.asm.exe] WeiDU version 192
[C:\Program Files\Black Isle\Icewind Dale/CHITIN.KEY] 268 BIFFs, 19992 resources
[C:\Program Files\Black Isle\Icewind Dale/dialog.tlk] 34502 string entries
[C:\cygwin\usr\src\WeiDU-192.tmp\weidu.asm.exe] Using scripting style "IWD1"
[C:\Program Files\Black Isle\Icewind Dale\CD2\/data/AR3401.cbf] decompressed bif file 2095712 bytes
[C:\Program Files\Black Isle\Icewind Dale/cache/data/AR3401.bif] 2095712 bytes, 6 files, 1 tilesets
[./AR3401.WED] created from [C:\Program Files\Black Isle\Icewind Dale/data/AR3401.bif]
>

You can see that it reports decompressing the CBF files to the Cache directory.  Note that it will only do this on an operation that grabs resources from CBF files.  I noticed that 'weidu --biff <bif-file>' is a bit broken.  It doesn't report an error if it can't find <bif-file> in the game keys, and <bif-file> must match the key entry exactly.  Here's an xample:
Code: [Select]
> ./weidu.asm.exe --biff 'data\AR3401.bif'
eidu.asm.exe] WeiDU version 192
nd Dale/CHITIN.KEY] 268 BIFFs, 19992 resources
nd Dale/dialog.tlk] 34502 string entries
eidu.asm.exe] Using scripting style "IWD1"
1.TIS at index 1
1.WED at index 0
T.BMP at index 1
M.BMP at index 2
R.BMP at index 3
1.MOS at index 4
A.WAV at index 5
> ./weidu.asm.exe --biff 'data/AR3401.bif'
eidu.asm.exe] WeiDU version 192
nd Dale/CHITIN.KEY] 268 BIFFs, 19992 resources
nd Dale/dialog.tlk] 34502 string entries
eidu.asm.exe] Using scripting style "IWD1"
> ./weidu.asm.exe --biff 'AR3401.bif'
eidu.asm.exe] WeiDU version 192
nd Dale/CHITIN.KEY] 268 BIFFs, 19992 resources
nd Dale/dialog.tlk] 34502 string entries
eidu.asm.exe] Using scripting style "IWD1"
>

Fixing this is a bit of a pain.  I only cared because I was looking for examples to test this new patch with.

Here's that patch:

Code: [Select]
diff -b -w -x '*.exe' -x obj -x '*.txt' -x 'iw12*' -x '*~' -x  -Naur ./WeiDU-192.orig/Makefile ./WeiDU-192.tmp/Makefile
--- ./WeiDU-192.orig/Makefile 2006-04-24 15:14:58.000000000 -0400
+++ ./WeiDU-192/Makefile 2006-05-23 13:34:23.950168600 -0400
@@ -135,7 +135,7 @@
         baflexer bafparser \
         baflexer_old bafparser_old \
         diff tp dlexer dparser \
-        automate kit
+        automate kit cbif
 ifdef ITEMLIST
 WEIDU_BASE_MODULES  += pretty itemlist
 endif
diff -b -w -x '*.exe' -x obj -x '*.txt' -x 'iw12*' -x '*~' -x  -Naur ./WeiDU-192.orig/src/biff.ml ./WeiDU-192/src/biff.ml
--- ./WeiDU-192.orig/src/biff.ml 2006-03-30 16:42:25.000000000 -0500
+++ ./WeiDU-192/src/biff.ml 2006-05-23 13:26:48.106418600 -0400
@@ -9,6 +9,7 @@
 (* Infinity Engine [BIF] *)
 open Util
 open Key
+open Cbif
 
 type biff_file = {
   res_loc       : int ;
@@ -204,10 +205,6 @@
   }
 end
 
-(* comment this out if you don't have zlib *)
-external uncompress : string -> pos:int -> clen:int -> ulen: int -> string
-  = "mlgz_uncompress"
-
 (* reads 'size' bytes that would start at location 'start' in this BIFF
  * if it were not compressed! *)
 let read_compressed_biff_internal fd filename start size chunk_fun =
@@ -215,7 +212,7 @@
   let unc_offset = ref 0 in
 
   (* buffer holds the uncompressed bytes [start_unc,end_unc]  *)
-  let result = Buffer.create size in
+  let (*result*) _ = Buffer.create size in
   let start_unc_offset = ref 0 in
   let end_unc_offset = ref 0 in
 
@@ -248,7 +245,7 @@
       let _ = Unix.lseek fd (!cmp_offset+8) Unix.SEEK_SET in
       let cmp_buff = String.create cmplen in
       my_read cmplen fd cmp_buff filename ;
-      let uncmp = uncompress cmp_buff 0 cmplen uncmplen in
+      let uncmp = Cbif.uncompress cmp_buff 0 cmplen uncmplen in
       (*
       if (String.length uncmp <> uncmplen) then begin
         log_and_print "ERROR: [%s] chunk at offset %d was supposed to have %d bytes of compressed data that expanded to %d, but in reality they expanded to %d"
diff -b -w -x '*.exe' -x obj -x '*.txt' -x 'iw12*' -x '*~' -x  -Naur ./WeiDU-192.orig/src/cbif.ml ./WeiDU-192/src/cbif.ml
--- ./WeiDU-192.orig/src/cbif.ml 1969-12-31 19:00:00.000000000 -0500
+++ ./WeiDU-192/src/cbif.ml 2006-05-23 13:26:48.122043600 -0400
@@ -0,0 +1,12 @@
+(* Decompression routines for compressed bif files *)
+
+exception Error of string
+
+let _ = Callback.register_exception "mlgz_exn" (Error "")
+
+external cbf2bif : string -> string -> int
+    = "mlgz_cbf2bif"
+
+external uncompress : string -> pos:int -> clen:int -> ulen: int -> string
+  = "mlgz_uncompress"
+
diff -b -w -x '*.exe' -x obj -x '*.txt' -x 'iw12*' -x '*~' -x  -Naur ./WeiDU-192.orig/src/load.ml ./WeiDU-192/src/load.ml
--- ./WeiDU-192.orig/src/load.ml 2006-04-10 16:58:53.000000000 -0400
+++ ./WeiDU-192/src/load.ml 2006-05-23 13:26:48.122043600 -0400
@@ -8,6 +8,7 @@
 It was originally taken from Westley Weimer's WeiDU 185. *)
 
 open Util
+open Cbif
 
 let registry_game_paths () =
   let str_list = "." :: !Arch.registry_paths in
@@ -319,19 +320,33 @@
       Hashtbl.find game.loaded_biffs bif_file (* already here *)
     else begin
       (* we must load the BIF *)
-      let biff_path =
-        let rec trial lst =
+      let biff_path = begin
+        let rec trial f lst =
           match lst with
-            [] -> find_file_in_path game.game_path bif_file
+            [] -> find_file_in_path game.game_path f
           | hd :: tl ->
-            let perhaps = find_file_in_path hd bif_file in
+              let perhaps = find_file_in_path hd f in
             log_only "BIFF may be in hard-drive CD-path [%s]\n" perhaps ;
             if file_exists perhaps then
               perhaps
-            else trial tl
-        in
-        trial (game.cd_path_list)
+              else trial f tl
       in
+        (* Check to see if the bif file exists, if it doesn't try for a .CBF file *)
+        let bf = trial bif_file (game.cd_path_list @ [ game.game_path ^ "/cache" ] ) in
+        if file_exists bf then
+          bf
+        else begin
+          let cbf = Filename.chop_extension bif_file ^ ".cbf" in
+          let cbf_file = trial cbf (game.cd_path_list) in
+          if file_exists cbf_file then
+            let cache_file = game.game_path ^ "/cache/" ^ bif_file in
+            let sz = Cbif.cbf2bif cbf_file cache_file in
+            let _ = log_and_print "[%s] decompressed bif file %d bytes\n" cbf_file sz in
+            cache_file
+          else
+            bf
+        end
+      end in
       let the_biff = Biff.load_biff biff_path in
       Hashtbl.add game.loaded_biffs bif_file the_biff ;
       the_biff
diff -b -w -x '*.exe' -x obj -x '*.txt' -x 'iw12*' -x '*~' -x  -Naur ./WeiDU-192.orig/zlib/zlib.c ./WeiDU-192/zlib/zlib.c
--- ./WeiDU-192.orig/zlib/zlib.c 2003-06-02 06:08:24.000000000 -0400
+++ ./WeiDU-192/zlib/zlib.c 2006-05-23 13:26:48.122043600 -0400
@@ -19,12 +19,13 @@
   static value * exn = NULL;
   if(exn == NULL)
     exn = caml_named_value ("mlgz_exn");
-  raise_with_string(*exn, (char *)msg) ;
+  caml_raise_with_string(*exn, (char *)msg) ;
 }
 
-value mlgz_uncompress(value v_src, value v_pos, value v_len, value unc_len)
+CAMLprim value mlgz_uncompress(value v_src, value v_pos, value v_len, value unc_len)
 {
-  value v_ret;
+  CAMLparam4(v_src, v_pos, v_len, unc_len);
+  CAMLlocal1(v_ret);
   int level, pos, len, out_buf_len, r;
   uLong out_len;
   const char *in_buf;
@@ -42,7 +43,7 @@
    out_buf = malloc(out_buf_len);
    */
 
-  v_ret = alloc_string(Int_val(unc_len));
+  v_ret = caml_alloc_string(Int_val(unc_len));
   out_buf = String_val(v_ret);
 
   if(out_buf == NULL)
@@ -55,7 +56,6 @@
     } else if(r == Z_BUF_ERROR) {
       char *new_buf;
 
-      printf("uncompress 1\n"); fflush(stdout);
       raise_mlgz_exn("uncompress");
       out_buf_len *= 2;
       new_buf = realloc(out_buf, out_buf_len);
@@ -65,11 +65,9 @@
       }
       out_buf = new_buf;
     } else if(r == Z_MEM_ERROR) {
-      printf("uncompress 3\n"); fflush(stdout);
       free(out_buf);
       raise_out_of_memory();
     } else {
-      printf("WeiDU: ZLIB: Warning! Problem decompressing block.\n");
       fflush(stdout);
       out_len = Int_val(unc_len);
       break;
@@ -83,5 +81,224 @@
    memcpy(String_val(v_ret), out_buf, out_len);
    free(out_buf);
    */
-  return v_ret ;
+  CAMLreturn (v_ret) ;
+}
+
+/* zerr() and def() are copied directly from zlib example code. */
+
+#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
+#  include <fcntl.h>
+#  include <io.h>
+#  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
+#else
+#  define SET_BINARY_MODE(file)
+#endif
+
+#define CHUNK 16384
+
+
+/* Raise an exception for a zlib or i/o error */
+void mlgz_zerr(int ret)
+{
+    switch (ret) {
+    case Z_ERRNO:
+        raise_sys_error(copy_string(strerror(errno))) ;
+        break;
+    case Z_STREAM_ERROR:
+        raise_mlgz_exn("invalid compression level");
+        break;
+    case Z_DATA_ERROR:
+        raise_mlgz_exn("invalid or incomplete deflate data");
+        break;
+    case Z_MEM_ERROR:
+        raise_out_of_memory() ;
+        break;
+    case Z_VERSION_ERROR:
+        raise_mlgz_exn("zlib version mismatch!");
+    }
+}
+
+/* Yes, this is bad practice.  I compensated by using "%.256s" instead of just "%s": */
+static char errstr[1024];
+
+/* This is a bit better for error checking: */
+int fread_check(void* b, size_t sz, size_t cnt, FILE* fp, const char* fn)
+{
+    if (fread(b, sz, cnt, fp) != cnt) {
+        sprintf(errstr, "Failed to read %d bytes from file %.256s", sz*cnt, fn);
+        raise_mlgz_exn(errstr);
+        return 1;
+    }
+    return 0;
+}
+
+int fread_uint(uint32_t* i, FILE* fp, const char* fn)
+{
+    if (fread_check(i, 4, 1, fp, fn))
+        return 1;
+#if defined(__ppc__) || defined(__ppc64__)
+    *i = OSSwapInt32(*i);
+#endif
+    return 0;
+}
+
+
+/* Decompress from file source to file dest until stream ends or EOF.
+   inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
+   allocated for processing, Z_DATA_ERROR if the deflate data is
+   invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
+   the version of the library linked do not match, or Z_ERRNO if there
+   is an error reading or writing the files. */
+int inf(FILE *source, FILE *dest)
+{
+    int ret;
+    unsigned have;
+    z_stream strm;
+    unsigned char in[CHUNK];
+    unsigned char out[CHUNK];
+   
+    /* allocate inflate state */
+    strm.zalloc = Z_NULL;
+    strm.zfree = Z_NULL;
+    strm.opaque = Z_NULL;
+    strm.avail_in = 0;
+    strm.next_in = Z_NULL;
+    ret = inflateInit(&strm);
+    if (ret != Z_OK)
+        return ret;
+    /* decompress until deflate stream ends or end of file */
+    do {
+        strm.avail_in = fread(in, 1, CHUNK, source);
+        if (ferror(source)) {
+            (void)inflateEnd(&strm);
+            return Z_ERRNO;
+        }
+        if (strm.avail_in == 0)
+            break;
+        strm.next_in = in;
+        /* run inflate() on input until output buffer not full */
+        do {
+            strm.avail_out = CHUNK;
+            strm.next_out = out;
+            ret = inflate(&strm, Z_NO_FLUSH);
+            /* assert(ret != Z_STREAM_ERROR); */  /* state not clobbered */ /* no asserts for Ocaml */
+            switch (ret) {
+            case Z_NEED_DICT:
+                ret = Z_DATA_ERROR;     /* and fall through */
+            case Z_DATA_ERROR:
+            case Z_MEM_ERROR:
+                (void)inflateEnd(&strm);
+                return ret;
+            }
+            have = CHUNK - strm.avail_out;
+            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
+                (void)inflateEnd(&strm);
+                return Z_ERRNO;
+            }
+        } while (strm.avail_out == 0);
+        /* done when inflate() says it's done */
+    } while (ret != Z_STREAM_END);
+    /* clean up and return */
+    (void)inflateEnd(&strm);
+    return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
+}
+
+/*
+   Unompresses a CBF files to a specified BIF file.  Return 0 on failure or the
+   number of uncompressed bytes on success.
+ */
+CAMLprim value mlgz_cbf2bif(value _cbf_file, value _bif_file)
+{
+  CAMLparam2(_cbf_file, _bif_file);
+  const char* cbf_file = String_val(_cbf_file);
+  const char* bif_file = String_val(_bif_file);
+  FILE *cbf_fp;
+  FILE *bif_fp;
+  char sigver[9];
+  uint32_t bif_file_len, cmplen, uncmplen;
+  int zret;
+   
+  if (!(cbf_fp = fopen(cbf_file, "rb"))) {
+    sprintf(errstr, "failure opening file: %.256s", cbf_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+
+  if (!(bif_fp = fopen(bif_file, "wb"))) {
+    fclose(cbf_fp);
+    sprintf(errstr, "failure opening file: %.256s", bif_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+
+  sigver[8] = 0;
+  if (fread_check(sigver, 1, 8, cbf_fp, cbf_file)) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    CAMLreturn(Val_int(0));
+  }
+   
+  if (strcmp(sigver, "BIF V1.0")) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    sprintf(errstr, "incorrect CBF header for file %.256s", cbf_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+
+  if (fread_uint(&bif_file_len, cbf_fp, cbf_file)) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    CAMLreturn(Val_int(0));
+  }
+   
+  if (bif_file_len <=0 || bif_file_len > 128) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    sprintf(errstr, "corrupt CBF file %.256s", cbf_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+
+  /* Seek ahead past embedded file name, doesn't really matter what it is */
+  if (fseek(cbf_fp, bif_file_len, SEEK_CUR)) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    sprintf(errstr, "failure seeking %d bytes into file %.256s", bif_file_len, cbf_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+
+  if (fread_uint(&uncmplen, cbf_fp, cbf_file)) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    CAMLreturn(Val_int(0));
+  }
+   
+  if (fread_uint(&cmplen, cbf_fp, cbf_file)) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    CAMLreturn(Val_int(0));
+  }
+  /* printf("CBF %s (%ld bytes) -> BIF %s [%ld bytes]", cbf_file, cmplen, bif_file, uncmplen); */
+
+  if ((zret=inf(cbf_fp, bif_fp)) != Z_OK) {
+    fclose(cbf_fp);
+    fclose(bif_fp);
+    mlgz_zerr(zret);
+    CAMLreturn(Val_int(0));
+  }
+
+  if (fclose(cbf_fp)) {
+    fclose(bif_fp);
+    sprintf(errstr, "failure closing file %.256s", cbf_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+  if (fclose(bif_fp)) {
+    sprintf(errstr, "failure closing file %.256s", bif_file);
+    raise_mlgz_exn(errstr);
+    CAMLreturn(Val_int(0));
+  }
+  CAMLreturn(Val_int(uncmplen));
 }
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 23, 2006, 01:16:33 PM
Patch applied.
Title: Re: iwd1 compressed bif files (cbf)
Post by: the bigg on May 23, 2006, 01:36:09 PM
Patch for cleanup is attached. Assumes your latest cbif patch is already applied.
Title: Re: iwd1 compressed bif files (cbf)
Post by: FredSRichardson on May 23, 2006, 03:39:24 PM
Patch for cleanup is attached. Assumes your latest cbif patch is already applied.

Thank you for this! :)  You actually don't need to test for the existance of the cache file:
Code: [Select]
+             if not (file_exists cache_file) then Queue.add cache_file cbifs_to_rem;I added "cache" to the CD search list so that this code isn't reached if one already exists (I tested this out since I had a few cache files left over by the IE).

-Fred