Pocket Plane Group

Friends and Neighbors => Weimer Republic (WeiDU.org) => WeiDU => Topic started by: Wisp on September 20, 2018, 09:41:42 AM

Title: Updates to HANDLE_CHARSETS
Post by: Wisp on September 20, 2018, 09:41:42 AM
I've made a few additions to HANDLE_CHARSETS, due for next release:

The function can put output files in a separate directory with the option out_path. Normally output files reversibly overwrite the originals, so as to make the conversion transparent and ease the update of mods to be EE-compatible. With out_path, you can choose to have the function make no changes to the originals, but at the cost of having to accommodate this in your TP2 code. If you use out_path, HANDLE_CHARSETS cannot reasonably know whether the function has already run and further conversions are redundant, so it will run every time it's called and it's up to you to limit redundancy.

The function can run in reverse and convert UTF-8 into language-dependent character sets, for installation of EE-era mods on the original/legacy editions of the games.

The changes are live on github, for those who wish to peruse.

Title: Re: Updates to HANDLE_CHARSETS
Post by: AL|EN on October 01, 2018, 03:30:40 AM

Let's assume that there is a new function called: HANDLE_UTF8


-weidu knows/can detect the source file encoding: UTF8 No BOM
-weidu knows from the charset table, the correct codepage for ".*polish.*" ".*polski.*" ".*pl_PL.*" it's CP1250
-weidu knows the type of the game (EE or Classic)


Case 1: installing mod for EE game
- weidu detect EE + UTF8 so no need to do anything


Case 2: installing mod for Classic game
- weidu detect Classic game
- selected mod translation was 'polish', tra files are converted using CP1250


Possible to implement? Or it's out of scope of the current HANDLE code function?

Title: Re: Updates to HANDLE_CHARSETS
Post by: Wisp on October 01, 2018, 11:22:29 AM
That's what the newly implemented from_utf8 option does. With it you declare that your source files are in UTF-8 and if your mod is being installed on the original editions, the TRA files are converted into the character set specified by the charset table (or which is inferred).
Title: Re: Updates to HANDLE_CHARSETS
Post by: AL|EN on October 01, 2018, 02:11:24 PM
Right but since my files are coded as utf8 and weidu can detect it, why declaring from_utf8 at all? And intention of the new function would be to only use infer_charset, defined by weidu. Why declare unnecessary things if start conditions are set like source encoding, game type and destination encoding (from %LANGUAGE%) are know? It's not only about the conversion ability, it's about having a function which will have all necessary things already set as default, like infer_charset=1, default_language=0, tra_path=~%MOD_FOLDER%/lang~ You see where I'm going?
Title: Re: Updates to HANDLE_CHARSETS
Post by: Wisp on October 01, 2018, 05:28:06 PM
WeiDU can't detect character sets, which is why it needs to be declared.

As for why things are default the way they are, tra_path varies wildly between mods (/tra, /lang or /language(s), just for a few examples), other options (like default_language) cannot safely be assumed because it'd presuppose things about the way the mod is structured, which would result in undesirable behaviour if things were not so, and so on. The option infer_charsets could perhaps have been default, but in practical terms, there have been regressions in the past involving variables that were not 0 by default being altered to be so, and I also wanted it to be opt in, so the modder would think about it and how the files were encoded. That might not have had the desired effect.