Author Topic: Proposal for BGEE TLK compatibility  (Read 9530 times)

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Proposal for BGEE TLK compatibility
« on: February 17, 2013, 02:50:22 PM »
I'm posting this in the interest of soliciting input, comments and hole-poking. If you have questions after reading this, I encourage you to ask them. Perhaps it was something I neglected to consider. Behaviour on non-EE platforms would remain unchanged. The issue with character encodings will be dealt with separately.

See here for the latest.

Reading from the TLKs:

WeiDU would attempt to parse baldur.ini in order to obtain the user-selected language. Should this for any reason fail, WeiDU will default to reading the English TLK. This could be overridden on the command line in the usual manner.

Baldur.ini stores this information as e.g.,
Code: [Select]
'Language', 'Text', 'de_DE' and the game defaults to English if no such row exists.


Writing to the TLKs:

WeiDU would write the install-time strings to all TLKs detected in subdirectories of the lang/ directory. This means that if you install the mod in German, you would have German strings in your English, French etc. TLKs, and if you wanted to switch to a different language you would need to reinstall your mods and, if applicable, make the appropriate fixes to your GAM/SAV files (use NPC string fixers etc.).

STRING_SETs would be backed up and uninstalled in the current manner (by saving and restoring the old string verbatim), except that the strings from each TLK present under lang/ would be backed up to a corresponding directory array among the component's backup files. I.e., instead of STRING_SET strings being backed up to [BACKUP-DIR]/[COMPONENT-NUMBER]/[UNSETSTR-FILES] the English strings would be backed up to [BACKUP-DIR]/[COMPONENT-NUMBER]/en_us/[UNSETSTR-FILES], German strings would be backup up to [BACKUP-DIR]/[COMPONENT-NUMBER]/de_de/[UNSETSTR-FILES] and so on.



Rejected schemes:

Only writing to a single TLK. Because the user installs a mod, changes language and the zombie apocalypse ensues.

Writing each language to their respective TLK. Because WeiDU does not know what the install-time language is and TP2 language declarations are completely unreliable and unsuitable for inferring that information. Mods are also available in languages not offered for BGEE and I want to keep this simple and non-buggy.
« Last Edit: February 26, 2013, 03:03:11 PM by Wisp »

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #1 on: February 17, 2013, 04:15:34 PM »
Reading - nice and sound.

Writing - I suppose one should expect that kind of behavior, if they intend to play each chapter in different language (I find the idea absurd).

Writing each language to their respective TLK. Because WeiDU does not know what the install-time language is and TP2 language declarations are completely unreliable and unsuitable for inferring that information. Mods are also available in languages not offered for BGEE and I want to keep this simple and non-buggy.
I was about to propose the introduction of multi-language TP2 command, where a modder could define %LANGUAGE% <-> TLK pairs, overriding the user-selected LANGUAGE (used as a default for all TLKs) for specified TLKs.
But then again, who would really care to play the game while switching language every launch?

Offline DavidW

  • Planewalker
  • *****
  • Posts: 316
Re: Proposal for BGEE TLK compatibility
« Reply #2 on: February 18, 2013, 02:48:21 AM »
A couple of thoughts:

(i) If you think that writing different languages to different TLK files is other things equal desirable, you could always introduce a new unambiguous TP2 format and have a LANGUAGE_IS_BGEE_OPTIMIZED flag to indicate that it's being used. (I don't know if it's actually desirable or not.)

(ii) Is the zombie apocalypse really so terrible? There's (iirc) no user-friendly way to change language within BG:EE other than reinstalling. So the only people exposed to apocalypse are those who (a) change language mid-game; (b) know enough about the game to do so by editing the ini file; (c) don't realise that doing something radical like that with mods installed is a bad idea. The interection of (a)-(c) is quite small. Having said that, the only downsides of your strategy compared to writing to one dialog file are (i) longer install times (presumably negligible in practice) and (ii) a certain inelegance (than which there are worse things).

Offline CamDawg

  • Infidel
  • Planewalker
  • *****
  • Posts: 859
  • Dreaming of a red Xmas
    • The Gibberlings Three
Re: Proposal for BGEE TLK compatibility
« Reply #3 on: February 18, 2013, 10:56:19 AM »
The only conceivable pitfall I can see is if one language uses characters that may fubar a different language's tlk file--I think every tlk is encoded with the same charset so I'm not sure even that's possible. I understand (and agree with) the decision to write to all tlk files; my only suggestion is that you may want to just use empty/placeholder strings for the non-active ones. Like David and GeN1e mention, players are unlikely to be switching languages much, and I'd say it's on them if they do so.
The Gibberlings Three - Home of IE Mods

The BG2 Fixpack - All the fixes of Baldurdash, plus a few hundred more. Now available, with more fixes being added in every release.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #4 on: February 18, 2013, 03:19:30 PM »
There's (iirc) no user-friendly way to change language within BG:EE other than reinstalling.
You can do it with Options->Language->Restart game. I don't know how common the practice is, but it is easy to do.

The only conceivable pitfall I can see is if one language uses characters that may fubar a different language's tlk file--I think every tlk is encoded with the same charset so I'm not sure even that's possible.
Yeah, they use UTF-8 across the board.


Edit:
my only suggestion is that you may want to just use empty/placeholder strings for the non-active ones. Like David and GeN1e mention, players are unlikely to be switching languages much, and I'd say it's on them if they do so.
I'm not sure that would be practical. WeiDU does not add duplicate strings, so the placeholders would have to have the uniqueness as the real strings or the other TLKs would not have the same number of strings as the active TLK.

Edit 2:
Oh, crap. Suppose that the stringset of the mod partially overlaps the stringset of the active TLK, but not does not overlap the stringsets of the other TLKs (or vice versa). Back to the drawing board, I guess.
« Last Edit: February 18, 2013, 04:05:06 PM by Wisp »

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #5 on: February 18, 2013, 05:33:34 PM »
I think there is another "Oh, crap" here - what happens to string patching? Some of BG2Tweaks' and Item Revisions' components rely heavily on reading a string, REPLACE_TEXTUALLY-ing something within, and then SAY_EVALUATED-ing it back.

And with some imagination, worse stuff may happen when using REPLACE_EVALUATE. E.g. I use it in a couple of IR's components to log encountered bugs in descriptions, but in theory it may as well affect the resulting installation depending on what language has been chosen. An exaggeration it is, most likely, but better safe than sorry?

I'd take zombies then - since it's hardly much worse than the different stringset overlapping, and warn modders to warn players to not change the language without re-installing the mod.

Offline Kaeloree

  • Planewalker
  • *****
  • Posts: 109
Re: Proposal for BGEE TLK compatibility
« Reply #6 on: February 18, 2013, 07:06:50 PM »
I think I'd also say zombies.

As long as we make it very clear to players that they can't change language mid-game, it should be fine.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #7 on: February 19, 2013, 07:21:50 PM »
I think there is another "Oh, crap" here - what happens to string patching? Some of BG2Tweaks' and Item Revisions' components rely heavily on reading a string, REPLACE_TEXTUALLY-ing something within, and then SAY_EVALUATED-ing it back.

And with some imagination, worse stuff may happen when using REPLACE_EVALUATE. E.g. I use it in a couple of IR's components to log encountered bugs in descriptions, but in theory it may as well affect the resulting installation depending on what language has been chosen. An exaggeration it is, most likely, but better safe than sorry?
Just for the record, this behaviour would remain unchanged for any scheme > idea stage.

Offline Isaya

  • Planewalker
  • *****
  • Posts: 47
Re: Proposal for BGEE TLK compatibility
« Reply #8 on: February 23, 2013, 09:07:58 AM »
So the only people exposed to apocalypse are those who (a) change language mid-game; (b) know enough about the game to do so by editing the ini file; (c) don't realise that doing something radical like that with mods installed is a bad idea. The interection of (a)-(c) is quite small.
Like David and GeN1e mention, players are unlikely to be switching languages much, and I'd say it's on them if they do so.
I do not agree with that. With old games, the only way to change language was to overwrite the dialog.tlk file. So you knew what you did, as a player, even if you didn't realize that mods were adding texts to that file. With BGEE, all you need is go in a menu and select another language. And that just works, so people may get used to it for some reason (I am using this when a translation does not look right). Since this behaviour is currently broken by installing mods, I strongly support Wisp's idea of updating all languages at the same time. The player knows he/she installed the mod in a specific language so we can assume there won't be complaints that swithcing language didn't apply to the mod content.

However there is a pitfal to updating all languages : currently the english dialog.tlk is always ahead of other languages. Typically it has more strings than other languages. For instance, version 1.0.2014 goes up to @32149 in english and only @32140 in French (and the last two string are even empty, contrary to English).
If you want to allow switching between languages, I would suggest considering that the English language is always the reference to determine what is the first available StrRef. Then WeiDU should fill the missing strings in other languages, if needed, before adding the mod strings.

I think there is another "Oh, crap" here - what happens to string patching? Some of BG2Tweaks' and Item Revisions' components rely heavily on reading a string, REPLACE_TEXTUALLY-ing something within, and then SAY_EVALUATED-ing it back.

And with some imagination, worse stuff may happen when using REPLACE_EVALUATE. E.g. I use it in a couple of IR's components to log encountered bugs in descriptions, but in theory it may as well affect the resulting installation depending on what language has been chosen. An exaggeration it is, most likely, but better safe than sorry?
This issue is not specific to BGEE and its multiple languages. People who use mods that rely on REPLACE_TEXTUALLY in other languages than the one the mod was written are already likely to face such issues. Because, in another language, there may be various ways of writing something compared to English, so French may require a regexp to catch something while it wasn't anticipated by the mod author. Besides, people who install IR in English because there is no translation in their language, even if the game exists in that language, don't get the text updated either.
BGEE will not bring specific issues regarding these cases, I think.

Should new options be added to WeiDU language related instructions in order to handle character encoding, I believed adding the ability to specify the original encoding of the tra files for each language would help implementing an automated way to convert existing tra files, that use various Windows encoding, to UTF-8 on the fly at install time. Something like
Code: [Select]
LANGUAGE_EE ~Francais~
                      ~CP1252~
                      ~french~
                      ~bg1npc/tra/french/setup.tra~
CP1252 being the code used in iconv for the corresponding encoding. I am assuming that, given the history of mods, the reference encoding for the tra files included in the mod would be the "old" 8 bits one and that WeiDU would only convert if the game is BGEE.

For reference, here is what I experimented with the encoding issues and how to overcome them.

Offline Mike1072

  • Planewalker
  • *****
  • Posts: 298
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #9 on: February 23, 2013, 10:02:16 PM »
Writing each language would be the most desirable and, by far, the most complicated.  WeiDU languages could be linked to the language codes BGEE uses by adding an optional flag to WeiDU's LANGUAGE declarations.  Non-BGEE mods wouldn't need to be updated, so it's a reasonable change.

However, that wouldn't address how WeiDU should handle commands that retrieve tra or string references (SPRINT, READ_STRREF, GET_STRREF), which would essentially need to return multiple results (or have new commands added to access the other results).

Oh, crap. Suppose that the stringset of the mod partially overlaps the stringset of the active TLK, but not does not overlap the stringsets of the other TLKs (or vice versa). Back to the drawing board, I guess.
This would need to be solved for any non-explodey solution to work.  WeiDU would have to be able to insert duplicates of a string into dialog.tlk, when not doing so would cause two language's strings to map to different string references.

The explodey solution would be simplest.  Implementing it would not prevent us revisiting the issue later.

Offline Andrea C.

  • Planewalker
  • *****
  • Posts: 80
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #10 on: February 25, 2013, 01:34:37 PM »
What if a mod was available in more than one language? Would it be possible to make it an option to install more than one language at once on BG:EE?

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #11 on: February 25, 2013, 08:07:44 PM »
However there is a pitfal to updating all languages : currently the english dialog.tlk is always ahead of other languages. Typically it has more strings than other languages. For instance, version 1.0.2014 goes up to @32149 in english and only @32140 in French (and the last two string are even empty, contrary to English).
I did not know that. Yeah, it needs to be accounted for. WeiDU already pads the shorter of dialog.tlk/dialogf.tlk, so you can extend that to pad the non-longest TLKs of all languages.

Should new options be added to WeiDU language related instructions in order to handle character encoding, I believed adding the ability to specify the original encoding of the tra files for each language would help implementing an automated way to convert existing tra files, that use various Windows encoding, to UTF-8 on the fly at install time.
I have considered something like it, but it would be better to declare the encoding on a per-file basis, I think. Aside from the greater flexibility, it also puts the declaration within reach of the translator, who is the one saving the tra file and choosing the encoding.

What if a mod was available in more than one language? Would it be possible to make it an option to install more than one language at once on BG:EE?
This is complicated by the need to keep all strrefs deterministic across all languages, a difficulty further compounded by the TLK-wiping patches. Currently it is not considered for implementation, barring good solutions from the gallery. Having to update everything to be able to cope with multiple parallel strings is a pretty strong disincentive too.


New proposal, ETA tomorrow.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #12 on: February 26, 2013, 02:59:58 PM »
Proposal the second:

General:
The language declaration is extended to take an additional argument. This argument is an IETF language tag and may correspond to the name of one of the subdirectories of the lang/ directory. The argument is case-insensitive and hyphens and underscores are interchangeable. The old format for language declarations remains valid. If the language tag (the new argument) corresponds to one of the lang/ subdirectories, the TLK(s) in that subdirectory are considered to be the active TLK(s). If the old format for language declarations are used, or the language tag does not correspond to any of the subdirectories, WeiDU attempts to parse baldur.ini to find out which is the user-selected language (defaulting to English). The TLK(s) so obtained are then considered to the be active TLK(s).

Pre-processing:
Pre-processing starts by comparing dialog.tlk and dialogf.tlk for all languages and padding the shorter one if lengths differ. (This is current behaviour, sans the multi-language bit.) The longest TLK of all languages is found, and the shorter TLKs, if any, are padded to the same length.


Reading from the TLKs:
Reads are only made from the active dialog.TLK, effectively the same as what we have now.


Writing to the TLKs:
STRING_SETs are only made to the active TLK(s). The uninstall information for STRING_SETs is augmented with information on which TLK was written to. STRING_SETs are uninstalled from this/these TLK(s) regardless of which TLK(s) is/are the active ones. If this augmenting uninstall information is missing (say, because the mod was installed with WeiDU 231), STRING_SETs are uninstalled from the TLK(s) in the root game directory, if any, or the TLK(s) obtained from baldur.ini (or maybe we should drop it if the TLK(s) in the root dir are missing?).

New strings are appended only to the active TLK(s).


Post-processing:
Post-processing is done when the active TLK(s) are ready to be written to disk. The inactive TLKs are padded to be of the same length as the active TLK(s).

Offline DavidW

  • Planewalker
  • *****
  • Posts: 316
Re: Proposal for BGEE TLK compatibility
« Reply #13 on: February 27, 2013, 08:48:07 AM »
From a user perspective, that sounds great - but it also sounds like a nightmare for you to code.

(If you wanted a nice minimal alternative, you could find some string that displays on the start screen and just change it in every TLK but the active one into some warning that you shouldn't change languages with mods installed!)

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #14 on: February 27, 2013, 11:11:00 AM »
Any touching the code on account of multiple TLKs would entail a fair amount of work. This is not that much more. While I'm rooting around in there I may as well do it right.

Offline Mike1072

  • Planewalker
  • *****
  • Posts: 298
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #15 on: February 27, 2013, 11:42:04 PM »
Would the post-processing padding copy the new strings from the active .tlk file to the others or append empty strings to the others?

There are users who prefer to install mods in Language A but will install them in Language B when A is not available.  If the padding consisted of empty strings, after installing two mods in different languages, neither of the .tlk files would be complete.

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #16 on: February 28, 2013, 02:46:08 PM »
Would the post-processing padding copy the new strings from the active .tlk file to the others or append empty strings to the others?
Thinking very carefully, wouldn't copying only brand new English (or whatever) strings to another TLKs result in correct overlapping? E.g. if we have SAY'ed "Buckler" somewhere, then it would set the value to #13709 instead of appending new entry - and #13709 should have already been provided with the translation in other languages.

There will still be a room for language differences, but for the most part installing a mod in any language could then automatically provide the correct translation for re-used strings, leaving only new ones in gibberish.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #17 on: March 02, 2013, 04:03:00 PM »
There are users who prefer to install mods in Language A but will install them in Language B when A is not available.  If the padding consisted of empty strings, after installing two mods in different languages, neither of the .tlk files would be complete.
Good point.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #18 on: March 05, 2013, 04:34:42 PM »
Proposal the third:

General:
If it has not already been done and the results saved, the user is prompted for which TLK set he/she would like to install to (or uninstall from). The result is saved to a config file among the other BGEE files (saves, baldur.ini and such) and reused when any subsequent mods are installed. If the file is deleted, WeiDU asks again. Two command-line options are added, something like "--use-tlk en_us" and "--force-tlk en_us". The first sets the TLK path if unset but is disregarded otherwise, the second sets the TLK path even if it is already set. Both obviously inhibit the prompting. If the argument is invalid (referring to a non-existent directory), WeiDU errors out. We call the TLK(s) so selected the active TLK(s).

The rest works as per proposal #2 (except for the bit about falling back on the TLK(s) obtained from baldur.ini; instead we fall back on the user-specified TLK(s)). We pad with filler strings, not install strings.

(No, it did not take me a week to come up with this.)

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #19 on: March 06, 2013, 12:03:04 AM »
I think a user capable of understanding what in the Nine Hells the TLK files are about, should also be able to edit the TP2 language arguments.
Otherwise you'll just add extra confusion to the installation process, which, simple as it may seem to us, is pretty complicated to others. I've seen enough players preferring to torrent a pre-installed mega-mod game, than read through the BWP manual. Possibly - for I've never researched the matter - it is a Russia-specific issue, but I bet you'll find it spreading when people are presented with more questions they don't really understand.

I vote for the second proposal, padding is done with install strings.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #20 on: March 06, 2013, 10:35:25 AM »
I was thinking more like
Quote
Which language file would you like to install to?
[1] English
[2] French
...
[N] foo_bar (language foo as spoken in country bar, so presented because a new translation was released and WeiDU has yet to be updated)

This would be followed by the usual "choose your language", which could be altered to be a little more clear about what it concerns.

Admittedly there is some slight overlap, but as I see it, it is a robust and unconstraining way of mapping the unknown set of mod-languages onto the set of game languages. It is also something you do once, that is applicable to all mods, rather than something you have to do for each mod (and possibly do slightly differently for each).

Offline Miloch

  • Barbarian
  • Planewalker
  • *****
  • Posts: 1032
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #21 on: March 06, 2013, 05:01:13 PM »
I think that last proposal is sound. There's a difference between game language and mod language. So any ETA on a next version? ;)

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Proposal for BGEE TLK compatibility
« Reply #22 on: March 07, 2013, 06:13:30 AM »
So any ETA on a next version? ;)
Not yet. You can have another beta, if you like. You people like betas, right?

Another problem with proposal #2 that I neglected to mention is what to do with STRING_SETs. If you pad with install strings, STRING_SETs will still be missing, so if two mods install to different TLKs, you still have the incompleteness problem. It's not insurmountable, as you could probably just STRING_SET to all TLKs, as per proposal #1, but it does lead to a more complicated (and thus problem-prone) implementation.

Offline Miloch

  • Barbarian
  • Planewalker
  • *****
  • Posts: 1032
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #23 on: July 26, 2013, 01:41:01 PM »
You can have another beta, if you like. You people like betas, right?
Yes, we like betas... thanks for the latest one. I'm willing to test a beta version of this functionality anyway, and there's been a lot of demand for a solution. We've been discussing it elsewhere and I think the best solution is the simplest one for now. If we need complexities (e.g. mass STRING_SET padding) down the road, maybe those can be implemented later but aren't strictly needed at present IMO.

Edit: Also, we were wondering whether the functionality to recognize saved games in "My Documents" is in - do SAVE_DIRECTORY and MPSAVE_DIRECTORY require explicit arguments or can those paths be determined from the OS registry or .ini files somehow?
« Last Edit: July 26, 2013, 01:50:33 PM by Miloch »

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
Re: Proposal for BGEE TLK compatibility
« Reply #24 on: July 26, 2013, 02:09:50 PM »
As Miloch says, all things considered, the third solution is probably the best. I would just change the prompt to something more exhaustive.

Quote
WeiDU has detected that your game comes with the multi-language support, allowing you to switch the game language in the middle of playing. Due to overcomplicated technical reasons, it is not possible to install third-party mods - such as this - and retain the language switch function without introducing severe glitches, such as the in-game text becoming incorrect or outright missing - suffice to say that over the course of its development WeiDU has never been optimized to simultaneously support more than one language.

In light of the abovementioned, it is necessary that you decide which of the game's pre-existing language files will be used for the purpose of installing third-party mods from now on. Changing the language in game options to another one will result in ill behavior as described above.

Which game language file would you like to use for installing mods?
  • The currently selected language (English)
  • [1] English
    [2] French
    ...
[N] foo_bar (language foo as spoken in country bar, so presented because a new translation was released and WeiDU has yet to be updated)

 

With Quick-Reply you can write a post when viewing a topic without loading a new page. You can still use bulletin board code and smileys as you would in a normal post.

Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.

Name: Email:
Verification:
Type the letters shown in the picture
Listen to the letters / Request another image
Type the letters shown in the picture:
What color is grass?:
What is the seventh word in this sentence?:
What is five minus two (use the full word)?: