Author Topic: regexp newline  (Read 2033 times)

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
regexp newline
« on: February 19, 2010, 03:19:13 PM »
Quote from: area_actors.txt
AR0530 
DEMPRI    Pitre                         xxxxxxxxxxxxxxxxxxxxxxxx
DEMFIG01  Falahar                       xxxxxxxxxxxxxxxxxxxxxxxx
DEMFIG02  Valeria                       xxxxxxxxxxxxxxxxxxxxxxxx
DEMMAG    Dracandros                    xxxxxxxxxxxxxxxxxxxxxxxx
AR2014 
TRGENI01  Khan Zahraa                   xxxxxxxxxxxxxxxxxxxxxxxx
TRGENI03  Faafirah                      xxxxxxxxxxxxxxxxxxxxxxxx


Code: [Select]
<<<<<<<<dummy
>>>>>>>>
COPY dummy dummy.txt
COPY - area_actors.txt area_actors.txt
  REPLACE_EVALUATE ~\(.+\)~ BEGIN
    PATCH_IF STRING_LENGTH ~%MATCH1%~ < 11 BEGIN

      SPRINT areafile ~%MATCH1%~

      INNER_ACTION BEGIN
        APPEND_OUTER dummy.txt ~~~~~SET $"%areafile%"("%actor%")= ~%schedule%~ ~~~~~
      END
    END
  END ~~
Yields this -
Quote
SET $"AR0530 
"("%actor%")= ~%schedule%~
SET $"AR2014 
"("%actor%")= ~%schedule%~ 
Bad. Using this instead of SPRINT areafile ~%MATCH1%~
Code: [Select]
      INNER_PATCH ~%MATCH1%~ BEGIN
        FOR (i=0;i<STRING_LENGTH ~%MATCH1%~;i+=1) BEGIN
          READ_ASCII i char (1)
          PATCH_IF (~%char%~ STRING_COMPARE_REGEXP ~[a-zA-Z0-9]~) BEGIN
            WRITE_BYTE i 0
          END
        END
        READ_ASCII 0 areafile (8) NULL
      END
yields correct thing

Quote
SET $"AR0530"("%actor%")= ~%schedule%~
SET $"AR2014"("%actor%")= ~%schedule%~ 

I would assume that for whatever reason ~\(.+\)~ catches newline character. But I thought a dot is anything except newline?

Also, doesn't readme mention that ~~~~~ allows to store any char, including % in the string? I'm not complaining, quite the opposite in fact, as otherwise it would have taken more labor to do it right, but still it's strange.
« Last Edit: February 19, 2010, 03:20:48 PM by GeN1e »

Offline the bigg

  • The Avatar of Fighter / Thieves
  • Moderator
  • Planewalker
  • *****
  • Posts: 3804
  • Gender: Male
Re: regexp newline
« Reply #1 on: February 19, 2010, 03:39:02 PM »
Maybe it's catching the \r? Either way, OCaml regexp are way beyond broken, and I can't change how they work.

The readme actually mentions ~~~~~ (it has been in since 168 or so) - just search it for ~~~~~. Of course, it isn't easy to find if you don't already know what to look for, but then what is?
Author or Co-Author: WeiDU (http://j.mp/bLtjOn) - Widescreen (http://j.mp/aKAiqG) - Generalized Biffing (http://j.mp/aVgw3U) - Refinements (http://j.mp/bLHoCc) - TB#Tweaks (http://j.mp/ba02Eg) - IWD2Tweaks (http://j.mp/98OFYY) - TB#Characters (http://j.mp/ak8J55) - Traify Tool (http://j.mp/g1Ry9A) - Some mods that I won't mention in public
Maintainer: Semi-Multi Clerics (http://j.mp/9UeIwB) - Nalia Mod (http://j.mp/dng9l0) - Nvidia Fix (http://j.mp/aRWjjg)
Code dumps: Detect custom secondary types (http://j.mp/hVzzXG) - Stutter Investigator (http://j.mp/gdtBn8)

If possible, send diffs, translations and other contributions using Git (http://j.mp/aBZFrq).

Offline Mike1072

  • Planewalker
  • *****
  • Posts: 298
  • Gender: Male
Re: regexp newline
« Reply #2 on: February 19, 2010, 04:28:30 PM »
Maybe it's catching the \r?
Definitely looks like it.

I did a straight-up REPLACE_TEXTUALLY ~\(.+\)~ ~@\1!~ on
Code: [Select]
A
B
C
with the 3 styles of newlines.

When Linux formatted, it gave the expected
Code: [Select]
@A!\n
@B!\n
@C!\n

When Mac formatted, it gave
Code: [Select]
@A\r
B\r
C\r
!

And when Windows formatted, it gave
Code: [Select]
@A\r!\n
@B\r!\n
@C\r!\n

Is it possible to do any pre-processing on regular expressions before they get sent to OCaml?  Turn dots that aren't escaped or in character sets into [^%mnl%%lnl%]?  What I'd really like are some \t, \r, \n shortcuts, so we don't have to use variables....

Offline the bigg

  • The Avatar of Fighter / Thieves
  • Moderator
  • Planewalker
  • *****
  • Posts: 3804
  • Gender: Male
Re: regexp newline
« Reply #3 on: February 19, 2010, 04:32:22 PM »
Regexp preprocessing is a compatibility bomb waiting to explode.
Author or Co-Author: WeiDU (http://j.mp/bLtjOn) - Widescreen (http://j.mp/aKAiqG) - Generalized Biffing (http://j.mp/aVgw3U) - Refinements (http://j.mp/bLHoCc) - TB#Tweaks (http://j.mp/ba02Eg) - IWD2Tweaks (http://j.mp/98OFYY) - TB#Characters (http://j.mp/ak8J55) - Traify Tool (http://j.mp/g1Ry9A) - Some mods that I won't mention in public
Maintainer: Semi-Multi Clerics (http://j.mp/9UeIwB) - Nalia Mod (http://j.mp/dng9l0) - Nvidia Fix (http://j.mp/aRWjjg)
Code dumps: Detect custom secondary types (http://j.mp/hVzzXG) - Stutter Investigator (http://j.mp/gdtBn8)

If possible, send diffs, translations and other contributions using Git (http://j.mp/aBZFrq).

Offline Mike1072

  • Planewalker
  • *****
  • Posts: 298
  • Gender: Male
Re: regexp newline
« Reply #4 on: February 19, 2010, 05:29:41 PM »
Sounds exciting!

Offline the bigg

  • The Avatar of Fighter / Thieves
  • Moderator
  • Planewalker
  • *****
  • Posts: 3804
  • Gender: Male
Re: regexp newline
« Reply #5 on: February 19, 2010, 05:33:29 PM »
Author or Co-Author: WeiDU (http://j.mp/bLtjOn) - Widescreen (http://j.mp/aKAiqG) - Generalized Biffing (http://j.mp/aVgw3U) - Refinements (http://j.mp/bLHoCc) - TB#Tweaks (http://j.mp/ba02Eg) - IWD2Tweaks (http://j.mp/98OFYY) - TB#Characters (http://j.mp/ak8J55) - Traify Tool (http://j.mp/g1Ry9A) - Some mods that I won't mention in public
Maintainer: Semi-Multi Clerics (http://j.mp/9UeIwB) - Nalia Mod (http://j.mp/dng9l0) - Nvidia Fix (http://j.mp/aRWjjg)
Code dumps: Detect custom secondary types (http://j.mp/hVzzXG) - Stutter Investigator (http://j.mp/gdtBn8)

If possible, send diffs, translations and other contributions using Git (http://j.mp/aBZFrq).

Offline cmorgan

  • Planewalker
  • *****
  • Posts: 1424
  • Gender: Male
  • Searcher of Bugs
Re: regexp newline
« Reply #6 on: February 19, 2010, 08:32:18 PM »
Check CamDawg's stuff in Fixpack and bg1noc and some other places - he adss


D:\ie_modding\BG2_Fixpack-v8\bg2fixpack\lib\extra_regexp_vars.tph
Code: [Select]
OUTER_INNER_PATCH ~12~ BEGIN
  WRITE_BYTE 1 0x09
  READ_ASCII 1 tab (1)  // 0x09, tab
  WRITE_BYTE 1 0x0a
  READ_ASCII 1 lnl (1)  // 0x0a, Linux
  WRITE_BYTE 0 0x0d
  READ_ASCII 0 mnl (1)  // 0x0d, Mac
  READ_ASCII 0 wnl (2)  // 0x0d0a, Windows
END

Will this help? (at least you will get a good laugh at my misunderstanding :D )

Offline the bigg

  • The Avatar of Fighter / Thieves
  • Moderator
  • Planewalker
  • *****
  • Posts: 3804
  • Gender: Male
Re: regexp newline
« Reply #7 on: February 20, 2010, 05:13:40 AM »
Mike is asking for that to be performed automatically, actually.
Author or Co-Author: WeiDU (http://j.mp/bLtjOn) - Widescreen (http://j.mp/aKAiqG) - Generalized Biffing (http://j.mp/aVgw3U) - Refinements (http://j.mp/bLHoCc) - TB#Tweaks (http://j.mp/ba02Eg) - IWD2Tweaks (http://j.mp/98OFYY) - TB#Characters (http://j.mp/ak8J55) - Traify Tool (http://j.mp/g1Ry9A) - Some mods that I won't mention in public
Maintainer: Semi-Multi Clerics (http://j.mp/9UeIwB) - Nalia Mod (http://j.mp/dng9l0) - Nvidia Fix (http://j.mp/aRWjjg)
Code dumps: Detect custom secondary types (http://j.mp/hVzzXG) - Stutter Investigator (http://j.mp/gdtBn8)

If possible, send diffs, translations and other contributions using Git (http://j.mp/aBZFrq).

erik

  • Guest
Re: regexp newline
« Reply #8 on: February 22, 2010, 10:18:16 AM »
Definitely going to explode hairily, if changed.

 

With Quick-Reply you can write a post when viewing a topic without loading a new page. You can still use bulletin board code and smileys as you would in a normal post.

Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.

Name: Email:
Verification:
Type the letters shown in the picture
Listen to the letters / Request another image
Type the letters shown in the picture:
What color is grass?:
What is the seventh word in this sentence?:
What is five minus two (use the full word)?: