Author Topic: Using advanced regular expressions in WeiDU  (Read 2349 times)

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Using advanced regular expressions in WeiDU
« on: March 29, 2017, 06:23:28 PM »
Is it possible to use some kind of "inverse matching" in regular expressions?

I tried to match a whole Lua function definition (starting with "function myFunction()" and ending with the first instance of "end") with the following expression:
Code: [Select]
~[ %TAB%]*function[ %TAB%]+myFunction()\(^\(\(?!end\).\)*[%WNL%]+\)+[ %TAB%]*end~
but didn't succeed.

It works in a fashion with
Code: [Select]
~[ %TAB%]*function[ %TAB%]+myFunction()\(.*[%WNL%]+\)+[ %TAB%]*end~
However, it matches only the last instance of "end" in the whole file which removes about 50% of content. It would have worked with a non-greedy quantifier, but I can't find out how to use it in WeiDU.

Has anyone more ideas?

Offline GeN1e

  • Planewalker
  • *****
  • Posts: 267
  • Gender: Male
Re: Using advanced regular expressions in WeiDU
« Reply #1 on: March 29, 2017, 08:23:30 PM »
Does wrapping it in additional set of \(\) help?
Code: [Select]
~\([ %TAB%]*function[ %TAB%]+myFunction()\(.*[%WNL%]+\)+[ %TAB%]*end\)~I distinctly remember matching individual IF THEN END blocks in scripts without issues, although I don't remember the exact structure.
« Last Edit: March 29, 2017, 08:25:12 PM by GeN1e »

Offline Magus_BGforge

  • Planewalker
  • *****
  • Posts: 75
Re: Using advanced regular expressions in WeiDU
« Reply #2 on: March 30, 2017, 07:42:14 AM »
A function in lua can contain an end that is not an actual end, so it would be imperfect anyway. Maybe it would be better to look for a workaround that doesn't include parsing lua with regex.

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Re: Using advanced regular expressions in WeiDU
« Reply #3 on: March 30, 2017, 08:16:39 AM »
Does wrapping it in additional set of \(\) help?
Doesn't work either. I have also tried escaping/unescaping all kinds of control characters without success.

A function in lua can contain an end that is not an actual end, so it would be imperfect anyway. Maybe it would be better to look for a workaround that doesn't include parsing lua with regex.
Yes, I know. Actually, one of the functions I intend to replace contains several "if" blocks which end with "end". I'm using additional matches in that case. A perfect solution would require a full-featured Lua parser, but that's probably overkill for a simple UI modification and wouldn't easily work with UI.MENU which only contains embedded Lua code fragments.

I have found a solution that is more cumbersome, but works:
Code: [Select]
COPY_EXISTING ~ui.menu~ ~override~
  TEXT_SPRINT text ~replacement text without final "end" line~
  SET textSize = STRING_LENGTH ~%text%~
  SET startIndex = INDEX_BUFFER ( ~[ %TAB%]*function[ %TAB%]+myFunction()~ )
  PATCH_IF (startIndex >= 0) BEGIN
    SET endIndex = INDEX_BUFFER ( ~[ %TAB%]*end~ startIndex )
    PATCH_IF (endIndex > startIndex) BEGIN
      SET searchSize = endIndex - startIndex
      PATCH_IF (textSize > searchSize) BEGIN
        INSERT_BYTES startIndex (textSize - searchSize)
      END ELSE PATCH_IF (textSize < searchSize) BEGIN
        DELETE_BYTES startIndex (searchSize - textSize)
      END
      WRITE_ASCIIE startIndex ~%text%~ (textSize)
    END
  END
« Last Edit: March 30, 2017, 08:38:57 AM by Argent77 »

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Using advanced regular expressions in WeiDU
« Reply #4 on: March 30, 2017, 03:17:59 PM »
To answer the original question, WeiDU uses OCaml's regexps, but OCaml does not use standard regexps (unless you consider Emacs a standard). What's in the Readme is what's available. There is third-party code for, e.g., PCRE, but I can't draw on that because as things are today, there would be licence conflicts and like most modern OCaml projects, they use toolchains that probably don't work too great (if at all) on Windows.

Offline Galactygon

  • Modding since 2002
  • Planewalker
  • *****
  • Posts: 378
  • Gender: Male
  • Creator of spells
Re: Using advanced regular expressions in WeiDU
« Reply #5 on: April 16, 2017, 02:47:43 PM »
I wrote a UI.menu replace function block function (no pun intended) that takes into account any nested if and for loops; it turned out less hassle than expected. See code below with an inlined example of a new UI block.

Code: [Select]
DEFINE_PATCH_FUNCTION "REPLACE_UI_BLOCK"
INT_VAR replaced_ui_block = 0
STR_VAR new_block = ""
find_block = ""
find_bound = "\([^a-z0-9]if[^a-z0-9].*[^a-z0-9]then\|[^a-z0-9]for[^a-z0-9].*[^a-z0-9]do\|[^a-z0-9]end[^a-z0-9]\)"
RET replaced_ui_block // set to 1 if replacement is successful
BEGIN
// Truncate any blank spaces/lines at the end of input variable
INNER_PATCH_SAVE new_block "%new_block%" BEGIN
PATCH_IF BUFFER_LENGTH > 0 BEGIN // truncate at end
READ_ASCII (BUFFER_LENGTH - 1) last (1)
WHILE "%last%" STRING_MATCHES_REGEXP "[ %TAB%%LNL%%MNL%%WNL%]" = 0 AND BUFFER_LENGTH > 0 BEGIN
DELETE_BYTES (BUFFER_LENGTH - 1) 1
READ_ASCII (BUFFER_LENGTH - 1) last (1)
END
END
END
PATCH_IF "%SOURCE_FILE%" STRING_MATCHES_REGEXP "^.+\.menu$" = 0 BEGIN // sanity check
SET index_begin = INDEX_BUFFER ("%find_block%" 0)
PATCH_IF index_begin >= 0 BEGIN // If found a result
SET index_end = index_begin
SET index_depth = 1 // Set depth to 1
WHILE index_depth > 0 AND index_end >= 0 BEGIN // while depth does not return to 0 and not eof
SET index_end = INDEX_BUFFER ("%find_bound%" (index_end + 1))
READ_ASCII index_end index_match (4)
PATCH_IF "%index_match%" STRING_CONTAINS_REGEXP "\(if\|for\)" = 0 BEGIN // if beginning of something i.e. if/for loop
SET index_depth += 1 // increment depth by 1
END ELSE
PATCH_IF "%index_match%" STRING_CONTAINS_REGEXP "end" = 0 BEGIN // if end of something
SET index_depth -= 1 // decrement depth by 1
END
END
// If block can be properly read
PATCH_IF index_depth = 0 AND index_begin < index_end BEGIN
SET index_end += 4 // include the " end" string that completes the block
SET read_length = index_end - index_begin
SET write_length = STRING_LENGTH "%new_block%"
PATCH_IF (write_length > read_length) BEGIN
INSERT_BYTES index_begin (write_length - read_length)
END ELSE
PATCH_IF (write_length < read_length) BEGIN
DELETE_BYTES index_begin (read_length - write_length)
END
WRITE_ASCIIE index_begin "%new_block%" (write_length)
SET replaced_ui_block = 1
END
END
END
END

<<<<<<<< .../%MOD_FOLDER%-Inlined/MyFunction.txt
function MyFunction()
do stuff here
end
>>>>>>>>
COPY - ".../%MOD_FOLDER%-Inlined/MyFunction.txt" ""
READ_ASCII 0 my_block (BUFFER_LENGTH)

// Replace "function updateAttrTable()" in UI.menu with inlined function
COPY_EXISTING "UI.menu" "override"
LPF "REPLACE_UI_BLOCK" STR_VAR new_block = EVAL "%my_block%" find_block = "function updateAttrTable()" END
BUT_ONLY

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Re: Using advanced regular expressions in WeiDU
« Reply #6 on: April 18, 2017, 07:43:10 AM »
Great work!

That will surely help to properly 'WeiDU-ize' a great number of UI mods which are getting more and more popular. I will be using it in my "Resizeable Combat Log" mod for PST:EE.

Btw, I've made some minor changes to the script to also support LUA functions and a few (mainly cosmetic) optimizations, and added a short function description:
Code: [Select]
/**
 * Replaces a complete LUA function in *.menu or *.lua files.
 * STR_VAR new_block      Replacement text for the matching lua function.
 * STR_VAR find_block     Start of the lua function to match (e.g. "function myFunctionToReplace(param1)").
 *                        Match includes any whitespace found directly before the search text.
 * RET replaced_ui_block  Returns 1 if the replacement was successful, 0 otherwise.
 *
 * Original author: Galactygon
 */
DEFINE_PATCH_FUNCTION REPLACE_UI_BLOCK
  STR_VAR
    new_block   = ""
    find_block  = ""
  RET
    replaced_ui_block
BEGIN
  SET replaced_ui_block = 0

  // Truncate any blank spaces/lines at the end of input variable
  INNER_PATCH_SAVE new_block "%new_block%" BEGIN
    PATCH_IF BUFFER_LENGTH > 0 BEGIN // truncate at end
      READ_ASCII (BUFFER_LENGTH - 1) last (1)
      WHILE "%last%" STRING_MATCHES_REGEXP "[ %TAB%%WNL%]" = 0 AND BUFFER_LENGTH > 0 BEGIN
        DELETE_BYTES (BUFFER_LENGTH - 1) 1
        READ_ASCII (BUFFER_LENGTH - 1) last (1)
      END
    END
  END
  PATCH_IF ("%SOURCE_EXT%" STRING_EQUAL_CASE "menu" OR "%SOURCE_EXT%" STRING_EQUAL_CASE "lua") BEGIN // sanity check
    TEXT_SPRINT find_bound "\([^a-z0-9]if[^a-z0-9].*[^a-z0-9]then\|[^a-z0-9]for[^a-z0-9].*[^a-z0-9]do\|[^a-z0-9]end[^a-z0-9]\)"
    SET index_begin = INDEX_BUFFER ("[ %TAB%]*%find_block%" 0)
    PATCH_IF index_begin >= 0 BEGIN // If found a result
      SET index_end = index_begin
      SET index_depth = 1 // Set depth to 1
      WHILE index_depth > 0 AND index_end >= 0 BEGIN // while depth does not return to 0 and not eof
        SET index_end = INDEX_BUFFER ("%find_bound%" (index_end + 1))
        READ_ASCII index_end index_match (4)
        PATCH_IF "%index_match%" STRING_CONTAINS_REGEXP "\(if\|for\)" = 0 BEGIN // if beginning of something i.e. if/for loop
          SET index_depth += 1 // increment depth by 1
        END ELSE
        PATCH_IF "%index_match%" STRING_CONTAINS_REGEXP "end" = 0 BEGIN // if end of something
          SET index_depth -= 1 // decrement depth by 1
        END
      END
      // If block can be properly read
      PATCH_IF index_depth = 0 AND index_begin < index_end BEGIN
        SET index_end += 4 // include the " end" string that completes the block
        SET read_length = index_end - index_begin
        SET write_length = STRING_LENGTH "%new_block%"
        PATCH_IF (write_length > read_length) BEGIN
          INSERT_BYTES index_begin (write_length - read_length)
        END ELSE
        PATCH_IF (write_length < read_length) BEGIN
          DELETE_BYTES index_begin (read_length - write_length)
        END
        WRITE_ASCIIE index_begin "%new_block%" (write_length)
        SET replaced_ui_block = 1
      END
    END
  END
END

// Example usage:
<<<<<<<< .../%MOD_FOLDER%-Inlined/MyFunction.txt
function MyFunction()
do stuff here
end
>>>>>>>>
COPY - ".../%MOD_FOLDER%-Inlined/MyFunction.txt" ""
  READ_ASCII 0 my_block (BUFFER_LENGTH)

// Replace "function updateAttrTable()" in UI.menu with inlined function
COPY_EXISTING "UI.menu" "override"
  LPF REPLACE_UI_BLOCK STR_VAR new_block = EVAL "%my_block%" find_block = EVAL "function updateAttrTable()" END
BUT_ONLY
« Last Edit: April 18, 2017, 01:41:01 PM by Argent77 »

Offline Galactygon

  • Modding since 2002
  • Planewalker
  • *****
  • Posts: 378
  • Gender: Male
  • Creator of spells
Re: Using advanced regular expressions in WeiDU
« Reply #7 on: April 18, 2017, 01:01:23 PM »
Thanks Argent77! After asking you for so much advice I finally get to contribute something in return! ;D

Your updated version seems good to me, I would certainly suggest Wisp to include it in the next WeiDU release. Does %WNL% suffice for all operating systems? I keep using [%LNL%%MNL%%WNL%] just in case.

I've got a couple minor suggestions to the function, mostly "convenience" suggestions for beginners:
1. If %find_bound% is changed by the user then it can mess up the if/for "depth" detection so I suggest just placing it above the INNER_PATCH_SAVE new_block "%new_block%" line so it doesn't become changeable by the user.
2. %replaced_ui_block% should be set to 0 at the beginning of the function so it overrides whatever the user sets.
3. I would place an "EVAL" before %find_block% in the example code, even if there are no variables set in the example. I make this mistake *all the time* of forgetting to place "EVAL" where it should be placed.

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Re: Using advanced regular expressions in WeiDU
« Reply #8 on: April 18, 2017, 01:41:35 PM »
MNL (\r) and LNL (\n) are redundant inside regex square brackets because WNL (\r\n) includes both already.

I've updated the code in my previous post.

Offline Galactygon

  • Modding since 2002
  • Planewalker
  • *****
  • Posts: 378
  • Gender: Male
  • Creator of spells
Re: Using advanced regular expressions in WeiDU
« Reply #9 on: April 18, 2017, 04:38:46 PM »
Note that not setting %new_block% will result in the found block being deleted.

Here's another and simpler function I wrote "INSERT_UI_BLOCK", where I've set %NEWLINE% to different values based on the OS:
Code: [Select]
/**
 * Inserts a complete LUA function in *.menu or *.lua files.
 * STR_VAR new_block      Insertion text. Two newlines are automatically added after the text.
 * STR_VAR find_block     Text will be inserted before the following text (e.g. "function myFunctionToReplace(param1)").
 *                        Match includes any whitespace found directly before the search text.
 * RET added_ui_block     Returns 1 if the insertion was successful, 0 otherwise.
 *
 * Original author: Galactygon
 */
DEFINE_PATCH_FUNCTION INSERT_UI_BLOCK
  STR_VAR
    new_block = ""
    find_block = ""
  RET
    added_ui_block // set to 1 if insertion is successful
BEGIN
  SET added_ui_block = 0 // reset to default value
  // Truncate any blank spaces/lines at the end of input variable
  INNER_PATCH_SAVE new_block "%new_block%" BEGIN
    PATCH_IF BUFFER_LENGTH > 0 BEGIN // truncate at end
      READ_ASCII (BUFFER_LENGTH - 1) last (1)
      WHILE "%last%" STRING_MATCHES_REGEXP "[ %TAB%%LNL%%MNL%%WNL%]" = 0 AND BUFFER_LENGTH > 0 BEGIN
        DELETE_BYTES (BUFFER_LENGTH - 1) 1
        READ_ASCII (BUFFER_LENGTH - 1) last (1)
      END
    END
  END
  // Set OS-specific newline variable
  PATCH_IF "%WEIDU_OS%" STRING_EQUAL_CASE "unix" BEGIN
    SPRINT NEWLINE "%LNL%"
  END ELSE BEGIN
    SPRINT NEWLINE "%WNL%"
  END
  PATCH_IF ("%SOURCE_EXT%" STRING_EQUAL_CASE "menu" OR "%SOURCE_EXT%" STRING_EQUAL_CASE "lua") BEGIN // sanity check
    SET index_begin = INDEX_BUFFER ("[ %TAB%]*%find_block%" 0)
    PATCH_IF index_begin >= 0 BEGIN // If found a result
      SET write_length = STRING_LENGTH "%new_block%"
      INSERT_BYTES index_begin (write_length + 2)
      WRITE_ASCIIE index_begin "%new_block%%NEWLINE%%NEWLINE%" (write_length + 2)
      SET added_ui_block = 1
    END
  END
END

// Example usage:
<<<<<<<< .../%MOD_FOLDER%-Inlined/MyFunction.txt
function MyFunction()
do stuff here
end
>>>>>>>>
COPY - ".../%MOD_FOLDER%-Inlined/MyFunction.txt" ""
  READ_ASCII 0 my_block (BUFFER_LENGTH)

// Insert inlined function before "function initAbilities()" in UI.menu
COPY_EXISTING "UI.menu" "override"
  LPF INSERT_UI_BLOCK STR_VAR new_block = EVAL "%my_block%" find_block = EVAL "function initAbilities()" END
BUT_ONLY

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Re: Using advanced regular expressions in WeiDU
« Reply #10 on: April 19, 2017, 07:54:37 AM »
You can remove %MNL% completely. It was only used by ancient pre-OSX Macintoshs. Modern Mac OS X uses unix-style line breaks.

Offline Galactygon

  • Modding since 2002
  • Planewalker
  • *****
  • Posts: 378
  • Gender: Male
  • Creator of spells
Re: Using advanced regular expressions in WeiDU
« Reply #11 on: April 26, 2017, 12:50:12 PM »
I've modified the PATCH_FUNCTION INSERT_UI_BLOCK above for future reference. I've padded INDEX_BUFFER with blank spaces, so it's now INDEX_BUFFER ("[ %TAB%]*%find_block%" 0). I've also changed the setting of %NEWLINE% and cleaned up the function header a bit. Furthermore, I've included a check for "lua" extension check.

@Argent
I forgot to include while ... do loops in the REPLACE_UI_BLOCK function. Can you change the following lines?
1.
Code: [Select]
TEXT_SPRINT find_bound "[^a-z0-9]\(if[^a-z0-9].*[^a-z0-9]then\|for[^a-z0-9].*[^a-z0-9]do\|while[^a-z0-9].*[^a-z0-9]do\|end\)[^a-z0-9]"2.
Code: [Select]
PATCH_IF "%index_match%" STRING_CONTAINS_REGEXP "\(if\|for\|whi\)" = 0 BEGIN // if beginning of something i.e. if/for/while loop

 

With Quick-Reply you can write a post when viewing a topic without loading a new page. You can still use bulletin board code and smileys as you would in a normal post.

Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.

Name: Email:
Verification:
Type the letters shown in the picture
Listen to the letters / Request another image
Type the letters shown in the picture:
What color is grass?:
What is the seventh word in this sentence?:
What is five minus two (use the full word)?: