Author Topic: Is STRING_MATCHES_REGEXP working as intended?  (Read 421 times)

Offline DavidW

  • Planewalker
  • *****
  • Posts: 316
Is STRING_MATCHES_REGEXP working as intended?
« on: December 12, 2023, 08:03:04 AM »
I find that
Code: [Select]
"frog toad" STRING_MATCHES_REGEXP " *"returns 0 (no change), when I assumed it should return 1.

In general I can't seem to get STRING_MATCHES_REGEXP to behave any differently from STRING_CONTAINS_REGEXP.

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Re: Is STRING_MATCHES_REGEXP working as intended?
« Reply #1 on: December 12, 2023, 09:30:04 AM »
The description in the WeiDU Readme appears to be inaccurate.

Looking at the sources it seems that both WeiDU actions are handled by the same function which calls different Ocaml methods internally depending on the input parameters.

The Ocaml method called for STRING_MATCHES_REGEXP is Str.string_match which works differently as described in the WeiDU readme. Any regular expression will return 0 (match) if it matches a substring at the beginning of the source string.

In the example above "f", "frog" and "frog toad" will all return 0. The same is true for " *", since it basically matches an empty substring which always matches at any given string position.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Is STRING_MATCHES_REGEXP working as intended?
« Reply #2 on: December 13, 2023, 02:43:59 PM »
I'm not sure what you mean by WeiDU's description being inaccurate. Can you elaborate?

As you say, the example returns 0 because " *" means "match space 0 or more times" and 0 times is a match. The regexp " +" is not a match with STRING_MATCHES_REGEXP, but it is with STRING_CONTAINS_REGEXP, since in the latter case, it can match the interposing space. This seems to me like it's working as described, with allowances for regexps.

Offline Argent77

  • Planewalker
  • *****
  • Posts: 187
Re: Is STRING_MATCHES_REGEXP working as intended?
« Reply #3 on: December 13, 2023, 03:28:47 PM »
I'm not sure what you mean by WeiDU's description being inaccurate. Can you elaborate?

As you say, the example returns 0 because " *" means "match space 0 or more times" and 0 times is a match. The regexp " +" is not a match with STRING_MATCHES_REGEXP, but it is with STRING_CONTAINS_REGEXP, since in the latter case, it can match the interposing space. This seems to me like it's working as described, with allowances for regexps.

From the WeiDU readme:
String STRING_MATCHES_REGEXP StringAs STRING_COMPARE_CASE, but the second string is treated as a regexp. Thus "AR1005" STRING_MATCHES_REGEXP "AR[0-9]+" evaluates to 0 (“no difference”). You may use STRING_COMPARE_REGEXP as a synonym.

The description implies that STRING_MATCHES_REGEXP behaves identically to STRING_COMPARE_CASE, which matches only if the second string completely matches the first string. That doesn't appear to be the case for STRING_MATCHES_REGEXP which also matches substrings.

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Is STRING_MATCHES_REGEXP working as intended?
« Reply #4 on: December 13, 2023, 04:24:51 PM »
I agree the comparison to STRING_COMPARE_CASE does the description no favours, and now we can also attribute this misunderstanding to it. I think it was simply meant to convey that, like STRING_COMPARE_CASE, STRING_MATCHES_REGEXP returns 0 (false) on a match. The regexp " *" is open-ended so we could argue about what a complete match even means. I would argue current behaviour is simply how regexps work (even in PCRE or other environments), and if you don't use an expression like "^ *$" you can't expect a regexp to not match substrings.

I'll try to do better for the descriptions, including the description of STRING_COMPARE_CASE. "legacy syntax for STRING_EQUAL_CASE" indeed.

« Last Edit: December 13, 2023, 04:29:55 PM by Wisp »

Offline DavidW

  • Planewalker
  • *****
  • Posts: 316
Re: Is STRING_MATCHES_REGEXP working as intended?
« Reply #5 on: December 13, 2023, 07:04:16 PM »
Is the notion of complete match so unclear? I most often use regexp matches in the ACTION_MATCH/PATCH_MATCH context; there it seems well defined. If I do

Code: [Select]
ACTION_MATCH "frog toad" WITH
" *" BEGIN
PRINT "matched"
END
DEFAULT
PRINT "not matched"
END

then I get back 'not matched' because, although ' *' matches a substring of 'frog toad', it doesn't match the whole string. I don't need "^ *$".

Offline Wisp

  • Moderator
  • Planewalker
  • *****
  • Posts: 1176
Re: Is STRING_MATCHES_REGEXP working as intended?
« Reply #6 on: December 14, 2023, 12:18:24 PM »
ACTION/PATCH_MATCH uses an additional condition: the regexp must match && the match length must equal the string length. Possibly I should document this peculiar behaviour as well.
« Last Edit: December 14, 2023, 12:20:45 PM by Wisp »

 

With Quick-Reply you can write a post when viewing a topic without loading a new page. You can still use bulletin board code and smileys as you would in a normal post.

Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.

Name: Email:
Verification:
Type the letters shown in the picture
Listen to the letters / Request another image
Type the letters shown in the picture:
What color is grass?:
What is the seventh word in this sentence?:
What is five minus two (use the full word)?: