Partial regexp matching

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Partial regexp matching

Isiah Meadows-2
I've been working on a test framework, and I'd love to implement
support for matching tests via regexp-based selectors. It's basically
impossible without the ability to execute a regular expression and
test if it matched positively, negatively, or incompletely.

- If the regexp does not have an end marker, this, of course, can't
generate a *negative* match, only positive/incomplete ones.
- If the regexp *does* have an end marker, this is where I actually
need native support. (This is especially true if group references get
involved.)

Any chance this could get added?

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

Isiah Meadows-2
Here's a few examples:

- C++ via Boost:
http://www.boost.org/doc/libs/1_66_0/libs/regex/doc/html/boost_regex/partial_matches.html
- Java via `java.util.regex.Matcher::hitEnd`:
https://docs.oracle.com/javase/9/docs/api/java/util/regex/Matcher.html#hitEnd--
- Python via `regex` (enabled via keyword arg):
https://pypi.python.org/pypi/regex

Any of these would suffice to solve my issue.

Also, I'm not the first to request this:
https://esdiscuss.org/topic/partial-matching-a-string-against-a-regex

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com


On Thu, Feb 15, 2018 at 10:04 AM, Peter Jaszkowiak <[hidden email]> wrote:

> Do _any_ languages have support for this? It doesn't sound that useful, and
> I have no idea what you mean by "matching tests via regexp-based selectors".
> Can you provide an example?
>
> On Feb 15, 2018 06:12, "Isiah Meadows" <[hidden email]> wrote:
>
> I've been working on a test framework, and I'd love to implement
> support for matching tests via regexp-based selectors. It's basically
>
> impossible without the ability to execute a regular expression and
> test if it matched positively, negatively, or incompletely.
>
> - If the regexp does not have an end marker, this, of course, can't
> generate a *negative* match, only positive/incomplete ones.
> - If the regexp *does* have an end marker, this is where I actually
> need native support. (This is especially true if group references get
> involved.)
>
> Any chance this could get added?
>
> -----
>
> Isiah Meadows
> [hidden email]
>
> Looking for web consulting? Or a new website?
> Send me an email and we can get started.
> www.isiahmeadows.com
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

Mike Samuel
In reply to this post by Isiah Meadows-2
So you want an API that can indicate "might match if there were more input"?

Is it ok if it is conservative?  I.e. it won't incorrectly say "definitely wouldn't match given more input" but could tolerate errors the other way?

For example, /^food(?!)/ would have to say no for "foop" but we might tolerate a maybe for "foo".


On Feb 15, 2018 8:12 AM, "Isiah Meadows" <[hidden email]> wrote:
I've been working on a test framework, and I'd love to implement
support for matching tests via regexp-based selectors. It's basically
impossible without the ability to execute a regular expression and
test if it matched positively, negatively, or incompletely.

- If the regexp does not have an end marker, this, of course, can't
generate a *negative* match, only positive/incomplete ones.
- If the regexp *does* have an end marker, this is where I actually
need native support. (This is especially true if group references get
involved.)

Any chance this could get added?

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

Isiah Meadows-2
Yes, and I specifically want the conservative variant - I'd want a
"maybe" for "foo" in that example.

(For context, my test framework determines *while running a test*
whether that test has children, and checks whether to allocate them
when defining them. For me, I only need "yes/maybe" and "no", but
splitting "yes" and "maybe" could be beneficial to others.)

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com


On Sun, Feb 18, 2018 at 4:19 PM, Mike Samuel <[hidden email]> wrote:

> So you want an API that can indicate "might match if there were more input"?
>
> Is it ok if it is conservative?  I.e. it won't incorrectly say "definitely
> wouldn't match given more input" but could tolerate errors the other way?
>
> For example, /^food(?!)/ would have to say no for "foop" but we might
> tolerate a maybe for "foo".
>
>
> On Feb 15, 2018 8:12 AM, "Isiah Meadows" <[hidden email]> wrote:
>
> I've been working on a test framework, and I'd love to implement
> support for matching tests via regexp-based selectors. It's basically
> impossible without the ability to execute a regular expression and
> test if it matched positively, negatively, or incompletely.
>
> - If the regexp does not have an end marker, this, of course, can't
> generate a *negative* match, only positive/incomplete ones.
> - If the regexp *does* have an end marker, this is where I actually
> need native support. (This is especially true if group references get
> involved.)
>
> Any chance this could get added?
>
> -----
>
> Isiah Meadows
> [hidden email]
>
> Looking for web consulting? Or a new website?
> Send me an email and we can get started.
> www.isiahmeadows.com
> _______________________________________________
> es-discuss mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/es-discuss
>
>
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

Mike Samuel


On Mon, Feb 19, 2018 at 11:43 AM, Isiah Meadows <[hidden email]> wrote:
Yes, and I specifically want the conservative variant - I'd want a
"maybe" for "foo" in that example.

(For context, my test framework determines *while running a test*
whether that test has children, and checks whether to allocate them
when defining them. For me, I only need "yes/maybe" and "no", but
splitting "yes" and "maybe" could be beneficial to others.)

Ok, so you're trying to decide whether to prune a graph search based
on a regexp that classifies paths through the graph.

We can't base a distinction between yes and maybe on whether a
zero-width $ assertion is triggered if there are paths to completion
that do not pass through that assertion.


const re = /^foo($|d).?/

// This variant uses the $ assertion
console.log(
  re.exec("foo")[0] === 'foo')
// yes would be inappropriate, but yes|maybe would be because

// This variant uses the d
console.log(
  re.exec("food")[0] === 'food')
// and yes|maybe would be appropriate here since

console.log(
  re.exec("foods")[0] === 'foods')


So IIUC, the yes/maybe distinction could be based on a bit that is set on success of a $ assertion and erased on exit from any of ( ...|... , ...? , ...* , ...{0,...} , (?!...) ).
That only works when we know that the start of the match is stable though because

const re = /foo$/

console.log(
  re.test('foo'))
console.log(
  re.test('foofoo'))


It would hold for the common case /^...$/ but in that case you already know the answer, and can test it at runtime by testing myRegExp.source matches a meta-pattern like /^\^([^\\]|\\[\s\s])*\$$/.





_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

Isiah Meadows-2
Inline.


On Mon, Feb 19, 2018 at 12:39 PM, Mike Samuel <[hidden email]> wrote:

>
>
> On Mon, Feb 19, 2018 at 11:43 AM, Isiah Meadows <[hidden email]>
> wrote:
>>
>> Yes, and I specifically want the conservative variant - I'd want a
>> "maybe" for "foo" in that example.
>>
>> (For context, my test framework determines *while running a test*
>> whether that test has children, and checks whether to allocate them
>> when defining them. For me, I only need "yes/maybe" and "no", but
>> splitting "yes" and "maybe" could be beneficial to others.)
>
>
> Ok, so you're trying to decide whether to prune a graph search based
> on a regexp that classifies paths through the graph.

That would be the correct understanding. (I typically avoid CS jargon
since I never got the formal education and I rarely converse with
people well-educated in it, so apologies if me not using the technical
term made things harder any.)

>
> We can't base a distinction between yes and maybe on whether a
> zero-width $ assertion is triggered if there are paths to completion
> that do not pass through that assertion.
>
>
> const re = /^foo($|d).?/
>
> // This variant uses the $ assertion
> console.log(
>   re.exec("foo")[0] === 'foo')
> // yes would be inappropriate, but yes|maybe would be because
>
> // This variant uses the d
> console.log(
>   re.exec("food")[0] === 'food')
> // and yes|maybe would be appropriate here since
>
> console.log(
>   re.exec("foods")[0] === 'foods')
>

This particular scenario would not matter to me directly because all I
need is a "could this match now or potentially later". The optional
end would be fine, since I'd have the invariant that when I check each
child, I'll be adding a space along with the next test's name anyways
(and thus won't have a `d` to worry about).

As for whether it should consider it "ended", I think that's something
that could probably be spec'd out in a proposal repo, and I doubt
that'd be a blocker for stage 1 (that's typically a stage 2 concern).

>
>
> So IIUC, the yes/maybe distinction could be based on a bit that is set on
> success of a $ assertion and erased on exit from any of ( ...|... , ...? ,
> ...* , ...{0,...} , (?!...) ).
> That only works when we know that the start of the match is stable though
> because
>
> const re = /foo$/
>
> console.log(
>   re.test('foo'))
> console.log(
>   re.test('foofoo'))
>
>
>
> It would hold for the common case /^...$/ but in that case you already know
> the answer, and can test it at runtime by testing myRegExp.source matches a
> meta-pattern like /^\^([^\\]|\\[\s\s])*\$$/.
>
>
>
>

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

Mike Samuel


On Mon, Feb 19, 2018 at 1:15 PM, Isiah Meadows <[hidden email]> wrote:

>
> We can't base a distinction between yes and maybe on whether a
> zero-width $ assertion is triggered if there are paths to completion
> that do not pass through that assertion.
>
>
> const re = /^foo($|d).?/
>
> // This variant uses the $ assertion
> console.log(
>   re.exec("foo")[0] === 'foo')
> // yes would be inappropriate, but yes|maybe would be because
>
> // This variant uses the d
> console.log(
>   re.exec("food")[0] === 'food')
> // and yes|maybe would be appropriate here since
>
> console.log(
>   re.exec("foods")[0] === 'foods')
>

This particular scenario would not matter to me directly because all I
need is a "could this match now or potentially later". The optional
end would be fine, since I'd have the invariant that when I check each
child, I'll be adding a space along with the next test's name anyways
(and thus won't have a `d` to worry about).

As for whether it should consider it "ended", I think that's something
that could probably be spec'd out in a proposal repo, and I doubt
that'd be a blocker for stage 1 (that's typically a stage 2 concern).

Fwiw, it sounds like a fine idea to me.

_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Partial regexp matching

kai zhu
In reply to this post by Isiah Meadows-2
i have something, although not about partial-regexp-matching, could address the bigger-picture UX problem you have in testing.  i do frequent browser integration-tests with test-coverage, my standard-operating-procedure for pre-commits is to:

[1] run all tests with no url-based selector





[2] if issues arise, then re-run problematic tests with a url-search-parameter for further debugging.  here’s comma-separated example to select 2 problematic tests:
http://...?modeTestCase=_testCase_testRunDefault_failure,testCase_ajax_timeout)


[3] and here’s real-world source-code for implementing the ?modeTestCase=... test-selector

```js
// read modeTestCase search-param from browser-url
switch (local.modeJs) {
case 'browser':
    location.search.replace(
        (/\b(NODE_ENV|mode[A-Z]\w+|timeExit|timeoutDefault)=([^&#]+)/g),
        function (match0, key, value) {
            match0 = decodeURIComponent(value);
            local[key] = local.env[key] = match0;
            // try to JSON.parse the string
            local.tryCatchOnError(function () {
                local[key] = JSON.parse(match0);
            }, local.nop);
        }
    );
    break;
...

// filter testCases with modeTestCase
Object.keys(options).forEach(function (key) {
    // add testCase options[key] to testPlatform.testCaseList
    if (typeof options[key] === 'function' && (local.modeTestCase
            ? local.modeTestCase.split(',').indexOf(key) >= 0
            : key.indexOf('testCase_') === 0)) {
        testPlatform.testCaseList.push({
            name: key,
            status: 'pending',
            onTestCase: options[key]
        });
    }
});

```


On 15 Feb 2018, at 8:12 PM, Isiah Meadows <[hidden email]> wrote:

I've been working on a test framework, and I'd love to implement
support for matching tests via regexp-based selectors. It's basically
impossible without the ability to execute a regular expression and
test if it matched positively, negatively, or incompletely.

- If the regexp does not have an end marker, this, of course, can't
generate a *negative* match, only positive/incomplete ones.
- If the regexp *does* have an end marker, this is where I actually
need native support. (This is especially true if group references get
involved.)

Any chance this could get added?

-----

Isiah Meadows
[hidden email]

Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com
_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss


_______________________________________________
es-discuss mailing list
[hidden email]
https://mail.mozilla.org/listinfo/es-discuss