Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gen2-CORE Filtering chapter update. #320

Merged
merged 5 commits into from Jan 7, 2020
Merged

Gen2-CORE Filtering chapter update. #320

merged 5 commits into from Jan 7, 2020

Conversation

UlfBj
Copy link
Contributor

@UlfBj UlfBj commented Nov 22, 2019

Browser version of Gen2-CORE document including PR:
https://raw.githack.com/UlfBj/automotive/gh-pages/spec/Gen2_Core.html

@UlfBj UlfBj added the VISS v2 Generation Two of the spec label Nov 22, 2019
@@ -178,16 +174,17 @@ <h2>Data model</h2>
<section data-dfn-for="address">
<h2>Addressing</h2>
<p>Addressing of elements is done using URIs as defined in [[RFC3986]].</p>
<blockquote><a>scheme</a>:<a>authority</a>/<a>path</a></blockquote>
<blockquote><a>scheme</a>:<a>authority</a>/<a>path</a>?<a>query</a></blockquote>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to distinguish cleary between addressing and filtering.
BTW, in the past we found that describing the filtering within the query of an URI is complicated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to match my findings - see details below. For most of the described filters I see no problem, but for the path filter I also feel that it deals with "addressing".

where<br>
- the question mark is the delimiter between the request path and the query.<br>
- reserved-word must have the dollar-sign as the first character ($). The availabe reserved words are described in the chapters below.<br>
- logical-operator is one of either the equal sign (=), the larger than sign (>), or the smaller than sign (<).<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparison-operator

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@peterMelco
Copy link
Contributor

I believe that we could still use the equal operator as is, and not invent a new scheme for doing this. Consider adding operators separate from the equal sign and for example use brackets to enclose the operators on the key name.

[EQ],[GT],[LT]
http GET:Vehicle/Cabin/Door?$path=*/*/IsOpen
vehicle/Cabin/Door/Row1/Right/Shade/Position?$range[GT]=15 AND $change[GT]=5

This way we can add operators if needed and be compatible with json query string libraries. It would also be in my opinion more inline with what other has done and thus simplify client development. Downside would be server side parsing and grouping filters....maybe.

@UlfBj
Copy link
Contributor Author

UlfBj commented Dec 4, 2019

According to RFC3986, the allowed characters in a query component are as shown below.
Which means that "[ and "]" are not available.
I think one should avoid most sub-delims, I found e. g. that the TCP/IP lib on my computers did not accept "&" (but on the other hand "$").
Any of the "-" / "." / "_" / "~" characters would probably be the safest bet.
BR
Ulf

query = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

pct-encoded = "%" HEXDIG HEXDIG

sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="

@gunnarx
Copy link

gunnarx commented Dec 5, 2019

I think one should avoid most sub-delims

Just to clarify why you say this. The reason was that some libraries are not following standard, or is there also another reason?

As long as it's an acceptable use we can (should) also weigh in usability and what people might find most intuitive. But as I said, I believe some fundamental group inside W3C could "rule" on this point if needed. Perhaps it's not needed since we seem to have found that it is OK according to specification?

Any of the "-" / "." / "_" / "~" characters would probably be the safest bet.

Did any of the libraries you found not allow * ?

According to RFC3986, the allowed characters in a query component are as shown below.

Just to clarify, my proposal was to use * in the "URL part", not the query component. But if I read your input correctly then the sub-delim * is also OK in the query?

@gunnarx
Copy link

gunnarx commented Dec 5, 2019

Web protocols and implementations are a bit messy, but I think the most common approach is that everyone simply accept the required work to keep to the standard. It lies is in the implementation complexity, but not visible to the user.

What I mean is, one proposal is the original commit you had using "<" and ">". The standards simply require the client (whether this is a web browser or something else) to encode them correctly in the request.

Most web browsers will do this automatically,
Example. Typing something like this in the URL bar:

https://www.google.com/<

will return:

The requested URL /%3C was not found on this server

So %3C was actually requested.

... so I think Gen 2 clients could do this encoding if we specify that they should. As you know, programming libraries often have these functions readily available. Then what is shown to the user tends to be the non-encoded version. If we do that, then I think <>= makes it the most readable and understandable for the user?

This is how most web protocols and applications work, as far as I can tell. So at the moment I would like to propose it rather than the EQ / GT / LT syntax.

By the way, have we missed less-or-equal and greater-or-equal?

@gunnarx gunnarx mentioned this pull request Dec 5, 2019
@UlfBj
Copy link
Contributor Author

UlfBj commented Dec 5, 2019

The reason was that some libraries are not following standard, or is there also another reason?

No.

Did any of the libraries you found not allow * ?

I have not done any systematic investigation, my comments come from the result I see when using the libraries on my own machines.

  • works in my system, also =, but not & (which as you mention gets recoded into a %hex representation, so it is not impossible to use).

But if I read your input correctly then the sub-delim * is also OK in the query?

Yes.

@UlfBj
Copy link
Contributor Author

UlfBj commented Dec 5, 2019

Then what is shown to the user tends to be the non-encoded version.

This might be a bit too optimistic, it has not been the case in my logs etc anyway.

If we do that, then I think <>= makes it the most readable and understandable for the user?

Assuming that "all" libraries do recoding to %hex and not throwing it away, it is a possibility.

By the way, have we missed less-or-equal and greater-or-equal?

I deliberately omitted them for simplicity, by tweaking the comparison value you can achieve the same.

<h3>Subscribe filtering</h2>
<p>The available subscribe filtering options are presented in the following chapters, and is only applicable to subscribe requests.
</p>
<section data-dfn-for="interval-filter">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned in the call that I did not like modifying the subscribe interval with a filter, so I add it as a comment here:

  • The filter function should, as I feel it, primarily be filtering/querying the addressed node paths, and for that purpose it is OK to use it for subscribe and it will be identical for GET and subscribe (as proposed).

I don't recall if you also showed a subscribe request using REST in your demo?
If you did, I think it must be clarified*** in the specification and not only in the implementation demo how that shall behave (is it long polling on a GET request?).
If you did not do it for REST, then I think we should be more free to define the websocket protocol in a way that suits it because it is no longer dealing with equivalence to the REST/HTTP version (and therefore specify interval in the JSON request instead). Finally, in my opinion VISSv1 backward compatibility should be kept if there is no strong reason to change.

***Actually my first preference is against subscription on REST/HTTP.

</section>
<section data-dfn-for="search-filter">
<h3>Search filter</h2>
<p>If the path in a read request does not terminate in a leaf node, then the response will contain values from all leaf nodes in the subtree given by the path. The search filter makes it possible to taylor a subset of this response. The search query has the structure<br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/taylor/tailor/

</section>
<section data-dfn-for="search-filter">
<h3>Search filter</h2>
<p>If the path in a read request does not terminate in a leaf node, then the response will contain values from all leaf nodes in the subtree given by the path. The search filter makes it possible to taylor a subset of this response. The search query has the structure<br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path in a read request does not terminate in a leaf node, then the response will contain values from all leaf nodes in the subtree given by the path.

Is this the right behavior? Alternatively such a request, if made by accident or deliberately, could return either information about the branch node only (as appropriate depending on protocol) or perhaps a failure. From a usage perspective I can see it just as easy to make an explicit request for multiple matches (using a wildcard) and from implementation/spec perspective it seems better to require a multi-node match to be explicit. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right that there could be a risk for making undeliberate requests for multiple matches, which would be reduced by requiring wildcard usage.
So a request like
GET /Vehicle/Cabin
would then return an error code, while
GET /Vehicle/Cabin?$pathEQ*
would return data from all leaf nodes on that subtree (please neglect the EQ terminology here).
That would be Ok with me.
For service discovery request, like
GET /Vehicle/Cabin?$spec=0
I do not see the need to also add a wildcard, as the query part cannot be added undeliberately.

where<br>
- $data is the reserved word for data value filtering.<br>
- logical-operator is one of either the equal sign (=), the larger than sign (>), or the smaller than sign (<).<br>
- value is any of the supported number data types, or the boolean data type.<br><br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"value is any of the supported number data types", this can be written more strictly. The value is a numerical value? I first read it as a filter on data type i.e. it is the name of any of the supported data types. Then I had to check back that it was a value filter. So it should be specified that it is a value and how such values are encoded in standard URLs (if required). What is the standard encoding of a float for example? If URLs don't define this an easy solution would be to say that it shall follow for example Javascript syntax rules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Updated to Javascript syntax rules.

<dfn>$interval = value</dfn><br>
where<br>
- $interval is the reserved word for interval filtering.<br>
- the equal sign (=) is the only allowed logical operator for search filtering.<br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/search filtering/interval filtering/ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -178,16 +174,17 @@ <h2>Data model</h2>
<section data-dfn-for="address">
<h2>Addressing</h2>
<p>Addressing of elements is done using URIs as defined in [[RFC3986]].</p>
<blockquote><a>scheme</a>:<a>authority</a>/<a>path</a></blockquote>
<blockquote><a>scheme</a>:<a>authority</a>/<a>path</a>?<a>query</a></blockquote>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to match my findings - see details below. For most of the described filters I see no problem, but for the path filter I also feel that it deals with "addressing".

where<br>
- the question mark is the delimiter between the request path and the query.<br>
- reserved-word must have the dollar-sign as the first character ($). The availabe reserved words are described in the chapters below.<br>
- logical-operator is one of either the equal sign (=), the larger than sign (>), or the smaller than sign (<).<br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

- $path is the reserved word for search filtering.<br>
- the equal sign (=) is the only allowed logical operator for search filtering.<br>
- search-expression is a path expression that may contain the wildcard character (*) for representation of an unknown node name.<br><br>
The search-expression is relative to the root node given by the path component in the request. An example could be "*/*/isOpen", which, preceeded with a slash, and concatenated with the request root-path "Vehicle/Cabin/Door" would generate the absolute search expression "Vehicle/Cabin/Door/*/*/isOpen", in which case the response would contain all values from the isOpen nodes in that subtree, but not from the other possible leaf nodes in it.<br><br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You describe the specification as a "filter" and my gut tells me the specification should then apply a "filter" on what was specified in the URL, rather than as it looks here, add missing parts of the URL. However, I can see how this matches your idea that addressing a branch node means getting all of the subtree returned (which I also had a different opinion on).

So another interpretation (or another way to specify) is to provide a path, then to apply the filter on top of that path. This would here address
Vehicle/Cabin/Door but Cabin and Door would be overlayed by the matching filter */* and match Vehicle/*/*/isOpen. Now I'm not proposing that, because it is confusing, but I also think splitting the path definition into the URL part and the HTTP Query String is unfortunate.

So... in summary, when it comes to an advanced query language, yes, it likely must be placed into the HTTP Query String, but for simple wildcards, could we investigate if the URL could handle them, (as it was also done for VISS before)..

There seems to be different opinions when searching the internet for if the asterisk * is allowed in a URL, but the most recent I have found say that it is allowed, and that some web sites use it already. W3C is an authority on web standards... so I'd really appreciate if that question is lifted up to discussion in other groups (@tguild)

If * is acceptable, I much prefer the more direct approach of:

GET https://<server>/Vehicle/Cabin/*/*/isOpen

Other more advanced queries might still need the query language (and the query string).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can agree that IF wildcard is universally acceptable directly in the path component of a URL, then it would be preferable.
Until we have resolved that I advocate for keeping it as is in the PR, as that is a valid solution.

The data value query has the structure<br>
<dfn>$data logical-operator value</dfn><br>
where<br>
- $data is the reserved word for data value filtering.<br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the keywords seem special because they use $word. Is it not enough to just use word=something ? I believe that is a more typical usage of HTTP Query String in other web protocols?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $word usage I got inspired by VIWI to use. We can change that later if that is preferred by WG.

<h3>Data value filter</h2>
<p>If a request, typically when it is addressing a subtree, is only interested in response data with a specific value, then a data value filter can be used.
The data value query has the structure<br>
<dfn>$data logical-operator value</dfn><br>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comment seemed to be in the wrong place.
I was proposing: What do you think about using the keyword name value for the value filter, instead of the name data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (minor) problem I see with that is that value is a common placeholder name for what is on the right side of these expressions.
E.g.
$range comparison-operator value
For this it would then be
$value comparison-operator value
If it is backed by some more I will change:)

@tguild tguild merged commit 63ebd2f into w3c:gh-pages Jan 7, 2020
@tguild
Copy link
Member

tguild commented Jan 7, 2020

agreed to adopt before December break and additional aspects to be raised as separate issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VISS v2 Generation Two of the spec
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants