#2557 WispService fails on unescaped query portion of URI

jhughes Mon 29 Aug 2016

I setup a WebServer using the WispService hello example as a starting point. I modified the code to pull out the query portion of a uri and just typed some random text into it and I was able to decode it and print it out no problem. Now if I put in the real data that has a lot of special characters, the onGet method is never called. No error is thrown to the console either so where it's failing in the WispService and/or WebMod i'm not sure.

I've been able to prove this is with the special characters since I took the query portion of the URI, put it through a URI encoder and pasted it back in and I got the exact decoded results printed to the console without changing anything in the code.

So I guess my question boils down to:

  1. Is there a way to pre-process the get request before it's passed to the onGet method?
  2. If 1 is not possible, is there a way to find and handle the error that is being thrown that would essentially allow me to re-direct the uri back into the wisp service with the fixed query string?

brian Mon 29 Aug 2016

It largely depends on what exactly you are doing. You didn't really specify what the URI is and exactly how you are making the request to the server. If you have a URI that be decoded via sys::Uri.decode then it should work. Otherwise its going to assume that you have an invalid HTTP request and will immediately close the socket since its a malformed request that can't even parse the first section.

If you really want to see what is going on you can add some echo statements into WispActor in the parseReq method

jhughes Mon 29 Aug 2016

The URI is just the ip:port/?<query string>. Once the query string has special characters, the doGet method is never reached. I am just using a browser to paste in the full URL to resolve.

Could you elaborate on where/how to access the parseReq method? The only modifications i've done to the web-hello.fan example at this point was just to extract the query portion (req.modRel().query()) of the URI so if you could use that base code to explain how I would go about accessing the WispActor and parseReq that would be extremely helpful.

SlimerDude Tue 30 Aug 2016

add some echo statements into WispActor in the parseReq method

where/how to access the parseReq method?

Brian is essentially talking about re-compiling the web pod from source. It's probably something you don't want to get in to right now.

I can't replicate the problem you're having. Here's a sample test program I just knocked up:

using web
using wisp
using concurrent

class Example {    
    Void main(Str[] args) {
        WispService { it.root=ExampleWebMod(); it.httpPort=8069 }.install.start
        Actor.sleep(Duration.maxVal)
    }
}

const class ExampleWebMod : WebMod {
    override Void onGet() {
        echo(req.modRel)
        res.headers["Content-Type"] = "text/plain"
        res.out.writeChars("Query Str: ${req.modRel.queryStr}").flush.close
    }
}

And as you can see from the screen shot, when I enter http://localhost:8069/?<query string> into my browser (Chrome v52) - the onGet() method prints out:

Query Str: <query string>

Screenshot:

Query string with special characters

jhughes Tue 30 Aug 2016

It doesn't seem to fail on every special character, space seems to work. If you were to add brackets into the query portion, you would be able to see the error.

SlimerDude Tue 30 Aug 2016

The URI is just the ip:port/?<query string>

As you can see, I tried the example URL you gave... complete with angle brackets, and it works okay.

Using the sample code I provided, can you give a complete URL where you do not see expected results?

jhughes Tue 30 Aug 2016

The example code you provided exhibits the same behavior. Here is a full URL that will recreate this issue.

http://localhost:8069/?json={"objects":[{"object":{"name":"value"}}]}

SlimerDude Tue 30 Aug 2016

Ah, okay. That one concrete example makes everything clear.

Your problem is that JSON isn't a legal query string. Not by Fantom standards, and not by Internet standards - see RFC 3986.

As Brian mentioned, you can see what Fantom does by attempting to decode your string into a URI:

C:\>fansh
Fantom Shell v1.0.69 ('?' for help)

fansh> Uri.decode("""http://localhost:8069/?json={"objects":[{"object":{"name":"value"}}]}""")

sys::ParseErr: Invalid Uri: 'http://localhost:8069/?json={"objects":[{"object":{"name":"value"}}]}'
  : Invalid char in query at index 28
  fan.sys.Uri.decode (Uri.java:55)
  fan.sys.Uri.decode (Uri.java:45)

Arguably Fantom should accept the { & } chars in a query string (as they're not listed in RFC 3986), but the chars [ & ] are illegal general delimiters.

Going forward, the approach to take, is it percent encode illegal characters. Uri.encode() will do this for you:

C:\>fansh
Fantom Shell v1.0.69 ('?' for help)

fansh> `http://localhost:8069/?json={"objects":[{"object":{"name":"value"}}]}`.encode

http://localhost:8069/?json=%7B%22objects%22:%5B%7B%22object%22:%7B%22name%22:%22value%22%7D%7D%5D%7D

That last URL, when pasted into a browser, will prompt ExampleWebMod to respond with:

Query Str: json={"objects":[{"object":{"name":"value"}}]}

SlimerDude Tue 30 Aug 2016

.

jhughes Tue 30 Aug 2016

I ran the JSON online URI encoder originally which is what made me realize it wasn't accepting the special characters and was able to get similar results as you show in the last URL.

I've done this type of testing on other web servers by simply pasting an unencoded URI into a browser (generally JSON) and had it accepted and encoded in the browser. I always assumed it was the browser handling this encoding but it looks like it might have been the server itself fixing the URI?

Is there any way to validate a URI and attempt to encode it from within the WispService before attempting to process it? Would be helpful since the alternative is to write the query string in one text editor, then encode via fantom or something third party encoder, and then send to the WispService for processing.

SlimerDude Tue 30 Aug 2016

browser handling this encoding but it looks like it might have been the server

I find it's generally a bit of both. Different browsers seem to escape different sets of characters, but all cover the usual suspects like spaces.

Other languages & frameworks often treat the URL as a plain string, which is why it's so easily passed down the line. But that just leaves it up to every man and his dog to attempt to decode the ugly mess that is RFC 3986 Web Encoding Hell.

Fantom generally does you a favour by having a pretty faithful, and impressive, URI class - which takes care of all the encoding / decoding for you.

any way to validate a URI and attempt to encode it from within the WispService

No, I've not seen any hooks for this inside Wisp - be happy that WebMod.onGet() only receives clean URLs!

If you really need different behaviour then I think you'll have to re-compile your own wisp pod.

brian Tue 30 Aug 2016

That URI will be percent encoded by the browser and send in the HTTP request as this:

?json={%22objects%22:[{%22object%22:{%22name%22:%22value%22}}]}

If you try to decode that you will see it fails:

fansh> Uri.decode("?json={%22objects%22:[{%22object%22:{%22name%22:%22value%22}}]}")
sys::ParseErr: Invalid Uri: '?json={%22objects%22:[{%22object%22:{%22name%22:%22value%22}}]}': Invalid char in query at index 6

Its a bug in Uri.decode because that is a legally encoded URI. It could be trivial to fix or quite involved. I will ask Matthew to take a look sometime in the next week or so.

But in general I think its probably a lot safer to base64 complicated data, or better yet POST with a proper MIME type. There is going to be a limit to the number of chars accepted by an GET on the URI request line.

jhughes Tue 30 Aug 2016

Thanks for the info guys. Since I was only using the browser as a tool to do some quick testing with Wisp, it won't be too difficult to write up some test code to run against the WispService that can pre-process my requests. Might give me a good excuse to start messing around with the graphics packages.

jhughes Wed 31 Aug 2016

Ran into another issue which may be related to the RFC 3986 encoding. In my case, since I am passing a JSON string which has it's own encoding, if an equals character exists in the JSON, it breaks the query portions ability to handle key/value pairs.

Example:

echo(req.modRel())
Str:Str query := req.modRel().query()
keys := query.keys()
keys.each |Str k| 
{
	echo("key:"+ k)
	echo("val: " + query.get(k))
}

?json="{"requests":[{"request":"get","type":"user"},{"request":"get","paged":{},"filters":{"filters":[{"filter":"((TYPE=admin'))"}]},"sort":{},"type":"user"}]}"'

key:json="{"requests":[{"request":"get","type":"user"},{"request":"get","paged":{},"filters":{"filters":[{"filter":"((TYPE

val: admin'))"}]},"sort":{},"type":"user"}]}'

You can see that the full query is received but when you look at the keys, it doesn't recognize json as a key but groups all the text before the second = as the key and makes the text after that the value.

SlimerDude Wed 31 Aug 2016

I don't think you're gonna find a way around that. Query strings don't recognise the difference between a = in JSON and = in a key / value pair.

If you need to communicate multiple key value pairs of JSON strings, then I think you're gonna have to either encode the JSON, or send the data in the request body. application/x-www-form-urlencoded data is probably the better way to go, ala HTML forms.

jhughes Wed 31 Aug 2016

I went ahead an moved the request data into the content of a POST to get around this. Just wanted to show results from my additional testing for help with any future updates that might be able to handle this type of situation.

brian Fri 16 Sep 2016

Ticket promoted to #2557 and assigned to brian

There is a bug regarding percent encoding being treated as a separator character.

?json={%22objects%22:[{%22object%22:{%22name%22:%22value%22}}]}

fansh> Uri.decode("?json={%22objects%22:[{%22object%22:{%22name%22:%22value%22}}]}")
sys::ParseErr: Invalid Uri: '?json={%22objects%22:[{%22object%22:{%22name%22:%22value%22}}]}': Invalid char in query at index 6

brian Fri 15 Sep 2017

Ticket cancelled

This is actually not a bug, but the correct behavior. The curly braces "{" and "}" are not legal URI chars according to RFC 3986. So they need to be percent encoded using "%7B" and "%7D" respectively.

fansh> Uri(Str<|?json={"foo":"bar"}|>).encode
?json=%7B%22foo%22:%22bar%22%7D
fansh> Uri.decode(Uri(Str<|?json={"foo":"bar"}|>).encode)
?json={"foo":"bar"}

SlimerDude Sat 16 Sep 2017

The curly braces "{" and "}" are not legal URI chars according to RFC 3986

I can clearly see the square brackets "[" and "]" in the Reserved Characters section of RFC 3986 - but I can't see the curly brackets "{" and "}" mentioned anywhere!?

Can you point out where they're mentioned in the spec, for I feel may have to re-answer this question at some point in the future.

brian Sat 16 Sep 2017

The grammar where pchar is the legal path chars:

pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

reserved    = gen-delims / sub-delims

gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
            / "*" / "+" / "," / ";" / "="

So basically anything outside of that list must be percent encoded such space, curly braces, etc

SlimerDude Sat 16 Sep 2017

Cool - I geddit. Thanks.

Login or Signup to reply.