I have implemented spring webflux controller like this:
@GetMapping(path = "/test", produces = APPLICATION_JSON_VALUE)
public Mono<String> getData(ServerHttpRequest serverHttpRequest) {
String param = serverHttpRequest.getURI().getRawQuery();
return Mono.just("TEST");
}
My goal is to get the whole query and do something with it. For instance, if url is http://localhost:8080/test?identifier=test|code
, I want identifier=test|code
string and pass directly to another server.
However, when I send url with some special character such as pipe (|), I get 400 Bad request.
All the answers that I could find was that client should encode special character (for pipe, it should be %7C). However, I want to see if there is any option for server side to handle.
My app uses Netty instead of Tomcat (not sure if this is helpful). I saw something about "relaxedQueryChar" about Tomcat but I don't think it is related to Netty.
How can I handle this on server side?
Short answer: The validation of URL characters is done by java.net.URI
, not Netty, Springboot or Tomcat. This makes it very difficult/unrealistic to bypass, especially as java.net.URI
is declared final
.
Long answer:
I set the following packages to TRACE
logging:
logging.level.org.netty=TRACE
logging.level.org.springframework=TRACE
On calling the service with url with Postman's 'Encode URL automatically' switched off: http://localhost:8088/api/test?identifier=test|code the following was written to the logs:
DEBUG o.s.h.s.r.ReactorHttpHandlerAdapter : Failed to get request URI: Illegal character in query at index 46: http://localhost:8088/api/test?identifier=test|code
o.s.h.s.r.ReactorHttpHandlerAdapter
is org.springframework.http.server.reactive.ReactorHttpHandlerAdapter
ReactorHttpHandlerAdapter
calls reactor.netty.http.HttpOperations<INBOUND extends NettyInbound, OUTBOUND extends NettyOutbound>
which uses java.net.URI.create(String str)
which eventually hits java.net.URI.create.match(char c, long lowMask, long highMask)
match()
tests that the characters in the url are all included in the following:
// Character-class masks, in reverse order from RFC2396 because
// initializers for static fields cannot make forward references.
// digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
// "8" | "9"
private static final long L_DIGIT = 0x3FF000000000000L; // lowMask('0', '9');
private static final long H_DIGIT = 0L;
// upalpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
// "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
// "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
private static final long L_UPALPHA = 0L;
private static final long H_UPALPHA = 0x7FFFFFEL; // highMask('A', 'Z');
// lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
// "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
// "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
private static final long L_LOWALPHA = 0L;
private static final long H_LOWALPHA = 0x7FFFFFE00000000L; // highMask('a', 'z');
// alpha = lowalpha | upalpha
private static final long L_ALPHA = L_LOWALPHA | L_UPALPHA;
private static final long H_ALPHA = H_LOWALPHA | H_UPALPHA;
// alphanum = alpha | digit
private static final long L_ALPHANUM = L_DIGIT | L_ALPHA;
private static final long H_ALPHANUM = H_DIGIT | H_ALPHA;
// hex = digit | "A" | "B" | "C" | "D" | "E" | "F" |
// "a" | "b" | "c" | "d" | "e" | "f"
private static final long L_HEX = L_DIGIT;
private static final long H_HEX = 0x7E0000007EL; // highMask('A', 'F') | highMask('a', 'f');
// mark = "-" | "_" | "." | "!" | "~" | "*" | "'" |
// "(" | ")"
private static final long L_MARK = 0x678200000000L; // lowMask("-_.!~*'()");
private static final long H_MARK = 0x4000000080000000L; // highMask("-_.!~*'()");
// unreserved = alphanum | mark
private static final long L_UNRESERVED = L_ALPHANUM | L_MARK;
private static final long H_UNRESERVED = H_ALPHANUM | H_MARK;
// reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
// "$" | "," | "[" | "]"
// Added per RFC2732: "[", "]"
private static final long L_RESERVED = 0xAC00985000000000L; // lowMask(";/?:@&=+$,[]");
private static final long H_RESERVED = 0x28000001L; // highMask(";/?:@&=+$,[]");
// The zero'th bit is used to indicate that escape pairs and non-US-ASCII
// characters are allowed; this is handled by the scanEscape method below.
private static final long L_ESCAPED = 1L;
private static final long H_ESCAPED = 0L;
// uric = reserved | unreserved | escaped
private static final long L_URIC = L_RESERVED | L_UNRESERVED | L_ESCAPED;
private static final long H_URIC = H_RESERVED | H_UNRESERVED | H_ESCAPED;
// pchar = unreserved | escaped |
// ":" | "@" | "&" | "=" | "+" | "$" | ","
private static final long L_PCHAR
= L_UNRESERVED | L_ESCAPED | 0x2400185000000000L; // lowMask(":@&=+$,");
private static final long H_PCHAR
= H_UNRESERVED | H_ESCAPED | 0x1L; // highMask(":@&=+$,");
// All valid path characters
private static final long L_PATH = L_PCHAR | 0x800800000000000L; // lowMask(";/");
private static final long H_PATH = H_PCHAR; // highMask(";/") == 0x0L;
// Dash, for use in domainlabel and toplabel
private static final long L_DASH = 0x200000000000L; // lowMask("-");
private static final long H_DASH = 0x0L; // highMask("-");
// Dot, for use in hostnames
private static final long L_DOT = 0x400000000000L; // lowMask(".");
private static final long H_DOT = 0x0L; // highMask(".");
// userinfo = *( unreserved | escaped |
// ";" | ":" | "&" | "=" | "+" | "$" | "," )
private static final long L_USERINFO
= L_UNRESERVED | L_ESCAPED | 0x2C00185000000000L; // lowMask(";:&=+$,");
private static final long H_USERINFO
= H_UNRESERVED | H_ESCAPED; // | highMask(";:&=+$,") == 0L;
// reg_name = 1*( unreserved | escaped | "$" | "," |
// ";" | ":" | "@" | "&" | "=" | "+" )
private static final long L_REG_NAME
= L_UNRESERVED | L_ESCAPED | 0x2C00185000000000L; // lowMask("$,;:@&=+");
private static final long H_REG_NAME
= H_UNRESERVED | H_ESCAPED | 0x1L; // highMask("$,;:@&=+");
// All valid characters for server-based authorities
private static final long L_SERVER
= L_USERINFO | L_ALPHANUM | L_DASH | 0x400400000000000L; // lowMask(".:@[]");
private static final long H_SERVER
= H_USERINFO | H_ALPHANUM | H_DASH | 0x28000001L; // highMask(".:@[]");
// Special case of server authority that represents an IPv6 address
// In this case, a % does not signify an escape sequence
private static final long L_SERVER_PERCENT
= L_SERVER | 0x2000000000L; // lowMask("%");
private static final long H_SERVER_PERCENT
= H_SERVER; // | highMask("%") == 0L;
// scheme = alpha *( alpha | digit | "+" | "-" | "." )
private static final long L_SCHEME = L_ALPHA | L_DIGIT | 0x680000000000L; // lowMask("+-.");
private static final long H_SCHEME = H_ALPHA | H_DIGIT; // | highMask("+-.") == 0L
// scope_id = alpha | digit | "_" | "."
private static final long L_SCOPE_ID
= L_ALPHANUM | 0x400000000000L; // lowMask("_.");
private static final long H_SCOPE_ID
= H_ALPHANUM | 0x80000000L; // highMask("_.");
// -- Escaping and encoding --
private static final char[] hexDigits = {
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
};
So to implement this java.net.URI
must to be replaced with one that includes '|' in the following mask. Also a fork of netty that uses this variant of java.net.URI
will need to be created:
private static final long L_MARK = 0x678200000000L; // lowMask("-_.!~*'()");
private static final long H_MARK = 0x4000000080000000L; // highMask("-_.!~*'()");
Stack from netty resolvePath()
to to URI$Parser.match()
- from callee to caller:
URI.match(char, long, long) line: 2637
URI$Parser.scan(int, int, long, long) line: 3120
URI$Parser.checkChars(int, int, long, long, String) line: 3143
URI$Parser.checkChar(int, long, long, String) line: 3155
URI$Parser.parse(boolean) line: 3170
URI.<init>(String) line: 623
URI.create(String) line: 904
HttpOperations<INBOUND,OUTBOUND>.resolvePath(String) line: 429