javaspring-webfluxspring-restcontroller

How to handle special character (such as pipe) for spring webflux controller on server side?


I have implemented spring webflux controller like this:

@GetMapping(path = "/test", produces = APPLICATION_JSON_VALUE)
    public Mono<String> getData(ServerHttpRequest serverHttpRequest) {
        String param = serverHttpRequest.getURI().getRawQuery();
        return Mono.just("TEST");
    }

My goal is to get the whole query and do something with it. For instance, if url is http://localhost:8080/test?identifier=test|code, I want identifier=test|code string and pass directly to another server.

However, when I send url with some special character such as pipe (|), I get 400 Bad request.

All the answers that I could find was that client should encode special character (for pipe, it should be %7C). However, I want to see if there is any option for server side to handle.

My app uses Netty instead of Tomcat (not sure if this is helpful). I saw something about "relaxedQueryChar" about Tomcat but I don't think it is related to Netty.

How can I handle this on server side?


Solution

  • Short answer: The validation of URL characters is done by java.net.URI, not Netty, Springboot or Tomcat. This makes it very difficult/unrealistic to bypass, especially as java.net.URI is declared final.

    Long answer: I set the following packages to TRACE logging:

    logging.level.org.netty=TRACE
    logging.level.org.springframework=TRACE
    

    On calling the service with url with Postman's 'Encode URL automatically' switched off: http://localhost:8088/api/test?identifier=test|code the following was written to the logs:

    DEBUG o.s.h.s.r.ReactorHttpHandlerAdapter      : Failed to get request URI: Illegal character in query at index 46: http://localhost:8088/api/test?identifier=test|code
    

    o.s.h.s.r.ReactorHttpHandlerAdapter is org.springframework.http.server.reactive.ReactorHttpHandlerAdapter

    ReactorHttpHandlerAdapter calls reactor.netty.http.HttpOperations<INBOUND extends NettyInbound, OUTBOUND extends NettyOutbound> which uses java.net.URI.create(String str) which eventually hits java.net.URI.create.match(char c, long lowMask, long highMask)

    match() tests that the characters in the url are all included in the following:

    // Character-class masks, in reverse order from RFC2396 because
    // initializers for static fields cannot make forward references.
    
    // digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
    //            "8" | "9"
    private static final long L_DIGIT = 0x3FF000000000000L; // lowMask('0', '9');
    private static final long H_DIGIT = 0L;
    
    // upalpha  = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
    //            "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
    //            "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
    private static final long L_UPALPHA = 0L;
    private static final long H_UPALPHA = 0x7FFFFFEL; // highMask('A', 'Z');
    
    // lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
    //            "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
    //            "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
    private static final long L_LOWALPHA = 0L;
    private static final long H_LOWALPHA = 0x7FFFFFE00000000L; // highMask('a', 'z');
    
    // alpha         = lowalpha | upalpha
    private static final long L_ALPHA = L_LOWALPHA | L_UPALPHA;
    private static final long H_ALPHA = H_LOWALPHA | H_UPALPHA;
    
    // alphanum      = alpha | digit
    private static final long L_ALPHANUM = L_DIGIT | L_ALPHA;
    private static final long H_ALPHANUM = H_DIGIT | H_ALPHA;
    
    // hex           = digit | "A" | "B" | "C" | "D" | "E" | "F" |
    //                         "a" | "b" | "c" | "d" | "e" | "f"
    private static final long L_HEX = L_DIGIT;
    private static final long H_HEX = 0x7E0000007EL; // highMask('A', 'F') | highMask('a', 'f');
    
    // mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
    //                 "(" | ")"
    private static final long L_MARK = 0x678200000000L; // lowMask("-_.!~*'()");
    private static final long H_MARK = 0x4000000080000000L; // highMask("-_.!~*'()");
    
    // unreserved    = alphanum | mark
    private static final long L_UNRESERVED = L_ALPHANUM | L_MARK;
    private static final long H_UNRESERVED = H_ALPHANUM | H_MARK;
    
    // reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
    //                 "$" | "," | "[" | "]"
    // Added per RFC2732: "[", "]"
    private static final long L_RESERVED = 0xAC00985000000000L; // lowMask(";/?:@&=+$,[]");
    private static final long H_RESERVED = 0x28000001L; // highMask(";/?:@&=+$,[]");
    
    // The zero'th bit is used to indicate that escape pairs and non-US-ASCII
    // characters are allowed; this is handled by the scanEscape method below.
    private static final long L_ESCAPED = 1L;
    private static final long H_ESCAPED = 0L;
    
    // uric          = reserved | unreserved | escaped
    private static final long L_URIC = L_RESERVED | L_UNRESERVED | L_ESCAPED;
    private static final long H_URIC = H_RESERVED | H_UNRESERVED | H_ESCAPED;
    
    // pchar         = unreserved | escaped |
    //                 ":" | "@" | "&" | "=" | "+" | "$" | ","
    private static final long L_PCHAR
        = L_UNRESERVED | L_ESCAPED | 0x2400185000000000L; // lowMask(":@&=+$,");
    private static final long H_PCHAR
        = H_UNRESERVED | H_ESCAPED | 0x1L; // highMask(":@&=+$,");
    
    // All valid path characters
    private static final long L_PATH = L_PCHAR | 0x800800000000000L; // lowMask(";/");
    private static final long H_PATH = H_PCHAR; // highMask(";/") == 0x0L;
    
    // Dash, for use in domainlabel and toplabel
    private static final long L_DASH = 0x200000000000L; // lowMask("-");
    private static final long H_DASH = 0x0L; // highMask("-");
    
    // Dot, for use in hostnames
    private static final long L_DOT = 0x400000000000L; // lowMask(".");
    private static final long H_DOT = 0x0L; // highMask(".");
    
    // userinfo      = *( unreserved | escaped |
    //                    ";" | ":" | "&" | "=" | "+" | "$" | "," )
    private static final long L_USERINFO
        = L_UNRESERVED | L_ESCAPED | 0x2C00185000000000L; // lowMask(";:&=+$,");
    private static final long H_USERINFO
        = H_UNRESERVED | H_ESCAPED; // | highMask(";:&=+$,") == 0L;
    
    // reg_name      = 1*( unreserved | escaped | "$" | "," |
    //                     ";" | ":" | "@" | "&" | "=" | "+" )
    private static final long L_REG_NAME
        = L_UNRESERVED | L_ESCAPED | 0x2C00185000000000L; // lowMask("$,;:@&=+");
    private static final long H_REG_NAME
        = H_UNRESERVED | H_ESCAPED | 0x1L; // highMask("$,;:@&=+");
    
    // All valid characters for server-based authorities
    private static final long L_SERVER
        = L_USERINFO | L_ALPHANUM | L_DASH | 0x400400000000000L; // lowMask(".:@[]");
    private static final long H_SERVER
        = H_USERINFO | H_ALPHANUM | H_DASH | 0x28000001L; // highMask(".:@[]");
    
    // Special case of server authority that represents an IPv6 address
    // In this case, a % does not signify an escape sequence
    private static final long L_SERVER_PERCENT
        = L_SERVER | 0x2000000000L; // lowMask("%");
    private static final long H_SERVER_PERCENT
        = H_SERVER; // | highMask("%") == 0L;
    
    // scheme        = alpha *( alpha | digit | "+" | "-" | "." )
    private static final long L_SCHEME = L_ALPHA | L_DIGIT | 0x680000000000L; // lowMask("+-.");
    private static final long H_SCHEME = H_ALPHA | H_DIGIT; // | highMask("+-.") == 0L
    
    // scope_id = alpha | digit | "_" | "."
    private static final long L_SCOPE_ID
        = L_ALPHANUM | 0x400000000000L; // lowMask("_.");
    private static final long H_SCOPE_ID
        = H_ALPHANUM | 0x80000000L; // highMask("_.");
    
    // -- Escaping and encoding --
    
    private static final char[] hexDigits = {
        '0', '1', '2', '3', '4', '5', '6', '7',
        '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
    };
    

    So to implement this java.net.URI must to be replaced with one that includes '|' in the following mask. Also a fork of netty that uses this variant of java.net.URI will need to be created:

    private static final long L_MARK = 0x678200000000L; // lowMask("-_.!~*'()");
    private static final long H_MARK = 0x4000000080000000L; // highMask("-_.!~*'()");
    

    Stack from netty resolvePath() to to URI$Parser.match() - from callee to caller:

    URI.match(char, long, long) line: 2637  
    URI$Parser.scan(int, int, long, long) line: 3120    
    URI$Parser.checkChars(int, int, long, long, String) line: 3143  
    URI$Parser.checkChar(int, long, long, String) line: 3155    
    URI$Parser.parse(boolean) line: 3170    
    URI.<init>(String) line: 623    
    URI.create(String) line: 904    
    HttpOperations<INBOUND,OUTBOUND>.resolvePath(String) line: 429