Skip to content

Unicode handling in header location #110

@noraj

Description

@noraj

webrick doesn't handle Unicode in HTTP location header, eg. redirection to an URL like http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg.

[2023-02-17 16:41:33] ERROR URI::InvalidURIError: URI must be ascii only "http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/\u041E\u0443\u044D\u043D-\u041C\u044D\u0442\u044C\u044E\u0441.jpg"                                                                                                              
        /usr/local/lib/ruby/3.2.0/uri/rfc3986_parser.rb:20:in `split'                                                                                                                                                
        /usr/local/lib/ruby/3.2.0/uri/rfc3986_parser.rb:71:in `parse'                                                                                                                                                
        /usr/local/lib/ruby/3.2.0/uri/rfc3986_parser.rb:111:in `convert_to_uri'                                                                                                                                      
        /usr/local/lib/ruby/3.2.0/uri/generic.rb:1110:in `merge'                                                                                                                                                     
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:320:in `setup_header'                                                                                                                       
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:240:in `send_response'                                                                                                                      
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/httpserver.rb:112:in `run'                                                                                                                                  
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/server.rb:310:in `block in start_thread'

The following code is responsible:

@header['location'] = @request_uri.merge(location).to_s

This is because methods such as URI.parse or here URI.merge only handles ASCII.

uri = URI.parse('http://dxczjjuegupb.cloudfront.net')
uri.merge('/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg').to_s
/home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/rfc3986_parser.rb:20:in `split': URI must be ascii only "/wp-content/uploads/2017/10/\u041E\u0443\u044D\u043D-\u041C\u044D\u0442\u044C\u044E\u0441.jpg" (URI::InvalidURIError)                                                                                                                                                                      
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/rfc3986_parser.rb:71:in `parse'                                                                                   
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/rfc3986_parser.rb:111:in `convert_to_uri'                                                                         
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/generic.rb:1110:in `merge'                                                                                        
        from (irb):9:in `<main>'                                                                                                                                                        
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/irb-1.6.2/exe/irb:11:in `<top (required)>'                                                                  
        from /home/noraj/.asdf/installs/ruby/3.2.0/bin/irb:25:in `load'                                                                                                                 
        from /home/noraj/.asdf/installs/ruby/3.2.0/bin/irb:25:in `<main>'

So URL or fragments should be escaped first, with CGI.escape for URL component and URI::Parser.new.escape for full URLs.

Examples in https://github.com/noraj/ctf-party/blob/master/lib/ctf_party/cgi.rb.

cf. https://stackoverflow.com/questions/46849219/ruby-uriinvalidurierror-uri-must-be-ascii-only/75487328

patched code:

uri.merge(CGI.escape('/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg')).to_s
# => "http://dxczjjuegupb.cloudfront.net/%2Fwp-content%2Fuploads%2F2017%2F10%2F%D0%9E%D1%83%D1%8D%D0%BD-%D0%9C%D1%8D%D1%82%D1%8C%D1%8E%D1%81.jpg"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions