The ups and downs of the HTTP header

  • A number of errors in this article makes me wary:

    1. The "request" line in HTTP is not a header - it is the request, which can have associated headers. The headers are all “about” the request. The request itself is not a header, and does not follow the header syntax. (The historical reason for this is that the request line was defined in HTTP 0.9, which did not have headers.)

    2. ISO-8859-1 is not “a crappy Windows character set”. It is an international standard specifically different from what Microsoft was using at the time (code page 437 was standard for MS-DOS in the US). Later, Windows switched to code page 1252, which is a copy of ISO-8859-1 except some extra glyphs in the bytes the ISO standard defined as control characters.

  • Why is the UA header so screwed up, aside from the historical issues with it? Isn't it time that we replace it with something a bit more sane and structured? It seems the idea of detecting the browser vs detecting browser features goes back and forth. Sure, on the client side, where you have access to the DOM and the JavaScript runtime, it's great to know whether you can use the placeholder attribute in a text input, but server-side you need to decide which video file to serve to the client, and this gets tricky.

    Instead, why don't we have something like this?:

        OS: Windows
        OS-Version: 8.1
        Browser: Chrome
        Browser-Version: 18.5
    
    (Not suggesting the format, just the type of data.)

    That way we can ditch the stupid stuff such as "like Gecko" which means nothing, and focusing on actual useful things.

  • Well as part of a rant, I'll point out two bizarro-world features of HTTP headers: Line folding and comments.

    You can add arbitrary crlfs to any header, so long you start the next line with whitespace. Proper implementations need to properly treat every next line as part of the single header. Very annoying to implement (and other similar protocols implementations' do not all agree!), and no benefit. Unless you're composing HTTP headers to read on a 80-column layout. And that kind of thing has no place in a computer protocol.

    Comments. Seriously read this from the spec:

      Comments can be included in some HTTP header fields by surrounding
      the comment text with parentheses. Comments are only allowed in
      fields containing "comment" as part of their field value definition.
      In all other fields, parentheses are considered part of the field
      value.
    
    That's even more bizarre. It further makes parsing need to know which header it is operating on. It just adds possibility for mis-implementation, security issues (confused deputy) and hurts performance. It's only useful if you're writing HTTP headers by hand and feel the need to comment them for ... I can't think of a legit case.

    "Human readable" computer protocols are debatable (parsing rules always seem to become more difficult, which is very bad), but "human writable" is just silly.

  • A bit of trivia why Opera is claiming to be 9.80: They used 10.00 in beta of Oepra 10 and found out that many site's sniffers couldn't process 2-digit version number. So with final release (and after that until the death of the browser) they used Opera/9.80 and put the actual version elsewhere in the string.

    That being said, people who sniff UA string to serve different content (or even block the user) should end up in hell. I'd start with Google.

  • Interesting article, but for the part about the User-Agent header, I really liked the history lesson by Aaron Andersen [1] from 2008.

    [1] http://webaim.org/blog/user-agent-string-history/

  • Can't say I like the design of the page, but a good read nonetheless. Though after all those warnings, I expected it to be much longer. Is it really that long an article?

  • > Opera 12 then just gets weird on us. It says "Generic English please, or U.S English, if not then uh... Arabic! If not then perhaps Catalan? If not then Danish, or if not that then Dutch. Ok perhaps Greek? Finnish?... Go home Opera, you're drunk.

    Most amusing part. Seriously, I can't imagine why Opera sends all these languages in its request. Bizarre.

  • Slightly off topic, but this is the first post I've read on a Ghost-powered blog – I think it looks great.