No currently open issues.
For CGI/1.2, let's get rid of the entire NPH (non-parsed-header) concept. It becomes truly broken in an HTTP/1.1 environment.
Historically this metavariable has been set to the same value the server uses for the Server: response header field. However, that field can contain more than just the base server information (such as "Apache/1.3.0"); it may also include other components such as comments ("(comment text)") or plug-in versions (e.g., "Apache/1.3.0 MyMod/1.0"). Should this metavariable be documented as tracking the Server: field value, or containing just the base server software version information?
This is fairly self-explanatory. It seems reasonable that script should know when it is servicing a request made through SSL; it's functionally an extension of the server, after all. Ben Laurie has mumbled about supplying the necessary information about the SSL additional info. See also issue #5 for this document, which is related.
Due to the significant differences in process communication models between the canonical Unix script environment and the Win32 one, and the increasing number of Win32-based Web servers, it has been requested that a section on Win32 CGI/1.2 implementation be added. (A good idea --Ken) There is already some sort of "WinCGI" specification document; it would be most excellent if that could be incorporated into this so a single CGI/1.2 RFC was applicable to all the main architectures extant at the time of publication.
The CGI/1.1 specification allows a script to construct a working URL for itself - mostly. One of the shortcomings is due to the distinction between the "scheme" and the "protocol." The scheme is the actual URI component, whilst the protocol is the transport mechanism. In most cases these are named identically, but one place in which this falls down is the "https" scheme. Who knows what might appear in the future? Currently there is no way for a script to construct a complete URL for itself; we should probably include a REQUEST_SCHEME or other meta-variable to augment and disambiguate SERVER_PROTOCOL.
From §11.2:
Character set
The US-ASCII character set is used for the definition of
environment variables and header fields; the newline (NL)
sequence is LF; servers should also accept CR LF as a newline.
For environment variable NAMES and header field NAMES, this is fine. For the VALUES of those, I'm not happy. RFC 2068 seems to say that header values can be ISO 8859-1, or they can be encoded:
The TEXT rule is only used for descriptive field contents and values
that are not intended to be interpreted by the message parser. Words
of *TEXT may contain characters from character sets other than ISO
8859-1 [22] only when encoded according to the rules of RFC 1522 [14].
TEXT = <any OCTET except CTLs,
but including LWS>
If the CGI environment either decodes 1522 or munges 8859-1 into something else, that's important for the CGI script writer to know....
In §4, concerning the description of CONTEN_TYPE: In several points I had discomfort with the tension between citing another source and then repeating what that source said. This section is a example. There is a strong chance that this definition might over time get out of sync with HTTP. The point of this field is to provide access to an HTTP header value, not to require a server to rewrite the value in a particular way.
[Editor's note: this would be simply handled if HTML were the base document technology. As long as it's plain-text, though..]
§8.2: I would like this whole section better if it referenced HTTP header folding and continuation rules and stated that the following paragraphs were intended to restate the HTTP rules which allow for combining multiple occurences and continuations lines into a single header on a single line.
§8.1: "The server should attempt to ensure that the script output is sent directly to the clinet, with minimal buffering." This 'should' is in conflict with HTTP/1.1 performance objectives in cases where the server might need to add the content-length header and/or apply chunked encoding.
[a reply:] Erm, if the script is entirely responsible for *all* of the HTTP processing, as is the intent and as you note in your previous point, the server *mustn't* fiddle with it. N'est-ce pas? I have changed the "minimal buffering" to "minimal internal and no transport-visible buffering," because if the script is doing all the work (like chunking) itself, the server should be essentially transparent. Shouldn't it?
The Location response header field description doesn't conform to current practice as relative URLs seem to be acceptable for redirects. (hmmm... I wonder if this is an HTTP/1.1 issue also). This description limits the Location to use with GET/HEAD which is in conflict with HTTP/1.1.
[a reply:] Here's a grey area. The specification in this draft does not match that in the NCSA documentation, which provides for limited local redirects internal to the server (they don't get sent back to the client as a 'real' redirect). So that's a break with current practice that needs to be fixed. On the other hand, the grey area starts forming because this is overloading an HTTP header field (Location, which takes only an absoluteURI) with a CGI field that has different semantics (Location, which can take an absoluteURI or a server-local URI [with some restrictions]). I'm not sure how to handle this; perhaps define the CGI semantics but make a note that this is also the name of an HTTP field with differing ones? Suggestions? I've discussed the overloading issue briefly with Roy Fielding, and he remarked that it should be safe to avoid this sort of collision by naming CGI header fields with a "CGI-" prefix.
§8.2: The "Status" CGI header field makes no provision for the CGI script downgrading the response to HTTP/1.0.
[a reply:] I don't think it would be a good idea to overload this onto the Status field. There is nothing in CGI/1.1 current practice (AFAIK) that provides for specifying the response protocol, so doing this through a CGI/1.2-only field seems appropriate. Perhaps a token for the Script-Control field..
§10: This recomendation is off base (no pun intended) as it asserts that the HTML returned by a CGI script should not send back relative URLs unless a <BASE> tag is included. There are several alternative warnings, but the statement is too strong asis.
[a reply:] I disagree here. I'm not sure what you mean by "alternative warnings," but I would neither like to see this made stronger nor weaker. But convince me; what is your reasoning? I'm concerned about possible exposures if a script is accessed through an unexpected URI -- such as a link on a UNIX system.
§8.2, §11.1, §11.2: States that the newline sequence for header data is LF but CR LF should be accepted. Elsewhere it tells the server it must re-write LF to CRLF. It would be more sensible to have the newline sequence be CR LF but tell servers to fix it up ...
The fact that the canonical query-string parameter separator is the ampersand (&) causes some problems, since it has special meaning in HTML as the character-entity introducer. draft-fielding-uri-syntax-03 lists several reserved characters that may be used as URI-component delimiters:
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
"$" | ","
It seems reasonable that the CGI specification recommend the use of one of the other delimiters rather than the ampersand.
Unfortunately, this has implications where HTML form data are concerned.