This draft document defines the format and handling of mailto URIs. It is a standalone specification and does not rely on other specifications.
Mailto URIs are used to specify email message header data. Mail clients use this data to generate default header values and default compose form text field values.
"mailto:?" + "&-separated list of hname=hvalue pairs"
All hname=hvalue pairs are optional.
"mailto:" and hnames are handled in a case-insensitive manner. Use lowercase when possible though.
In HTML and XML markup, "&" must be represented as "&".
TO, SUBJECT, BODY, CC and BCC are the basic hnames that should be supported.
Everything after "mailto:" on up to (but not including) "?" (or the end of the URI if a "?" is not present) also represents a TO hvalue (where "mailto:" represents the TO hname for the value).
"Comma-separated" refers to separating by a comma and a space. For example: "1, 2, 3, 4, 5" and NOT "1,2,3,4,5". In the case of an hvalue, each comma is encoded as %2C and each space is encoded as %20.
"\r\n-separated" refers to separating by a carriage return (0x0D) and a line feed (0x0A). In the case of an hvalue, each carriage return is encoded as %0D and each line feed is encoded as %0A. For example, "line1%0D%0Aline2".
(Need to do more than examples)
"string" in the following encoding, decoding and parsing steps refers to a sequence of 8bit unsigned characters. If working with a string of characters with a different type and or width, adjust the steps to conform.
Hvalues and hnames need to be percent-encoded.
While I is less than the length of S:
If C equals a Line Feed (0x0A):
Else if C is found in NOENCODE:
Else:
All characters in an hvalue or hname that need to be encoded should be. An "@" when used as a separator (as in email@example.com), still needs to be encoded to %40. It being a separator in this case, does not exempt it from being encoded. Same goes for "+". Even though it represents itself (instead of a space) in a mailto URI, it is not exempt from being encoded to "%2B". (Although mail clients still need to handle "@" and "+" the same even if they're not encoded.)
Just like hvalues, hnames can contain full utf-8 sequences. This allows clients to support unicode hnames if needed. For example, mailto:?%E2%88%9A=%E2%88%9A is valid in a mailto URI.
While I is less than the length of S:
If C equals "%" and I + 2 is less than the length of s:
If F1 or F2 was not found in HEXITS:
Else:
Else:
RET will be a string of raw utf-8 sequences with all newline pairs and single newlines normalized to \n.
In a mailto URI, there can be more than one hname with the same name. When a client parses a mailto URI to generate a TO, CC, BCC, SUBJECT or BODY value, the following rules should be followed.
The decoded version of the generated TO value is what browsers should use for the mailto link "Copy email address" feature. "Copy subject", "Copy body", "Copy CC addresses" and "Copy BCC addresses" features should use the decoded version of their corresponding generated value.
If the client supports other address-based hnames, the TO, CC and BCC rules can be used if desired.
Duplicate hnames were generally allowed for use with clients that had hvalue length restrictions, but still wanted to support acceptence of large amounts of data for an hname. Even clients that don't have hvalue length restrictions should support duplicate hnames for compatibility.
For TO, CC and BCC (and generally other address types), there is no need to join empty hvalues. You'll end up getting ", , , , ," if you do.
For SUBJECT, since there can be only one subject in a message, make each new SUBJECT hvalue override the previous one. Also, there are existing implementations that already do this.
For BODY, do not join empty BODY hvalues until there is a BODY hvalue with significant content (non-empty). That way, empty body hvalues do not cause a bunch of empty lines at the top of the BODY field before there's significant content.
If you want your mailto URI to be handled correctly in clients that don't support duplicate hnames, don't use duplicate hnames in your URI. Currently, there's not much need to anyway.
If S starts with a case-insensitive "mailto:":
If one or more '?' are present in SUB:
Else:
While I is less than the length of HLIST:
If a '=' is found in S:
If HNAME equals "to":
If HVALUE is not empty:
If TO is not empty:
Else if HNAME equals "cc":
If HVALUE is not empty:
If CC is not empty:
Else if HNAME equals "bcc":
If HVALUE is not empty:
If BCC is not empty:
Else if HNAME equals "subject":
Else if HNAME equals "body":
If (HVALUE is empty and BODY is empty) equals false:
If BODY is not empty:
Else if HNAME equals "another hname you want to support":
Parse it as desired. However, if it's an address type, it is recommended that you parse it like TO, CC and BCC.
The decoded values are what mail clients use (after converting them to the needed encoding) to fill in a compose form's text fields.
If your app and or your parser does not support null bytes, before parsing the mailto URI, convert "%00" (and raw null bytes) to "%2500" so that when decoding, they show up in the compose field as "%00". If working with a string of characters with a different type and or width, adjust this normalization accordingly. Use this same type of normalization for other bytes that are not supported.
Click on the link. In your mail client's compose form, you should get the following results.
To:
Cc:
Bcc:
Subject:
Note that there may be extra lines in the body in your mail client if it has signature support.
Also note that for the TO, CC and BCC fields, user-input in most clients requires properly-escaped addresses where certain charactes are escaped with "\" and other parts are quoted (see Address Syntax). However, some clients allow unescaped user-input and will, after decoding an hvalue, deescape characters escaped with "\" when filling in the fields. As long as the intended headers are produced in the outgoing message, the client can use whatever user-input method that's desired.
In MailtoURIParserPack.zip are C, C++, D, Java, Javascript, Perl, Python, Ruby, Pike, Lua, Tcl, PHP5 and Python3000 MailtoURIParser classes that use the rules in this document. The test above uses the Javascript version to parse the mailto link to generate the form data.
Mozilla Thunderbird 3.0 generally parses mailto URIs according to the rules in this document (keeping the deescaping note for the TO, CC and BCC fields in mind). Thunderbird 3.0 passes the test above (if clicking on the mailto link in Firefox with Thunderbird 3.0 as the default mail client).
Opera 9.5 generally parses mailto URIs for its built-in mail client according to the rules in this document (except for a current empty body value quirk).