Revision 5 - September 26, 2010
This draft document defines the syntax, generation and handling of mailto URIs (and IRIs where applicable).
Mailto URIs are used to specify email message compose data in a portable format. Mail clients parse and decode this data to generate default header field values and message bodies for compose forms and message sources.
For consumers of mailto URIs
Authoring mailto URIs is quite simple. However, handling them is more complex. You need to convert the URI to a dataset, parse it and percent-decode it to get the raw data. But, you need to do this in a consistent way for all types of mailto URIs including invalid ones.
You have a mailto URI if the string in question starts with a case-insensitive "mailto:". "MAILTO:", "mAiLtO:", "MaIlTo:" and "mailto:" are all mailto URIs.
There are a few things you need to do with the URI before you can process it.
Here's a Javascript example:
function escapeUnsafeRaw(s) {
return s.replace(/[\x00-\x08]|[\x0B-\x0C]|[\x0E-\x0F]/g, function(match) {
try {
return encodeURIComponent(match);
} catch (e) {
return "";
}
});
}
function escapeInvalidHH(s) {
return s.replace(/%(?![0-9A-F]{2})/gi, function() {
return "%25";
});
}
function normalizeNewlines(s) {
return s.replace(\r\n|\r|\n/g, "%0D%0A").replace(/%0D%0A|%0D|%0A/gi, "%0D%0A");
}
function escapeUnsafeHH() {
return s.replace(/%(00|01|02|03|04|05|06|07|08|0B|0C|0E|0F|10|11|12|13|14|15|16|17|18|19|1A|1B|1C|1D|1E|1F)/gi, function(match, hh) {
"%25" + hh;
});
}
function escapePlus(s) {
return s.replace(/\+/g, "%2B");
}
function makeURISafe(s) {
return normalizeNewlines(escapePlus(escapeUnsafeHH(escapeInvalidHH(escapeUnsafeRaw(s)))));
}
var uri = "mailto:\0%00\n\r\n\r%3y%5e%0A%0D%0A%0D+";
alert(makeURISafe(uri)); // "mailto:%2500%2500%0D%0A%0D%0A%0D%0A%253y%5e%0D%0A%0D%0A%0D%0A%2B"
Before you can process the mailto URI, you need to convert it into an &-separated list of hfname=hfvalue pairs.
This is done in a few steps:
Here's a Javascript example:
var uri = "mailto:&&&foo?x=1&y=2?#x#y#z";
var dataset = "to=" + uri.replace(/#.+/, "").substr(7).replace(/^[^?]+/, function(match) {
return match.replace(/&/g, "%26");
}).replace(/\?/, "&");
alert(dataset); // "to=%26%26%26foo&x=1&y=2?"
Once you have a dataset to work with, you need to split it by "&" into a bunch of tokens. Each token will then represent an hfield.
Here's a Javascript example:
var dataset = "to=%26%26%26foo&x=1&y=2?";
var hfields = dataset.split("&");
for (var i = 0; i < hfields.length; ++i) {
if (hfields[i].indexOf("=") !== -1) {
alert(hfields[i]);
// "to=%26%26%26foo"
// "x=1"
// "x=2?"
}
}
As you can see, you skip hfield values that don't have any "=" in them. If there's no "=" in the string, there's no hfname or hfvalue.
To get the hfname and hfvalue from an hfield, you need to split it by the first "=". You split it by the first "=" so that any extra "=" will be treated as part of the hfvalue.
Here's a Javascript example:
var hfield = "x==1";
var eq = hfield.indexOf("=");
var hfname = hfield.substring(0, eq);
var hfvalue = hfield.substr(eq + 1);
alert(hfname); // "x"
alert(hfvalue); // "=1"
Percent-encoding just involves converting %HH to their raw codepoints.
Here's a Javascript example:
function decode(s) {
try {
return decodeURIComponent(s);
} catch (e) {
return "";
}
}
alert(decode("%5E%E2%88%9A")); // "^√" or "^\u221A";
Note that '+' is NOT decoded to a space. It is left alone.
Also note that if you didn't Make the URI safe for processing and fix percent-encoding, you'll have to do these things here.
An hfname is a UTF-8-percent-encoded representation of a raw name value. To convert an hfname to a name value, you UTF-8-percent-decode it and then convert it to lowercase.
An hfvalue is a UTF-8-percent-encoded representation of a raw value. To convert an hfvalue to a value, you just UTF-8-percent-decode it.
Values for the names "to", "cc", "bcc", and "subject" are considered single-line fields. \r and \n should be stripped from them.
For example, if you had "mailto:line1%0D%0Aline2", the value for "to" would be "line1line2".
This SHOULD apply to all other values that are known to be single-line values.
Here's a Javascript example:
alert("line1\r\nline2".replace(/\r|\n/g, "")); // "line1line2"
You might find that an author of a mailto URI specified duplicate hfvalues. For example, as a consumer of a mailto URI, you might want to handle "mailto:?cc=1&cc=2" as the author intended.
There are 4 different types of accumulation:
Values for "to", "cc" and "bcc" and other values that are known to carry addresses are of the address type.
Values for "body" and other values that are known to contain multiple lines are multi-line.
Values for "subject" and other values that should only contain a single line are single-line.
Values of unknown type should follow the standard rule.
Now, if you do not want to do any accumulation, use the standard rule for all values. But, still honor special processing for certain values types when it applies.
Here is a full parsing example in Javascript:
// Example
See mailto_uri_parser.js for now.
Yet to be defined. See this for now.
When browsers, for example, pass a mailto URI (from the address field or an HTML link) to another UA (a mail client for example), if the mailto URI contains a fragment identifier, the browser MUST NOT strip it when passing to the mail client. It is up to the mail client to properly discard of the fragment identifier, not the browser. One reason for this is that the browser is not the consumer. It's just passing the URI along (after normalizing it, escaping it and quoting it so that it's safe to pass on the command line) to the client. Another reason for this is that in the future, if mailto URIs are defined to make use of fragment identifiers and mail clients start supporting them, browsers won't have to update their passing code and the full URI, including the fragment identifier will make it to the client.
When a browser resolves a mailto IRI (in an HTML href attribute for example) or when a browser goes to pass a mailto IRI to another client, or when a user copies a mailto IRI to the clipboard, the IRI MUST be converted to a mailto URI using UTF-8, regardless of the page's encoding. This is important because most clients (including webmails) expect only UTF-8-based mailto URIs.
For example, even in a Shift-JIS page, an href attribute value of "mailto:?subject=√" must be resolved to "mailto:?subject=%E2%88%9A". Further, an href attribute value of "mailto:?subject=%E2%88%9A" in a Shift-JIS page must still resolve to "mailto:?subject=%E2%88%9A".
Note however that some browsers will show IRIs in the status field and address bar instead of URIs. The href property for links and the location property for documents should still return URI values though.
HTML form submission (when the action attribute starts with 'mailto:') is specified in HTML5 under the Form Submision Algorithm (see the table under step 15), Mail with Headers and Mail as Body sections.
The following is a Javascript example of how the outgoing mailto URI is generated for POST when enctype is "application/x-www-form-urlencoded", and for all other methods regardless of enctype. (Handling of POST with other enctype values is not shown in the example.)
HTMLFormElement.prototype.createMailtoURIFromFormData = function() {
var destination = "mailto:";
if (this.action.search(/mailto:/i) === 0) {
var headers = this.createDatasetFromActiveFormElements();
destination = this.action;
// Avoid ambiguous use of +. (Not in HTML5).
destination = destination.replace(/\+/g, "%2B");
var qm = destination.indexOf('?');
if (this.method === "post") {
var body = encodeURIComponent(headers);
if (qm === -1) {
destination += '?';
} else {
destination += '&';
}
destination += "body=";
destination += body;
} else {
if (qm !== -1) {
destination = destination.substring(0, qm);
}
destination += '?';
destination += headers.replace(/\+/g, "%20");
}
}
return destination;
};
Note that 'createDatasetFromActiveFormElements' above is an example representing the dataset you get by running the HTML5 form submission algorithm.
Also, before submitting a mailto form, the UA SHOULD present the user with a confirmation dialog as the submission will usually launch an external program.