This draft document defines the format and handling of mailto URIs.
RFC2368 does not apply to this document.
Mailto URIs are used to specify email message compose data in a portable format. Mail clients parse and decode this data to generate default text field values for compose forms, which users can review before sending.
"mailto:?" + "&-separated list of hname=hvalue pairs"
All hname=hvalue pairs are optional.
"mailto:" and the TO, CC, BCC, SUBJECT and BODY hnames are handled in a case-insensitive manner. Other hnames are generally handled in a case-sensitive manner, but it's up to the handler of the URI.
In HTML and XML markup, "&" must be represented as "&".
In short, mailto URIs just carry a bunch of percent-encoded field values that mail clients percent-decode. That's all mailto URIs are.
TO, SUBJECT, BODY, CC and BCC are the basic hnames that should be supported.
Everything after "mailto:" on up to (but not including) "?" (or the end of the URI if a "?" is not present) also represents a TO hvalue (where "mailto:" represents the TO hname for the value).
"Comma-separated" refers to separating by a comma and a space. For example: "1, 2, 3, 4, 5" and NOT "1,2,3,4,5". In the case of an hvalue, each comma is encoded as %2C and each space is encoded as %20.
"\r\n-separated" refers to separating by a carriage return (0x0D) and a line feed (0x0A). In the case of an hvalue, each carriage return is encoded as %0D and each line feed is encoded as %0A. For example, "line1%0D%0Aline2".
BCC addresses are meant to be somewhat private. Having a BCC address (in a mailto URI or in raw format) in public content reduces that privacy. Despite the privacy concern with public content, mail clients should still accept BCC hvalues from a mailto URI.
(See RFC2822 - 3.4. Address Specification for specifics.)
Here are some example addresses. They are shown in unencoded form. If these values are put in a mailto URI, they need to be encoded first.
Again, addresses don't go in mailto URIs. Only percent-encoded values representing addresses do. The mail client won't see any addresses until it parses and decodes the mailto URI.
"string" in the following encoding, decoding and parsing steps refers to a sequence of 8bit unsigned characters. If working with a string of characters with a different type and or width, adjust the steps to conform.
Hvalues and hnames need to be utf-8-percent-encoded.
While I is less than the length of S:
If C equals a Line Feed (0x0A):
Else if C is found in NOENCODE:
Else:
All characters in an hvalue or hname that are not in the NOENCODE group need to be encoded to their correpsonding %HH. "@" for example still must be encoded to %40 like it is in HTTP query string values. It is not a separator in a mailto URI and is only a separator in a raw address value, which mailto URIs don't contain. Same type of thing goes for "+". It is not exempt from being encoded to "%2B". (Although mail clients still need to treat "@" as "@" and "+" as "+" should they occur in raw form in the URI.)
Just like hvalues, hnames can contain utf-8 percent-encoded sequences. This allows clients to support unicode hnames if needed. For example, mailto:?%E2%88%9A=%E2%88%9A is valid in a mailto URI.
In short, hnames and hvalues are just encoded versions of some unencoded values. You can think of it (in Javascript terms) as: encodeURIComponent(some_value) + '=' + encodeURIComponent(some_value) with the addition that all \r\n, stray \r and \n are represented as %0D%0A.
If your encoding (or decoding) methods are strict and throw exceptions on invalid utf-8 sequences for example, whether you catch those exceptions and return an empty string, the original string or something else, is up to you. Returning an empty string on an error is often desirable, but you may want to be more relaxed or more strict.
Subject fields in many mail clients don't support newlines. Clients may strip the newlines out, but they might not and might break something. If unsure, you can strip all %0D%0A from the subject hvalue. Or, you can convert each %0D%0A to a %20. Or, you can convert each %0D to %20 and each %0A to %20 so that you can have *some* clue as to where the newlines were at by looking for 2 consecutive spaces in the subject field.
Note that <input type="url"> in Web Forms 2.0 requires an IRI. In this case, in addition to the normal reserved characters that need to be encoded, instances of "(", ")", "!", "*", and "'" may also need to be encoded to their corresponding %HH so that the input will consider the mailto URI an IRI. Otherwise, the input will consider the value invalid (unless the UA allows the invalid input and fixes it for you). See RFC3987 for details.
A raw "#" should never appear in a mailto URI. It should always be represented in a mailto URI as %23. If a raw "#" happens to be in a mailto URI, some mail clients (or browsers passing a mailto URI to a mail client) might treat it as a fragment identifier, where the "#" and everything after it is ignored. Other clients may treat it as just another character. Making sure "#" is always encoded as %23 avoids this issue. However, if a raw "#" is present in a mailto URI, the suggestion is to do what most clients do, and that's to treat the "#" as just another char. For example, mailto:#test would result in "#test" being in the TO field. mailto:?subject=#test would result in "#test" being in the subject field. In short, there's no such thing as a fragment identifier in a mailto URI, according to most clients.
If, for whatever reason, you want to compose a URI that puts just numbers in the TO field, you have to be careful. For example, if you wanted 44 to show up in the TO field, you might use "mailto:44". However, this creates a problem if you enter the URI in a browser's address field. The browser's address field might think that you mean to go to some mailto site on port 44. To avoid this, use "mailto:?to=44" instead. Or, percent-encode the first 4 to %34. This should not be a problem for links in a web page.
While I is less than the length of S:
If C equals "%" and I + 2 is less than the length of s:
If F1 or F2 was not found in HEXITS:
Else:
Else:
RET will be a string of raw utf-8 sequences with all newline pairs and single newlines normalized to \n.
In a mailto URI, there can be more than one hname with the same name. When a client parses a mailto URI to generate a TO, CC, BCC, SUBJECT or BODY value, the following rules should be followed.
The decoded version of the generated TO value is what browsers should use for the mailto link "Copy email address" feature. "Copy subject", "Copy body", "Copy CC addresses" and "Copy BCC addresses" features should use the decoded version of their corresponding generated value.
If the client supports other address-based hnames, the TO, CC and BCC rules can be used if desired. If the client supports other multiline-based hnames, the body rule can be used. If the client supports other single-line hnames, the subject rule can be used.
Note that array-based hnames like ?body[]=line1&body[]=line2 are not supported in mailto URIs.
For TO, CC and BCC (and generally other address types), there is no need to join empty hvalues. You'll end up getting ", , , , ," if you do.
For SUBJECT, since there can be only one subject in a message, make each new SUBJECT hvalue override the previous one. Also, there are existing implementations that already do this.
For BODY, do not join empty BODY hvalues until there is a BODY hvalue with significant content (non-empty). That way, empty body hvalues do not cause a bunch of empty lines at the top of the BODY field before there's significant content.
If you want your mailto URI to be handled correctly in clients that don't support duplicate hnames, don't use duplicate hnames in your URI. Currently, there's not much need to anyway.
Before parsing, the URI needs to be stripped of "mailto:" and converted into a dataset of &-separated hname=hvalue pairs so that they can be split by "&".
IF S starts with a case-insensitive "mailto:":
IF one or more '?' are present in SUB:
Else:
Since "mailto:" is considered a TO hname, step #3 will ensure that there's always a TO hvalue in the returned dataset even if the hvalue is empty. If you do not wish this to happen (because you're going to split the datset and store the values in a multimap of some sort for example and want to skip the first TO entry if it's empty), you can use the following steps for #3 instead.
IF S equals a case-insensitive "mailto:":
Else IF S starts with a case-insenstive "mailto:?":
Else:
While I is less than the length of HLIST:
If a '=' is found in S:
IF HNAME is empty:
IF LCCHECK equals "to" or "cc" or "bcc" or "subject" or "body":
If HNAME equals "to":
If HVALUE is not empty:
If TO is not empty:
Else if HNAME equals "cc":
If HVALUE is not empty:
If CC is not empty:
Else if HNAME equals "bcc":
If HVALUE is not empty:
If BCC is not empty:
Else if HNAME equals "subject":
Else if HNAME equals "body":
If (HVALUE is empty and BODY is empty) equals false:
If BODY is not empty:
Else if HNAME equals "another hname you want to support":
Handle multiple instances of HNAME as desired. However, if it's an address type, it is recommended that you parse it like TO, CC and BCC and handle HNAME in a case-insensitive manner.
The decoded values are what mail clients use (after converting them to the needed encoding) to fill in a compose form's text fields.
If your app and or your parser does not support null bytes, before parsing the mailto URI, convert "%00" (and raw null bytes) to "%2500" so that when decoding, they show up in the compose field as "%00". If working with a string of characters with a different type and or width, adjust this normalization accordingly. Use this same type of normalization for other bytes that are not supported and invalid %HH. For example, with "mailto:?subject=%YY", %YY is not a valid %HH. It's really just a % (that should have been encoded to %25) and 2 Y's. %YY should be treated as %25YY so that when it's decoded, it comes out as %YY in the subject field.
Click on the link. In your mail client's compose form, you should get the following results.
To:
Cc:
Bcc:
Subject:
Note that there may be extra lines in the body in your mail client if it has signature support.
Also note that for the TO, CC and BCC fields, user-input in most clients requires properly-escaped addresses where certain charactes are escaped with "\" and other parts are quoted (see Address Syntax). However, some clients allow unescaped user-input and will, after decoding an hvalue, deescape characters escaped with "\" when filling in the fields. As long as the intended headers are produced in the outgoing message, the client can use whatever user-input method that's desired.
Note that this test currently does not test the matching of an hname if it's encoded. For example, "mailto:?%74%6F=email%40site.com" is the same as "mailto:?to=email%40site.com". This is covered in the parsing section though where the hname is decoded first before lowercasing and checking for a match.
In MailtoURIParserPack.zip are C, C++, D, Java, Javascript, Perl, Python, Ruby, Pike, Lua, Tcl, PHP5 and Python3000 MailtoURIParser classes that use the rules in this document to show examples of parsing. The test above uses a Javascript version to parse the mailto link to generate the form data.
Mozilla Thunderbird 3.0 generally parses mailto URIs according to the rules in this document (keeping the deescaping note for the TO, CC and BCC fields in mind). Thunderbird 3.0 passes the test above.
Opera 9.5 generally parses mailto URIs for its built-in mail client according to the rules in this document. It passes the test above.
An Opera UserJS mailto link handler that opens mailto links in various webmails.
The action attribute value of an HTML form can be any valid mailto URI including just "mailto:". When the form is submitted, how the URI is generated before being passed to the mail client depends on the form submission method.
Mailto URIs are NOT of type application/x-www-form-urlencoded and mail clients don't decode + to a space. Spaces in the generated mailto URI must be represented as %20 and NOT +.
When submitting to the mail client with method="get", "mailto:?" plus the encoded data set for the form should be generated. Existing hvalues in the mailto URI will be ignored.
For example, the following form would generate and submit "mailto:?to=to1%40example.com%2C%20to2%40example.com&to=to3%40example.com%2C%20to4%40example.com&cc=cc1%40example.com%2C%20cc2%40example.com&cc=cc3%40example.com%2C%20cc4%40example.com&bcc=bcc1%40example.com%2C%20bcc2%40example.com&bcc=bcc3%40example.com%2C%20bcc4%40example.com&subject=subject%201&subject=subject%202&body=Line%201%0D%0ALine%202&body=Line%203%0D%0ALine%204" to the mail client.
<form action="mailto:?body=This%20hvalue%20will%20be%20removed." method="get">
<p>
<input type="text" name="to" value="to1@example.com, to2@example.com">
<input type="text" name="to" value="to3@example.com, to4@example.com">
<input type="text" name="cc" value="cc1@example.com, cc2@example.com">
<input type="text" name="cc" value="cc3@example.com, cc4@example.com">
<input type="text" name="bcc" value="bcc1@example.com, bcc2@example.com">
<input type="text" name="bcc" value="bcc3@example.com, bcc4@example.com">
<input type="text" name="subject" value="subject 1">
<input type="text" name="subject" value="subject 2">
<textarea name="body">Line 1
Line 2</textarea>
<textarea name="body">Line 3
Line 4</textarea>
<input type="submit" value="Compose">
</p>
</form>
So, for method="get", action="mailto:" is what you should use.
When submitting to the mail client with method="post", the encoded data set is itself encoded and used as the BODY hvalue of the generated mailto URI. A Subject hvalue is also generated that contains the User Agent string, which may be overridden by the last Subject hvalue in the URI if one is present. Also, since this is POST, if there are any existing hvalues in the mailto URI, they are kept.
For example, in the following form, "mailto:?body=This%20hvalue%20will%20NOT%20be%20removed.&subject=Form%20Post%20from%20Opera%2F9.50%20(Windows%20NT%205.1%3B%20U%3B%20en)&body=body%3DJust%2520a%2520test." will be generated.
<form action="mailto:?body=This%20hvalue%20will%20NOT%20be%20removed." method="post">
<p>
<input type="text" name="body" value="Just a test.">
<input type="submit" value="Compose">
</p>
</form>
Also see JS parsing examples and createMailtoURIFromActiveDataSet() in BeforeMailtoURL.js for details.
action="mailto:" is partially broken (at different levels) in Opera, Safari, Firefox and IE. Firefox has the best support though with its only problem being that it encodes spaces as +.
When browsers resolve mailto URIs during markup parsing or input submission, raw white-space and non-ascii characters (even if the URI is represented with a non-ascii string) should be UTF8-Percent-encoded to %HH. For example, after entering "mailto:√?subject=1 2√" into a browser's address field and pressing enter, it should be resolved to "mailto:%E2%88%9A?1%202%E2%88%9A". It is very important that this happens before passing a mailto URI on the command line. The presence of raw quotes, backslashes and wide characters can split up commands and execute arbitrary programs (even if the URI is quoted before passing).
Also, when browsers resolve, if a mailto URI contains a %HH sequence representing a character that is not reserved, the %HH sequence should be left alone (at least in places like the address field where a user might be manually editing the URI, or for links when the user might "Copy link address"). It should not be decoded just because it doesn't need to be encoded. For example, "mailto:%2E?subject=%2E" should not be resolved to "mailto:.?subject=.". In this case, it should be as the author of the link intended, if possible.
Also, a browser's "Copy link address" option should copy the mailto link to the clipboard in it's fully-encoded, resolved state.
Also, by default, browsers, should display mailto links in status bars and tooltips in their fully-encoded, resolved state. mailto URIs can lose their meaning when you show them in decoded form and it's no help to the user. The user will get to see everything in decoded format during the compose preview.
A browser's "Send link by mail" feature should general work like this Javascript example:
function UTF8PercentEncodeWithNormalizedNewlines(s) {
try {
// Normalize raw newlines first so that *if* there are any newlines
// in s, \r\n, stray \r and \n all come out as %0D%0A.
return encodeURIComponent(s.replace(/\r\n/g, "\n").replace(/\r/g, "\n").replace(/\n/g, "\r\n"));
} catch (e) {
return "Error%20encoding%20data.";
}
}
function UTF8PercentEncodeWithNewlinesStripped(s) {
try {
return encodeURIComponent(s.replace(/(\r|\n)/g, ""));
} catch (e) {
return "Error%20encoding%20data.";
}
}
function sendLinkByMail() {
var subject = UTF8PercentEncodeWithNewlinesStripped(document.title);
var body = UTF8PercentEncodeWithNormalizedNewlines("<" + document.location + ">");
var uri = "mailto:?subject=" + subject + "&body=" + body;
window.open(uri);
}
// For Thunderbird 2 (not 3) with HTML composition turned on for example
function sendLinkByMailBodyIsHTML() {
var subject = UTF8PercentEncodeWithNewlinesStripped(document.title);
var body = UTF8PercentEncodeWithNormalizedNewlines("<" + document.location + ">");
var uri = "mailto:?subject=" + subject + "&body=" + body;
window.open(uri);
}
// For Thunderbird 2 (not 3) with HTML composition turned on for example
function sendLinkByMailBodyIsHTMLActualLink() {
var subject = UTF8PercentEncodeWithNewlinesStripped(document.title);
var link = '<a href="' + document.location + '">' + document.location + '</a>';
var body = UTF8PercentEncodeWithNormalizedNewlines(link);
var uri = "mailto:?subject=" + subject + "&body=" + body;
window.open(uri);
}