HTML / XHTML / WML / XML Validator |
||||||||||||
| de | en | ||||||||||||
|
||||||||||||
Identifying the charset encoding of a HTML-document was based on the
HTML 4.01 W3C-Specifications But some questions are left open by that. The following points are still unexplained or inconsistent
| |||||||||||
HTTP-Header: Content-Type: text/html; charset=ISO-8859-1 1: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>no error</title> 7: <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> 8: </head> 9: <body></body> 10: </html> | This document contains a valid charset statement in Meta-Tag and in HTTP-Header. The HTTP-Header charset encoding has to be used. | ||||||||||
HTTP-Header: Content-Type: text/html 1: EF BB BF<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>Byte order Mark != Meta-Charset</title> 7: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 8: </head> 9: <body>äöüÄÖÜß</body> 10: </html> | The document is encoded in UTF-8, but Meta-Tag defines ISO-8859-1. That conflict should be pointed out to the user. | ||||||||||
HTTP-Header: Content-Type: text/html; charset=UTF-8 1: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2: "http://www.w3.org/TR/html4/loose.dtd">
3:
4: <html>
5: <head>
6: <title>HTTP-Header != Meta-Charset</title>
7: <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
8: </head>
9: <body>äöüÄÖÜß</body>
10: </html>
| The charset encoding in Meta-Tag is different to charset encoding in HTTP-Header. HTTP-Header charset encoding has to be used. | ||||||||||
HTTP-Header: Content-Type: text/html 1: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>missing http-charset</title> 7: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 8: </head> 9: <body>äöüÄÖÜß?</body> 10: </html> | Missing charset encoding in HTTP-Header. Meta-Tag charset encoding has to be used. | ||||||||||
HTTP-Header: Content-Type: text/html; charset=UTF-8 1: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>missing meta-charset</title> 7: </head> 8: <body>äöüÄÖÜß</body> 9: </html> | There isn't any charset encoding statement in this document. Only in HTTP-Header, a charset encoding statement is being communicated. Because of not being able to identify the charset encoding, if the document will not be sent via HTTP (e.g. a local copy), a notice will be shown to the user. | ||||||||||
HTTP-Header: Content-Type: text/html 1: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>missing all-charset</title> 7: </head> 8: <body>äöüÄÖÜß</body> 9: </html> | Neither in Meta-Tag, Byte Order Mark (BOM), nor in HTTP-Header a charset encoding statement
was found. The W3C-specification recommends to ignore RFC2616 and consequently not to perform a fallback to ISO-8859-1, but nothing is being told about which charset encoding should be used instead. For this reason, we decided to abort validation and report an Error message. | ||||||||||
From now on, there are UTF-16 encoded HTML-documents. | |||||||||||
HTTP-Header: Content-Type: text/html; charset=UTF-16 1: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>no error UTF-16 without byte order mark</title> 7: <meta http-equiv="Content-Type" content="text/html; charset=utf-16"> 8: </head> 9: <body>äöüÖÄÜß</body> 10: </html> | This HTML-document has been encoded in UTF-16 and contains no Errors. | ||||||||||
HTTP-Header: Content-Type: text/html; charset=UTF-16 1: FF FE<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2: "http://www.w3.org/TR/html4/loose.dtd">
3:
4: <html>
5: <head>
6: <title>no error UTF-16 with byte order mark</title>
7: <meta http-equiv="Content-Type" content="text/html; charset=utf-16">
8: </head>
9: <body>äöüÖÄÜß</body>
10: </html>
| This HTML-document has been encoded in UTF-16 with existing Byte Order Mark (BOM) and contains no Errors. | ||||||||||
The following examples are UTF-16 encoded with existing Byte Order Mark (BOM). | |||||||||||
HTTP-Header: Content-Type: text/html; charset=ISO-8859-1 1: FF FE<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2: "http://www.w3.org/TR/html4/loose.dtd">
3:
4: <html>
5: <head>
6: <title>UTF-16; HTTP-Header != BOM</title>
7: <meta http-equiv="Content-Type" content="text/html; charset=utf-16">
8: </head>
9: <body></body>
10: </html>
| HTTP-Header charset encoding statement is different to BOM in this example document. In such cases HTTP-Header charset encoding is being used. Because of the document being encoded in UTF-16, but having to process it with ISO-8859-1, some parsing Errors should be reported. | ||||||||||
HTTP-Header: Content-Type: text/html; charset=UTF-161: FF FE<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>UTF-16; HTTP-Header != META</title> 7: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 8: </head> 9: <body>äöüÄÖÜß</body> 10: </html> | Meta-Tag charset encoding is different to HTTP-Header charset encoding. HTTP-Header charset encoding has to be used. | ||||||||||
HTTP-Header: Content-Type: text/html 1: FF FE<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2: "http://www.w3.org/TR/html4/loose.dtd"> 3: 4: <html> 5: <head> 6: <title>UTF-16; BOM != META</title> 7: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 8: </head> 9: <body>äöüÄÖÜß</body> 10: </html> | This HTML-document is UTF-16 encoded, but the Meta-Tag charset statement tells us
to process it with ISO-8859-1. BOM describes the right charset. That conflict should be pointed out to the user. | ||||||||||
| Datenschutzerklärung | |