HTML charset Attribute
Description
The HTML charset
attribute specifies the character encoding for the HTML document. This attribute is essential for ensuring that the text in the document is correctly displayed by browsers, particularly when it includes characters outside the ASCII range, such as characters from non-Latin alphabets, special characters, and emojis.
The charset
attribute is typically used within a <meta>
tag in the document's <head>
section. The most commonly recommended character encoding in modern web development is UTF-8, a universal character set that includes virtually all characters used in human languages. UTF-8 is capable of encoding all 1,112,064 valid character code points in Unicode using one to four one-byte (8-bit) code units, making it a versatile and universally accepted encoding standard.
Here is an example of how the charset
attribute is used in HTML to specify UTF-8 encoding:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Example Document</title>
</head>
<body>
<!-- Your content goes here -->
</body>
</html>
By declaring the character encoding of the document using the charset
attribute, web developers help prevent issues related to incorrect character rendering, ensuring that the content is accessible and correctly displayed to users worldwide.
Just some of the charsets that can be used are:
UTF-8: This is the most widely used character encoding and supports virtually every character from all scripts around the world. It's a good practice to use UTF-8 for new websites. Example:
<meta charset="UTF-8">
.ISO-8859-1: Also known as Latin-1, this character set supports most Western European languages. Example:
<meta charset="ISO-8859-1">
.Windows-1252: This is a character encoding for the Latin script. It's very similar to ISO-8859-1 but includes some additional characters. It's commonly used in Windows-based systems. Example:
<meta charset="Windows-1252">
.UTF-16: This encoding is similar to UTF-8 but uses two (or more) bytes for each character. It's less commonly used for the web because it's not as efficient for languages that can be encoded with fewer bytes in UTF-8. Example:
<meta charset="UTF-16">
.ISO-8859-2: This encoding supports Latin script used by Central European and Eastern European languages. Example:
<meta charset="ISO-8859-2">
.GBK: This encoding is used for Simplified Chinese characters. It's a wider encoding scheme that can represent more characters than its predecessor, GB2312. Example:
<meta charset="GBK">
.Shift_JIS: This character set is used for Japanese text. It combines two character sets: JIS X 0201 and JIS X 0208. Example:
<meta charset="Shift_JIS">
.EUC-KR: This is a character encoding for the Korean language. It's widely used in South Korea and supports both Hangul and Hanja characters. Example:
<meta charset="EUC-KR">
.
Syntax
<meta charset="character-set">
Values
- character-setCharacter set to use, common values are UTF-8 and ISO-8859-1.
Applies To
The charset
attribute can be used on the following html elements.
Example
<meta charset="utf-8">
Browser Support
The following table will show you the current browser support for the HTML charset
Attribute.
Desktop | |||||
12 | 1 | 1 | 15 | 3 |
Tablets / Mobile | |||||
18 | 4 | 14 | 2 | 1 | 4.4 |
Last updated by CSSPortal on: 28th March 2024