C++ has a lot of types that vaguely describe the same thing. Assuming that we are compiling for an architecture where a byte is 8-bit, all of the following types are vaguely similar:
std::bytestd::uint8_tstd::bitset<8>unsigned char(8-bit)char(8-bit)
If a byte is 8-bit, are all these types more or less interchangeable? If not, when would one need to be used instead of another?
I often see questions like Converting a hex string to a byte array on Stack Overflow where someone uses std::uint8_t, char, unsigned char and other types to represent a "byte". Is this just a matter of stylistic preference?
Note: This Q&A is intended to be a community FAQ, and edits are encouraged. The question of when to use what type for a "byte" and why comes up all the time, despite C++17 having introduced std::byte which seemingly makes the choice obvious. Having an FAQ that addresses all the misconceptions about std::bitset, std::uint8_t, etc. being a "byte" is useful. Edits are encouraged.
For 8-bit architectures, all the listed types are vaguely similar in the sense that they model something that has 8 bits. However, the use cases are fundamentally different, and only some of these types are guaranteed special properties that make them usable as a byte type.
Overview
std::byteenum class byte : unsigned char {};✔️ all special properties
unsigned char✔️ all special properties
signed char❌ no special properties
charsigned charorunsigned char⚠️ only some special properties
char8_tunderlying type
unsigned char❌ no special properties
std::uint8_ttypedef unsigned char uint8_t;(This is not guaranteed, just the most
common implementation.)
⚠️ special properties not guaranteed
std::bitset<8>template <std::size_t N>class bitset;❌ no special properties
See the appendix at the end of the question for a list of all these special properties, type by type.
std::byte(C++17)This is the canonical byte type in C++. Whenever you have to ask yourself the question "Which type should I use to represent these bytes?",
std::byteis the answer.Note that
std::byteis very special because there are many relaxations that allow you to use the type in otherwise undefined ways. For example, the strict aliasing rule is relaxed forstd::byte([basic.lval] p11), meaning that you can examine any object as an array ofstd::bytes.Most other types don't have these special powers, and attempting to use them as a byte would be undefined behavior.
As appropriate as
std::byteis for raw memory operations, many older APIs such as the<iostream>library predate it and aren't designed around it. The type is also somewhat clunky (e.g.my_byte == 0is not possible). Don't attempt to forcefully use it with libraries that weren't designed forstd::byte.See also: Is there 'byte' data type in C++?, What is the purpose of std::byte?, P0298 - A byte type definition*
unsigned charThis is the closest thing to a "byte" there is prior to C++17.
unsigned charhas all the special properties that astd::bytehas.However, the name is very confusing and it's also treated as a character in some contexts. For example,
std::ostream::operator<<prints it as an ASCII character, instead of printing its numeric value. Also, doing arithmetic withunsigned charpromotes it tointbefore any operation, which seems inappropriate for a "byte".All in all, it's a wishy washy type that is simultaneously a byte, a character, and an arithmetic type. Prefer
std::byte,char,std::uint8_t, orstd::uint_least8_tinstead.See also: How to use new std::byte type in places where old-style unsigned char is needed?
signed charThe signed counterpart to
unsigned charis similarly confused. It has almost none of the special properties thatstd::byteandunsigned charhave, and is a strange mix of arithmetic and character type. It should also be avoided.A better alternative is
std::int_least8_twhich is also signed, and also guaranteed to be at least 8 bits wide, but which doesn't have a weird connotation of also being a character.See also: Difference between signed / unsigned char
charThis is a distinct type which has the same underlying type as
signed charorunsigned char. It has most (but not all) of the special properties ofunsigned charandstd::byte. For example, unlikeunsigned char, it does not provide storage ([intro.object] p3) for objects created in achar[].charshould be used for what the name says: a character.See also: char!=(signed char), char!=(unsigned char)
char8_t(C++20)There was originally some discussion about this type having special properties akin to
char, but it ended up having none. Its underlying type isunsigned char, but it unlikestd::byte, this doesn't mean that it inherits any properties from it.It should be used as a UTF-8 character, possibly within a UTF-8 encoded string.
std::uint8_t(C++11)This type is a design mistake that has started in C. While this isn't guaranteed, it is usually implemented as type alias like
This means that it has the special properties that
unsigned charhas in practice (since all compilers implement it like this), but none of this is guaranteed by the standard. The fact that it can alias every other type can also make it detrimental to performance, compared to if it was an alias for a unique type.One thing to note is that a byte isn't guaranteed to be 8 bits in C++. Many people use
std::uint8_tbecause it offers a perceived safety of really being 8 bits. However,std::uint8_tis optional and doesn't exist on platforms where a byte is wider than 8 bits, so it is no more portable than:For a more portable 8-bit arithmetic type, there are
std::uint_fast8_tandstd::uint_fast8_t, which are guaranteed to exist but may be wider than 8 bits.Note that
std::uint8_t,std::uint_least8_t, andstd::uint_fast8_tmay all be promoted toint, just likeunsigned char.See also: uint8_t vs unsigned char, What platforms have something other than 8-bit char?
std::bitset<8>This is the furthest from "byte" type. It models sequence of bits, or a set of numbers depending on perspective.
A
std:bitset<8>is at least as large asintin most implementations, so it isn't even 8 bits large. Only use this type for what the name says: a set of bits. It is not a byte.Conclusion
std::byteis the only type which models a byte, nothing more, nothing less. It should be preferred as a byte type whenever possible. All other types are either missing crucial properties or have a fundamentally different purpose than being a byte.Appendix
Special properties of
std::byteand ordinary character typesunsigned char[],std::byte[]unsigned char[],std::byte[]char*, cvunsigned char*, and cvstd::byte*static_castof pointers to objects outside lifetime is allowedstd::bytechar[],unsigned char[],std::byte[]char,unsigned char,std::bytechar[],unsigned char[],std::byte[]std::bytestd::bit_castNote: it's unclear what unsigned ordinary character type actually means. See Editorial Issue 5070.