C++支持各种字符串和字符类型,并提供了表达每种类型字面值的方法。在源代码中,我们使用字符集来表示字符或字符串。同时我们还可以使用通用字符名和转义字符来通过基本的源字符集表示任何字符串。而原始字符串能够避免对转义字符进行转义,并可用于表示所有类型的字符串。

1. 字符和字符串字面量

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int main()
{
// Character literals
auto c0 = 'A'; // char
auto c1 = u8'A'; // char
auto c2 = L'A'; // wchar_t
auto c3 = u'A'; // char16_t
auto c4 = U'A'; // char32_t

// Multicharacter literals
auto m0 = 'abcd'; // int, value 0x61626364

// String literals
auto s0 = "hello"; // const char*
auto s1 = u8"hello"; // const char* before C++20, encoded as UTF-8,
// const char8_t* in C++20
auto s2 = L"hello"; // const wchar_t*
auto s3 = u"hello"; // const char16_t*, encoded as UTF-16
auto s4 = U"hello"; // const char32_t*, encoded as UTF-32
}

2. 原始字符串

1
2
3
4
5
6
7
8
9
10
int main()
{
// Raw string literals containing unescaped \ and "
auto R0 = R"("Hello \ world")"; // const char*
auto R1 = u8R"("Hello \ world")"; // const char* before C++20, encoded as UTF-8,
// const char8_t* in C++20
auto R2 = LR"("Hello \ world")"; // const wchar_t*
auto R3 = uR"("Hello \ world")"; // const char16_t*, encoded as UTF-16
auto R4 = UR"("Hello \ world")"; // const char32_t*, encoded as UTF-32
}

3. 字符串后缀

字符串后缀就是在字符串后面加s,如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
auto S0 =   "hello"s; // std::string
auto S1 = u8"hello"s; // std::string before C++20, std::u8string in C++20
auto S2 = L"hello"s; // std::wstring
auto S3 = u"hello"s; // std::u16string
auto S4 = U"hello"s; // std::u32string
std::string_view sv = "abc\0\0def"sv;

// 和原始字符串一起使用
auto S5 = R"("Hello \ world")"s; // std::string from a raw const char*
auto S6 = u8R"("Hello \ world")"s; // std::string from a raw const char* before C++20, encoded as UTF-8,
// std::u8string in C++20
auto S7 = LR"("Hello \ world")"s; // std::wstring from a raw const wchar_t*
auto S8 = uR"("Hello \ world")"s; // std::u16string from a raw const char16_t*, encoded as UTF-16
auto S9 = UR"("Hello \ world")"s; // std::u32string from a raw const char32_t*, encoded as UTF-32

和不加后缀的区别如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>
#include <string>

void print_with_zeros(auto const note, std::string const& s)
{
std::cout << note;
for (const char c : s)
c ? std::cout << c : std::cout << "₀";
std::cout << " (size = " << s.size() << ")\n";
}

int main()
{
using namespace std::string_literals;

std::string s1 = "abc\0\0def";
std::string s2 = "abc\0\0def"s;
print_with_zeros("s1: ", s1);
print_with_zeros("s2: ", s2);

std::cout << "abcdef"s.substr(1,4) << '\n';
}

输出:

1
2
3
s1: abc (size = 3)
s2: abc₀₀def (size = 8)
bcde

4. 自定义字符串后缀

字符串后缀是通过操作符的重载实现的:

1
std::string operator""s( const char *str, std::size_t len );

我们也可以自定义操作符的重载来实现其他的字符串后缀,自定义的后缀建议以下划线开头,否则会产生编译警告:

1
warning C4455: “operator ""mm”: 已保留不以下划线开头的文本后缀标识符

下面自定义_mm_m_km后缀分别表示毫米、米、千米:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
long double operator"" _mm(long double x) {
return x / 1000;
}

long double operator"" _m(long double x) {
return x;
}

long double operator"" _km(long double x) {
return x * 1000;
}

int main()
{
std::cout << 1.0_mm << std::endl; // 0.001
std::cout << 1.0_m << std::endl; // 1
std::cout << 1.0_km << std::endl; // 1000

return 0;
}