Character Lower-casing: Locale-aware vs Lookup Table
Question 18 / 51 • Correct so far: 0 (0 answered)
Locale Lower
std::string lowerCase(const std::string& s) {
std::locale loc;
std::string out(s.size(), '\0');
for (std::size_t i = 0; i < s.size(); ++i)
out[i] = std::tolower(s[i], loc);
return out;
}
std::string result = lowerCase(SAMPLE); Table Lower
std::string lowerCase(const std::string& s) {
std::string out(s.size(), '\0');
for (std::size_t i = 0; i < s.size(); ++i)
out[i] = static_cast<char>(LOWER_TABLE[static_cast<unsigned char>(s[i])]);
return out;
}
std::string result = lowerCase(SAMPLE); Shared test data (shared-setup)
static const std::string SAMPLE = []() {
std::string s;
s.reserve(4096);
for (int i = 0; i < 4096; ++i)
s += static_cast<char>('A' + (i % 26));
return s;
}();
static const std::array<unsigned char, 256> LOWER_TABLE = []() {
std::array<unsigned char, 256> t;
for (int i = 0; i < 256; ++i)
t[i] = static_cast<unsigned char>(std::tolower(i));
return t;
}(); Which snippet is faster?
Snippet B is faster. The two-argument form std::tolower(c, locale) must consult the locale object on every character, and on some standard library implementations this involves acquiring an internal lock or performing a virtual dispatch through the locale facet. A 256-entry lookup table built once at startup converts any byte in a single array read with no lock, no virtual call, and no locale state — making it dramatically faster when called millions of times in a parsing or text-processing hot path.
Benchmark results
| Snippet | CPU time / iteration | Speedup |
|---|---|---|
| Locale Lower | 13 us | 1.0× |
| Table Lower | 1.72 us | 7.5× |
Explore the source
Open in Compiler ExplorerQuiz complete. You can return to the question list to restart and compare.