Skip to content

Commit 77ad77c

Browse files
committed
[libc++][format] Improve string formatters
This changes the implementation of the formatter. Instead of inheriting from a specialized parser all formatters will use the same generic parser. This reduces the binary size. The new parser contains some additional fields only used in the chrono formatting. Since this doesn't change the size of the parser the fields are in the generic parser. The parser is designed to fit in 128-bit, making it cheap to pass by value. The new format function is a const member function. This isn't required by the Standard yet, but it will be after LWG-3636 is accepted. Additionally P2286 adds a formattable concept which requires the member function to be const qualified in C++23. This paper is likely to be accepted in the 2022 July plenary. Depends on D121530 NOTE parts of the code now contains duplicates for the current and new parser. The intention is to remove the duplication in followup patches. A general overview of the final code is available in D124620. That review however lacks a bit of polish. Most of the new code is based on the same algorithms used in the current code. The final version of this code reduces the binary size by 17 KB for this example code ``` int main() { { std::string_view sv{"hello world"}; std::format("{}{}|{}{}{}{}{}{}|{}{}{}{}{}{}|{}{}{}|{}{}|{}", true, '*', (signed char)(42), (short)(42), (int)(42), (long)(42), (long long)(42), (__int128_t)(42), (unsigned char)(42), (unsigned short)(42), (unsigned int)(42), (unsigned long)(42), (unsigned long long)(42), (__uint128_t)(42), (float)(42), (double)(42), (long double)(42), "hello world", sv, nullptr); } { std::wstring_view sv{L"hello world"}; std::format(L"{}{}|{}{}{}{}{}{}|{}{}{}{}{}{}|{}{}{}|{}{}|{}", true, L'*', (signed char)(42), (short)(42), (int)(42), (long)(42), (long long)(42), (__int128_t)(42), (unsigned char)(42), (unsigned short)(42), (unsigned int)(42), (unsigned long)(42), (unsigned long long)(42), (__uint128_t)(42), (float)(42), (double)(42), (long double)(42), L"hello world", sv, nullptr); } } ``` Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D125606
1 parent 7dbb366 commit 77ad77c

File tree

8 files changed

+715
-430
lines changed

8 files changed

+715
-430
lines changed

libcxx/include/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,7 @@ set(files
248248
__format/formatter_floating_point.h
249249
__format/formatter_integer.h
250250
__format/formatter_integral.h
251+
__format/formatter_output.h
251252
__format/formatter_pointer.h
252253
__format/formatter_string.h
253254
__format/parser_std_format_spec.h
Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
// -*- C++ -*-
2+
//===----------------------------------------------------------------------===//
3+
//
4+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
5+
// See https://llvm.org/LICENSE.txt for license information.
6+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
7+
//
8+
//===----------------------------------------------------------------------===//
9+
10+
#ifndef _LIBCPP___FORMAT_FORMATTER_OUTPUT_H
11+
#define _LIBCPP___FORMAT_FORMATTER_OUTPUT_H
12+
13+
#include <__algorithm/copy.h>
14+
#include <__algorithm/fill_n.h>
15+
#include <__config>
16+
#include <__format/parser_std_format_spec.h>
17+
#include <__utility/move.h>
18+
#include <__utility/unreachable.h>
19+
#include <cstddef>
20+
#include <string_view>
21+
22+
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
23+
# pragma GCC system_header
24+
#endif
25+
26+
_LIBCPP_BEGIN_NAMESPACE_STD
27+
28+
#if _LIBCPP_STD_VER > 17
29+
30+
namespace __formatter {
31+
32+
// TODO FMT remove _v2 suffix.
33+
struct _LIBCPP_TYPE_VIS __padding_size_result_v2 {
34+
size_t __before_;
35+
size_t __after_;
36+
};
37+
38+
// TODO FMT remove _v2 suffix.
39+
_LIBCPP_HIDE_FROM_ABI constexpr __padding_size_result_v2 __padding_size_v2(size_t __size, size_t __width,
40+
__format_spec::__alignment __align) {
41+
_LIBCPP_ASSERT(__width > __size, "don't call this function when no padding is required");
42+
_LIBCPP_ASSERT(__align != __format_spec::__alignment::__default,
43+
"the caller should adjust the default to the value required by the type");
44+
_LIBCPP_ASSERT(__align != __format_spec::__alignment::__zero_padding,
45+
"the caller should have handled the zero-padding");
46+
47+
size_t __fill = __width - __size;
48+
switch (__align) {
49+
case __format_spec::__alignment::__default:
50+
case __format_spec::__alignment::__zero_padding:
51+
__libcpp_unreachable();
52+
53+
case __format_spec::__alignment::__left:
54+
return {0, __fill};
55+
56+
case __format_spec::__alignment::__center: {
57+
// The extra padding is divided per [format.string.std]/3
58+
// __before = floor(__fill, 2);
59+
// __after = ceil(__fill, 2);
60+
size_t __before = __fill / 2;
61+
size_t __after = __fill - __before;
62+
return {__before, __after};
63+
}
64+
case __format_spec::__alignment::__right:
65+
return {__fill, 0};
66+
}
67+
__libcpp_unreachable();
68+
}
69+
70+
/// Writes the input to the output with the required padding.
71+
///
72+
/// Since the output column width is specified the function can be used for
73+
/// ASCII and Unicode output.
74+
///
75+
/// \pre [\a __first, \a __last) is a valid range.
76+
/// \pre \a __size <= \a __width. Using this function when this pre-condition
77+
/// doesn't hold incurs an unwanted overhead.
78+
///
79+
/// \param __first Pointer to the first element to write.
80+
/// \param __last Pointer beyond the last element to write.
81+
/// \param __out_it The output iterator to write to.
82+
/// \param __specs The parsed formatting specifications.
83+
/// \param __size The (estimated) output column width. When the elements
84+
/// to be written are ASCII the following condition holds
85+
/// \a __size == \a __last - \a __first.
86+
///
87+
/// \returns An iterator pointing beyond the last element written.
88+
///
89+
/// \note The type of the elements in range [\a __first, \a __last) can differ
90+
/// from the type of \a __specs. Integer output uses \c std::to_chars for its
91+
/// conversion, which means the [\a __first, \a __last) always contains elements
92+
/// of the type \c char.
93+
template <class _CharT, class _ParserCharT>
94+
_LIBCPP_HIDE_FROM_ABI auto __write(const _CharT* __first, const _CharT* __last,
95+
output_iterator<const _CharT&> auto __out_it,
96+
__format_spec::__parsed_specifications<_ParserCharT> __specs, ptrdiff_t __size)
97+
-> decltype(__out_it) {
98+
_LIBCPP_ASSERT(__first <= __last, "Not a valid range");
99+
100+
if (__size >= __specs.__width_)
101+
return _VSTD::copy(__first, __last, _VSTD::move(__out_it));
102+
103+
__padding_size_result_v2 __padding =
104+
__formatter::__padding_size_v2(__size, __specs.__width_, __specs.__std_.__alignment_);
105+
__out_it = _VSTD::fill_n(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
106+
__out_it = _VSTD::copy(__first, __last, _VSTD::move(__out_it));
107+
return _VSTD::fill_n(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
108+
}
109+
110+
# ifndef _LIBCPP_HAS_NO_UNICODE
111+
template <class _CharT>
112+
_LIBCPP_HIDE_FROM_ABI auto __write_unicode_no_precision(basic_string_view<_CharT> __str,
113+
output_iterator<const _CharT&> auto __out_it,
114+
__format_spec::__parsed_specifications<_CharT> __specs)
115+
-> decltype(__out_it) {
116+
_LIBCPP_ASSERT(!__specs.__has_precision(), "use __write_unicode");
117+
// No padding -> copy the string
118+
if (!__specs.__has_width())
119+
return _VSTD::copy(__str.begin(), __str.end(), _VSTD::move(__out_it));
120+
121+
// Non Unicode part larger than width -> copy the string
122+
auto __last = __format_spec::__detail::__estimate_column_width_fast(__str.begin(), __str.end());
123+
ptrdiff_t __size = __last - __str.begin();
124+
if (__size >= __specs.__width_)
125+
return _VSTD::copy(__str.begin(), __str.end(), _VSTD::move(__out_it));
126+
127+
// Is there a non Unicode part?
128+
if (__last != __str.end()) {
129+
// Non Unicode and Unicode part larger than width -> copy the string
130+
__format_spec::__detail::__column_width_result __column_width =
131+
__format_spec::__detail::__estimate_column_width(__last, __str.end(), __specs.__width_);
132+
__size += __column_width.__width; // Note this new size is used when __size < __specs.__width_
133+
if (__size >= __specs.__width_)
134+
return _VSTD::copy(__str.begin(), __str.end(), _VSTD::move(__out_it));
135+
}
136+
137+
return __formatter::__write(__str.begin(), __str.end(), _VSTD::move(__out_it), __specs, __size);
138+
}
139+
# endif
140+
141+
template <class _CharT>
142+
_LIBCPP_HIDE_FROM_ABI auto __write_unicode(basic_string_view<_CharT> __str,
143+
output_iterator<const _CharT&> auto __out_it,
144+
__format_spec::__parsed_specifications<_CharT> __specs)
145+
-> decltype(__out_it) {
146+
# ifndef _LIBCPP_HAS_NO_UNICODE
147+
if (!__specs.__has_precision())
148+
return __formatter::__write_unicode_no_precision(__str, _VSTD::move(__out_it), __specs);
149+
150+
// Non unicode part larger than precision -> truncate the output and use the normal write operation.
151+
auto __last = __format_spec::__detail::__estimate_column_width_fast(__str.begin(), __str.end());
152+
ptrdiff_t __size = __last - __str.begin();
153+
if (__size >= __specs.__precision_)
154+
return __formatter::__write(__str.begin(), __str.begin() + __specs.__precision_, _VSTD::move(__out_it), __specs,
155+
__specs.__precision_);
156+
157+
// No non Unicode part, implies __size < __specs.__precision_ -> use normal write operation
158+
if (__last == __str.end())
159+
return __formatter::__write(__str.begin(), __str.end(), _VSTD::move(__out_it), __specs, __str.size());
160+
161+
__format_spec::__detail::__column_width_result __column_width =
162+
__format_spec::__detail::__estimate_column_width(__last, __str.end(), __specs.__precision_ - __size);
163+
__size += __column_width.__width;
164+
// Truncate the output
165+
if (__column_width.__ptr != __str.end())
166+
__str.remove_suffix(__str.end() - __column_width.__ptr);
167+
168+
return __formatter::__write(__str.begin(), __str.end(), _VSTD::move(__out_it), __specs, __size);
169+
170+
# else
171+
if (__specs.__has_precision()) {
172+
ptrdiff_t __size = __str.size();
173+
if (__size > __specs.__precision_)
174+
return __formatter::__write(__str.begin(), __str.begin() + __specs.__precision_, _VSTD::move(__out_it), __specs,
175+
__specs.__precision_);
176+
}
177+
return __formatter::__write(__str.begin(), __str.end(), _VSTD::move(__out_it), __specs, __str.size());
178+
179+
# endif
180+
}
181+
182+
} // namespace __formatter
183+
184+
#endif //_LIBCPP_STD_VER > 17
185+
186+
_LIBCPP_END_NAMESPACE_STD
187+
188+
#endif // _LIBCPP___FORMAT_FORMATTER_OUTPUT_H

libcxx/include/__format/formatter_string.h

Lines changed: 42 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,15 @@
1010
#ifndef _LIBCPP___FORMAT_FORMATTER_STRING_H
1111
#define _LIBCPP___FORMAT_FORMATTER_STRING_H
1212

13-
#include <__assert>
13+
#include <__availability>
1414
#include <__config>
15-
#include <__format/format_error.h>
1615
#include <__format/format_fwd.h>
17-
#include <__format/format_string.h>
16+
#include <__format/format_parse_context.h>
1817
#include <__format/formatter.h>
18+
#include <__format/formatter_output.h>
1919
#include <__format/parser_std_format_spec.h>
20+
#include <__utility/move.h>
21+
#include <string>
2022
#include <string_view>
2123

2224
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
@@ -27,43 +29,30 @@ _LIBCPP_BEGIN_NAMESPACE_STD
2729

2830
#if _LIBCPP_STD_VER > 17
2931

30-
namespace __format_spec {
31-
3232
template <__formatter::__char_type _CharT>
33-
class _LIBCPP_TEMPLATE_VIS __formatter_string : public __parser_string<_CharT> {
33+
struct _LIBCPP_TEMPLATE_VIS __formatter_string {
3434
public:
35-
_LIBCPP_HIDE_FROM_ABI auto format(basic_string_view<_CharT> __str,
36-
auto& __ctx) -> decltype(__ctx.out()) {
37-
38-
_LIBCPP_ASSERT(this->__alignment != _Flags::_Alignment::__default,
39-
"The parser should not use these defaults");
40-
41-
if (this->__width_needs_substitution())
42-
this->__substitute_width_arg_id(__ctx.arg(this->__width));
43-
44-
if (this->__precision_needs_substitution())
45-
this->__substitute_precision_arg_id(__ctx.arg(this->__precision));
46-
47-
return __formatter::__write_unicode(
48-
__ctx.out(), __str, this->__width,
49-
this->__has_precision_field() ? this->__precision : -1, this->__fill,
50-
this->__alignment);
35+
_LIBCPP_HIDE_FROM_ABI constexpr auto parse(basic_format_parse_context<_CharT>& __parse_ctx)
36+
-> decltype(__parse_ctx.begin()) {
37+
auto __result = __parser_.__parse(__parse_ctx, __format_spec::__fields_string);
38+
__format_spec::__process_display_type_string(__parser_.__type_);
39+
return __result;
5140
}
52-
};
5341

54-
} //namespace __format_spec
42+
_LIBCPP_HIDE_FROM_ABI auto format(basic_string_view<_CharT> __str, auto& __ctx) const -> decltype(__ctx.out()) {
43+
return __formatter::__write_unicode(__str, __ctx.out(), __parser_.__get_parsed_std_specifications(__ctx));
44+
}
5545

56-
// [format.formatter.spec]/2.2 For each charT, the string type specializations
46+
__format_spec::__parser<_CharT> __parser_;
47+
};
5748

5849
// Formatter const char*.
5950
template <__formatter::__char_type _CharT>
60-
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT
61-
formatter<const _CharT*, _CharT>
62-
: public __format_spec::__formatter_string<_CharT> {
63-
using _Base = __format_spec::__formatter_string<_CharT>;
51+
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT formatter<const _CharT*, _CharT>
52+
: public __formatter_string<_CharT> {
53+
using _Base = __formatter_string<_CharT>;
6454

65-
_LIBCPP_HIDE_FROM_ABI auto format(const _CharT* __str, auto& __ctx)
66-
-> decltype(__ctx.out()) {
55+
_LIBCPP_HIDE_FROM_ABI auto format(const _CharT* __str, auto& __ctx) const -> decltype(__ctx.out()) {
6756
_LIBCPP_ASSERT(__str, "The basic_format_arg constructor should have "
6857
"prevented an invalid pointer.");
6958

@@ -78,8 +67,9 @@ struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT
7867
// now these optimizations aren't implemented. Instead the base class
7968
// handles these options.
8069
// TODO FMT Implement these improvements.
81-
if (this->__has_width_field() || this->__has_precision_field())
82-
return _Base::format(__str, __ctx);
70+
__format_spec::__parsed_specifications<_CharT> __specs = _Base::__parser_.__get_parsed_std_specifications(__ctx);
71+
if (__specs.__has_width() || __specs.__has_precision())
72+
return __formatter::__write_unicode(basic_string_view<_CharT>{__str}, __ctx.out(), __specs);
8373

8474
// No formatting required, copy the string to the output.
8575
auto __out_it = __ctx.out();
@@ -91,66 +81,59 @@ struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT
9181

9282
// Formatter char*.
9383
template <__formatter::__char_type _CharT>
94-
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT
95-
formatter<_CharT*, _CharT> : public formatter<const _CharT*, _CharT> {
84+
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT formatter<_CharT*, _CharT>
85+
: public formatter<const _CharT*, _CharT> {
9686
using _Base = formatter<const _CharT*, _CharT>;
9787

98-
_LIBCPP_HIDE_FROM_ABI auto format(_CharT* __str, auto& __ctx)
99-
-> decltype(__ctx.out()) {
88+
_LIBCPP_HIDE_FROM_ABI auto format(_CharT* __str, auto& __ctx) const -> decltype(__ctx.out()) {
10089
return _Base::format(__str, __ctx);
10190
}
10291
};
10392

10493
// Formatter char[].
10594
template <__formatter::__char_type _CharT, size_t _Size>
10695
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT formatter<_CharT[_Size], _CharT>
107-
: public __format_spec::__formatter_string<_CharT> {
108-
static_assert(!is_const_v<_CharT>);
109-
using _Base = __format_spec::__formatter_string<_CharT>;
96+
: public __formatter_string<_CharT> {
97+
using _Base = __formatter_string<_CharT>;
11098

111-
_LIBCPP_HIDE_FROM_ABI auto format(_CharT __str[_Size], auto& __ctx) -> decltype(__ctx.out()) {
99+
_LIBCPP_HIDE_FROM_ABI auto format(_CharT __str[_Size], auto& __ctx) const -> decltype(__ctx.out()) {
112100
return _Base::format(basic_string_view<_CharT>(__str, _Size), __ctx);
113101
}
114102
};
115103

116104
// Formatter const char[].
117105
template <__formatter::__char_type _CharT, size_t _Size>
118-
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT
119-
formatter<const _CharT[_Size], _CharT>
120-
: public __format_spec::__formatter_string<_CharT> {
121-
using _Base = __format_spec::__formatter_string<_CharT>;
106+
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT formatter<const _CharT[_Size], _CharT>
107+
: public __formatter_string<_CharT> {
108+
using _Base = __formatter_string<_CharT>;
122109

123-
_LIBCPP_HIDE_FROM_ABI auto format(const _CharT __str[_Size], auto& __ctx)
124-
-> decltype(__ctx.out()) {
110+
_LIBCPP_HIDE_FROM_ABI auto format(const _CharT __str[_Size], auto& __ctx) const -> decltype(__ctx.out()) {
125111
return _Base::format(basic_string_view<_CharT>(__str, _Size), __ctx);
126112
}
127113
};
128114

129115
// Formatter std::string.
130116
template <__formatter::__char_type _CharT, class _Traits, class _Allocator>
131-
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT
132-
formatter<basic_string<_CharT, _Traits, _Allocator>, _CharT>
133-
: public __format_spec::__formatter_string<_CharT> {
134-
using _Base = __format_spec::__formatter_string<_CharT>;
117+
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT formatter<basic_string<_CharT, _Traits, _Allocator>, _CharT>
118+
: public __formatter_string<_CharT> {
119+
using _Base = __formatter_string<_CharT>;
135120

136-
_LIBCPP_HIDE_FROM_ABI auto
137-
format(const basic_string<_CharT, _Traits, _Allocator>& __str, auto& __ctx)
121+
_LIBCPP_HIDE_FROM_ABI auto format(const basic_string<_CharT, _Traits, _Allocator>& __str, auto& __ctx) const
138122
-> decltype(__ctx.out()) {
139-
// drop _Traits and _Allocator
123+
// Drop _Traits and _Allocator to have one std::basic_string formatter.
140124
return _Base::format(basic_string_view<_CharT>(__str.data(), __str.size()), __ctx);
141125
}
142126
};
143127

144128
// Formatter std::string_view.
145129
template <__formatter::__char_type _CharT, class _Traits>
146130
struct _LIBCPP_TEMPLATE_VIS _LIBCPP_AVAILABILITY_FORMAT formatter<basic_string_view<_CharT, _Traits>, _CharT>
147-
: public __format_spec::__formatter_string<_CharT> {
148-
using _Base = __format_spec::__formatter_string<_CharT>;
131+
: public __formatter_string<_CharT> {
132+
using _Base = __formatter_string<_CharT>;
149133

150-
_LIBCPP_HIDE_FROM_ABI auto
151-
format(basic_string_view<_CharT, _Traits> __str, auto& __ctx)
134+
_LIBCPP_HIDE_FROM_ABI auto format(basic_string_view<_CharT, _Traits> __str, auto& __ctx) const
152135
-> decltype(__ctx.out()) {
153-
// drop _Traits
136+
// Drop _Traits to have one std::basic_string_view formatter.
154137
return _Base::format(basic_string_view<_CharT>(__str.data(), __str.size()), __ctx);
155138
}
156139
};

0 commit comments

Comments
 (0)