Saturday, September 21, 2019

c++ - Why are the fast integer types faster than the other integer types?



In ISO/IEC 9899:2018 (C18) is stated under 7.20.1.3:




7.20.1.3 Fastest minimum-width integer types



1 Each of the following types designates an integer type that is usually fastest268) to operate with among all integer types that have at least the specified width.




2 The typedef name int_fastN_t designates the fastest signed integer type with a width of at least N. The typedef name uint_fastN_t designates the fastest unsigned integer type with a width of at least N.



3 The following types are required:



int_fast8_t, int_fast16_t, int_fast32_t, int_fast64_t,
uint_fast8_t, uint_fast16_t, uint_fast32_t, uint_fast64_t



All other types of this form are optional.







268)The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.







But it is not stated why these "fast" integer types are faster.





  • Why are these fast integer types faster than the other integer types?





EDIT: I tagged the question with C++, because the fast integer types are also available in C++17 in the header file of cstdint. Unfortunately, in ISO/IEC 14882:2017 (C++17) there is no such section about their explanation; I had implemented that section otherwise in the question´s body.






Info: In C, they are declared in the header file of stdint.h.



Answer



Imagine a CPU that performs only 64 bit arithmetic operations. Now imagine how you would implement an unsigned 8 bit addition on such CPU. It would necessarily involve more than one operation to get the right result. On such CPU, 64 bit operations are faster than operations on other integer widths. In this situation, all of Xint_fastY_t might presumably be an alias of the 64 bit type.



If a CPU supports fast operations for narrow integer types and thus a wider type is not faster than a narrower one, then Xint_fastY_t will not be an alias of the wider type than is necessary to represent all Y bits.



Out of curiosity, I checked the sizes on a particular implementation (GNU, Linux) on some architectures. These are not same across all implementations on same architecture:



┌────╥───────────────────────────────────────────────────────────┐
│ Y ║ sizeof(Xint_fastY_t) * CHAR_BIT │
│ ╟────────┬─────┬───────┬─────┬────────┬──────┬────────┬─────┤

│ ║ x86-64 │ x86 │ ARM64 │ ARM │ MIPS64 │ MIPS │ MSP430 │ AVR │
╞════╬════════╪═════╪═══════╪═════╪════════╪══════╪════════╪═════╡
│ 8 ║ 8 │ 8 │ 8 │ 32 │ 8 │ 8 │ 16 │ 8 │
│ 16 ║ 64 │ 32 │ 64 │ 32 │ 64 │ 32 │ 16 │ 16 │
│ 32 ║ 64 │ 32 │ 64 │ 32 │ 64 │ 32 │ 32 │ 32 │
│ 64 ║ 64 │ 64 │ 64 │ 64 │ 64 │ 64 │ 64 │ 64 │
└────╨────────┴─────┴───────┴─────┴────────┴──────┴────────┴─────┘






Note that although operations on the larger types may be faster, such types also take more space in cache, and thus using them doesn't necessarily yield better performance. Furthermore, one cannot always trust that the implementation has made the right choice in the first place. As always, measuring is required for optimal results.






Screenshot of table, for Android users:



Screenshot of above table



(Android doesn't have box-drawing characters in the mono font - ref)



No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...