ATL AM64/128A Manual de usuario

Busca en linea o descarga Manual de usuario para Redes ATL AM64/128A. Optimizing software in C++ Manual de usuario

  • Descarga
  • Añadir a mis manuales
  • Imprimir
  • Pagina
    / 144
  • Tabla de contenidos
  • MARCADORES
  • Valorado. / 5. Basado en revisión del cliente
Vista de pagina 0
1.
Optimizing software in C++
An optimization guide for Windows, Linux and Mac
platforms
By Agner Fog. Copenhagen University College of Engineering.
Copyright © 2008. Last updated 2009-01-21.
Contents
1 Introduction ....................................................................................................................... 3
1.1 The costs of optimizing ............................................................................................... 3
2 Choosing the optimal platform........................................................................................... 4
2.1 Choice of hardware platform ....................................................................................... 4
2.2 Choice of microprocessor ........................................................................................... 5
2.3 Choice of operating system......................................................................................... 6
2.4 Choice of programming language ............................................................................... 7
2.5 Choice of compiler ...................................................................................................... 9
2.6 Choice of function libraries........................................................................................ 11
2.7 Choice of user interface framework........................................................................... 13
2.8 Overcoming the drawbacks of the C++ language...................................................... 14
3 Finding the biggest time consumers ................................................................................ 15
3.1 How much is a clock cycle? ...................................................................................... 15
3.2 Use a profiler to find hot spots .................................................................................. 16
3.3 Program installation .................................................................................................. 18
3.4 Program loading ....................................................................................................... 18
3.5 Dynamic linking and position-independent code ....................................................... 18
3.6 File access................................................................................................................ 20
3.7 System database ...................................................................................................... 21
3.8 Graphics and other system resources....................................................................... 21
3.9 Network access ........................................................................................................ 21
3.10 Memory access....................................................................................................... 21
3.11 Context switches..................................................................................................... 22
3.12 Dependency chains ................................................................................................ 22
3.13 Execution unit throughput ....................................................................................... 22
4 Performance and usability ............................................................................................... 23
5 Choosing the optimal algorithm ....................................................................................... 24
6 The efficiency of different C++ constructs........................................................................ 25
6.1 Different kinds of variable storage............................................................................. 25
6.2 Integers variables and operators............................................................................... 28
6.3 Floating point variables and operators ...................................................................... 30
6.4 Enums ......................................................................................................................32
6.5 Booleans................................................................................................................... 32
6.6 Pointers and references............................................................................................ 34
6.7 Function pointers ...................................................................................................... 35
6.8 Member pointers....................................................................................................... 36
6.9 Arrays ....................................................................................................................... 36
6.10 Type conversions.................................................................................................... 38
6.11 Branches and switch statements............................................................................. 41
6.12 Loops......................................................................................................................42
6.13 Functions ................................................................................................................ 45
6.14 Structures and classes............................................................................................ 47
6.15 Class data members (properties) ............................................................................ 48
6.16 Class member functions (methods)......................................................................... 49
6.17 Virtual member functions ........................................................................................ 50
Vista de pagina 0
1 2 3 4 5 6 ... 143 144

Indice de contenidos

Pagina 1 - Optimizing software in C++

1. Optimizing software in C++ An optimization guide for Windows, Linux and Mac platforms By Agner Fog. Copenhagen University College of Engineering.

Pagina 2

10The Intel compiler has a number of important optimization features: Very good support for vector operations using the single-instruction-multiple-

Pagina 3 - 1 Introduction

100Use a compiler that supports automatic vectorization, such as Gnu, Intel or PathScale. Use appropriate compiler options to enable the desired ins

Pagina 4

101load the structure y into an XMM register, add the constant vector (1.,2.,3.,4.), and store the result in x. The #pragma vector always tells the c

Pagina 5 - 2.2 Choice of microprocessor

102Intrinsic functions are primitive operations in the sense that each intrinsic function call is translated to just one or a few machine instruction

Pagina 6

103 a = _mm_or_si128(a, mask); // Store the result vector in eight consecutive elements in aa: _mm_store_si128(ToVectorAddress(aa[i]

Pagina 7

104_mm_empty() after the 64-bit vector operations and before any floating point code. The 128-bit vectors do not have this problem. Rewriting exampl

Pagina 8

105 // Redefine class for vector of eight 16-bit integers: class Int16Vector8 : public Is16vec8 { public: // Default constructor: Int16Vecto

Pagina 9 - Intel C++ compiler

106 d.load(dd + i); // Load eight elements from dd[i] a = select(b > 0, c + 2, d + g); // Select c+2 or d+g a.store(aa

Pagina 10 - Open Watcom

107Intel compiler does an excellent job of optimizing this code, and so does the Gnu compiler. The functions, operators and constructors are all inli

Pagina 11 - Comments

108 for (int i = 0; i < size; i++) { aa[i] = (bb[i] + cc[i]) / ff[i]; } } Example 11.6a is not vectorized automatically by the c

Pagina 12

109 // Roll out loop by four to fit the four-element vectors: for (int i = 0; i < size; i += 4) { b.load(bb + i); // Load fo

Pagina 13

11Codeplay VectorC A commercial compiler for 32-bit Windows. Integrates into the Microsoft Visual Studio IDE. Has not been updated since 2004. Can do

Pagina 14 - Security

110} void AddAndDivide (short int *aa, short int *bb, short int *cc, float *ff) { // Define temporary vectors: Int16Vector8 a, b, c;

Pagina 15

111 // Example 11.7 #include <emmintrin.h> // Define class complexd for vector of two doubles. // The low element represents the real part an

Pagina 16

112static inline complexd operator - (complexd const & a, complexd const & b) { return _mm_sub_pd(a, b);} // complex * complex: (a.re*b.

Pagina 17

113static inline complexd operator - (complexd const & a) { static const union { // (signbit,signbit) int i[4]; __m128d v

Pagina 18 - 3.4 Program loading

11411.3 Mathematical functions for vectors There are various function libraries for computing mathematical functions such as logarithms, exponential

Pagina 19

115The necessary function prototypes and description of the different math libraries can be found in: Mike Stoner: "Integrating Fast Math Librar

Pagina 20 - 3.6 File access

116Code example Compiler used Time per element AMD64 Time per element Pentium 4 Time per element Pentium M Time per element Core 2 11.8a Microsoft 5

Pagina 21 - 3.10 Memory access

117 11.6 Conclusion Vectorized code often contains a lot of extra instructions for converting the data to the right format and getting them into the

Pagina 22 - 3.12 Dependency chains

118want to make one version that takes advantage of the SSE2 instruction set and another version that is compatible with old microprocessors. Using t

Pagina 23 - 4 Performance and usability

119 ... InnerFunctionSSE2(); } } else { // Use legacy version to support old systems for (i = 0; i < 1000

Pagina 24

12time in library functions then it may not be necessary to optimize anything else than finding the most efficient library and economize the library

Pagina 25 - Global or static storage

120You may want to compile an application for a 64-bit operating system in order to get the advantages mentioned on page 6. Unfortunately, it is not

Pagina 26 - Register storage

121• Make two or more versions of the most critical part of the code and compile them separately with the appropriate instruction set specified. Ins

Pagina 27 - Dynamic memory allocation

122 __intel_cpu_indicator = 0x800; break; case 6: case 7: // Supplementary-SSE3 supported __intel_cpu_indicator = 0x1000;

Pagina 28 - Integer sizes

123 13 Specific optimization tips 13.1 Use lookup tables Reading a value from a table of constants is very fast if the table is cached. Usually it t

Pagina 29 - Integer operators

124Storing something in static memory can cause caching problems because static data are likely to be scattered around at different memory addresses.

Pagina 30

125reason is, I guess, that compiler makers assume that floating point comparisons are more predictable than integer comparisons. The solution a = 1.

Pagina 31

126A possible negative value of i will appear as a large positive number when i is interpreted as an unsigned integer and this will trigger the error

Pagina 32 - 6.5 Booleans

127 Sunday = 1, Monday = 2, Tuesday = 4, Wednesday = 8, Thursday = 0x10, Friday = 0x20, Saturday = 0x40 }; Weekdays Day; if (Day & (Tuesday

Pagina 33

128Here, the address of matrix[j][0] is calculated internally as (int)&matrix[0][0] + j * (columns * sizeof(float)). Now, the factor to multiply

Pagina 34 - Pointers versus references

129• Integer division by a constant is faster than division by a variable • Integer division by a constant is faster if the constant is a power of

Pagina 35 - 6.7 Function pointers

13memcpy 16kB aligned operands Intel Core 2 0.12 0.18 0.12 0.11 0.18 0.18 0.18 0.11 memcpy 16kB unaligned op. Intel Core 2 0.63 0.75 0.18 0.11 1.21

Pagina 36 - 6.9 Arrays

130 // Example 13.13b int list[300]; int i; for (i = 0; i < 300; i += 3) { list[i] = 0; list[i+1] = 1; list[i+2] = 2; } The loop unr

Pagina 37

131// Example 13.16a double y, a1, a2, b1, b2; y = a1/b1 + a2/b2; Here we can eliminate one division by making a common denominator: // Example 1

Pagina 38 - Integer size conversion

13213.8 Conversions between floating point numbers and integers Conversion from floating point to integer According to the standards for the C++ lang

Pagina 39 - Float to integer conversion

133Conversion from integer to floating point Conversion of integers to floating point is faster than from floating point to integer. The conversion t

Pagina 40 - Reinterpret cast

134()631638322)1(−−⋅+⋅⋅−= fractiononevaluelongdoubleexponentsign. The value is zero if all bits except the sign bit are zero. Zero can be represente

Pagina 41 - Converting class objects

135float a, b; if (*(unsigned int*)&a * 2 > *(unsigned int*)&b * 2) { // abs(a) > abs(b) } The multiplication by 2 in example 13.28

Pagina 42 - 6.12 Loops

136There is a penalty for accessing part of a variable when the full variable is accessed shortly afterwards because the CPU becomes confused about t

Pagina 43 - Loop unrolling

137Alternatively, you can get ReadTSC as a library function available in www.agner.org/optimize/asmlib.zip. // Example 14.1 #include <intrin.h>

Pagina 44 - Copying or clearing arrays

138the clock counts should be multiplied by the clock period and by the number of times CriticalFunction is called in a typical application to calcul

Pagina 46 - Use fastcall functions

14specified on a command line or an input file. The output goes to the console or to an output file. A console mode program is fast, compact, and sim

Pagina 47 - 6.14 Structures and classes

14015 Overview of compiler options Table 15.1. Command line options relevant to optimization MS compiler Windows Gnu compilerLinux Intel compiler

Pagina 48

141 Table 15.2. Compiler directives and keywords relevant to optimization MS compiler Windows Gnu compiler Linux Intel compiler Windows Intel comp

Pagina 49

142platform Linux platform n.a. __unix__ __linux__ __unix__ __linux__ x86 platform _M_IX86 _M_IX86 x86-64 platform _M_IX86 and _WIN64 _M_X64 _M_

Pagina 50 - 6.19 Inheritance

14316 Literature Other manuals by Agner Fog The present manual is number one in a series of five manuals. See page 3 for a list of titles. Literatur

Pagina 51 - 6.22 Bitfields

144Theoretical textbook about input/output devices, ergonomics, cognition, psychology. W. M. Newman & M. G. Lamming: "Interactive System

Pagina 52 - 6.25 Templates

15A good way to prevent such errors is to replace arrays by well-tested container classes. The standard template library (STL) is a useful source of

Pagina 53

163.2 Use a profiler to find hot spots Before you start to optimize anything, you have to identify the critical parts of the program. In some program

Pagina 54

17Function addresses are obscured in optimized programs. The profiler identifies any hot spots in the program by their address and attempts to transl

Pagina 55 - 6.26 Threads

18arithmetic operations. The most common time-consumers are discussed in the following sections. 3.3 Program installation The time it takes to insta

Pagina 56 - 6.27 Exception handling

19The advantages of using static linking rather than dynamic linking are: Static linking includes only the part of the library that is actually need

Pagina 57

26.18 Runtime type identification (RTTI)... 50 6.19 Inheritance...

Pagina 58

20this leads to a considerable extra overhead. There are two reasons for this overhead. The first reason is that the 32-bit instruction set has no su

Pagina 59 - 6.30 Namespaces

21thread if there is other work that the processor can do while waiting for disk operations to finish. 3.7 System database It can take several secon

Pagina 60 - Function inlining

22cycles if it is not cached. See page 25 about data storage and page 81 about memory caching. 3.11 Context switches A context switch is a switch be

Pagina 61 - Register variables

23the loop counter with its limit, etc. In most cases, you can assume that these integer operations do not add to the total computation time. 4 Per

Pagina 62 - Join identical branches

24Copy protection. Some copy protection schemes are based on hacks that violate or circumvent operating system standards. Such schemes are frequent s

Pagina 63 - Eliminate jumps

25Performance Primitives" library contains many functions for audio and video processing, signal processing, data compression and cryptography (

Pagina 64 - Loop invariant code motion

26The static data area is usually divided into three parts: one for constants that are never modified by the program, one for initialized variables t

Pagina 65 - Induction variables

27Floating point variables use a different kind of registers. There are eight floating point registers available in 32-bit operating systems and sixt

Pagina 66 - Algebraic reductions

28 A class member variable with the static modifier will be stored in static memory and will have one and only one instance. Non-static members of th

Pagina 67 - Devirtualization

29 It is recommended to use the default integer size in cases where the size doesn't matter, such as simple variables, loop counters, etc. In la

Pagina 68

316 Literature... 143 1 Introducti

Pagina 69

30will delay the availability of x for approximately two clock cycles. Obviously, the initial value of i must be adjusted if you change pre-increment

Pagina 70

31The calculation of expressions where operands have mixed precision require precision conversion instructions which can be quite time-consuming (see

Pagina 71 - Pointer aliasing

326.4 Enums An enum is simply an integer in disguise. Enums are exactly as efficient as integers. 6.5 Booleans The order of Boolean operands The ope

Pagina 72 - Pure functions

33This is typically implemented by the compiler in the following way: bool a, b, c, d; if (a != 0) { if (b != 0) { c = 1; } else {

Pagina 73 - Algebraic reduction

34You cannot replace a && b with a & b if b is an expression that should not be evaluated if a is false. Likewise, you cannot replace a |

Pagina 74

35References are useful for copy constructors and overloaded operators. Function parameters that are declared as constant references accept expressi

Pagina 75

36Pentium M processor may be able to predict the target if the changes of the function pointer follows a simple regular pattern, while Pentium 4 and

Pagina 76 - 7.6 Optimization directives

37 More examples of container classes are given in www.agner.org/optimize/cppexamples.zip. An array using the above template class is declared by sp

Pagina 77 - Compiler-specific keywords

38this is not an issue because an optimizing compiler can see that the rows are accessed consecutively and can calculate the address of each row by a

Pagina 78

39Floating point precision conversion Conversions between float, double and long double take no extra time when the floating point register stack is

Pagina 79

4development process. These requirements are often conflicting with the requirements of optimizing the software for speed or size. Today, it is not

Pagina 80

40Pointer type conversion A pointer can be converted to a pointer of a different type. Likewise, a pointer can be converted to an integer, or an inte

Pagina 81 - 8 Optimizing memory access

41Dynamic cast The dynamic_cast operator is used for converting a pointer to one class to a pointer to another class. It makes a runtime check that t

Pagina 82

42 A switch statements is a kind of branch that can go more than two ways. Switch statements are most efficient if the case labels follow a sequence

Pagina 83

43thousand times then the loop control branch is mispredicted only one time in thousand so the misprediction penalty is only a negligible contributio

Pagina 84 - 8.5 Alignment of data

44The loop control condition The most efficient loop control condition is a simple integer counter. A microprocessor with out-of-order capabilities (

Pagina 85

45// Example 6.32a const int size = 1000; int i; float a[size], b[size]; // set a to zero for (i = 0; i < size; i++) a[i] = 0.0; // copy a to b f

Pagina 86

46Transfer large objects by reference If an object (class or structure) is transferred to a function as a parameter, then the entire object is copied

Pagina 87 - 8.7 Container classes

47Floating point parameters are not affected by __fastcall. The implicit 'this' pointer in member functions is also treated like a paramete

Pagina 88

48and member functions is not expensive. You may use an object oriented programming style if it is good for the logical structure and clarity of the

Pagina 89 - 8.9 Access data sequentially

49Structure and class objects can often be made smaller by reordering the data members. If the class has at least one virtual member functions then t

Pagina 90

5 This manual is based on the standard PC platform with an Intel or AMD processor and a Windows, Linux or BSD operating system running in 32-bit or 6

Pagina 91

506.17 Virtual member functions Virtual functions are used for implementing polymorphic classes. Each instance of a polymorphic class has a pointer t

Pagina 92 - 8.11 Explicit cache control

516.20 Constructors and destructors A constructor is implemented as a member function which returns a reference to the object. The allocation of memo

Pagina 93 - Uncached memory store

52 struct { int a:4; int b:2; int c:2; }; char abc; }; Bitfield x; int A, B, C; x.abc = A | (B << 4) | (C << 6)

Pagina 94 - 9 Using multiple CPU cores

53 template <int m> int MultiplyBy (int x) { return x * m;} int a, b; a = Multiply(10,8); b = MultiplyBy<8>(10); a and b will both g

Pagina 95

54 public: virtual void Disp() { cout << 1; } }; class C2 : public CHello { public: virtual void Disp() { cout <&

Pagina 96 - 10 Out of order execution

55 void Disp() { cout << 2; } }; void test () { CChild1 Object1; CChild2 Object2; CChild1 * p1; p1 = &Object1; p1-

Pagina 97

56threads with different priorities then the user might experience unacceptably long response times to keyboard and mouse inputs when the program is

Pagina 98 - 11 Using vector operations

57 } catch (...) { ... } } The function F1 is supposed to call the destructor for the object x when it returns. But what if an exception

Pagina 99 - 11.1 Automatic vectorization

58 __try { // Main loop for calculations: for ( ; i < arraysize; i++) { // Overflow may occur in multiplicatio

Pagina 100

59Exception handling is not necessary when no attempt is made to recover from errors. If you just want the program to issue an error message and stop

Pagina 101 - Using intrinsic functions

6Some systems have a graphics processing unit, usually on a graphics card. Such units can be used as coprocessors to take care of some of the heavy g

Pagina 102

607 Optimizations in the compiler 7.1 How compilers optimize Modern compilers can do a lot of modifications to the code in order to improve performan

Pagina 103 - Using vector classes

61 float a, b; a = parabola (2.0f); b = a + 1.0f; The compiler may replace this by // Example 7.3b a = 5.0f; b = 6.0f; Constant folding and consta

Pagina 104 - Redefining a vector class

62 The maximum number of floating point register variables is eight in 32-bit systems and sixteen in 64-bit systems. Some compilers have difficulties

Pagina 105

63else { y = cos(x); } z = y + 1.; Eliminate jumps Jumps can be avoided by copying the code that it jumps to. Example: // Example 7.9a int SomeF

Pagina 106

64 } else { return a - 1; } } The compiler may reduce this to: // Example 7.11b int SomeFunction (int a, bool b) { if (b) {

Pagina 107 - Defining missing operators

65Induction variables An expression that is a linear function of a loop counter can be calculated by adding a constant to the previous value. Example

Pagina 108

66The compilers I have studied do not make induction variables for floating point expressions or more complex integer expressions. See page 73 for an

Pagina 109

67This may be useful if the subexpression c+b can be reused elsewhere. In this example, we are using 8-bit integers which range from -128 to +127. An

Pagina 110

68Without optimization, the compiler needs to look up in a virtual table to see whether the call p->f() goes to C0::f or C1::f. But an optimizing

Pagina 111

69((a*x+b)*x+c)*x+d x*x*x*x*x*x*x*x = ((x2) 2) 2 - - x - - - - - - a+a+a+a = a*4 x - x x - - - - x -(-a) = a x - x x x x x x - a-(-b) = a+b x - x

Pagina 112

7The SSE2 instruction set is supported on all 64-bit CPU's and operating systems. The 64 bit instruction set supports self-relative addressing

Pagina 113 - Vectorized table lookup

70(a&&b) || (!a&&c) || (b&&c) = a ? b : c x - - x - - - - - (a&&b) || (a&&b&&c) = a&&

Pagina 114

71Boolean XMM (vector) reductions: ~(~a) = a - n.a. - - - - n.a. n.a. - (a&b)|(a&c) = a&(b|c) - n.a. - - - - n.a. n.a. -

Pagina 115

72 Most compilers have an option for assuming no pointer aliasing (/Oa). The easiest way to overcome the obstacle of possible pointer aliasing is to

Pagina 116

73#define pure_function __attribute__((const)) #else #define pure_function #endif double Func1(double) pure_function ; double Func2(double x) {

Pagina 117 - CPU's

74The calculation of this polynomial can be done with just two additions by the use of two induction variables: // Example 7.23b. Calculate polynomi

Pagina 118

75 Avoid long dependency chains, especially loop-carried dependency chains with long latencies. 7.5 Compiler optimization options All C++ compilers

Pagina 119

76 You may choose a newer instruction set when compatibility with old CPU's is not needed. Even better, you may make multiple versions of the mo

Pagina 120

77int List[ArraySize]; ... for (int i = 0; i < ArraySize; i++) List[i]++; Here, the compiler can replace all occurrences of ArraySize by the valu

Pagina 121

78Assume no pointer aliasing. __declspec(noalias) or __restrict or #pragma optimize("a",on). Specifies that pointer aliasing does not occur

Pagina 122 - Disclaimer

79 mov ecx, DWORD PTR [esp+8] ; ecx = a xor eax, eax ; eax = i = 0 mov edx, DWORD PTR [esp

Pagina 123 - 13.1 Use lookup tables

8Several modern programming languages use an intermediate code (byte code). The source code is compiled into an intermediate code, which is the code

Pagina 124

80compiler has not noticed that i can never be negative so that we don't have to care about the sign bit. We can tell it this by making i an uns

Pagina 125 - 13.2 Bounds checking

81Furthermore, this solution is using one register less so that it doesn't have to push and pop ebx. 8 Optimizing memory access 8.1 Caching of

Pagina 126

82this can cause contentions in the code cache. The subsequent sections describe various ways to avoid these problems. More details about how caches

Pagina 127 - 13.4 Integer multiplication

83in which they are used. Such variables and objects will be stored on the stack, which is very likely to be in the level-1 cache. The different kind

Pagina 128 - 13.5 Integer division

84 // Example 8.2b void F3(bool y) { union { int a[1000]; float b[1000]; }; if (y) { F1(a); } else { F2(b

Pagina 129

85Arrays that are too large for the stack can be allocated dynamically. The advantages of dynamic memory allocation are: Gives a more clear program

Pagina 130 - 13.6 Floating point division

86step by step. In most systems, you cannot increase the size of a memory block that has already been allocated. If the final size cannot be predicte

Pagina 131

87 DynamicArray[i] = WhateverFunction(i); // ... } } } Obviously, a function should never return any pointer or reference

Pagina 132

88themselves. But implementing a matrix in STL as a vector of vectors, as is often seen, is certainly not the most efficient solution. Fortunately,

Pagina 133

89• Is searching needed before all objects have been added? If search facilities are needed, and new objects can be added at any time, then the solu

Pagina 134

9An important disadvantage of C++ relates to security. There are no checks for array bounds violation, integer overflow, and invalid pointers. The ab

Pagina 135

90const int NUMROWS = 100, NUMCOLUMNS = 100; int matrix[NUMROWS][NUMCOLUMNS]; int row, column; for (row = 0; row < NUMROWS; row++) for (column

Pagina 136 - 14 Testing speed

91the rows, not the columns. Every fourth of these cache lines belong to the same set in the cache. When we reach element number 16 in column 28, the

Pagina 137

92 // Define size of squares: const int TILESIZE = 8; // SIZE must be divisible by TILESIZE // Loop r1 and c1 for all squares:

Pagina 138

93Table 8.3. Cache control instructions. There are other cache control instructions than the ones mentioned in table 8.3, such as flush and fence in

Pagina 139

94const int SIZE = 512; // number of rows and columns in matrix // function to transpose and copy matrix void TransposeCopy(double a[SIZE][SIZE], dou

Pagina 140

95It is important to distinguish between coarse-grained parallelism and fine-grained parallelism when deciding whether it is advantageous to do thing

Pagina 141 - __GNUC__ and not

96A CPU with multiple cores or a system with multiple CPU's is very useful for running multiple tasks in parallel. A CPU-intensive program with

Pagina 142

97float list[size], sum1 = 0, sum2 = 0; int i; for (i = 0; i < size; i += 2) { sum1 += list[i]; sum2 += list[i+1];} sum1 += sum2; If the m

Pagina 143 - 16 Literature

98The loop branch should be predicted. This is no problem if the repeat count is large or constant. If the loop count is small and changing then the

Pagina 144 - Internet forums

99The use of vector operations is more advantageous the smaller the data elements are. For example, it takes the same time to add two vectors of four

Comentarios a estos manuales

Sin comentarios