In contrast with some older languages, such as FORTRAN 77 or COBOL, C++ is freeform in the sense that whitespace (spaces, tabs, returns) has no meaning. The code can be laid out as you wish, provided that each individual command is separated by a semicolon (;) and blocks of commands are delineated by braces ({ ... }). Explanatory remarks, called comments, can be inserted by signalling to the compiler that certain parts of the program file are to be ignored. Single line comments begin with a double slash (//) and extend to the end of the current line. Blocks of comments are enclosed by matching slash-star pairs (/* ... */). Both the following code listings are interpreted identically by the compiler.
int main()
{
return 0; // zero is the standard return value
// when there are no errors to report
}
// C++ code is freeform and all whitespace is treated equally
int main( ){
; ;
; ;
return 0 ;}
/* Related blocks of code are enclosed in matching pairs
of braces. Each individual command is followed by a
semicolon. In the function above, there are four "do
nothing" commands and one return statement. */
These two examples are variations on the null program, int main(){return 0;}, which is the simplest possible. (It begins, does nothing, and ends.) Every valid C++ program must contain exactly one function called main that returns an integer value to the operating system. Program flow begins with the call to main and terminates when all the statements in main have been executed.
Roughly speaking, everything in the C++ language is either an object or a function. An object occupies memory (to store its data) and has a definite type associated with it. A function acts on zero or more objects and returns at most one object. (C++ functions encompass both the mathematical notion of a function and what is more commonly called a procedure.) The function syntax is illustrated below.
int mult(int x, int y) { return x*y; }
#include <cmath> // read in the C standard math library
using std::pow; // make the function pow available
double geo_ave(double a, double b, double c)
{
double prod = a*b*c;
return pow(prod,1.0/3.0);
}
void rescale(float &x, int n) { x *= n; }
The function mult takes two ints (i.e., integer numbers), named x and y, and returns their product. The function geo_ave takes three doubles (i.e., double precision floating point numbers) and returns their geometric average. Note that geo_ave itself calls another function, pow, which is used to raise a*b*c to the one third power. (In some languages, exponentiation is provided via the ** or ^ operators; this is not true of C and C++.) The function rescale takes a float (i.e., a single precision floating point number) and an int and multiplies the former by the latter. void indicates that the function has no return value.
The ampersand (&) before x tells the compiler that that argument is passed by reference rather than by value, which means that changes to x within rescale are propagated outside the function.
int, double, and float are examples of atomic objects—what are commonly called plain old datatypes (PODs). As we will see later, composite objects can be created using C arrays (to group many objects of the same type) or using the struct and class keywords (to group objects of arbitrary type). All objects are either constant or mutable and must be declared before they are used. Constants (constant objects) must be defined at the moment of declaration and their value can never change. Variables (mutable objects) can be defined at the moment of declaration or later. Their value can always be changed.
Example
$ cat > add.cpp
#include <iostream>
using std::cout; // stream object directed to stdout
using std::endl; // end-of-line marker
int add(int a, int b) { return a+b; }
int add(int a, int b, int c) { return a+b+c; }
void add2(int a, int b, int &c) { c = a+b; }
int main()
{
cout << "1+2+3 = " << add(1,2,3) << endl;
const int x = 4; // x declared as an integer constant
// and assigned the value 4
int y,z; // y and z declared as integer variables
y = 5; // y assigned the value 5; z remains undefined.
cout << "4+5 = " << add(x,y) << endl;
add2(x,y,z); // z is assigned the sum x+y
cout << "4+5 = " << z << endl;
return 0;
}
[ctrl-d]
$ g++ -o add add.cpp
$ ./add
1+2+3 = 6
4+5 = 9
4+5 = 9
In the example above, the iostream library is invoked to handle output to the terminal. Note that the function add is overloaded, i.e., it has multiple definitions with different sets of arguments. This is legal in C++ provided that the compiler can unambiguously determine which version of the function is being called. The function add2 requires a different name since overloaded functions cannot be distinguished by return type (and another add already exists that takes three integer arguments).
Objects exist within a certain scope—at the code block level in which they are declared and in all blocks nested inside, unless preempted by another variable declared with the same name. In some cases, preempted names can be uncovered using ::, the scope operator.
Example
$ cat > scope.cpp
#include <iostream>
using std::cout;
using std::endl;
int i = 1; // global variable
class foo
{
public:
static int i; // static class variable
};
int foo::i = 2;
int main()
{
int i = 3;
int j,k,l,m;
{
int i = 4;
j = i;
l = foo::i;
m = ::i;
}
k = i;
cout << j << k << l << m << endl;
}
[ctrl-d]
$ g++ -o scope scope.cpp
$ ./scope
4321
The names that are given to objects and functions are known as identifiers. Identifiers can be made up of any combination of letters, numbers, and underscore characters (_), except that they must begin with a letter or underscore and they cannot be identical to any of the keywords that make up the C++ language. Note that C++ is case-sensitive; hence, foo, Foo, and FOO are three unique names.
Keyword | Description |
---|---|
asm | insert an assembly instruction |
auto | declare a local variable |
bool | declare a boolean variable |
break | break out of a loop |
case | a block of code in a switch statement |
catch | handles exceptions from throw |
char | declare a character variable |
class | declare a class |
const | declare immutable data or functions that do not change data |
const_cast | cast from const variables |
continue | bypass iterations of a loop |
default | default handler in a case statement |
delete | make memory available |
do | looping construct |
double | declare a double precision floating-point variable |
dynamic_cast | perform runtime casts |
else | alternate case for an if statement |
enum | create enumeration types |
explicit | only use constructors when they exactly match |
export | allows template definitions to be separated from their declarations |
extern | tell the compiler about variables defined elsewhere |
false | the boolean value of false |
float | declare a floating-point variable |
for | looping construct |
friend | grant non-member function access to private data |
goto | jump to a different part of the program |
if | execute code based on the result of a test |
inline | optimize calls to short functions |
int | declare a integer variable |
long | declare a long integer variable |
mutable | override a const variable |
namespace | partition the global namespace by defining a scope |
new | allocate dynamic memory for a new variable |
operator | create overloaded operator functions |
private | declare private members of a class |
protected | declare protected members of a class |
public | declare public members of a class |
register | request that a variable be optimized for speed |
reinterpret_cast | change the type of a variable |
return | return from a function |
short | declare a short integer variable |
signed | modify variable type declarations |
sizeof | return the size of a variable or type |
static | create permanent storage for a variable |
static_cast | perform a nonpolymorphic cast |
struct | define a new structure |
switch | execute code based on different possible values for a variable |
template | create generic functions |
this | a pointer to the current object |
throw | throws an exception |
true | the boolean value of true |
try | execute code that can throw an exception |
typedef | create a new type name from an existing type |
typeid | describes an object |
typename | declare a class or undefined type |
union | a structure that assigns multiple variables to the same memory location |
unsigned | declare an unsigned integer variable |
using | import complete or partial namespaces into the current scope |
virtual | create a function that can be overridden by a derived class |
void | declare functions or data with no associated data type |
volatile | warn the compiler about variables that can be modified unexpectedly |
wchar_t | declare a wide-character variable |
while | looping construct |
C++ offers a variety of integer types based on int and modified by the keywords short, long, signed, and unsigned. It also offers three floating point (FP) types, float, double, and long double. On almost all modern computer architectures, the first two types correspond to the single precision (binary32) and double precision (binary64) encodings specified in the IEEE Standard for Floating-Point Arithmetic (IEEE 754). long double refers to a floating point type that may, and usually does, have greater than double precision. On the x86 architecture, most compilers implement long double as the 80-bit extended precision type supported by that hardware. On some other architectures, long double is a 128-bit quadruple precision type.
Type | Description | Example |
---|---|---|
bool | boolean value | bool t = true; bool f = false; |
char | single character | char c = 'a'; |
signed char | single character | signed char c = 'a'; |
unsigned char | single character | unsigned char c = 'a'; |
wchar_t | single wide character | wchar_t wc = 'a'; |
short int | short integer | short i = 5; |
int | integer | int i = 5; |
long int | integer | long i = 5L; |
unsigned short int | unsigned short integer | unsigned short i = 5U; |
unsigned int | unsigned integer | unsigned i = 5U; |
unsigned long int | unsigned long integer | unsigned long i = 5UL; |
float | single precision FP | float x = 10.2F; |
double | double precision FP | double x = 10.2; |
long double | high precision FP | long double x = 10.2L; |
Numerical constants such as 5 and 10.2 are called literals. All literals have a type. By convention, obvious integers such as 5 are of type int rather than of type short or long. An unsigned or long version of an integer constant can be created by appending the suffix U or L or both. (In this one situation, C++ is case-insensitive. These suffixes can equally be written lower case.) Thus, if the function foo is overloaded as follows:
void foo(int);
void foo(unsigned);
void foo(unsigned long);
then foo(23), foo(23u), and foo(23ul) call three different functions.
Example
$ cat overload.cpp >
#include <iostream>
using std::cout;
using std::endl;
void report(int i)
{
cout << "The integer " << i << " is signed" << endl;
}
void report(unsigned int i)
{
cout << "The integer " << i << " is unsigned" << endl;
}
int main()
{
report(5);
report(-5);
report(5u);
return 0;
}
$ g++ -o overload overload.cpp
$ ./overload
The integer 5 is signed
The integer -5 is signed
The integer 5 is unsigned
Similarly, obvious decimal numbers such as 10.2 are of type double. A single precision version can be specified by appending F. A quadruple precision version can be specified by appending L. Floating point literals can also be specified in a variant of scientific notation, where \(m\)E\(e\) stands in for \(m \times 10^e\). The mantissa (\(m\)) and exponent (\(e\)) are themselves arbitrary decimal numbers.
Type | Encoding/Base | Example |
---|---|---|
char[] | ASCII | "hello" |
char | ASCII | 'a' |
unsigned short int | ASCII | L'ab' (one or two characters) |
int | octal | 01 |
decimal | 1 | |
hexadecimal | 0x1 | |
ASCII | 'ABC' (two to four characters) | |
unsigned int | octal | 01U |
decimal | 1U | |
hexadecimal | 0x1U | |
long int | octal | 01L |
decimal | 1L | |
hexadecimal | 0x1L | |
unsigned long int | octal | 01UL |
decimal | 1UL | |
hexadecimal | 0x1UL | |
float | decimal | 12.3F |
scientific | 1.23E1F, 123E-1F | |
double | decimal | 12.3 |
scientific | 1.23E1, 123E-1 | |
long double | decimal | 12.3L |
scientific | 1.23E1L, 123E-1L |
Literals of the lexical types are enclosed in quotation marks: single quotes for characters (char) and double quotes for C strings (char[]). Since ' and " are used as delimiters, special escape sequences \' and \" are used to produce quotes within a literal. For example, char a = '\'' assigns a single quote to the char variable a. There are many other two-character escape sequences beginning with a backslash.
Code | Character | Description |
---|---|---|
\\ | \ | backslash |
\' | ' | single quote |
\" | " | double quote |
\? | ? | question mark |
\0 | <NUL> | binary 0 |
\a | <BEL> | bell (audible alert) |
\b | <BS> | back space |
\f | <FF> | form feed |
\n | <NL> | new line |
\r | <CR> | carriage return |
\t | <HT> | horizontal tab |
\v | <VT> | vertical tab |
We have not yet said anything about the internal representation of the various types. In C++ this is left up to the compiler (with some restrictions on the relative sizes) and varies from platform to platform. The sizeof operator can be used to query the compiler as to the number of bytes that are needed to store each of the PODs.
Example
$ cat > sizes.cpp
#include <iostream>
using std::cout;
using std::endl;
int main()
{
const char message[] = "Hello";
cout << "char = " << sizeof(char) << endl;
cout << "\"Hello\" = " << sizeof(message) << endl;
cout << "unsigned short = " << sizeof(unsigned short) << endl;
cout << "short = " << sizeof(short) << endl;
cout << "unsigned int = " << sizeof(unsigned int) << endl;
cout << "int = " << sizeof(int) << endl;
cout << "unsigned long = " << sizeof(unsigned long) << endl;
cout << "long = " << sizeof(long) << endl;
cout << "float = " << sizeof(float) << endl;
cout << "double = " << sizeof(double) << endl;
return 0;
}
$ g++ -o sizes sizes.cpp
$ ./sizes
char = 1
"Hello" = 6
unsigned short = 2
short = 2
unsigned int = 4
int = 4
unsigned long = 4
long = 4
float = 4
double = 8
There are a few things to take note of in the example above. The square brackets [] indicate that message is an array of chars. The size of "hello" is 6 rather than 5 since the string is actually stored in memory as a list of characters terminated by the binary byte 0.
Various operations can be performed on PODs. A unary operation acts on a single object. Binary and tertiary operations act on pairs and triplets of objects. The available operators are listed in order of precedence.
Precedence | Operator | Description | Example |
---|---|---|---|
1 | :: | scope operator | Class::age = 2; |
2 | () | grouping operator | (a+b)/4; |
[] | arrary access | array[4] = 2; | |
-> | member access from a pointer | ptr->age = 34; | |
. | member access from an object | obj.age = 34; | |
++ | post-increment | for (i = 0; i < 10; i++) ... | |
-- | post-decrement | for (i = 10; i > 0; i--) ... | |
3 | ! | (unary) logical negation | if (!done) ... |
~ | (unary) bitwise complement | flags = ~flags; | |
++ | pre-increment | for (i = 0; i < 10; ++i) ... | |
-- | pre-decrement | for (i = 10; i > 0; --i) ... | |
+ | (unary) plus | int i = +1; | |
- | (unary) minus | int i = -1; | |
* | pointer deference | data = *ptr; | |
& | address of | address = &obj; | |
(type) | cast to a given type | int i = (int) floatNum; | |
sizeof | return size in bytes | int size = sizeof(float); | |
4 | ->* | member pointer selector | ptr->*var = 24; |
.* | member object selector | obj.*var = 24; | |
5 | * | multiplication | int i = 2*4; |
/ | division | float f = 10.0/3; | |
% | modulus | int rem = 4%3; | |
6 | + | addition | int i = 2+3; |
- | subtraction | int i = 5-1; | |
7 | << | bitwise shift left | int flags = 33 << 1; |
>> | bitwise shift right | int flags = 33 >> 1; | |
8 | < | comparison less-than | if (i < 42) ... |
<= | comparison less-than-or-equal-to | if (i <= 42) ... | |
> | comparison greater-than | if (i > 42) ... | |
>= | comparison greater-than-or-equal-to | if (i >= 42) ... | |
9 | == | comparsion equal-to | if (i == 42) ... |
!= | comparsion not-equal-to | if (i != 42) ... | |
10 | & | bitwise AND | flags = flags & 42; |
11 | ^ | bitwise exclusive OR | flags = flags ^ 42; |
12 | | | bitwise inclusive OR | flags = flags | 42; |
13 | &&, and | logical AND | if (a && b) ... |
14 | ||, or | logical OR | if (a or b) ... |
15 | ? : | (tertiary) conditional if-then-else | int i = (a > b) ? a : b; |
16 | = | assignment operator | int a = b; |
+= | add and assign | a += 3; | |
-= | subtract and assign | a -= 4; | |
*= | multiply and assign | a *= 5; | |
/= | divide and assign | a /= 2; | |
%= | modulo and assign | a %= 3; | |
&= | bitwise AND and assign | flags1 &= flags2; | |
^= | bitwise exclusive OR and assign | flags1 ^= flags2; | |
|= | bitwise inclusive OR and assign | flags1 |= flags2; | |
<<= | bitwise shift left and assign | flags <<= 2; | |
>>= | bitwise shift right and assign | flags >>= 2; | |
17 | , | squential evaluation operator | for (i=0, j=0; i < 10; ++i, ++j) ... |
The precedence of an operation determines the order in which it is performed. For example, a+b*-c is interpreted as a+(b*(-c)), since unary minus has higher precedence than multiplication which in turn has higher precedence than addition. Similarly, a/b-+c is read as (a/b)-(+c) and a%b*c as (a%b)*c. Parentheses override the order of operations.
Example
$ cat ops.cpp >
#include <iostream>
using std::cout;
using std::endl;
const double hbar = 6.57E-16; // eV.s
const double omega0 = 5.1E14; // 1/s
double energy(int n)
{
return hbar*omega0*(n+0.5);
}
void write(int n)
{
cout << "level " << n << ": " << energy(n) << " eV" << endl;
}
int main()
{
cout << "Harmonic oscillator energies: \n"
"----------------------------- \n"
"level 0: " << energy(0) << " eV" << endl
<< "level 1: " << energy(1) << " eV" << endl
<< "level 2: " << energy(2) << " eV" << endl;
int n = 2;
write(++n);
write(++n);
write(n+1);
write(n+2);
write(2*n-1);
write(1 << 3);
write(75 % 11);
n *= 3;
n -= 2;
write(n);
double hbar = 0;
cout << endl << "hbar = " << ::hbar << " eV.s" << endl;
return 0;
}
[ctrl-d]
$ g++ -o ops ops.cpp
$ ./ops
Harmonic oscillator energies:
-----------------------------
level 0: 0.167535 eV
level 1: 0.502605 eV
level 2: 0.837675 eV
level 3: 1.17275 eV
level 4: 1.50782 eV
level 5: 1.84289 eV
level 6: 2.17796 eV
level 7: 2.51303 eV
level 8: 2.8481 eV
level 9: 3.18317 eV
level 10: 3.51824 eV
hbar = 6.57e-16 eV.s
For some operations, the order in which actions are performed in not always unambiguous. In particular, one should beware of the increment (++) and decrement (--) operators, which rely on side effects (the term of art for operations that modify their operands). For example, the unary operations +x and -x have no side effects; they return the value of x and its negative, but leave the variable itself unchanged. On the other hand, ++x increments x by one and then returns the new value. x++ returns the current value of x and then increments the variable by one. Exactly when the increment happens is sometimes hard to predict.
int x = 1;
int y = ++x; // y == 2, x == 2
int z = x++; // z == 2, x == 3
x = x / ++x; // unclear
In the last line above, the value of x is not guaranteed to be consistent across compilers. It is not clear at which point x should be incremented. When the order of evaluation is in doubt, it is best to break up operations into several steps.
There are a few subtleties to the logical operations. First, it is very important to distinguish the equality comparison operator (==) from the assignment operator (=). The statement x = 0 assigns x the value zero, whereas x == 0 checks whether x has the value zero and returns true or false. Second, comparison operations should not be chained together, since x < y < z (incorrect) and x < y and y < z (correct) are interpreted quite differently by the compiler. Finally, C++ employs lazy evaluation. That is, it takes advantage of the fact that the outcome of some binary logical comparisons can be predicted from one operand alone. Consider the truth tables for and and or.
or | true | false |
---|---|---|
true | true | true |
false | true | false |
and | true | false |
---|---|---|
true | true | false |
false | false | false |
It is clear that true or x is always true and that false and y is always false, regardless of the boolean values x and y. Since x and y do not need to be evaluated in this instance, they will not be. This matters primarily when the values x and y are returned from a function. If we attempt to evaluate the expression (one() or two()) and one() evaluates to true, then the function two() is not called. Similarly, if we attempt to evaluate the (one() and two()) and one() evaluates to false, then again two() is not called. This may be important if two() has side effects. Lazy evaluation always occurs from left to right.
The time required for the computer to execute each operation is not constant and depends on the particular machine architecture. Of the arithmetic operations, addition and subtraction are generally a little faster than multiplication, and division is always the slowest. Nonetheless the execution times are roughly comparable, and it is meaningful to think of performance in terms of how many operations are necessary to complete a particular calculation.
For example, what is an efficient algorithm for evaluating polynomials of the form \(P(x) = a_nx^n + \cdots + a_1x + a_0\) ? Naively, we might write something like
double polynom(double x, double a1, double a0)
{
return a1*x + a0;
}
double polynom(double x, double a2, double a1, double a0)
{
return a2*x*x + a1*x + a0;
}
double polynom(double x, double a3, double a2, double a1, double a0)
{
return a3*x*x*x + a2*x*x + a1*x + a0;
}
in which case the total number of operations scales as
That is, \(n(n-1)/2\) multiplications and \(n\) additions gives \(n(n+1)/2\) operations in total. On the other hand, if we regroup the terms in the polynomial as follows
then evaluation requires only \(2(n-1)\) operations.
double polynom(double x, double a1, double a0)
{
return a1*x + a0;
}
double polynom(double x, double a2, double a1, double a0)
{
return (a2*x + a1)*x + a0;
}
double polynom(double x, double a3, double a2, double a1, double a0)
{
return ((a3*x + a2)*x + a1)*x + a0;
}
In so-called “big-O” notation, we say that the first scheme is \(O(n^2)\) whereas the second, called Horner’s scheme, is \(O(n)\). For polynomials of very high order, this difference becomes significant.
Exercise
Write a function that evaluates \(5x^7 -8 x^6 + x^4 - x\) using the fewest operations.
Not all the C++ operators can act on all the PODs. For instance, the modulus operation only makes sense for integer values. Hence, 4%3 is valid code, whereas 5.75%2.25 triggers a compiler error. Another important restriction is that binary operations can only operate on two operands of identical type. If they are made to act between different types, and where it is sensible to do so, the compiler will quietly convert to the most expressive type of the two. This behind-the-scenes type conversion is called implicit casting.
double x = 2.0;
float y = 3.0;
x+y;
The addition operation above is actually carried out as x+(double)y or x+double(y). That is, the float is first cast to a double before the addition is carried out. The result of the addition is thus a double.
In some cases, you will want to explicitly cast one type to another. A good example is the division operation, which has subtly different behaviour depending on the types involved.
5/2; // == 2
5.0/2.0; // == 2.5
double(5)/2; // == double(5)/double(2) == 2.5
5/( (double)2 ); // == double(5)/double(2) == 2.5
static_cast<double>(5)/2; // == double(5)/double(2) == 2.5
C++ has inherited from C the cast notation (type)object and type(object). It also has its own specialized casting operators static_cast, const_cast, dynamic_cast, and reinterpret_cast, the last two of which are rarely used. When acting on PODs, the static_cast is equivalent to a C cast.
It is important to note that floating point types are cast to integer type by truncating the fractional part: e.g., (int)3.14 == 3, int(3.999) == 3, and int(-3.999) == -3. This behaviour implies that positive numbers are always rounded down and negative numbers always rounded up. To acheive conventional rounding, you might define a function like this.
inline int round(double x)
{
return x >= 0 ? (int)(x+0.5) : (int)(x-0.5);
}
Exercise
Write a function that rounds to the nearest even integer.
In order to maintain backward compatibility with C (which does not have a built-in boolean type), the values true and false are cast to the integers 1 and 0, respectively. This is the reason why the chained logical comparison we encountered in the last section does not actually produce an error. (It is valid code. It just doesn’t behave as expected!) The programmer’s intent is clearly to check that the three numbers have increasing value. In practice that is not how things work out. The less than (<) operator is binary and groups from left to right, so the statement x < y < z is read as (x < y) < z, i.e., as a nested pair of comparisons. The bracketed term then evaluates to either true of false, neither of which is a numerical type that can be compared to z. Hence, an implicit case transforms the second comparison into either 0 < z or 1 < z.
You will often want the computer to conditionally execute code based on the current value of some variable. The way to do this is with the if keyword or with the ?: operator. For example, to take the absolute value of a number, you want to check if it is negative and if so negate it.
double x = -5.0;
double abs_x;
if (x > 0.0)
abs_x = x;
else
abs_x = -x;
Only one of these code branches is executed, after which abs_x holds the absolute value of the variable x. The syntax for if is as follows: if (logical expression) action_{1} else action_{2}, where the actions are either a single statement or a code block enclosed in braces. The fail condition marked by else is optional. We could have written this instead:
double x = -5.0;
double abs_x = x;
if (abs_x < 0.0) abs_x = -abs_x;
An alternative formulation, based on the ?: operator, has the advantage that abs_x can be made const, since it is defined at the same time it is declared.
double x = -5.0;
const double abs_x = ( x > 0.0 ? x : -x );
Testing for several logical conditions can be carried out using if and else chained in series.
if (x == 1) { /* code for x == 1 */ }
else if (x == 2) { /* code for x == 2 */ }
else if (x == 3) { /* code for x == 3 */ }
else { /* code for x != 1 and x != 2 and x != 3 */ }
The same thing can also be done with a switch.
switch (x)
{
case 1:
// code for x == 1
break;
case 2:
// code for x == 2
break;
case 3:
// code for x == 3
break;
default:
// code for x != 1 and x != 2 and x != 3
break;
}
This construction is most useful when there are a large number of discrete conditions to check for. It is important not to forget the break statements. Otherwise, control falls through to the next case.
Conditional tests can be nested.
bool even;
if (x%2 == 1)
{
even = false;
if (x == 1) { /* code for x == 1 */ }
else { /* code for x odd and x != 1*/ }
}
else
{
even = true;
if (x == 2) { /* code for x == 2 */ }
else { /* code for x even and x != 2 */ }
}
Logical expressions can be combined using the logical operations and (&&), or (||), and not (!). (Note that !(x==1) is equivalent to x!=1.) Subexpressions can be grouped using parentheses to overide the natural precedence.
bool updated = true;
int x = 5;
if ( !( x > 1 and updated ) or x == 0 ) { /* does not execute */ }
if ( x < 2 or x > 4 and updated ) { /* does execute */ }
if ( x > 1 and x < 5 and !updated ) { /* does not execute */ }
Remember that x and y is true only if both x and y are true. Because of C++’s left-to-right lazy evaluation,
if ( a == 1 and expensive_function() )
may be much more efficient than
if ( expensive_function() and a == 1 )
especially if this test is performed multiple times and a often has a value other than 1.
Keep in mind that the indenting is just a formatting convention for the benefit of the programmer. Whitespace, although it has no meaning for the compiler, can occasionally be misleading to a human reader. At the end of the following code snippet, j == 2 and not j == -1.
double t = 99.0;
int j = -1;
if (t < 100.0)
if (j == 0)
j = 1;
else
j = 2;
Absent braces, else is always attached to the most recent if. To produce the logical flow suggested by the indentation, you would have to enclose the second if in braces:
double t = 99.0;
int j = -1;
if (t < 100.0)
{
if (j == 0)
j = 1;
}
else
j = 2;
To eliminate possible confusion, it is often a good idea to include braces even when they’re not strictly necessary.
The for construction is typically used to execute a block of code multiple times. A variable is introduced to serve as a counter. In the following, i ranges over the values 0, 1, ..., 9 in sequence. (Recall that i++ is shorthand for i += 1 or i = i + 1.)
for (int i = 0; i < 10; i++)
{
// code
}
The general syntax is for (initialization;logical expression;action_{2}) action_{1}. The initialization step is performed once at the start. At the beginning of each loop the logical condition is checked; if true, action_{1} and action_{2} are performed (in that order). The loop ends the first time the logical condition evaluates to false.
The for loop is very flexible. The range of the counter is arbitrary, and it does not need to step in unit increments. Here, i takes the even values -6, -4, -2, 0 , 2, 4, 6.
for (int i = -6; i < 7; i += 2)
{
// code
}
The scope of variables defined in a for loop is restricted. This is incorrect:
for (int i = -6; i < 7; i += 2)
{
// code
}
int j = i; // error: the variable i doesn't exist outside the for loop
This is correct:
int i;
for (i = -6; i < 7; i += 2)
{
// code
}
int j = i; // valid: j is assigned 6
The sequence operator (,) used in this context has a different behaviour than it does in function arguments. The two-part statement a,b; tells the compiler to compute a and throw away the result, then compute b and return the result. Similarly, a,b,c,d; computes each of a, b, c, and d, but evaluates to d. What this means is that
// sum numbers from 1 to 100
int sum = 0;
for (int i = 1; i < 101; ++i)
sum += i;
can be compressed to
int sum, i;
for (sum = 0, i = 0; sum += i, i < 101; ++i);
Such a rewriting is not always advisable, especially if it makes the code harder to decipher.
An alternative to for is while, which comes in two flavours. The first
int n = 0;
while (n < 10)
{
++n;
cout << n << (n != 10 ? '", ": " ");
}
and the second
int n = 0;
do
{
++n;
cout << n << (n != 10 ? '", ": " ");
} while (n < 10);
differ only in when the exit condition is checked. The behaviour of the two code snippets shown above is identical: both output 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to the terminal. Sometimes, however, testing to exit in one position or another does matter. For example, in the following recompute may never be executed at all.
bool is_converged(void);
void recompute(void);
while (!is_converged())
recompute();
If the loop is written this way, however, recompute will be executed at least once:
do
{
recompute();
} while (!is_converged());
The loop can also be exited at any point by issuing a break statement.
while (true)
{
recompute();
if (is_converged()) break;
}
while (true)
{
if (is_converged()) break;
recompute();
}
Consider the problem of evaluating the truncated series,
Convince yourself that the following program computes \(S_{10}\).
#include <iostream>
using std::cout;
using std::endl;
int main()
{
double sum = 1.0;
const int N = 10;
for (int n = 2; n <= N; ++n)
sum += 1.0/(n*n);
cout << "The series truncated at N = " << N
<< " evaluates to " << sum << endl;
return 0;
}
Exercise
What if sum += 1.0/(n*n) is switched out for sum += (1.0/n)/n or sum += 1/double(n))/n? Would this make any difference? Is any one of these statements more likely to overflow than the others?
Exercise
Consider the finite sequence of numbers
How many of the numbers in \(S\) are divisible by 12? (Hint: beware of overflow.)
Exercise
How many of the numbers 1, 2, ..., 1000 are perfect squares? How many are perfect cubes? (Hint: you can solve this exercise without using either sqrt or pow.)
Exercise
Consider an unbounded square grid of points spaced by \(\Delta x = 0.1\) and \(\Delta y = 0.1\). How many points lie inside the diamond \(|x|+|y| = 2\)? How many points lie inside the circle \(x^2+y^2 = 4\)? (Hints: (i) be sure not to count the points on the boundary; (ii) you may be better off reformulating the problem so that you can use integer rather than floating-point types in your code.)
It’s valid to nest loops with an inner loop depending in some way on an outer loop’s counter variable. In this way, we can generate all unique ordered pairs
for (int i = 1; i < N; ++i)
for (int j = 0; j < i; ++j)
cout << "(" << j << "," << i << ")" << endl;
and all unique ordered triples
for (int i = 2; i < N; ++i)
for (int j = 1; j < i; ++j)
for (int k = 0; k < j; ++k)
cout << "(" << k << "," << j << "," << i << ")" << endl;
Loops come in three flavours: for, while, and do while. Each can be understood as a nested sequence of actions and logical tests.
The for loop
for (initialization; logical_expression; statement2) statement1;
is equivalent to
{
initialization;
if (logical_expression)
{
statement1;
statement2;
if (logical_expression)
{
statement1;
statement2;
if (logical_expression)
{
statement1;
statement2;
.
.
.
}
}
}
}
The while loop
while (logical_expression) statement;
is equivalent to
if (logical_expression)
{
statement;
if (logical_expression)
{
statement;
if (logical_expression)
{
statement;
.
.
.
}
}
}
The do while loop
do statement; while (logical_expression);
is equivalent to
{
statement;
if (logical_expression)
{
statement;
if (logical_expression)
{
statement;
if (logical_expression)
{
.
.
.
}
}
}
}
The for structure is particularly well-suited for enumerated loops. The initialization and statement2 slots are typically used to define and increment a counter variable (whose scope is restricted to the loop).
while and do while check the exit condition before and after the statement. Hence, do while always performs the statement at least once.
By default, arguments to functions are passed by value. This means that the variables declared in a function’s argument list are copies. They are temporary variables occupying their own memory locations but assigned the external values gleaned from the function call. For example, consider the following function definition.
int odd(int i) { return 2*i+1; }
The call odd(3) transfers program control to odd, where a temporary int named i is allocated and assigned the value 3. The function returns 2*i+1. The statements int i = 3; odd(i); also lead to the creation of a temporary variable—an internal i that is assigned the value of the external i.
Changes to the internal temporaries do not propagate outward.
int odd(int i)
{
i = i+1;
return 2*i-1;
}
int main()
{
int j = 5;
int k = odd(j);
assert(j == 5);
assert(k == 11);
return 0;
}
Alternatively, an argument may be flagged with an ampersand (&) to indicate that it should be passed by reference. In that case, the function receives the address (i.e., the actual memory location) of the variable, rather than its value, and no temporary is created.
int odd(int &i)
{
i = i+1;
return 2*i-1;
}
int main()
{
int j = 5;
int k = odd(j); // j is altered during the function call
assert(j == 6);
assert(k == 11);
return 0;
}
A common idom is to pass a variable by reference but to declare it const. This ensures that the function has no side effects in that argument slot.
int odd(const int &i) { return 2*i+1; } // compiler approves
int odd(const int &i)
{
++i; // compiler reports an error here
return 2*i-1;
}
For PODs, there is no good reason to pass arguments this way: it has exactly the same effect as passing by value. On the other hand, for large class objects, which are very expensive to copy, this is the prefered method. (Whereas the object may be large, its address is always just one machine word in size.)
Non-const passing by reference is most commonly used to implement procedure-like functions, especially in situations where it’s necessary to bypass the restriction that functions return at most one object as return value. For example, imagine a procedure that solves for the eigenvectors and eigenvalues of a matrix. It is implemented as a function that takes a matrix, a list_of_vectors, a list_of_doubles (all hypothetical types), and a bool.
void EigenSolver(const matrix &M, list_of_vectors &V,
list_of_doubles &E, bool &is_singular);
We might call it as follows.
matrix M;
list_of_vectors V;
list_of_doubles E;
bool failed;
EigenSolver(M,V,E,failed);
if (!failed)
{
// perform operations on the eigenvectors and
// eigenvalues that have been assigned to V and E
}
It is true of both objects and functions that they must be declared before they are used. This is a requirement of C++’s strong type system. It is valid, however, if they are defined later. A function’s declaration is called a prototype. It consists of the function’s return type, name, and argument list, followed by a semicolon.
void inc_by(int&, int); // function declaration
int main()
{
int i, j, k; // declaration of integers i and j
k = j; // valid but dangerous assignment
i = 5;
j = 7;
inc_by(i,j);
return 0;
}
void inc_by(int &x, int dx) { x += dx; } // function definition
Prototypes are required for functions that have mutual dependencies. For example, foo needs to be declared before bar is defined and vice versa:
double foo(double);
double bar(double);
double foo(double x)
{
if (x < 0.0) return x;
return x + bar(x);
}
double bar(double x)
{
if (x > 0.0) return x;
return x - foo(x);
}
It is common practice to organize libraries of functions in a file filename.cpp and to store the corresponding prototypes in a separate header file named filename.h.
$ cat > B4.cpp
#include "my_math_functions.h"
// my_math_functions.h contains the prototype double Bessel0(double);
// Bessel0 is defined in my_math_functions.cpp
int main()
{
double x = Bessel0(4.0);
return 0;
}
[ctrl-d]
$ g++ -c my_math_functions.cpp
$ g++ -c B4.cpp
$ g++ -o B4 B4.o my_math_functions.o
The address of function can be passed as an argument to another function. This is useful when you want to perform some computation on a generic function that is to be specified later. For example, a simple numerical integration of a function \(f(x)\) might be given by
The corresponding C++ function might look as follows:
trapezoidIntegrator( double (&f) (double),
double a, double b, unsigned int N)
{
assert(N != 0);
const double width = b-a;
const double h = width/N;
double sum = 0.5*( f(a) + f(b) );
for (unsigned int i = 1; i < N; ++i)
{
const double x_i = a + i*h;
sum += f(x_i);
}
return sum*h;
}
The 100-slice trapezoid approximation to
would be called as follows:
#include <cmath>
using std::cos;
const double I = trapezoidIntegrator(cos,0.0,2*M_PI,100);
Later, we’ll encounter a more sophisticated way to pass functions (and function objects) using templates.
Recursive functions are ones that call themselves. The classic example is the factorial function, which can be defined either explicity, in terms of the product \(n! = n \cdot (n-1) \cdots 3 \cdot 2 \cdot 1\), or implicitly, by specifying a recursion condition \(n! = n \cdot (n-1)!\) and a terminating condition \(0! = 1! = 1\).
The conventional definition might look like one of the following.
unsigned int factorial(unsigned int n)
{
unsigned int prod = 1;
for (unsigned int m = 2; m <= n; ++m) prod *= m;
return prod;
}
unsigned int factorial(unsigned int n)
{
unsigned int prod = 1;
while (n > 1) prod *= n--;
return prod;
}
Keep in mind that C++ function calls are stored on a finite application stack. Typically, 100 or 200 levels of recursion are permitted before the program runs out of memory and crashes. Recursive functions also tend to be slower than their non-recursive alternatives (because of the overhead from repeated function calls).
$ cat > factorial.cpp
#include <iostream>
using std::cout;
using std::endl;
unsigned int factorial(unsigned int n)
{
if (n == 0 or n == 1) return 1;
// else
return n*factorial(n-1);
}
int main()
{
for (unsigned int n = 0; n < 8; ++n)
cout << char('0' + n) << "! = " << factorial(n) << endl;
return 0;
}
$ g++ -o factorial factorial.cpp
$ ./factorial
0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
Exercise
Explain why the function factorial works the same regardless of whether the else is commented out.
Exercise
Write a fibonacci function that uses recursion to calculate the series 1, 2, 3, 5, 8, 13, 21, ...
C++ provides a template mechanism for pattern matching of variables and type names. This allows the programmer to implement generic functions that act sensibly on objects of various types using a single definition.
$ cat > middle.cpp
#include <cassert>
template <typename T>
T middle(T x, T y, T z)
{
if (y < x and x < z or z < x and x < x)
return x;
else if (x < y and y < z or z < y and y < x)
return y;
else
return z;
}
int main()
{
assert( middle(1,2,3) == middle(2,1,3) );
const double x = middle(-5.0,99.3,26.0);
assert( x > 25.0 and x < 27.0);
const float y = middle(1.5F,1.0F,4.0F);
const unsigned long int i = middle( (unsigned long int) y, 0UL, 2UL);
assert( i == 1UL);
assert( middle('d','o','g') < 'f' );
}
[ctrl-d]
$ g++ -o middle middle.cpp
$ ./middle
Assertion failed: (middle('d','o','g') < 'f'), function main
Abort trap
Here is an example of a bubble sort function that works with both conventional C arrays and C++ vectors:
$ cat > bubble_sort.cpp
#include <vector>
using std::vector;
#include <iostream>
using std::cout;
using std::endl;
#include <iterator>
using std::ostream_iterator;
#include <algorithm>
using std::swap;
using std::copy;
template <typename Iter>
void bubble_sort(Iter p1, Iter p2)
{
bool mismatch;
do
{
mismatch = false;
for (Iter q = p1; q < p2-1; ++q)
if ( *(q+1) < *q )
{
swap(*q,*(q+1));
mismatch = true;
}
} while (mismatch);
}
int main()
{
int a[5] = { 3, 9, 0, 1, 7 };
vector<int> v(a,a+5);
v.push_back(2);
v.push_back(13);
bubble_sort(a,a+5);
bubble_sort(v.begin(),v.end());
cout << "a = ";
copy(a, a+5, ostream_iterator<int>(cout, " "));
cout << endl;
cout << "v = ";
copy(v.begin(), v.end(), ostream_iterator<int>(cout, " "));
cout << endl;
return 0;
}
[ctrl-d]
$ g++ -o bubble_sort bubble_sort.cpp
$ ./bubble_sort;
a = 0 1 3 7 9
v = 0 1 2 3 7 9 13
Unlike some languages, C++ has very few built-in functions. Instead, they are provided as a large external library, broken into broad subcategories. Each small grouping of functions is loaded by including the appropirate header file. For example, most of the important math functions are accessed by #include <cmath>.
Function | Description |
---|---|
cos | cosine |
sin | sine |
acos | arc cosine |
asin | arc sine |
atan | arc tangent |
atan2 | arc tangent (2 parameters) |
cosh | hyperbolic cosine |
sinh | hyperbolic sine |
tanh | hyperbolic tangent |
exp | exponential function |
frexp | get significand and exponent |
ldexp | generate number from significand and exponent |
log | natural logarithm |
log10 | logarithm base-10 |
modf | break into fractional and integral parts |
pow | raise to power |
sqrt | square root |
ceil | round up value |
fabs | compute absolute value |
floor | round down value |
fmod | computer remainder of division hline |
A macro for the decimal representation of \(\pi\), called M_PI, is also included. Under some versions of UNIX, compilation of code that uses math functions may require the GCC option -lm. Header files that are part of the C language are available in C++ with a change in the naming convention: e.g., #include <math.h> becomes #include <cmath>, #include <stdlib.h> becomes #include <cstdlib>, etc. There is not always a complete correspondence between the two languages, since updates to language specifications are not in sync. For instance, the rounding functions round(), trunc(), and rint() are now in math.h as of C99 (the most recently ratified version of C). They are not yet part of C++, but likely coming in the near future.
Here’s another way to implement rounding:
inline int round(double x)
{
const double abs_x = fabs(x);
const double int i = int(floor(abs_x+0.5));
return ( x > 0 ? i : -i );
}
Let’s consider a common mathematical procedure that requires both the square root (sqrt) and absolute value (fabs) functions. The quadratic polynomial \(ax^2 + bx + c = 0\) has two roots,
This expression is mathematically exact, but when \(b^2 \gg 4ac\), we may run into the problem that one of the roots
will very nearly vanish to leading order in \(b\). For floating point numbers (with finite precision), this can near cancellation can result in a catastrophic loss of significance.
A convenient workaround is to rewrite the troublesome root as follows.
The two roots can be expressed as
#include <cassert>
#include <cmath>
using std::sqrt; // square root
using std::fabs; // absolute value of a floating point number
void quadratic_roots(double a, double b, double c,
double &x1, double &x2)
{
const double X2 = b*b-4*a*c;
assert(X2 >= 0.0);
const double X = sqrt(X2);
const double Ym = -b-X;
const double Yp = -b+X;
const double Y = (fabs(Ym) > fabs(Yp) ? Ym : Yp);
x1 = 2*c/Y;
x2 = Y/(2*a);
}
Unix provides several ways for you to communicate with your program. One is to pass information to it from the command line when the program is first run. At that time, the command line invocation is parsed and deposited into an array of C strings called argv; the number of elements is stored in an integer argc. These two variables can be included in the argument list to main.
To be precise, argc is an int whose value is set equal to the number of individual terms entered on the command—including the program name. Each term is assigned consecutively to argv[0], argv[1], ..., argv[argc-1]. Consider the following example.
$ ./myprog reinit -n 100 -J5.0 --input=datafile.txt
In this case, argc is equal to 6. The array values, "myprog", "reinit", "-n", "100", "-J5.0", "--input=datafile.txt", are C strings. The cstdlib library provides functions atoi and atof for converting strings to numerical types.
(The whitespace determines the partition of the terms). Your program can be made to interpret this text data in any way you please. Note that to extract numerical values, the text must first be converted to a numerical type. The functions atoi and atof perform this task.
#include <iostream>
using std::cerr;
using std::cout;
#include <cstdlib>
using std::atoi; // function that converts text to an integer value
using std::atof; // function that converts text to a floating point value
int main(int agrc, char* argv[])
{
int N;
double T;
if (argc != 3) // program requires exactly two arguments
{
cerr << "Error: two arguments are required"
<< "Usage: myprog number_particles temperature" << endl;
return 1; // exit program
}
else
{
N = atoi(argv[1]);
T = atof(argv[2]);
cout << "Beginning simulation with " << N << " particles at temperature " << T << endl;
}
// code that makes use of the user-provided values in N and T
return 0; // exit program
}
A user interacting with this program in the BASH terminal might have the following exchange:
$ ./myprog 7.5
Error: two arguments are required.
Usage: myprog number_particles temperature
$ ./myprog 100 7.5
Beginning simulation with 100 particles at temperature 7.5
Let’s return to the series \(S_N\) that we looked at earlier. How would we go about computing the infinite series? One approach would be to extrapolate from the sequence \(S_{10}, S_{20}, S_{40}, \ldots\) to \(S_{\infty}\).
$ cat > series.cpp
#include <cstdlib>
using std::atoi;
#include <cassert>
#include <iostream>
using std::cout;
using std::endl;
#include <iomanip>
using std::setw;
int main(int argc, char *argv[])
{
assert(argc == 2);
double sum = 0.0;
int N = atoi(argv[1]);
assert(N > 1);
cout.precision(12);
for (int n = 1; n <= N; ++n)
sum += 1.0/(n*n);
cout << setw(10) << N << setw(20) << sum << endl;
return 0;
}
[ctrl-d]
$ g++ -o series series.cpp
$ cat > batch.bash
#!/bin/bash
N=10
./series $N > converge.dat
while (( $N < 2000 ))
do
let N=N*2
./series $N >> converge.dat
done
exit
[ctrl-d]
$ chmod +x batch.bash
$ ./batch.bash
$ more converge.dat
10 1.53976773117
20 1.59366324391
40 1.61961896301
80 1.63235561634
160 1.63866449491
320 1.64180417895
640 1.64337034551
1280 1.64415251159
2560 1.64454336554
$ gnuplot
gnuplot> plot "converge.dat" using 1:2 with points
gnuplot> f(x) = f0 + f1*x + f2*x**2
gnuplot> set fit errorvariables
gnuplot> fit f(x) "converge.dat" using (1.0/$1):2 via f0,f1,f2
gnuplot> plot "converge.dat" using (1.0/$1):2 with points, f(x)
gnuplot> print f0, f0_err
1.64493188042833 1.42971432364597e-06
We find that numerical estimate for \(\lim_{N\to\infty}S_N\) is 1.644932(1).
Exercise
Modify the program so that ./series m computes the series to m decimal digits of accuracy.
$ emacs rev3.cpp
#include <iostream>
using std::cout;
using std::cerr;
using std::endl;
int main(int argc, char* argv[])
{
if (argc == 1)
cerr << "Too few arguments!" << endl;
else if (argc == 2)
cout << argv[1] << endl;
else if (argc == 3)
cout << argv[2] << " " << argv[1] << endl;
else if (argc == 4)
cout << argv[3] << " " << argv[2] << " " << argv[1] << endl;
else
cout << "Too many arguments!" << endl;
return 0;
}
[ctrl-x][ctrl-s][ctrl-x][ctrl-c]
$ ls
rev3.cpp
$ g++ -o rev3 rev3.cpp
$ ls -F
rev3* rev3.cpp
$ ./rev3 a
a
$ ./rev3 a b
b a
$ ./rev3 a b c
c b a
$ echo I do not like them, $(./rev3 am I Sam).
I do not like them, Sam I am.
Exercise
Modify the program so that it can reverse as many as five arguments:
$ cp rev3.cpp rev5.cpp
$ emacs rev5.cpp
[Your changes]
$ g++ -o rev5 rev5.cpp
$ ls -F
rev3* rev3.cpp rev5* rev5.cpp
$ echo $(./rev5 believe do I) $(./rev5 correctly this did I that)
I do believe that I did this correctly
A UNIX stream is an ordered sequence of bytes terminated by an end-of-file (EOF) character. The EOF can be produced using the [ctrl-d] key combination ([ctrl-z] for MS-DOS and Windows). For example, the following BASH session redirects a stream of user-supplied character input to a file.
$ cat > my file.txt
This is a sequence of characters redirected from stdin to this file.
[ctrl-d]
Internally, the stream is encoded using ASCII. The corresponding hex values are
54 68 69 73 20 69 73 20 61 20 73 65 71 75 65 6E 63 65 . . . 69 6C 65 2E 04
An important property of streams in that they are unidirectional. A stream establishes a connection to a device for the purpose of sending or receiving data (not both). Three predefined streams are provided in the UNIX environment (connecting your program to the terminal), and these have special handlers in C++.
UNIX stream C++ stream object operator standard input (stdin) cin >> standard output (stdout) cout << standard error (stderr) cerr <<
A subtle but important point: these handles aren’t keywords (i.e., they aren’t part of the C++ language); rather, they are identifiers (the names of objects).cin is of type istream, whereas cout and cerr are of type ostream. All three are class objects, as opposed to PODs. Class objects have special functions associated with them, sometimes called methods. These are accessed using a dot (.) notation.
For example, cin has a method good that checks on the status of the stream. If a >> operation fails for some reason or if the EOF marker has been reached, then cin.good() evaluates to false. This code will count the number of integers it can read in from the standard input stream:
int i;
unsigned long int count = 0;
cin >> i;
while (cin.good())
{
++count;
cin >> i;
}
cout << count << " integers read from stdin." << endl;
The primary methods for controlling input/ouput (I/O) connections are width, precision, setf, unsetf. The first two are self-explanatory. The final two are used to turn on and off various flags (all defined in the std::ios namespace), the most common being std::ios::scientific and std::ios::fixed. Used in combination, these method allow the user to adjust the output format.
cout.setf(std::ios::fixed);
cout.precision(2);
const double money1 = 9.99;
const double money2 = 9.9987;
cout << "$" << money1 << endl; // $9.99
cout << "$" << money2 << endl; // $10.00
cout.precision(8);
const double pi = 3.14159265358979323846;
cout << pi << endl; // 3.14159265
// 01234567
cout.unsetf(std::ios::fixed);
cout.setf(std::ios::scientific);
cout.width(16);
cout << pi;
cout.width(16);
cout << 2*pi;
cout.width(16);
cout << 3*pi;
cout.width(16);
cout << 4*pi << endl;
// 3.14159265e+00 6.28318531e+00 9.42477796e+00 1.25663706e+01
//0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
An alternative to invoking the stream object methods is to embed so-called manipulators in the stream itself. The manipulators setw, setprecision, setiosflags, and resetiosflags perform nearly equivalent tasks to the methods discussed above. For example, to arrange the multiples of pi in four columns, we could also write the following:
cout << setiosflags(std::ios::scientific) << setprecision(8)
<< setw(16) << pi << setw(16) << 2*pi
<< setw(16) << 3*pi << setw(16) << 4*pi << endl;
In the following example, a sequence of numbers is reformatted in three columns.
cat > columns3.cpp
#include <iostream>
using std::cin;
using std::cout;
using std::endl;
int main()
{
int i;
cin >> i;
unsigned int count = 0;
while (cin.good())
{
++count;
cout << i;
if (count%3 == 0)
cout << endl;
else
cout << "t";
cin >> i;
}
if (count%3 != 0)
cout << endl;
return 0;
}
[ctrl-d]
$ g++ -o columns3 columns3.cpp
$ cat > numbers.dat
1 2 3 4 5
6 7
8 9 10 11 12 13
14
15
16 17
[ctrl-d]
$ ./columns3 < numbers.dat
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17
Some formatting options can be adjusting by setting flags via the setf and unsetf member functions of the stream object. Other member functions control the character width of the output (width), the number of digits of precision (precision), and the choice of padding character when number of required digits is smaller than the specified width (fill).
Flag | Description |
---|---|
std::ios::skipws | Skip leading whitespace on input |
std::ios::left | Left justify output |
std::ios::right | Right justify output |
std::ios::internal | Pad numeric output by inserting a fill character |
std::ios::boolalpha | Use true and false for boolean true and false |
std::ios::dec | Output numbers in base 10, decimal format |
std::ios::oct | Output numbers in base 8, octal format |
std::ios::hex | Output numbers in base 16, hexadecimal format |
std::ios::showbase | Print out a base indicator at the beginning of each number |
std::ios::showpoint | Show a decimal point for all floating-point numbers |
std::ios::uppercase | When converting hexadecimal numbers, show the digits A–F as uppercase |
std::ios::showpos | Put a plus sign before all positive numbers |
std::ios::scientific | Convert all floating-point numbers to scientific notation on output |
std::ios::fixed | Convert all floating-point numbers to fixed point on output |
std::ios::unitbuf | Buffer output |
Much of the same functionality can be obtained with manipulators, which are chained into the stream itself.
Manipulator | Description |
---|---|
std::dec | Output numbers in decimal format |
std::hex | Output numbers in hexadecimal format |
std::oct | Output numbers in octal format |
std::ws | Skip whitespace on input |
std::endl | Output end-of-line |
std::ends | Output end-of-string (\0) |
std::flush | Force any buffered output out |
std::setiosflags(long) | Set selected conversion flags |
std::resetiosflags(long) | Reset selected flags |
std::setbase(int) | Set conversion base to 8, 10, or 16 |
std::setw(int) | Set the width of the output |
std::setprecision(int) | Set the precision of floating-point output |
std::setfill(char) | Set the fill character |
Let’s attempt to approximate \(\pi\) from the idenity
by means of series expansion. The conventional Taylor series for \(\arctan\) can be computed efficiently with nested operations as follows.
Here’s a program that computes the approximation term-by-term and outputs the results with 16 digits of precision to stdout.
Example
$ cat > pi.cpp
#include <cmath>
using std::arctan;
#include <iostream>
using std::cout;
using std::endl;
double atan_series(double x, unsigned int N)
{
const double x2 = x*x;
double val = 0.0;
for (int n = N, m = 2*N+1; n >= 0; --n, m -= 2)
val = val*x2 + (n%2 == 0 ? 1.0 : -1.0)/m;
return val*x;
}
int main()
{
cout.precision(16);
cout << "via 16*arctan(1/5) - 4*arctan(1/239):" << endl;
for (unsigned int terms = 1; terms < 10; ++terms)
cout << "pi (" << terms << "-term approx) = "
<< 16*atan_series(0.2,terms) - 4*atan_series(1.0/239,terms)
<< endl;
cout << "pi (exact) = " << M_PI << endl;
cout << endl << "via 4*arctan(1):" << endl;
for (unsigned int terms = 1; terms < 10; ++terms)
cout << "pi (" << terms << "-term approx) = "
<< 4*atan_series(1.0,terms) << endl;
cout << "pi (exact) = " << M_PI << endl;
return 0;
}[ctrl-d]
$ g++ -o pi pi.cpp -lm
$ ./pi
via 16*arctan(1/5) - 4*arctan(1/239):
pi (1-term approx) = 3.140597029326061
pi (2-term approx) = 3.141621029325035
pi (3-term approx) = 3.141591772182178
pi (4-term approx) = 3.1415926824044
pi (5-term approx) = 3.141592652615309
pi (6-term approx) = 3.141592653623555
pi (7-term approx) = 3.141592653588603
pi (8-term approx) = 3.141592653589836
pi (9-term approx) = 3.141592653589792
pi (exact) = 3.141592653589793
via 4*arctan(1):
pi (1-term approx) = 2.666666666666667
pi (2-term approx) = 3.466666666666667
pi (3-term approx) = 2.895238095238096
pi (4-term approx) = 3.33968253968254
pi (5-term approx) = 2.976046176046176
pi (6-term approx) = 3.283738483738484
pi (7-term approx) = 3.017071817071817
pi (8-term approx) = 3.252365934718876
pi (9-term approx) = 3.041839618929402
pi (exact) = 3.141592653589793
This program produces columnar data suitable for gnuplot.
Example
$ cat > circle.cpp
#include <iostream>
using std::cout;
using std::endl;
#include <iomanip>
using std::setw;
#include <cmath>
using std::cos;
using std::sin;
int main()
{
const int steps = 100;
for (int n = 0; n <= steps; ++n)
{
const double theta = 2.0*M_PI*n/steps;
cout << setw(15) << cos(theta) << setw(15) << sin(theta) << endl;
}
return 0;
}
[ctrl-d]
$ g++ -o circle circle.cpp -lm
$ ./circle > circle.dat
$ gnuplot
gnuplot> plot "circle.dat" using 1:2 with lines
gnuplot> unset key
gnuplot> set size square
gnuplot> replot
gnuplot> quit
Exercise
Using circle.cpp as a template, write a program that outputs the coordinate pair
over one cycle.
Most files are human-readable and stored as a sequence of characters.
Example
$ cat > io.cpp
#include <cassert>
#include <iostream>
using std::cerr;
using std::endl;
#include <fstream>
using std::ofstream;
using std::ifstream;
ifstream fin;
int main()
{
ofstream fout("test.txt"); // open an empty file test.dat
// overwrite if file already exists
fout << "0 1 2 3 4 ... are the natural numbers" << endl;
fout.close();
int a,b,c,d;
fin.open("test.dat");
fin >> a >> b >> c >> d; // read in four integers
assert(a == 0 and b == 1 and c == 2 and d == 3);
fin.close();
fout.open("test.txt",std::ios::app); // open an existing file and
// append all output
if (fout.is_open())
fout << "Another additional line" << endl;
{
cerr << "Could not open file `test.txt`" << endl;
return 1;
}
fout.close();
return 0;
}
[ctrl-d]
$ g++ -o io io.cpp
$ ./io
$ cat test.txt
0 1 2 3 4 ... are the natural numbers
Another additional line
Remember that :: is the scope operator. app is inside the ios namespace which is inside the std namespace.
Binary files are treated as a stream of bits rather than characters. To manipulate them, we do not use the familiar text chaining operators << and >>. Instead, we use methods provided by the ofstream and ifstream classes to access the underlying bit patterns. The workhorse methods are read and write.
Example
#include <fstream>
using std::ofstream;
using std::ifstream;
ifstream fin;
fin.open("infile.dat", ios::in | ios::binary);
ofstream fout;
fout.open(outfile.dat", ios::out | ios::binary);
int main()
{
char buffer[100];
fin.read(buffer,100);
if (!fin)
{
cerr << "Looking for 100 bytes. Only "
<< fin.gcount << "bytes read." << endl;
fin.clear();
}
fout.write(buffer,100);
fin.close();
fout.close();
return 0;
}
Note that a binary file is marked with the std::ios::bin flag.
The binary data for an arbitrary type can be written to a file, but usually requires a cast to char*.
struct S
{
char name[20];
double x;
int i;
};
S datum;
S data[20];
fout.write((char*)(&datum),sizeof(S));
fout.write((char*)(data),20*sizeof(S));
The methods tellg() and tellp() query the current getstream and putstream position. The seekg(offset,direction) method moves the file position with repsect to ios::beg, ios::cur, or ios::end.
Example
ifstream fin("data.binary");
ifstream::pos_type begin = fin.tellg();
fin.seekg(0,ios::end);
ifstream::pos_type end = fin.tellg();
fin.close()
cout << "File size is " << end-begin << " bytes." << endl;
Example
ifstream fin;
fin.open("data.binary", ios::in | ios::binary | ios::ate);
if (fin.is_open())
{
ifstream::pos_type bytes = fin.tellg();
vector<char> buffer(bytes);
fin.seekg(0,ios::beg);
fin.read(buffer.begin(),bytes);
fin.close();
}