Data Types

Data Types#

All variables in Java must be declared before they may be used. This is done by specifying the variable name along with its data type. The data type allows the compiler to determine how much memory needs to be allocated for that variable. When operations are done in the code, the data types must also match appropriately. This is critical help for programmers as the compiler will catch mismatches. These tend to be spelling mistakes or logical errors, so the programmer can fix them before run-time, which can help significantly with the correctness of the program.

Data types in Java fall into a variety of categories:

Primitive versus Reference - a data type is primitive if its value is stored directly in memory and accessed by the name of the variable; a data type is a reference if its address is stored in memory and the address is accessed by the name of the variable and the value is accessed by then checking that location in memory. Java only has eight primitive data types: byte, short, int, long, char, float, double, and boolean.
Built-in versus Not Built-in - Built-in data types are those for which no import statement is required. All the primitive data types are built-in to Java. In addition, Java contains built-in reference data types such as arrays, Strings, and StringBuilders. Other data types must be imported, for example, ArrayList, HashMap, LinkedList, and Stack. All the data types that are not built-in are reference data types and follow the object-oriented paradigm.
Pre-defined vs User-defined - Java comes with a large library of pre-defined data types but the programmer (user of Java) can also easily define new data types using Java’s class system.

In addition, Java has a wrapper data type for each of the primitive data types. This wrapper type extends the primitive data into the object-oriented paradigm, that is, converts a primitive into a reference. The variables that are of a reference data type are called objects.

Integer Data Types#

Java has four built-in, primitive integer data types, each allowing a representation of a different range of numbers and each taking up a different amount of space in memory:

byte requires 8 bits in memory and may contain integers from \(-128\) to \(+127\)
short requires 16 bits in memory and may contain integers from \(-32768\) to \(+32767\)
int requires 32 bits in memory and may contain integers from \(-2,147,483,648\) to \(+2,147,483,647\). This is the most common declaration of an integer.
long requires 64 bits in memory and may contain integers from \(-9,223,372,036,854,775,808\) to \(+9,223,372,036,854,775,807\). These magnitudes are just bigger than nine quintillion[1].

Here are two examples of integers, x and banana:

int x;
int banana = 10;

In the first case, x is only declared so memory is set aside to hold x but no value is explicitly set. In the second line, banana is both declared and initialized so memory is set aside to hold banana and the value of 10 is put into that memory location.

Note you must be careful when using primitive integers, as these numbers occupy a finite space in RAM, causing strange results. For example, \(–2,147,483,648 – 1\) is \(2,147,483,647\) in Java. This is to say that subtracting \(1\) from the smallest int, i.e. the biggest magnitude negative integer, results in a positive number. Impossible! in real life, but not in Java. In fact, the Java run-time environment will not even give an error message, thus, the programmer must beware.

The primitive integer types have a corresponding wrapper class or reference type: Byte, Short, Integer, and Long. Yes, int is fully spelled out in the reference type as `Integer’. Even in these reference types, the same finite length issues are present. Here is an example of making a reference integer:

Integer banana = 10;

Notice how reference integers require more memory, and a level of indirection to access their value, but we often will skip the reference cell in the memory diagram and just draw it as if the variable is a primitive, as in the previous diagram, because these models are not completely accurate and their purpose is only to help us understand our code.

These reference types give us extra fields that we may access. Two common ones are:

Integer.MAX_VALUE instead of typing \(2147483647\), which is \(2,147,483,647\).
Integer.MIN_VALUE instead of typing \(-2147483648\), which is \(-2,147,483,648\).

If you want an integer similar to a Python integer, which is allowed to grow to any size (within the limits of your computer’s memory), then Java requires an import of that data type: BigInteger [2].

Understand the Limitations and Power of Your Data Type

In Java, you have control over how much memory your computer will use when your program is run. Remember, though, that you may cause errors that your code should explicitly detect.

Floating Point Data Types#

Floating point values in Java can be declared by different reserved words, depending on the accuracy required:

float provides 7 significant digits to the number giving a magnitude from \(0\) to about \(10^{38}\). A float needs 32 bits in RAM.
double is the most common way to declare a floating point number. A double provides 15 significant digits to the number and gives a magnitude from \(0\) to about \(10^{308}\). A double needs 64 bits in RAM.

Here is an example of setting variables to contain the value of one-half:

double try1 = 1/2;
double try2 = 1.0 / 2;
double try3 = 1 / 2.0;
double try4 = 0.5;
System.out.println("try1: " +  try1 + "  try2: " + try2 + "  try3: " + try3 + " try4: " + try4);

try1: 0.0  try2: 0.5  try3: 0.5 try4: 0.5

Notice that our first attempt might not give the expected result. Both 1 and 2 are ints, so in the int data type 0 is the result of dividing 1 by 2. Of course, there is a remainder of 1, but this is lost in this assignment statement. The other three attempts all work, showing that as long as one of the operands is a double then the result will also contain the fractional component.

Never assume that a floating point number is precise, because, for example, a double uses 64 bits so only \(2^{64}=18,446,744,073,709,551,616\) unique numbers will exist in the range, including both positive and negative numbers. The typical mathematical picture of real numbers being a superset of the integer is not correct in computing. The floating point numbers in computing are not a continuous set, but rather a discrete set, and are the integers in computing. The set of floating point numbers represented on any computer does not include all of the integers that a computer may represent. This is shown in the diagram below:

banana in memory with value of address to cell with 10

The corresponding wrapper classes are: Float to float and Double to double. If the data type starts with a capital letter, we assume it is a reference data type. Unfortunately, this is only a convention and not a requirement within the language. Some programmers create reference data types that start with a lowercase letter, but we should try not to do this.

Boolean Data Types#

The Boolean data type, declared as boolean in the primitive form and as Boolean in the reference form, holds either true or false. Only one bit is required to represent a primitive boolean.

A Boolean expression is an expression that evaluates to either true or false. It is often made using the boolean operators:

== is equality in Java. For example, it is true that 2 == 2 and it is false that 2 == 3.
&& is used as the logical and operator. Two expressions combined with && are true only when both expressions are true. If either expression is false then the combined expression is also false.
|| is used as the logical or operator. Two expressions combined with || are true when at least one or both of the expressions are true. If both expressions are false then the combined expression is also false.
! is used as the logical not operator. If an expression is true, it changes to false, and vice-versa, if an expression is false, it changes to true.

Note that the bit-wise operators (single & and |) are not the same as the Boolean operators and even though they may appear to work the same they should not be used in place of the Boolean operators. When you study bit-wise operators (not in this textbook), the difference will become more clear.

Character Data Types#

Characters, designated with the char data type for primitives along with the Character wrapper, are 16-bit Unicode. Characters must be in single quotes, for example, 'a' or 'A'. Capital letters will have a different encoding than lowercase letters. A Unicode character chart could be used to see the encoding[3], which handles all of the letters in the languages of the world. Unicode is a 32-bit encoding, so Java only handles those characters that fit in 16 bits, that is, those characters that have 0’s in the leftmost 16 bits of the Unicode. This allows most of the letters and symbols from the most common languages in the world. For English letters, a simpler ASCII chart may be used[4], but Java pads the leftmost bits with 0s to form 16 bits in total per character. Here is some sample code:

char x = 'R';
char y = '\u263A';  //Unicode for a smiley
char z = '☺';       //Pasting symbol in directly
System.out.println("x: " + x + "  y: " + y + "  z: " + z);

x: R  y: ☺  z: ☺

String Data Type#

Strings in Java are not primitive but they are built-in as an abstract data type, with special syntax particular to Strings. Strings are used for a sequence of characters, for example, an English word or sentence. Strings have many methods associated with them (and you can search for the String library documentation using any internet search engine and your current version of Java)[5]. The associated methods can be accessed using standard object-oriented dot notation, and constructors to set them up. For example,

String likesR = new String("Rosanna loves banana");
System.out.println("My string is: \"" + likesR + "\" with length: " + likesR.length());

My string is: "Rosanna loves banana" with length: 20

There is also a shortcut that allows Strings to be made with assignment directly, without using the new reserved word that is normally required for making objects. For example

String likesZ = "Zeus loves shrimp";
System.out.println("My string is: \"" + likesZ + "\" with length: " + likesZ.length());

My string is: "Zeus loves shrimp" with length: 17

Notice how the escape sequence \" is used as a character inside my message, which is a String delimited by ".

You can even make an empty String, an assignment to a String with nothing in it:

String emptyString = "";
System.out.println("empty string is: " + emptyString +
                   " with length: " + emptyString.length());

empty string is:  with length: 0

or a null String, where the variable is not assigned at all:

String nullString;
System.out.print("null string is: " + nullString);
System.out.println("with length: " + nullString.length());

null string is: null

---------------------------------------------------------------------------
java.lang.NullPointerException: Cannot invoke "String.length()" because "REPL.$JShell$27.nullString" is null
	at .(#29:1)

Methods are invoked or sent to Strings using the standard dot . notation of all object-oriented programming. In the previous examples, we saw the use of the length method sent to the name of the object and invoked with .length(). There are many other String methods available[5], with charAt, equals, equalsIgnoreCase, format, indexOf, matches, split, substring, toLowerCase and toUpperCase perhaps being the most useful. You should hazard a good guess as to what each of these methods does, before looking them up. Null objects, including null strings, will raise exceptions if any messages are attempted.

It is important to note that String values are immutable, and while you may access a particular item in a String you cannot change its value. For example, since Strings have 0-based indexing likesZ.charAt(1) will get the character ‘e’ but there is no way to change that ‘e’ into another letter. This is not to be confused with re-assigning the string with likesZ = "ZEus loves shrimp";, which moves the name likesZ to point to a different string altogether.

StringBuilder Data Type#

A mutable version of strings is built-in to Java as the data type StringBuilder[6]. It has some similar methods to String, but some different methods to allow mutation. If a string is being built by concatenation, it is often preferable to build it using StringBuilder’s append and then convert it back to a String using toString().

Array Data Type#

Often we want to place multiple items into memory but only use one name to refer to them. For example, a basket of fruit might contain kiwi, apple, orange, and watermelon. Built-in to Java is the array data type, which is the most basic structure or container[7]. Arrays are a reference data type with special notation, which applies only to arrays. Here is the Java code for the example:

String[] fruitBasket = {"kiwi", "apple", "orange", "watermelon"};

Here is a visualization of this code:

shown in memory an array with four items kiwi, apple, orange, watermelon

An array is a list of items or elements, stored contiguously in memory. All elements must be of the same data type. An array is declared using that same data type and putting square brackets onto the end of the data type. For example, int[] is an array of integers, and double[] is an array of double precision floating point numbers. There are two ways to make arrays, but once an array is made its size cannot be changed during run-time:

If you know the items, then place them with an assignment statement on the same line as the declaration. For example,
```
int[] yourArray = {1, 2, 3, 4, 5};
```
If you do not know the items, then simply ask for the correct amount of space, using the new operator in an array-specific way with square brackets instead of rounded brackets. For example,
```
int[] myArray;
myArray = new int[5];
```
Java initializes all items in an array, depending on their data type:
- numeric types: 0 is placed in each cell
- Boolean type: false is placed in each cell
- reference types: null is placed in each cell.
Later you can (re-)assign individual items, as an array’s items are mutable. Indexing starts at 0, so the first item is set to 18 with
```
 myArray[0] = 18;
```

The length of an array can be accessed by using dot notation on the name of the array, for example, myArray.length will have the value of 5, which is the length of myArray. Notice how the length message is asking for a data field directly. There are no brackets(), as when a method is called. The largest valid index for myArray is \(4 =\) myArray.length \(– 1\). Negative indices cannot be used, and all indices outside of the bounds of the array will cause a run-time exception when access is attempted. For example,

int[] x = new int[10];
x[10] = 18; //out of bounds

---------------------------------------------------------------------------
java.lang.ArrayIndexOutOfBoundsException: Index 10 out of bounds for length 10
	at .(#30:1)

Arrays may be multi-dimensional. A 2D array is made using [][] and a 3D array with [][][]. Suppose we want to make a tic-tac-toe board, which is 3 X 3:

char[][] board = new char[3][3];
board[1][1] = 'X'; //Place X in middle
board[2][2] = 'O'; //Place O in bottom right
board[2][0] = 'X'; //Place X in bottom left

Casting#

Sometimes Java can change the values of one data type into values of another data type. For example, it should be clear that the number 5, though an integer, could be assigned to a variable that is a double. Indeed, it can:

double x = 5;
System.out.println(x);

5.0

[8]

Java converts that 5 for the programmer implicitly, that is, without the programmer needing to do anything. However, going in the opposite direction: should it be allowed that a floating point number could be assigned to an integer? Check it out:

int y = 5.5;
System.out.println(y);

|   int y = 5.5;
incompatible types: possible lossy conversion from double to int

This rightly raises an error to warn the programmer that some accuracy may be lost. There are times when the programmer is okay with the loss in accuracy. Java allows the programmer to convert the type explicitly with a cast. A cast appears in rounded brackets () in front of the item to be converted. Here is the previous example done properly, assuming the loss of the \(0.5\) is correct:

int y = (int) 5.5;
System.out.println(y);

Can you identify the cast?

The problem of converting from lowercase ASCII characters to uppercase characters can be accomplished with casting. Looking at the ASCII chart shows that the encoding for a lowercase character is exactly 32 bigger than the encoding of the corresponding uppercase character[4].

char ourChar = 'r';
int encodedChar = (int) ourChar; //Convert to integer explicitly
encodedChar = encodedChar - 32;  //Find corresponding upper case
ourChar = (char) encodedChar;    //Convert to char explicitly
System.out.println(ourChar);

Practice Questions#

What is the purpose of a data type?
Declare a variable and set it equal to your name.
Declare a variable and set it equal to the number of days in February during a leap year.
Declare a variable pi and set it equal to the mathematical definition of \(\pi\). Will your value be exact?
Make a floating point number equal to Integer.MAX_VALUE - 1 (that is, \(2,147,483,646\)). Is your floating point value exact?
Given the following Java code:
```
int banana = 7;
float jasper = banana / 2;
```
Without a computer, determine the value of each of the following once the code is executed:
a. banana b. jasper
Make an array containing the odd numbers from 1 to 20 inclusive.
Make an array that can hold 20 floating point numbers of double precision. What is the largest valid index for this array?
Make two Strings, one containing the first name and one containing the last name of the original inventor of Java. Use the concatenation operator to make a new String containing both names, and display the name.
Make a StringBuilder initialized to the first name of the original inventor of Java. Use the append operator to add the last name on. Display the resulting name.
Write the Java expression that is true exactly when a char called theChar is an uppercase character.
Write the Java expression that is true exactly when a char called theChar is a letter in your name.

To Solutions