Data Types
We will be using a variety of objects to work effectively with R. Here we’ll discuss briefly the important types of objects and how they should be used.
Objects and OOP
R is an Object-Oriented Programming (OOP) language which means it works by storing information and data in “objects”. An object is a data structure that has some attributes and a set of methods that act on those attributes.
Object-oriented languages (often called “high-level” languages) are typically more intuitive, as whenever you want to assign some value(s) to a variable and save those values for later, you can assign these values to a variable, and this variable becomes an object which will be saved by R in the environment for future use.
Data Types
There are six main data types in R, but we will discuss only the first four.
- Characters
- Numerics (real or decimal)
- Integers
- Logicals
- Complex
- Raw
When we assign variables, it is important to know what type we are using. We can check the type of an object using typeof()
, and get other useful information about it, with a number of useful commands:
x <- "abc"
typeof(x)
## [1] "character"
If we want a more detailed answer we can ask what the structure (str()
) of an object is
str(x)
## chr "abc"
We see the output is slightly different here, with the function telling us the type of the object (chr
) and also the content of the object ("abc"
).
Characters
A character string (or simply ‘character’ type) is used to represent text. Character strings are typically enclosed in double quotes (" "
) or single quotes (' '
). They can contain letters, numbers, symbols, and spaces.
x <- "abc"
y <- "abc123"
typeof(x)
## [1] "character"
typeof(y)
## [1] "character"
Numerics
The ‘numeric’ type is R’s general-purpose way of storing numbers, especially those that might have decimal places. When you create a number like 3.14
or even 10
(without an L
suffix), R typically stores it in a way that can handle decimals.
z <- 3.14
typeof(z)
## [1] "double"
When R stores numbers with decimal points, it uses a high-precision format to be as accurate as possible. This is often referred to as ‘double precision’ or simply ‘double’. This means R can handle a very wide range of numbers, from very small to very large, and keep track of many decimal places. You’ll see typeof()
return "double"
for these numbers.
The output "double"
here refers to the fact that R automatically stores numeric data types with “double” precision.
Integers
An integer is a non-decimal whole number. Note: In R, the default is to store values as numeric unless explicitly told otherwise. We can see that if we make a new variable with only an integer value R will store it as a numeric type.
x1 <- 2
typeof(x1)
## [1] "double"
To force R to store it as an integer we can simply add an L
after the value.
x2 <- 2L
typeof(x2)
## [1] "integer"
Logicals
Logical types are simply TRUE or FALSE.
You can also use T
as a shorthand for TRUE
and F
for FALSE
. However, it’s generally considered best practice to use the full words (TRUE
and FALSE
) because T
and F
can technically be reassigned to other values (though this is rare and discouraged).
x3 <- TRUE
typeof(x3)
## [1] "logical"