Solutions to this workshop can be found here
Defining variables, or giving a name to values, such as the numbers you added together, allows you to store information.
You define variables using the <-
operator, which means “store the value on the right in the variable on the left”, like so. You can then call it back by saying the variable name.
variable1 <- 3
variable2 <- 4
variable1
You can use variables in the place of the numbers they hold, and perform mathematical operations on them.
variable1 + variable2
As their name implies, variables can take on many values, and can change, so be careful with naming and assignment.
variable1 <- variable1 + variable2
variable1
variable1 <- variable1 + variable2
variable1
These types of variables are numeric. However, R can also store text. These are in the form of characters. Surround characters in quotation marks to have them be character and not variable names.
char <- "a"
char
# In your console, try typing 'char <- a' (no quotes) and seeing what happens
Characters variables can store strings, such as words and sentences (in fact, in R, string and character are basically interchangeable). In R, one string (everything between the quotation marks) is stored as a single unit.
char_sentence <- "My favorite food is ice cream"
char_sentence
char_variable1 <- "variable1"
char_variable1
char_variable2 <- variable1
char_variable2
You can identify the type of variable (numeric, character) using is()
.
# Using is()
is(5)
# Try identifying the type of variable for char_variable1
# How about char_variable2?
# What is going on?
You can also make numbers into strings: just put a quote around them!
num_string <- '4'
actual_num <- 4
# Try identifying the types of num_string and actual_num
You need to be very careful about the types of your variables. It may look like actual_num and num_string are holding the same thing, but because we’ve implicitly told R that they are of different types, they behave very differently. In the Console, try adding num_string to actual_num. What happens, and why? What is R’s cryptic message trying to tell you?
We saw above that you can do basic arithmetic in R by just using mathematical symbols, e.g. +
. But for more complicated things, programming languages use things called functions. A function is something that wraps a whole set of operations so that you can invoke those operations in a single command. You have already used another function earlier in this class, is()
. Let’s take a look at another example: there’s a function, sum()
, that can add numbers, just like +
can. Let’s try it:
variable1
variable2
sum(variable1, variable2)
sum()
can also take more than two inputs!
# Try using the sum function to compute the sum of 3, 4, and 6
Functions are their own type! You can try using is()
to check this:
is(sum)
is(is)
Functions are always followed by ()
when you use them. When you use a function, often you are telling it to perform an operation on something else (e.g. a variable); in the examples above, we’re performing the sum()
function on variable1 and variable2, and we’re performing the is()
function on sum
and on itself. These are called arguments, and are passed to the function inside the parentheses.
Some functions don’t need to take any arguments. For example, you can find out what variables you have loaded in your environment at anytime using ls()
. This will list all your variables, any custom functions you may have written, etc.
ls()
Check out the Environment window on the top right! You can actually also see all your variables loaded in there, with a short summary of what’s in them. This is a super convenient feature of programming in Rstudio.
Another really useful function is the paste
function; this function allows you to combine strings.
favorite_subject <- 'biology'
statement_about_subject <- 'is #'
subject_rank <- '1'
biology_quality <- paste(favorite_subject, statement_about_subject, subject_rank)
Notice that the code above didn’t print anything out. That’s because we directed the output of the paste()
function into a variable, biology_quality. To get R to print out the current program variable, you could just type it into your console, or you could use a function! Try googling to find an R function that prints out a variable.
# Find an R function to print out a variable, and use it to print whatever
# the current_program variable is storing
You can learn about a function by reading its documentation. You can access the documentation with ?
, for example, type ?paste
into your console. You will see the help text pop up in the bottom right Help box in Rstudio.
?
is one of the more important keys for you to know. Seriously, many issues are averted by first checking the documentation. Documentation usually has:
In practice, this information can sometimes be a bit dense and technical. I usually find it most useful to read the ‘description’ on top, and then scroll all the way down to the ‘examples’ section at the bottom of the help function.
Notice that when we used paste above, the function automatically put a space between our two strings. It’s often really useful to combine strings with a different character, for example, a dash. Try to read the documentation for paste()
to figure out how to do this.
# Combine only the favorite_subject, statement_about_subject, and subject_rank
# variables, using a dash between them instead of a space
sum()
, paste()
, is()