Introduction to R and RStudio

Questions

  • What is R and RStudio?
  • How do you get started working in R and RStudio?

Objectives

  • Understand the difference between R and RStudio.
  • Describe the purpose of the different RStudio panes.
  • Organize files and directories into R Projects.

What are R and RStudio?

R is a programming language. R is a popular language for data analysis.

RStudio is software application that can make it easier to write and run R code.

Running R and RStudio

Let’s start R and RStudio in GitHub Codespaces.

  1. Go to data-explorers-feb-2025 on Github.
  2. Click on your repo.
  3. Click “Open in GitHub Codespaces” button.
  4. Click “PORTS” after the Codespace is finished building.
  5. Hover over “RStudio”.
  6. Click on middle globe icon to start RStudio.

Overview of workshop files

Let’s examine the project files using the Files pane.

It is a good practice to organize your projects into self-contained folders. Your project should start with a top-level folder that contains everything necessary for the project, including data, scripts, and results, all organized into subfolders.

project_folder
│
└── .devcontainer
│
└── .gitignore
|
└── data
│    └── cleaned
│    └── raw
│
└── project.Rproj
│
└── readme.md
|
└── results
│
└── scripts
    └── exercises
    └── lesson-scripts

We have a data folder containing cleaned and raw sub-folders. In general, you want to keep your raw data completely untouched, so once you put data into that folder, you do not modify it. Instead, you read it into R, and if you make any modifications, you write that modified file into the cleaned folder. We also have a results folder for any other documents you might produce. We also have a scripts folder to hold any scripts we write. We have a lesson-scripts subfolder that has all the code from the lesson, and a exercises subfolder for the lesson exercises.

RStudio provides a “Projects” feature that can make it easier to work on individual projects in R. RStudio projects have a xxx.Rproj file.

One of the benefits to using RStudio Projects is that they automatically set the working directory to the top-level folder for the project. The working directory is the folder where R is working, so it views the location of all files (including data and scripts) as being relative to the working directory.

Since the workshop is hosted on Github, we also have some additional files. .devcontainer has instructions on how to create Github Codespaces. .gitignore lists files and folders that Git should not track. readme.md gives a basic description of this project. Github displays the readme on the repository webpage. If you run R and RStudio on your own computer, then you don’t need these additional files.

Working in R and RStudio

Console vs. script

Programming is writing instructions for a computer. We refer to those instructions as code. You can run those instructions directly in the R console, or you can write them into an R script.

Console

  • The R console is where code is run/executed
  • The prompt, which is the > symbol, is where you can type commands
  • By pressing Return (Mac) or Enter (Windows), R will execute those commands and print the result.
  • When you write code in the R console, you will not be able to access your work in future.

Let’s try running some code in the console.

First, click down in the Console pane, and type:

1 + 2
[1] 3

Hit Return or Enter to run the code. You should see your code echoed, and then the value of 3 returned.

Script

  • We can also save the code in a plain text file called scripts. For R programming language, the script files have a .R extension
  • You type out lines of R code in a script, then send them to the R console to be evaluated. There are a few ways to run the code
    1. Cmd+Return (Mac) or Ctrl+Enter (Windows) will run the line of code that your cursor is on. If you highlight multiple lines of code, you can run all of them by pressing Cmd+Return (Mac) or Ctrl+Enter (Windows)
    2. click on the Run button above the editor panel
  • By saving the code in a script, you can edit and rerun them quickly, save them for later, and share them with others
  • An additional benefit of scripts is that you can leave comments for yourself or others to read. Lines that start with # are considered comments and will not be interpreted as R code.

First script

Let’s create our first script.

You can make a new R script by clicking File → New File → R Script, clicking the green + button in the top left corner of RStudio, or pressing Shift+Cmd+N (Mac) or Shift+Ctrl+N (Windows). It will be unsaved, and called “Untitled1”

R as Calculator

Now click into your blank script, and type:

1 + 2
[1] 3

With your cursor on that line, hit Cmd+Enter (Mac) or Ctrl+Enter (Windows) to run the code. You will see that your code was sent from the script to the console, where it returned a value of 3.

Saving scripts

Save script files by using Cmd+S (Mac) or Ctrl+S (Windows). Give the script a descriptive name such as ‘first_script.R’. Save it in the scripts folder.

RStudio does not have an auto save. You need to manually save the files. You should save often.

When you change a file, the name of the file will appear red in the tab. That means the file has unsaved changes. When you save the file, the name turns black.

Objects and Assignment

Sometimes we want to store values in memory so we can use it later. We can save values using the assignment operator <- (< less than and - hyphen).

object_name <- value

When there is <-, R creates an object, names the object object_name, and assigns value to the object. We use the object_name to access the value. Objects are shown in the Environment pane.

Let’s create price1 and price2 objects and assign them some values.

price1 <- 2
price2 <- 3

Now that we have the two objects saved in memory, we can do things with those objects. We can add the two prices.

price1 + price2
[1] 5

We can reassign new values to an existing object

price1 <- 10

price1 + price2
[1] 13

Comparing values

We can do comparison of price1 and price2. The comparisons return TRUE or FALSE.

price1 == price2
[1] FALSE
price1 > price2
[1] TRUE

R comparison operators:

== equal

!= not equal

> greater than

>= greater or equal than

< less than

<= less or equal than

Naming objects

You should try to use a descriptive name when naming objects to help you remember what information is contained in the object.

Object names can contain letters, numbers, underscores and periods. They cannot start with a number nor contain spaces. If you want object names with multiple words use underscores, capital letters or periods

first_name <- 'Jane'
firstName <- 'Jane'
first.name <- 'Jane'

Save changes to your repository

We are using Github and Github Codespaces to host and run our code.

When you create or edit a file, the changes are saved to Codespaces. You also need to save the changes to your repository. By saving the changes to your repository, you will have access to your files after the workshop ends.

Terminal is a program that allows you to write commands for the computer. To access the terminal, go to first the browser tab for Codespaces. Click “bash” in the bottom right, then click the “TERMINAL” tab.

Click on the $. Then type the following command after the $, and hit enter.

bash  scripts/save_files.sh

This will run a script called “save_files.sh” to save any changes you made to the repository.

When you are done coding for the day, you should run the save_files.sh script.

Note

This script uses Git to update your repository. The script creates a git commit and pushes the commit to your repository. If you want to learn more about Git, watch this 14 minute Git tutorial.