SCA 00

SCA 00: Getting Started with R

Purpose

The purpose of this activity is three-fold: to ensure that you properly install R on your laptop (if possible), that you have the recommended file structure created so you do not lose your files, and that you check your R installation to ensure that you did it correctly. This last purpose will have you do a little programming in R.

Functions

In this SCA, we will be using the following functions in R. It is useful to keep track of where you were introduced to the functions. By the end of the SCA, you should be able to explain what these functions do.

 


The SCA Procedure

Doing real statistics requires actually doing statistics on real data. That means using a computer and a statistical program. After trying many different programs, R matches my needs as an analyst most closely. It also allows me to easily check your understanding of statistical techniques because it requires you to provide the script (a.k.a. you show your work).

Part I: The Install

Linux, Mac, and Windows Instructions

The R Statistical Environment installs like the typical piece of computer software: download it, click on the installer, and follow the installation directions.

  1. Go to http://cran.r-project.org/.
  2. Click on the link for your operating system (Linux, MacOS, Windows).
  3. Click on “install R for the first time” if you are using Windows, and “R-4.3.2.pkg” if you are using High Sierra (10.13) or newer. If you have an older version of Mac OS, you will need to use the instructions for Chromebook (below). Note that the “4.3.2” represents the current version number, which changes from term to term. Version “4.3.2” was released on October 2023.
    Note that the version I have on my Windows office desktop is 4.0.2. The version I have on my Mac laptop is 4.0.2. The version I have at home is 3.3.2. In other words, version updates are frequent, but not important for the work we do.
  4. Install as you would any piece of software. The defaults are fine.

That is the end of the first part. By this point, you have R installed on your computer. In the third part, we will ensure that you did this correctly.

Chromebook and iPad Instructions

Because Chromebooks and iPads do not (easily) allow you to download software, you will not be able to use R without being connected to the Internet. You will use R at this URL:

https://euclid.knoxds.org/rstudio/

Note that security settings require you be on campus to use this link.

To help ensure that your experience with R is akin to what your classmates will experience, please make the following changes to the Global Options... under “Tools” Use the Basic tab under the General section.

  • Uncheck: Restore more recently opened project at startup
  • Uncheck: Restore .RData into workspace at startup
  • Save workspace to .RData on exit → Never

[global options]

In other words, make sure your Options page looks like this.

Part II: The Folders

My experience is that 80% of the problems in doing statistics comes from not knowing where files are located and not understanding the importance of the working directory. Creating this directory structure will help avoid those issue… but only if you actually follow through and use it.

  1. Create a folder for this course, STAT200. This is where you will save all of your work for this course. This is a good habit to get into for all courses (and projects) you are a part of.
  2. In that folder, create the following subfolders:
    • sca
    • labs
    • practicums

These subfolders have these names because these are the main “aspects” of this course. Other courses will have different subfolders because they have different structures.

Part III: Testing the Install

Now, let us double-check that you did everything correctly.

  1. Start R. Note that the window that opens is called the “Console” window. You know this because the window title is “R Console.”
  2. Open a new script window (it will be titled “untitled” until it is saved).
  3. Type the following into that script window: ### Test Script # Set prng seed set.seed(3) # Define variables x = c(1, 6, 7, 2) y = runif(4) z = x+y # Sample statistics mean(x) median(y) sd(z)
  4. Save this into your sca folder as “0-testScript.R.”
  5. Quit R. It is important to quit because that actually ends it in your computer’s current memory. In Windows, just click on the red X (top-right of the window). In Mac, just press Ctrl + q. Online, click on the x next to the script name, then close the browser window.
  6. Restart R.
  7. Click on “Open Script” (Windows) or “Open Document” (Mac) or “Open File” (online). This can be found in the menu under “File.”
  8. Open the script called “0-testScript.R.”
  9. Highlight all of the lines.
  10. Press Ctrl + r (Windows) or Cmd + Return (Mac) or Ctrl + Enter (online).
  11. Look at the blue numbers (Windows) or black numbers (Mac).
  12. They should be (in order and ignoring the echo from what you sent to the Console window): [1] 4 [1] 0.3563383 [1] 3.132832 Note that the “[1]” parts of the outputs indicate that this is the first value outputted in this particular list of values.
By the way, these are the three answers to R Assignment 4, which is due on Monday. I encourage you to go there and submit your work now.

Part IIII: The Working Directory

So, you now have the R Statistical Environment installed on your computer. You have a folder structure designed to help you understand the importance of keeping projects separate. The last thing we have to cover is the “working directory.”

The working directory is the default folder from which R will load data and/or save files. The working directory is (by default) the folder from which you start R. In Windows, opening R from the Start menu will open it in your Documents folder. In Mac, opening R from the Launchpad will open it in your folder.

The key is that you want to open R in your working directory. In Mac OS (and online), this is rather easy. In Windows, it is less so.

  • Windows:
    Save the R environment (.RData) to your working directory. Double-click on this to open R in this directory. [blue R icon]
    Note that copy-pasting will be helpful here. Copy-paste the big-blue-R-icon file from one folder to another to make this procedure easier. The big-blue-icon looks like the image to the right. There must not be a small arrow in the bottom-left of the icon (this would be a shortcut which will open R in its default folder, which you do not want).
  • Mac OS and Online:
    Save a script (extension of .R or .r) in your working directory. Open R by double-clicking on this file to open in this directory.

In all cases, R will open in your working directory. This allows you to keep all important information for the project in a single folder. Following this structure and procedure will help you structure your analyses.

Part V: You are Done

This is the end of the first Statistical Computing Activity (SCA). There is a lot of information available to you on the R Statistical Environment. A broad Internet search will usually land you in the right place.

That is where I start for a better understanding of any procedure in R. For instance, if I need to test a single population median, then I will tend to go directly to an Internet search. With that said, here are a few sources I prefer when learning about new R capabilities:

In other words, the truth is out there.

[The Truth is Out There]

You just need to look for it.

This page was last modified on 4 January 2024.
All rights reserved by Ole J. Forsberg, PhD, ©2008–2024. No reproduction of any of this material is allowed without explicit written permission of the copyright holder.