Move Seamlessly from Excel to R - 1
Over the years, I’ve encountered many people who rely heavily on Excel and similar spreadsheet programs for their data analysis. While most are comfortable with Excel, they often face challenges that could be easily addressed with R. To help bridge this gap, I’m creating a series of introductory videos on using R alongside Excel.
I’m not expecting you to completely switch from Excel to R. If you’re comfortable using Excel for most of your tasks, that’s perfectly fine. However, for more complex aspects of data analysis, I encourage you to start integrating R into your workflow. You can perform the heavy-lifting tasks in R, save the results in CSV format, and then return to Excel for the rest of your analysis.
This video series is designed to help you ease into that process, starting with a few basic concepts in R. This won't be an exhaustive overview of the language, but I’ll focus specifically on what you need to know to analyze Excel-type spreadsheets.
What is R?
R is an open-source software environment primarily used for statistical computing and data analysis. It was originally developed for statistical analysis and has become widely used in fields such as biology, economics, and the social sciences. One of R’s standout features is its powerful data visualization capabilities, allowing you to create clear, insightful plots with ease. R also offers robust tools for machine learning.
For now, though, my goal is to introduce you to R with a focus on working with Excel-like data sets.
Installing R
To get started with R, follow these steps:
Google "Install R": This will take you to the official R website where you can find download links for both Mac OS and Windows.
For Mac OS: Download the
.pkg
file and follow the installation instructions.For Windows: Click the Windows link, download the installer, and follow the prompts.
For Linux: Type "Install R Linux" into Google. For Ubuntu, there are specific commands you can run to install R.
Once R is installed, you’ll see the R icon on your desktop (Windows) or in Finder (Mac). Double-click the icon to open R.
Getting Started with R
When you open R, you’ll see the R console, which allows you to run commands interactively. For example, you can type simple arithmetic like:
12 + 23
Pressing Enter will show the result in the console. You can also use the up and down arrows to scroll through previous commands, making it easy to edit and rerun them.
To clear your screen, use Ctrl + L (Windows) or Cmd + L (Mac).
Using R as a Calculator
You can use R like a regular calculator, but with more power and flexibility. Here's a simple example:
For addition:
12 + 23
For subtraction:
34 - 12
For multiplication:
2 * 3
For division:
20 / 4
R also allows you to work with decimal numbers:
12.34 + 5.66
What makes R more convenient than traditional calculators is the ability to go back and edit your commands. For instance, if you realize there’s a mistake, you can press the up arrow, modify the number, and hit enter again.
More Complex Calculations
R is ideal for more complex expressions. For example, you can calculate:
0.9235 * 100 + 45 - 15
You can even store your results in variables. For example:
x <- 12 + 23
y <- x * 2
z <- y / 3
R also supports advanced scientific calculations, such as:
Square root:
sqrt(10)
Exponential:
exp(20)
Trigonometry (in radians):
sin(pi / 3)
R has many built-in functions for scientific calculations, and you can explore them in the documentation.
Working with Vectors
In R, a vector is a collection of numbers. For example, if you define:
x <- 1:100
You create a vector of numbers from 1 to 100. R allows you to perform operations on an entire vector in a single step:
sqrt(x)
This computes the square root of each number in the vector, all in one go.
Installing Packages
Just like a smartphone can be enhanced with apps, R’s functionality can be extended with packages. One highly useful package is Tidyverse, which is a collection of packages designed for data manipulation, visualization, and more.
Here’s how to install and use Tidyverse:
To install the package, type:
install.packages("Tidyverse")
After installation, you need to load the package by typing:
library(Tidyverse)
Once Tidyverse is loaded, you gain access to a variety of functions. For example, the pipe operator (%>%
) is a powerful tool for chaining operations together in a readable way.
Here’s an example of a calculation with Tidyverse:
100 %>% sqrt() %>% log10()
This computes the square root of 100 and then takes the logarithm of the result— all in one clean, readable line. This is much easier to manage than nested parentheses, especially for more complex operations.
Reading Data from CSV Files
Once R is set up, you can easily import data from CSV files. For example, if you have a file on your desktop called data.csv
, you can load it into R with:
df <- read.csv("data.csv")
This creates a data frame, which is essentially R's version of a spreadsheet. You can view the contents by typing df
in the console.
Conclusion
This video has covered some of the basic features of R that will help you work with Excel-like data. While R has a steep learning curve, integrating it with Excel will make your data analysis more powerful and flexible.
In future videos, we’ll dive deeper into specific R features, such as working with data frames, advanced data manipulation, and visualization. Stay tuned!