Instructor Notes
Dataset
The data used for this lesson are a slightly cleaned up version of the SAFI Survey Results available on GitHub. The original data is on figshare.
This lesson uses SAFI_clean.csv. The direct download
link for the data file is: https://raw.githubusercontent.com/datacarpentry/r-socialsci/main/episodes/data/SAFI_clean.csv.
Lesson Plans
The lesson contains much more material than can be taught in a day. Instructors will need to pick an appropriate subset of episodes to use in a standard one day course.
Suggested path for half-day course:
- Before we Start
- Introduction to R
- Starting with Data
Suggested path for full-day course:
- Before we Start
- Introduction to R
- Starting with Data
- Data Wranging with dplyr
- (OPTIONAL) Data Wrangling with tidyr
- Data Visualisation with ggplot2
For a two-day workshop, it may be possible to cover all of the episodes. Feedback from the community on successful lesson plans is always appreciated!
Technical Tips and Tricks
Show how to use the ‘zoom’ button to blow up graphs without constantly resizing windows.
Sometimes a package will not install. You can try a different CRAN mirror:
- Tools > Global Options > Packages > CRAN Mirror
Alternatively you can go to CRAN and download the package and install from ZIP file:
- Tools > Install Packages > set to ‘from Zip/TAR’
It’s often easier to make sure they have all the needed packages installed at one time, rather than deal with these issues over and over. See the “Setup instructions” section on the homepage of the course website for package installation instructions.
| character on Spanish keyboards: The
Spanish Mac keyboard does not have a | key. This character
can be created using:
`alt` + `1`
Other Resources
If you encounter a problem during a workshop, feel free to contact the maintainers by email or open an issue.
For a more in-depth coverage of topics of the workshops, you may want to read “R for Data Science” by Hadley Wickham and Garrett Grolemund.
Before we Start
Slides notes
It’s important to let people know they should have the orientation document opened at this point as well as the etherpad.
The etherpad will be used rather than asking people questions. That a collab doc which is semi public so please don’t share confidential info. If you have it opened please add an answer to the question in day 1.
Mention two sources for the course material, with optional lessons to be found on the official carpentry repository.
Mention the use sticky notes as well
Instructor Note
- The main goal here is to help the learners be comfortable with the RStudio interface.
- Go very slowly in the “Getting set up” section. Make sure that
learners are in the correct working directory, and that they create a
data(all lowercase) subfolder.
Instructor Note
- At this point you may want to show in the file explorer where the project directory is and where the script.R file is. You can also show how to open the project again by double clicking on the .Rproj file. You need to make sure it is extra clear that R interact with your computer locations (file explorer).
- Highlight the importance of saving your script and project often, and that you should always save your script before closing RStudio. If you don’t, you will lose all the work you have done since the last time you saved.
Instructor Note
- Show the file pane and the console pane and how they interact with
the working directory. You can also show how to check the working
directory with
getwd()and how to change it withsetwd(). Emphasize that you should avoid usingsetwd()in your scripts and instead use RStudio projects to manage your working directory. You can also show how to set working directory in RStudio by going to Session -> Set Working Directory -> To Source File Location. This will set the working directory to the location of your script file, which is useful if you have a consistent folder structure across your projects.
Introduction to R
Instructor Note
- The main goal is to introduce users to the various objects in R, from atomic types to creating your own objects.
- While this epsiode is foundational, be careful not to get caught in the weeds as the variety of types and operations can be overwhelming for new users, especially before they understand how this fits into their own “workflow.”
Note on
Learners sometimes type x<-5 intending a logical test
“is x less than -5?”.
In R, x<-5 is parsed as an assignment
because <- is a single token.
To resolve this you can either encourage spacing around operators or parentheses to avoid ambiguity:
Logical test:
x < -5(note the space between “<” and the “-” negative)Assignment:
x <- 5Alternative for clarity:
x < (-5)(explicit negative value)
Choose how to teach this section
The section on generative AI is intended to be concise but Instructors may choose to devote more time to the topic in a workshop. Depending on your own level of experience and comfort with talking about and using these tools, you could choose to do any of the following:
- Explain how large language models work and are trained, and/or the difference between generative AI, other forms of AI that currently exist, and the limits of what LLMs can do (e.g., they can’t “reason”).
- Demonstrate how you recommend that learners use generative AI.
- Discuss the ethical concerns listed below, as well as others that you are aware of, to help learners make an informed choice about whether or not to use generative AI tools.
This is a fast-moving technology. If you are preparing to teach this section and you feel it has become outdated, please open an issue on the lesson repository to let the Maintainers know and/or a pull request to suggest updates and improvements.
intro to Quarto (Optional)
Instructor Note
At this point you may want to explain the different chunk options used in the above code chunk, and what they do. You can also explain that there are many more options available, and that you can find them in the Quarto documentation. Since we have not covered yet how to import data, it would be good to move into the next episode before explaining the code in the chunk, and then come back to it to explain the chunk options.
Instructor Note
From now on you can advice them to use a quarto document for the subsequent episodes. You can briefly discuss the possibility of quarto (see next sections) but not in details.
Starting with Data
Instructor Note
The main goals for this lessons are:
- To make sure that learners are comfortable with working with data frames, and can use the bracket notation to select slices/columns.
Instructor Note
Demonstrate how to import data using click as and show the code that is generated in the console.
Data Wrangling with dplyr
Instructor Note
- This lesson works better if you have graphics demonstrating dplyr commands. You can modify this Google Slides deck and use it for your workshop.
- For this lesson make sure that learners are comfortable using pipes.
- There is also sometimes some confusion on what the arguments of
group_byshould be, and when to usefilter()andselect().
Data Visualisation with ggplot2
Instructor Note
- This episode is a broad overview of ggplot2 and focuses on (1)
getting familiar with the layering system of ggplot2, (2) using the
argument
groupin theaes()function, (3) basic customization of the plots. - The episode depends on data created in the Data Wrangling with tidyr episode. If you did not get to or through all of the tidyr episode, you can have the learners access the data by either downloading it or quickly creating it using the tidyr code below. You will probably want to copy the code into the Etherpad.
- If you did skip the tidyr episode, you might want to go over the exporting data section in that episode.