Next steps

We have covered various ways to do basic analyze iNaturalist data. Next step is for you to come up with some questions, and analyze the CNC iNaturalist data.

Developing your analysis

Before you start writing your own code

  1. Come up with one or more questions that you are interested in. Pick questions that mean something to you. It is your curiosity that will motivate you to learn how to use R to look for answers to your questions.

  2. Learning a programming language is like learning spoken/written languages - it will take practice to get better. Since this is the first programming experience for many of you, it is perfectly normal to copy others while gradually forming a mental model of what is going on.

  3. Review the online lessons and use Codespace to run the examples in ‘scripts/lesson-scripts’.

  4. We did one exercise in class. Optionally, do the rest of the exercises in the “Working with data”, “Creating maps”, and “Creating charts” sections to get more practice with coding. Click the “Show solution” button on the online lesson to see the solution.

  5. Once you formed some questions, figure out how to go from 191K City Nature Challenge observations to the observation you want. I provided a lot of coding examples, but I don’t expect anyone to use all the examples for their analysis. Think of each code example as one building block. It’s up to you to pick the blocks you need to build your analysis.

    If you want to study one or more species, review the examples that filter by species by common_name and scientific_name. If you want to study a group of species, review the higher taxa examples. If you want to limit the observations to a particular location, review the examples that show filtering observations using geographic boundaries and buffer. If you want to compare observations in different areas, review the examples about counting the number of observations per neighborhood. If you want to study your own observations, review the examples about filtering by iNaturalist user_login. If you want to mimic the data standards used by scientists, review the examples about filter by research grade and licenses. If you want to limit the observations by year, review the examples about filtering by years. If you want to remove some of the biases in community science data, review normalizing iNaturalist data examples. Review the datasets listed on the “Using to other datasets” page to see if any of the datasets are useful for you.

Writing your own code

Once you have a rough idea about what you want to analyze, you can start writing code for your analysis.

  1. Read the “Restart Codespace” section to restart Codespace. Read the “Stop Codespace” section to learn how to stop Codespace at the end of your work session.

  2. Create script files to save your work. Remember to end the script files with .R.

  3. Copy and paste examples provided in the lessons, and adjust them to fit your needs. When you have something that kinda works, save the script file and the results. Save your scripts often since RStudio does not have auto-save functionality. Save the scripts to the scripts folder. Save any CSVs and images you generate to the results folder.

    You can add comments in your scripts to help remind you of what the code does.

# Comments are lines that start with #. Comments are not processed by R. 
# Comments are lines that we add to help explain what is going on.

# read the CNC CSV

# read file that has Expostion park boundaries

# get my research grade observations in Exposition Park.
  1. Data exploration is an iterative process. Chances are, you won’t get your final results on your first day. Try different way to look at the data. Use filter(), select(), count(), and mutate() to change the data. Then create summary tables, charts, and/or maps to find patterns. Repeat the process. If you had an idea for an analysis, but the idea didn’t reveal any interesting patterns, you can modify your analysis. If you find something interesting, modify your query to focus on that pattern. You can create multiple script files as your analysis evolves.

    As you get more practice reading, copying, writing, and editing R code, your mental model what the R code is doing will evolve. Some things that didn’t make sense at first, will begin to click.

  2. As you are analyzing the data, think about how you want to present your results. Since animals, plants, fungi, etc can’t speak for themselves, we need to use our voice to advocate for them. Your presentation should represent your analysis, your voice, and your interests.

  3. Attend the office hours or email me if you have questions.