This is a detailed learning plan for a high schooler (sophomore and above) during an 8-week period. Students are expected to have basic knowledge about programming and statistics. The workload is 2-3 hours per day during the weekday.

### Learning objective

To develop essential skills for data analysis, including visualization, summarization, basic statistical inference, and generating report.

### Content

#### Basic statistics knowledge

• Understand data format: vector and matrix
• Understand concept of data distribution.
• Summary statistics: mean, meian, variance, standard deviation, proportions.
• Understand concept of data visualization and their relationship with data distribution and summary statistics: boxplot, histogram, scatter plot, bar plot.
• Concept of correlation and contingency table.
• Basic statistical inference: two group t- and z-test, simple linear regression.

#### Programming language

• R: for data analyses and visualization
• Basic usage of Rstudio
• Using R console for basic numeric computation.
• Understand R data types including vectors, matrix, data frame
• File input and output: how to read in files and write results out.
• Basic R graphics, and how to output R figures to file.
• Basic data analyses: compute summary statistics, simple statistical test, linear regression.
• markdown and Rmarkdown: for writing reports.
• Create markdown files, convert to pdf and/or html.
• Basic commands for different type of format
• How to include figures in the document
• How to include Latex style equations
• A littlt bit of Latex.
• I don’t expect students to write standalone latex document, but they need to know how to use latex to write equations, and insert those in Rmarkdown documents.

#### Data analysis

• Analyze a dataset for COVID-19.
• Write a final report in the standard format of a scientific paper.

### Materials

Below are some selected materials, websites, and videos. I’ll also use other materials, detailed in the weekly schedules.

• R
• markdown
• Latex
• Rmarkdown

### Weekly schedules

Click the link for each week for detailed learning schedule and homework. All homework need to be written in R, markdown, or Rmarkdown.

• Week 1:
• Install R and R studio.
• Understand R and R studio basics.
• Understand basic R data type and operator: scalar, vector, matrix. Understand how to combine and subset vector and matrix.
• Install a Markdown editor. Write a simple markdown document.
• Week 2
• Understand the concept of R packages. Learn to install packages.
• R data frame and list.
• R file I/O (input/output): Understand tab-delimited and csv files. How to read those files into R, and write the results out to a text file.
• How to save and load R objects.
• Statistics:
• understand mean, median, variance, standard deviation, etc. Learn to use R to compute these values.
• understand the concept of continuous and categorical data, and their data distribution. understand the meaning of histogram, boxplot, barplot.
• Basic R graphics. Understand the meaning of different type of plots. Use R to generate the plots.
• Learn to save the figures from R session to pdf/jpg/png/etc.
• Markdown: play with different type of fonts, ordered and unordered list, including figures. Write a complete markdown document.
• Week 3
• R programming control statement: loops, if-else.
• R graphics: plot with the color, line types, point types, legend, etc.
• Basic Latex.
• R markdown.
• Week 4
• Random number generator in R.
• overlay lines on scatterplot.
• Multiple panels in one figure.
• Colors in R.
• Figure margins.
• Basic statistics:
• Concept of random variable and probability distribution.
• Learn Latex math symbols. Insert latex equations in Rmarkdown.
• Week 5
• Concept of corrleation, contingency table. Relationships among more than one variables (continuous and/or categorical).
• Use scatterplot and boxplot to explore relationships among more than one variables.
• Understand the concept of simple linear regression.
• Use R to do linear regression.
• Write R function.
• Week 6
• Linux system and commands.
• R graphics, including base and ggplot2.
• Week 7
• R animation
• Review some earlier contents.
• Week 8
• Write a review for the whole course.
• COVID-19 data anaysis. Write a report.