Work with R so that it flows

Writing scripts

Include basic information on a script: give it a name, say when you created it, note where it ran last, etc This will help you to trace different versions, to cope with problems that arise with software updates, and to attribute your work to you.

###################################################################################################
# File-Name:   advR_workFlow.r      
# Date:        20/03/27                             
# Author:      DD                                                               
# Machine:     DD's MacBook Air
###################################################################################################

Executing the various commands in R requires to install and load additional packages to provide you with more tools than the ones offered by Base R (and beyond the packages RStudio loads automatically). Some packages you will need almost every time, some only occassionaly. Keeping a running list at the beginning of your script helps to have them handy and also to remember when you extended your tool kit into using a new package. This is the way to make your analysis sharable and replicable! When you share a file, avoid commands that change settings on others’ computers (e.g. with # r install.packages or # r setwd()).

First, define a running list of potentially required packages, then install if necessary # r for (pkg in pkgs) install.packages(pkg, character.only = TRUE) and load packages. If you do not want to load a package because you only need to reference it once, type #r dplyr::mutate(data,newVariable=1), which will make installed packages, here dplyr, available for just that one command called.

Occassionaly you will be thrown error messages related to outdated version of packages; it helps to #r remove.packages('name_package') and re-install. Occassionaly you will also load packages that use the same name for a command that does different things. With the conflicted-package, you can tell R which command from which package to prefer.

pkgs <- c("tidyverse","dplyr","ggplot2",'conflicted','rmarkdown')
for (pkg in pkgs) library(pkg, character.only = TRUE)

## ── Attaching packages ─────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 3.2.1     ✔ purrr   0.3.2
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0

## ── Conflicts ────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

conflict_prefer("filter", "dplyr")

## [conflicted] Will prefer dplyr::filter over any other package

conflict_prefer("select", "dplyr")

## [conflicted] Will prefer dplyr::select over any other package

conflict_prefer("geom_errorbarh", "ggplot2")

## [conflicted] Will prefer ggplot2::geom_errorbarh over any other package

data <- read.csv('fakeData.csv')
#ggplot(dota=data) + geom_point(mapping=aes(x=varInd,y=var))
ggplot(data=data) + geom_point(mapping=aes(x=varInd,y=var))

#fliter(data,var>5)
filter(data,var>3)

##    var cat   varCorr varWeakCorr    varInd varNonMon  varBiRaw varOutlier
## 1    5   0  6.802379    8.679899  5.445352     12.75 0.6922094  -1.666667
## 2   15   0 12.806620   18.433080  9.987334      2.75 0.1257472  50.000000
## 3   12   1 11.591310    8.500076  7.071755     11.00 0.6499108  -4.000000
## 4    6   1  1.205441    1.811344  3.001570     14.00 0.6948789  -2.000000
## 5   14   1 18.475360   17.343190 11.915390      6.00 0.3620667  -4.666667
## 6    9   0  9.927052   18.557400  5.372198     14.75 0.1283483  -3.000000
## 7   11   0 14.020340   14.084650  8.714008     12.75 0.4842256  -3.666667
## 8   15   1 15.447200   14.447050 11.565670      2.75 0.1523629  50.000000
## 9   13   1 14.138830   15.123470 10.162380      8.75 0.1706715  -4.333333
## 10  13   0  8.874312   13.891430 12.218340      8.75 0.6864638  -4.333333
##    varOutlierNoise varBi    varExp
## 1        15.591280     1  2.718282
## 2        68.350900     0 20.085540
## 3        -0.429595     1 11.023180
## 4        13.922400     1  3.320117
## 5        -4.481500     0 16.444650
## 6         3.763536     0  6.049647
## 7         9.527660     0  9.025014
## 8        61.923960     0 20.085540
## 9         9.865591     0 13.463740
## 10        3.943735     1 13.463740