Mandy Vogel
23.05.2016
?lm
apropos("linear")
Besides all LIFE Publication the group has a statistics section: https://www.zotero.org/groups/unimedleipzig/items
[…] Rstudio, which I believe is the best development environment for most R users. The only real competitor is Emacs Speaks Statistics (ESS).
PS: I love ESS, at least most of the time…
version
_
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 3.0
year 2016
month 05
day 03
svn rev 70573
language R
version.string R version 3.3.0 (2016-05-03)
nickname Supposedly Educational
RStudio.Version()
2 + 2
[1] 4
sqrt(4)
[1] 2
x <- 2
y <- 2
x
[1] 2
y
[1] 2
x + y
[1] 4
a <- "b"
b <- "Wort"
sentence <- "Hello world"
a
[1] "b"
b
[1] "Wort"
sentence
[1] "Hello world"
TRUE
and FALSE
x > 3
[1] FALSE
x == 3
[1] FALSE
x < 3
[1] TRUE
a == "a"
[1] FALSE
a == "b"
[1] TRUE
v <- c(1,4,7,2)
v
[1] 1 4 7 2
length(v)
[1] 4
There are some build in vectors in R, containing information frequently used by users. Here are some of them:
Get the length of these vectors!
The function
colors()
produces a vector containing the names of all predefined colors available in R.
data(iris)
summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
Median :5.800 Median :3.000 Median :4.350 Median :1.300
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
Species
setosa :50
versicolor:50
virginica :50
The data viewer provides:
Cave: the number of rows as well as the number of columns shown in the data viewer are limited!
Type the following commands. What do they do?
head(iris)
names(iris)
?iris
nrow(iris)
ncol(iris)
plot(iris)
The first way to access a column of a data frame is typing the name of the data frame followed by a dollar sign followed by the name of the column. HINT: RStudio provides very comprehensiv autocompletion functionality by hitting the TAB key
iris$Petal.Length
[1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3
[18] 1.4 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4
[35] 1.5 1.2 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7
[52] 4.5 4.9 4.0 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1
[69] 4.5 3.9 4.8 4.0 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5
[86] 4.5 4.7 4.4 4.1 4.0 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1
[103] 5.9 5.6 5.8 6.6 4.5 6.3 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9
[120] 5.0 5.7 4.9 6.7 4.9 5.7 6.0 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1
[137] 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9 5.7 5.2 5.0 5.2 5.4 5.1
plot(iris$Sepal.Length)
plot(iris$Sepal.Length,iris$Petal.Length)
plot(iris$Species,iris$Petal.Length)
plot(iris$Species)
plot(iris$Petal.Length,iris$Species)
plot(iris$Sepal.Length,iris$Petal.Length,col=iris$Species)
letters[20]
[1] "t"
letters[c(8,5,12,12,15)]
[1] "h" "e" "l" "l" "o"
iris[1,1]
[1] 5.1
iris[c(1,3),1]
[1] 5.1 4.7
iris[1:3,1:3]
Sepal.Length Sepal.Width Petal.Length
1 5.1 3.5 1.4
2 4.9 3.0 1.4
3 4.7 3.2 1.3
iris[1,]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
iris[,1]
[1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4
[18] 5.1 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5
[35] 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0
[52] 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8
[69] 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4
[86] 6.0 6.7 6.3 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8
[103] 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7
[120] 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7
[137] 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9
R can import data of different forms, from different file types, etc. It is important to remember: for almost every kind of file type you have to use a different function, e.g.
File Type | function to load the data | package |
---|---|---|
.rdata (R's own format) | load() | base |
.csv (English) | read.csv() | base |
.csv (German) | read.csv() | base |
.txt | read.table() | base |
.xlsx | read_excel() | readxl |
.sav (SPSS) | spss.get() | Hmisc |
.dta (Stata) | stata.get() | Hmisc |
.sasbdat (SAS) | sas.get() | Hmisc |
.sasxport (SAS Transport Files) | sasxport.get() | Hmisc |
All these functionalities are also available via the Import Dataset dialog. (Menu: Tools -> Import Dataset or via the Environment tab).
For now we want only use one of these functions to import a data set. Run the following line of code or use the data import menu…
education <- read.csv("http://www.barrolee.com/data/BL_v2.1/BL2013_MF1599_v2.1.csv")
Run summary() on the data frame.