Function

Three Component of function

body : 함수 내부의 내용
formal : 함수 전달 인자
environment : 함수가 정의된 환경
f <- function(x) x^2
f
#> function(x) x^2

formals(f)
#> $x
body(f)
#> x^2
environment(f)
#> <environment: R_GlobalEnv>

Three base types of function

Closure (사용자 정의 함수, print, mean 등의 대부분의 함수)
Built-in (sum, max, +, - 등)
Special (if, for, function, log 등) 

Built-in & Special function

  • .Primitive() 함수를 통해 C 코드를 곧바로 실행시킴
  • 또는 .Internal, .External R 내부 함수 실행
  • BASE 패키지에만 있음
formals(sum)
#> NULL
body(sum)
#> NULL
environment(sum)
#> NULL
typeof(sum)
#> [1] "builtin"

formals(`log`)
#> NULL
body(`log`)
#> NULL
environment(`log`)
#> NULL
typeof(`log`)
#> [1] "special"

Every operation is a function call

To understand computations in R, two slogans are helpful:

  1. Everything that exists is an object.
  2. Everything that happens is a function call.
x <- 10; y <- 5
x + y
#> [1] 15
`+`(x, y)
#> [1] 15
for (i in 1:2) print(i)
#> [1] 1
#> [1] 2
`for`(i, 1:2, print(i))
#> [1] 1
#> [1] 2
if (i == 1) print("yes!") else print("no.")
#> [1] "no."
`if`(i == 1, print("yes!"), print("no."))
#> [1] "no."

function 의 argument 를 list 형식으로 전달하기

args <- list(1:10, na.rm = TRUE)
do.call(mean, args)
#> [1] 5.5

Default and missing value

f <- function(a = 1, b = 2) {
  c(a, b)
}
f()
#> [1] 1 2

Special types of function

  1. Infix function
    • +, * 처럼 인수의 중간에 함수가 있는 경우
    • 유저가 infix function 을 정의하려면 앞 뒤로 %를 붙임
    • built-in function 은 % 가 없어도 됨 (+, ! & 등)
# 문자열을 잇는 함수 만들기 
`%+%` <- function(a, b) paste0(a, b)
"new" %+% " string"
#> [1] "new string"
`%+%`("new", "string")
#> [1] "newstring"
  1. Replacement function
    • 해당 위치의 원소를 바꾸는 특별한 함수
    • xxx<- 의 형태로 함수의 이름을 만들어야함
    • 아래 형태를 가져야 함
x <- 1:10
`modify<-` <- function(x, position, value) {
  x[position] <- value
  x
}
# 재할당하지 않아도, x의 첫 번째 원소를 바꾼다.
modify(x, 1) <- 10 
# x <- `modify<-`(x, 1, 10) 와 같다.

x
#>  [1] 10  2  3  4  5  6  7  8  9 10

replacement function의 예 : names<-

x <- c(a = 1, b = 2, c = 3)
names(x)
#> [1] "a" "b" "c"
names(x)[2] <- "two"
names(x)
#> [1] "a"   "two" "c"

# 실제로 내부적으로 아래 코드처럼 실행됨 
# `*tmp*` <- names(x)
# `*tmp*`[2] <- "two"
# names(x) <- `*tmp*`

data frame의 열 이름 바꾸기

df <- data.frame(a = c(1:3), b = c(2:4), c = c(3:5))
colnames(df)[2] <- 'd'
df
#>   a d c
#> 1 1 2 3
#> 2 2 3 4
#> 3 3 4 5

Excercise

  1. Boostrap + Random feature selection 구현하기
  • sample 10개를 생성하라 (사용자 정의 함수와 base 패키지의 replicate 함수를 이용)
make_boostraps <- function(data, n, row_perc, col_perc, label) {
  make_one_bootstrap <- function(data, row_perc, col_perc, label) {
    num_row <- nrow(data)
    num_col <- length(data)
    
    raws_to_keep <- sample(num_row, row_perc * num_row, replace = TRUE)
   
    cols_names <- colnames(BreastCancer)
    features <- cols_names[!(cols_names %in% label)]
    cols_to_keep <- c(label, sample(features, col_perc * num_col))
    
    return(data[raws_to_keep, cols_to_keep])
  }
  return(replicate(n,
    make_one_bootstrap(data, row_perc, col_perc, label = label),
    simplify = FALSE
  ))
}
library(mlbench)
data("BreastCancer")
bootstraps <- make_boostraps(BreastCancer,
  n = 10,
  row_perc = 0.5,
  col_perc = 0.7,
  label = c("Id", "Class")
)
str(bootstraps)
#> List of 10
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1276091" "1212422" "566509" "1240603" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ Cl.thickness   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 3 5 3 3 5 1 1 3 5 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 5 1 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 2 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 2 1 1 1 1 ...
#>   ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 2 1 1 1 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 2 1 1 1 ...
#>   ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 1 1 1 2 3 1 2 1 1 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1295508" "1126417" "1182404" "1299596" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 2 1 1 1 1 2 1 1 2 ...
#>   ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 3 1 1 4 ...
#>   ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 3 2 2 2 3 4 2 2 6 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 3 1 1 1 1 8 1 1 1 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 4 4 1 1 1 NA 9 1 1 1 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 6 1 1 1 1 3 1 1 4 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 1 3 3 1 3 1 8 2 2 4 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 2 2 1 1 1 9 1 1 3 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id           : chr [1:349] "466906" "640712" "1258549" "1211202" ...
#>   ..$ Class        : Factor w/ 2 levels "benign","malignant": 1 1 2 2 1 2 1 1 2 1 ...
#>   ..$ Epith.c.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 10 10 2 10 2 2 4 2 ...
#>   ..$ Cl.thickness : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 9 7 1 10 4 5 5 1 ...
#>   ..$ Marg.adhesion: Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 10 1 2 1 1 10 1 ...
#>   ..$ Bl.cromatin  : Factor w/ 10 levels "1","2","3","4",..: 1 2 10 4 2 5 3 2 5 1 ...
#>   ..$ Cell.shape   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 10 3 10 1 1 10 1 ...
#>   ..$ Cell.size    : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 5 1 10 1 1 10 1 ...
#>   ..$ Bare.nuclei  : Factor w/ 10 levels "1","2","3","4",..: 1 1 10 10 1 10 1 1 10 1 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "646904" "1115293" "646904" "1321264" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 1 1 2 1 1 1 2 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 2 1 1 4 2 1 1 1 4 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 3 2 3 2 3 4 2 3 1 8 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 7 10 1 1 1 10 ...
#>   ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 2 1 3 6 1 2 2 6 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 2 8 8 1 1 1 10 ...
#>   ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 2 8 8 3 1 1 7 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 1 1 4 1 1 1 2 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1324572" "1026122" "1272166" "1201870" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 1 1 2 1 1 1 1 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 2 10 1 2 1 1 ...
#>   ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 2 1 2 4 2 2 2 2 ...
#>   ..$ Cl.thickness   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 5 2 5 4 4 10 2 3 6 4 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 2 5 1 1 1 1 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 2 1 1 2 3 8 3 1 1 1 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 2 1 1 1 1 10 1 1 1 1 ...
#>   ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 3 1 6 1 3 3 3 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1241232" "733823" "873549" "543558" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 2 2 2 2 2 1 2 2 1 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 1 3 1 8 1 1 2 1 1 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 5 10 9 8 7 10 10 1 ...
#>   ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 3 4 4 6 4 3 4 2 ...
#>   ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 4 6 5 3 8 5 4 7 10 1 ...
#>   ..$ Cl.thickness   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 5 10 6 8 9 5 6 8 3 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 3 4 3 5 8 4 4 8 7 3 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: NA 10 7 5 9 10 5 10 4 1 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1218105" "1212232" "673637" "1136142" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 2 1 1 1 2 1 2 1 1 1 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 5 1 1 1 1 1 2 1 1 1 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 10 1 5 1 4 1 10 NA 2 1 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 7 2 5 2 8 3 1 2 7 3 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 10 1 1 1 10 1 3 1 1 1 ...
#>   ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 10 1 1 1 10 2 5 3 1 1 ...
#>   ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 9 1 1 1 6 1 4 1 1 3 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 10 1 1 1 5 1 6 1 1 1 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1182404" "1343068" "1200847" "1253955" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 2 2 2 1 1 1 1 1 1 ...
#>   ..$ Cl.thickness   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 8 6 8 4 3 3 3 3 4 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 5 10 10 1 1 1 1 8 1 ...
#>   ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 6 8 5 2 2 2 1 8 2 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 4 10 7 1 1 1 2 1 1 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 2 7 1 1 1 1 1 1 1 ...
#>   ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 4 1 1 1 1 3 1 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 2 2 10 5 2 2 1 2 5 2 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1287775" "1043068" "1303489" "846423" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 2 1 2 1 1 2 1 ...
#>   ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 2 4 3 2 2 2 5 2 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 2 1 1 10 10 3 1 1 10 1 ...
#>   ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 3 3 8 1 1 6 1 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 6 3 5 1 1 6 1 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 4 3 5 1 1 3 1 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 3 2 2 7 3 2 2 3 6 2 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 8 5 1 1 1 8 1 ...
#>  $ :'data.frame':    349 obs. of  9 variables:
#>   ..$ Id             : chr [1:349] "1031608" "867392" "1198641" "803531" ...
#>   ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 2 2 1 1 2 1 1 1 ...
#>   ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 2 6 10 3 1 5 2 1 1 ...
#>   ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 1 2 1 1 1 1 1 1 1 ...
#>   ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 3 10 1 1 8 1 3 6 ...
#>   ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 2 2 4 8 2 3 8 2 3 1 ...
#>   ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 3 5 3 2 6 1 1 1 ...
#>   ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 1 10 2 NA 1 10 1 1 1 ...
#>   ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 2 10 10 4 1 6 1 1 1 ...