## Hard skills/R

### R 중급 - 함수의 기초 (Functions)

2019. 4. 9. 10:23

Function

### Three Component of function

body : 함수 내부의 내용
formal : 함수 전달 인자
environment : 함수가 정의된 환경
f <- function(x) x^2
f
#> function(x) x^2

formals(f)
#> $x body(f) #> x^2 environment(f) #> <environment: R_GlobalEnv> ### Three base types of function Closure (사용자 정의 함수, print, mean 등의 대부분의 함수) Built-in (sum, max, +, - 등) Special (if, for, function, log 등)  ### Built-in & Special function • .Primitive() 함수를 통해 C 코드를 곧바로 실행시킴 • 또는 .Internal, .External R 내부 함수 실행 • BASE 패키지에만 있음 formals(sum) #> NULL body(sum) #> NULL environment(sum) #> NULL typeof(sum) #> [1] "builtin" formals(log) #> NULL body(log) #> NULL environment(log) #> NULL typeof(log) #> [1] "special" ### Every operation is a function call To understand computations in R, two slogans are helpful: 1. Everything that exists is an object. 2. Everything that happens is a function call. x <- 10; y <- 5 x + y #> [1] 15 +(x, y) #> [1] 15 for (i in 1:2) print(i) #> [1] 1 #> [1] 2 for(i, 1:2, print(i)) #> [1] 1 #> [1] 2 if (i == 1) print("yes!") else print("no.") #> [1] "no." if(i == 1, print("yes!"), print("no.")) #> [1] "no." function 의 argument 를 list 형식으로 전달하기 args <- list(1:10, na.rm = TRUE) do.call(mean, args) #> [1] 5.5 Default and missing value f <- function(a = 1, b = 2) { c(a, b) } f() #> [1] 1 2 ## Special types of function 1. Infix function • +, * 처럼 인수의 중간에 함수가 있는 경우 • 유저가 infix function 을 정의하려면 앞 뒤로 %를 붙임 • built-in function 은 % 가 없어도 됨 (+, ! & 등) # 문자열을 잇는 함수 만들기 %+% <- function(a, b) paste0(a, b) "new" %+% " string" #> [1] "new string" %+%("new", "string") #> [1] "newstring" 1. Replacement function • 해당 위치의 원소를 바꾸는 특별한 함수 • xxx<- 의 형태로 함수의 이름을 만들어야함 • 아래 형태를 가져야 함 x <- 1:10 modify<- <- function(x, position, value) { x[position] <- value x } # 재할당하지 않아도, x의 첫 번째 원소를 바꾼다. modify(x, 1) <- 10 # x <- modify<-(x, 1, 10) 와 같다. x #> [1] 10 2 3 4 5 6 7 8 9 10 replacement function의 예 : names<- x <- c(a = 1, b = 2, c = 3) names(x) #> [1] "a" "b" "c" names(x)[2] <- "two" names(x) #> [1] "a" "two" "c" # 실제로 내부적으로 아래 코드처럼 실행됨 # *tmp* <- names(x) # *tmp*[2] <- "two" # names(x) <- *tmp* data frame의 열 이름 바꾸기 df <- data.frame(a = c(1:3), b = c(2:4), c = c(3:5)) colnames(df)[2] <- 'd' df #> a d c #> 1 1 2 3 #> 2 2 3 4 #> 3 3 4 5 ## Excercise 1. Boostrap + Random feature selection 구현하기 • sample 10개를 생성하라 (사용자 정의 함수와 base 패키지의 replicate 함수를 이용) make_boostraps <- function(data, n, row_perc, col_perc, label) { make_one_bootstrap <- function(data, row_perc, col_perc, label) { num_row <- nrow(data) num_col <- length(data) raws_to_keep <- sample(num_row, row_perc * num_row, replace = TRUE) cols_names <- colnames(BreastCancer) features <- cols_names[!(cols_names %in% label)] cols_to_keep <- c(label, sample(features, col_perc * num_col)) return(data[raws_to_keep, cols_to_keep]) } return(replicate(n, make_one_bootstrap(data, row_perc, col_perc, label = label), simplify = FALSE )) } library(mlbench) data("BreastCancer") bootstraps <- make_boostraps(BreastCancer, n = 10, row_perc = 0.5, col_perc = 0.7, label = c("Id", "Class") ) str(bootstraps) #> List of 10 #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1276091" "1212422" "566509" "1240603" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$Cl.thickness : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 3 5 3 3 5 1 1 3 5 ... #> ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 5 1 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 2 ... #> ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 2 1 1 1 1 ...
#>   ..$Cell.shape : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 2 1 1 1 ... #> ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 2 1 1 1 ...
#>   ..$Marg.adhesion : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 1 1 1 2 3 1 2 1 1 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1295508" "1126417" "1182404" "1299596" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 2 1 1 1 1 2 1 1 2 ...
#>   ..$Marg.adhesion : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 3 1 1 4 ... #> ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 3 2 2 2 3 4 2 2 6 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 1 3 1 1 1 1 8 1 1 1 ... #> ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 4 4 1 1 1 NA 9 1 1 1 ...
#>   ..$Cell.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 6 1 1 1 1 3 1 1 4 ... #> ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 1 3 3 1 3 1 8 2 2 4 ...
#>   ..$Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 2 2 1 1 1 9 1 1 3 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "466906" "640712" "1258549" "1211202" ... #> ..$ Class        : Factor w/ 2 levels "benign","malignant": 1 1 2 2 1 2 1 1 2 1 ...
#>   ..$Epith.c.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 10 10 2 10 2 2 4 2 ... #> ..$ Cl.thickness : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 9 7 1 10 4 5 5 1 ...
#>   ..$Marg.adhesion: Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 10 1 2 1 1 10 1 ... #> ..$ Bl.cromatin  : Factor w/ 10 levels "1","2","3","4",..: 1 2 10 4 2 5 3 2 5 1 ...
#>   ..$Cell.shape : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 10 3 10 1 1 10 1 ... #> ..$ Cell.size    : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 5 1 10 1 1 10 1 ...
#>   ..$Bare.nuclei : Factor w/ 10 levels "1","2","3","4",..: 1 1 10 10 1 10 1 1 10 1 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "646904" "1115293" "646904" "1321264" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 1 1 2 1 1 1 2 ...
#>   ..$Bare.nuclei : Factor w/ 10 levels "1","2","3","4",..: 1 2 1 1 4 2 1 1 1 4 ... #> ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 3 2 3 2 3 4 2 3 1 8 ...
#>   ..$Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 7 10 1 1 1 10 ... #> ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 2 1 3 6 1 2 2 6 ...
#>   ..$Cell.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 2 8 8 1 1 1 10 ... #> ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 2 8 8 3 1 1 7 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 1 1 4 1 1 1 2 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1324572" "1026122" "1272166" "1201870" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 1 1 2 1 1 1 1 ...
#>   ..$Cell.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 2 10 1 2 1 1 ... #> ..$ Epith.c.size   : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 2 1 2 4 2 2 2 2 ...
#>   ..$Cl.thickness : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 5 2 5 4 4 10 2 3 6 4 ... #> ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 2 5 1 1 1 1 ...
#>   ..$Bl.cromatin : Factor w/ 10 levels "1","2","3","4",..: 2 1 1 2 3 8 3 1 1 1 ... #> ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 2 1 1 1 1 10 1 1 1 1 ...
#>   ..$Marg.adhesion : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 3 1 6 1 3 3 3 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1241232" "733823" "873549" "543558" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 2 2 2 2 2 1 2 2 1 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 1 1 3 1 8 1 1 2 1 1 ... #> ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 5 10 9 8 7 10 10 1 ...
#>   ..$Epith.c.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 3 4 4 6 4 3 4 2 ... #> ..$ Cell.shape     : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 4 6 5 3 8 5 4 7 10 1 ...
#>   ..$Cl.thickness : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 5 10 6 8 9 5 6 8 3 ... #> ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 3 4 3 5 8 4 4 8 7 3 ...
#>   ..$Bare.nuclei : Factor w/ 10 levels "1","2","3","4",..: NA 10 7 5 9 10 5 10 4 1 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1218105" "1212232" "673637" "1136142" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 2 1 1 1 2 1 2 1 1 1 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 5 1 1 1 1 1 2 1 1 1 ... #> ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 10 1 5 1 4 1 10 NA 2 1 ...
#>   ..$Bl.cromatin : Factor w/ 10 levels "1","2","3","4",..: 7 2 5 2 8 3 1 2 7 3 ... #> ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 10 1 1 1 10 1 3 1 1 1 ...
#>   ..$Cell.shape : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 10 1 1 1 10 2 5 3 1 1 ... #> ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 9 1 1 1 6 1 4 1 1 3 ...
#>   ..$Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 10 1 1 1 5 1 6 1 1 1 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1182404" "1343068" "1200847" "1253955" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 2 2 2 1 1 1 1 1 1 ...
#>   ..$Cl.thickness : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 3 8 6 8 4 3 3 3 3 4 ... #> ..$ Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 5 10 10 1 1 1 1 8 1 ...
#>   ..$Epith.c.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 6 8 5 2 2 2 1 8 2 ... #> ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 4 10 7 1 1 1 2 1 1 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 1 2 7 1 1 1 1 1 1 1 ... #> ..$ Marg.adhesion  : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 10 4 1 1 1 1 3 1 ...
#>   ..$Bl.cromatin : Factor w/ 10 levels "1","2","3","4",..: 2 2 10 5 2 2 1 2 5 2 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1287775" "1043068" "1303489" "846423" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 1 2 1 2 1 1 2 1 ...
#>   ..$Epith.c.size : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 2 2 2 4 3 2 2 2 5 2 ... #> ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 2 1 1 10 10 3 1 1 10 1 ...
#>   ..$Cell.shape : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 3 3 8 1 1 6 1 ... #> ..$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 1 6 3 5 1 1 6 1 ...
#>   ..$Mitoses : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 4 3 5 1 1 3 1 ... #> ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 3 2 2 7 3 2 2 3 6 2 ...
#>   ..$Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 8 5 1 1 1 8 1 ... #>$ :'data.frame':    349 obs. of  9 variables:
#>   ..$Id : chr [1:349] "1031608" "867392" "1198641" "803531" ... #> ..$ Class          : Factor w/ 2 levels "benign","malignant": 1 1 2 2 1 1 2 1 1 1 ...
#>   ..$Cell.shape : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 2 6 10 3 1 5 2 1 1 ... #> ..$ Mitoses        : Factor w/ 9 levels "1","2","3","4",..: 1 1 2 1 1 1 1 1 1 1 ...
#>   ..$Marg.adhesion : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 1 3 10 1 1 8 1 3 6 ... #> ..$ Bl.cromatin    : Factor w/ 10 levels "1","2","3","4",..: 2 2 4 8 2 3 8 2 3 1 ...
#>   ..$Normal.nucleoli: Factor w/ 10 levels "1","2","3","4",..: 1 1 3 5 3 2 6 1 1 1 ... #> ..$ Bare.nuclei    : Factor w/ 10 levels "1","2","3","4",..: 1 1 10 2 NA 1 10 1 1 1 ...
#>   ..\$ Cell.size      : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 1 2 10 10 4 1 6 1 1 1 ...

#### 'Hard skills > R' 카테고리의 다른 글

 R 과 데이터과학의 트렌드를 볼 수 있는 블로그들  (0) 2019.04.28 2019.04.28 2019.04.09 2019.04.09 2019.04.09 2019.04.07