Upload
learningtech
View
443
Download
0
Embed Size (px)
Citation preview
R Language
ian
Why ?
Background of R
What is R?
GNU Project Developed by John Chambers @ Bell LabFree software environment for statistical computing and graphicsFunctional programming language written primarily in C, Fortran
R is functional programming language R is an interpreted language R is object oriented-language
R Language
Statistic analysis on the fly
Mathematical function and graphic module embedded
FREE! & Open Source! http://cran.r-project.org/src/base/
Why Using R
What is your programming language of choice, R, Python or something else? “I use R, and occasionally matlab, for data analysis. There is a large, active and extremely knowledgeable R community at Google.”http://simplystatistics.org/2013/02/15/interview-with-nick-chamandy-statistician-at-google/
Data Scientist of these Companies Using R
“Expert knowledge of SAS (With Enterprise Guide/Miner) required and candidates with strong knowledge of R will be preferred”http://www.kdnuggets.com/jobs/13/03-29-apple-sr-data-scientist.html?utm_source=twitterfeed&utm_medium=facebook&utm_campaign=tfb&utm_content=FaceBook&utm_term=analytics#.UVXibgXOpfc.facebook
In 2007, Revolution Analytics providea commercial support for Revolution R http://www.revolutionanalytics.com/products/revolution-r.php http://www.revolutionanalytics.com/why-revolution-r/which-r-is-right-for-me.php
Big Data Appliance, which integrates R, Apache Hadoop, Oracle Enterprise Linux, and a NoSQL database with the Exadata hardware http://
www.oracle.com/us/products/database/big-data-appliance/overview/index.html
Commercial support for R
Free for Community Version http://www.revolutionanalytics.com/downloads/
http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php
Revolotion R
Base R 2.14.2 64
Revolution R (1-core)
Revolution R (4-core)
Speedup (4 core)
Matrix Calculation 17.4 sec 2.9 sec 2.0 sec 7.9x
Matrix Functions 10.3 sec 2.0 sec 1.2 sec 7.8x
Program Control 2.7 sec 2.7 sec 2.7 sec Not Appreciable
R Studio http://www.rstudio.com/
IDE
RGUI• http://www.r-project.org/
Shiny makes it super simple for R users like you to turn analyses into interactive web applications that anyone can use
http://www.rstudio.com/shiny/
Web App Development
CRAN (Comprehensive R Archive Network)
Package ManagementRepository URLCRAN http://cran.r-project.org/web/packages/Bioconductor http://www.bioconductor.org/packages/release/Softwa
re.htmlR-Forge http://r-forge.r-project.org/
R Basic
help() help(demo)
demo() demo(is.things)
q() ls() rm()
rm(x)
Basic Command
Vector List Factor Array Matrix Data Frame
Basic Object
物件類型 (type) 主要是向量 (vector), 矩陣 (matrix), 陣列 (array),因素 (factor), 列表 (list), 資料框架 (data frame), 函式 (function).
物件基本元素之“模式” (basic mode) 分成 1."numeric", 實數型 , 含 "integer", 整數型 ( 有時需特別指定 ),與 "double", 倍精確度型 . 2."logical", 邏輯型 (true or false), 以 TRUE(T) 或 FALSE(F) 呈現 ,
( 也可以是 1 (T) 與 0 (F). 3."complex", 複數型 4."character", 文字型 ( 或字串 ), 通常輸入時 , 在文字或字串兩側加上雙引號 (").
Scalar x=3; y<-5; x+y
Vectors x = c(1,2,3, 7); y= c(2,3,5,1); x+y; x*y; x – y; x/y; x =seq(1,10); y= 2:11; x+y x =seq(1,10,by=2); y =seq(1,10,length=2) rep(c(5,8), 3) x= c(1,2,3); length(x)
Objects & Arithmetic
Summary X = c(1,2,3,4,5,6,7,8,9,10) mean(x), min(x), median(x), max(x), var(x) summary(x)
Subscripting x = c(1,2,3,4,5,6,7,8,9,10) x[1:3]; x[c(1,3,5)]; x[c(1,3,5)] * 2 + x[c(2,2,2)] x[-(1:6)]
Summaries and Subscripting
Contain a heterogeneous selection of objects e <- list(thing="hat", size="8.25"); e l <-
list(a=1,b=2,c=3,d=4,e=5,f=6,g=7,h=8,i=9,j=10)
l$j man = list(name="Qoo", height=183);
man$name
Lists
Ordered collection of items to present categorical value
Different values that the factor can take are called levels
Factors phone = factor(c('iphone', 'htc', 'iphone',
'samsung', 'iphone', 'samsung')) levels(phone)
Factor
Array An extension of a vector to more than two dimensions a <- array(c(1,2,3,4,5,6,7,8,9,10,11,12),dim=c(3,4))
Matrices A vector to two dimensions – 2d-array x = c(1,2,3); y = c(4,5,6); rbind(x,y);cbind(x,y) x = rbind(c(1,2,3),c(4,5,6)); dim(x) x<-matrix(c(1,2,3,4,5,6),nr=3); x<-matrix(c(1,2,3,4,5,6),nrow=3, ,byrow=T) x<-matrix(c(1,2,3,4),nr=2);y<-matrix(c(5,6),nr=2); x%*%y t(matrix(c(1,2,3,4),nr=2)) solve(matrix(c(1,2,3,4),nr=2))
Matrices & Array
Useful way to represent tabular data essentially a matrix with named columns
may also include non-numerical variables
Example df =
data.frame(a=c(1,2,3,4,5),b=c(2,3,4,5,6));df
Data Frame
Function `%myop%` <- function(a, b) {2*a + 2*b}; 1 %myop% 1 f <- function(x) {return(x^2 + 3)} create.vector.of.ones <- function(n) { return.vector <- NA; for (i in 1:n) { return.vector[i] <- 1; } return.vector; } create.vector.of.ones(3)
Control Structures If …else… Repeat, for, while
Catch error – trycatch
Function
Functional language Characteristic apply.to.three <- function(f) {f(3)} apply.to.three(function(x) {x * 7})
Anonymous Function
All R code manipulates objects. Every object in R has a type In assignment statements, R will copy the
object, not just the reference to the object Attributes
Objects and Classes
Many R functions were implemented using S3 methods
In S version 4 (hence S4), formal classes and methods were introduced that allowed Multiple arguments Abstract types inheritance.
S3 & S4 Object
S4 OOP Example setClass("Student", representation(name = "character",
score="numeric")) studenta = new ("Student", name="david", score=80 ) studentb = new ("Student", name="andy", score=90 )setMethod("show", signature("Student"), function(object) { cat(object@score+100) }) setGeneric("getscore", function(object)
standardGeneric("getscore")) Studenta
OOP of S4
A package is a related set of functions, help files, and data files that have been bundled together.
Basic Command library(rpart) CRAN Install (.packages())
Packages
29
Package used in Machine Learning for Hackers
Apply Returns a vector or array or list of values
obtained by applying a function to margins of an array or matrix.
data <- cbind(c(1,2),c(3,4)) data.rowsum <- apply(data,1,sum) data.colsum <- apply(data,2,sum) data
Apply
Save and Load x = USPersonalExpenditure save(x, file="~/test.RData") rm(x) load("~/test.RData") x
File IO
Charts and Graphics
xrange = range(as.numeric(colnames(USPersonalExpenditure)));
yrange= range(USPersonalExpenditure); plot(xrange, yrange, type="n", xlab="Year",ylab="Category" )
for(i in 1:5) {
lines(as.numeric(colnames(USPersonalExpenditure)),USPersonalExpenditure[i,], type="b", lwd=1.5)
}
Plotting Example
Reference & Resource
R in a nutshell
Study Material
Online Reference
37
Community Resources for R help
Websites Stackoverflow Cross Validated R-help R-devel R-sig-* Package-specific mailing list
Blog R-bloggers
Twitter https://twitter.com/#rstats
Quora http://www.quora.com/R-software
Resource
Conference useR! R in Finance R in Insurance Others Joint Statistical Meetings Royal Statistical Society Conference
Local User Group http://blog.revolutionanalytics.com/local-r-groups.html
Taiwan R User Group http://www.facebook.com/Tw.R.User http://www.meetup.com/Taiwan-R/
Resource (Con’d)
05/03/2023 40Confidential | Copyright 2012 Trend Micro Inc.
Thank You!