Functional Programming

Functional programming refers to a specialized way of writing code so that ‘side effects’ are avoided, making the outcome of the code easier to predict. It is ‘declarative’ - functions are given as expressions, as opposed to ‘imperative’, where commands are issued to change the state of the program (e.g. change the value of variables by assignment). Historically, programming has been imperative, but the notion of functional programming is gaining ground lately, especially because of its connection to paralellization and distributed computing.

We won’t discuss the methodology in detail, but we will cover the basic R tools for functional programming - tools that allow us to apply functions across data structures. These tools allow us to avoid looping by extending the ability to vectorize operations.

R apply

apply maps a function to a matrix:

m<-matrix(rnorm(30),ncol=3,nrow=10)
print(m)
             [,1]        [,2]        [,3]
 [1,] -1.88415809 -0.09135793 -0.04933136
 [2,]  0.27581447  0.59928155 -0.01019772
 [3,]  0.23509325 -1.38182122 -1.49150530
 [4,] -1.29592234 -0.10180487 -1.62481563
 [5,]  0.31163838 -1.16070666  0.60327878
 [6,]  0.23685780  0.80313813  1.49028075
 [7,] -0.51883441  0.30667497  0.81876505
 [8,] -0.01050835 -1.49125738  1.67045716
 [9,] -0.17823674  0.50897338 -1.11499439
[10,]  1.00128144  0.70455616 -0.23839229
# Get the mean of the rows, returned as a vector
apply(m,1,mean)
  1. -0.674949128760326
  2. 0.288299434064387
  3. -0.87941108856336
  4. -1.00751428266863
  5. -0.0819298347992709
  6. 0.84342555958658
  7. 0.202201872370568
  8. 0.0562304744778094
  9. -0.261419251214308
  10. 0.489148438901616
# Get the mean of the columns, returned as a vector
apply(m,2,mean)
  1. -0.18269745842911
  2. -0.130432387900508
  3. 0.00535450434813664

Work!

  • Find the sum of each of the columns of the matrix m.
  • Find the product of each of the rows of the matrix m.

Custom Functions and apply

Any function that acts on a vector can be applied to the rows or columns of a matrix using apply. For example:

apply(m,1,function(x) x*x)
3.55005170850.07607362400.05526883721.67941471700.09711847760.05610161870.26918914150.00011042550.03176833631.0025645300
0.0083462720.3591383751.9094298850.0103642321.3472399590.6450308520.0940495382.2238485730.2590539050.496399383
0.00243358330.00010399352.22458805242.64002584000.36394528982.22093670910.67037621012.79042710781.24321250080.0568308829
apply(m,1,function(x) x%*%x)
  1. 3.56083156395646
  2. 0.435315992058913
  3. 4.18928677483859
  4. 4.32980478918986
  5. 1.80830372628942
  6. 2.92206917996323
  7. 1.03361488996985
  8. 5.01438610625781
  9. 1.5340347423788
  10. 1.55579479637673

Note that apply returns an object of the appropriate dimension automagically. In the first example, the multiplication operator is the element-by-element product, so apply give us a matrix whose rows are the result of multiplying each of the 3 rows by itself. In the second example, the multiplication operator is a dot product, so apply returns a 10-dimensional vector with each entry the squared norm of the 3-dimensional row vector.

Work!

Use apply to count the number of negative values in each column of m.

sapply and lapply

R has several other versions of apply, for specific types of input and output data structures. Two of the most common are lapply and sapply. sapply can operate on either a list or a vector and returns a list or vector, respectively. lapply can operate on either a list or a vector, but returns a list.

For example:

v<-1:10
sapply(v,function(x) x*2)
  1. 2
  2. 4
  3. 6
  4. 8
  5. 10
  6. 12
  7. 14
  8. 16
  9. 18
  10. 20
v<-1:10
lapply(v,function(x) x*2)
  1. 2
  2. 4
  3. 6
  4. 8
  5. 10
  6. 12
  7. 14
  8. 16
  9. 18
  10. 20

Aside on Lists

You may be wondering what is the difference between lists and vectors? Well, vectors are lists - but they are a specific type of list that contains only numbers. Lists can contain different types of objects. Consider the following:

my.list<-list(v,m)
my.list
    1. 1
    2. 2
    3. 3
    4. 4
    5. 5
    6. 6
    7. 7
    8. 8
    9. 9
    10. 10
  1. -1.88415809-0.09135793-0.04933136
    0.27581447 0.59928155-0.01019772
    0.2350933-1.3818212-1.4915053
    -1.2959223-0.1018049-1.6248156
    0.3116384-1.1607067 0.6032788
    0.23685780.80313811.4902807
    -0.5188344 0.3066750 0.8187651
    -0.01050835-1.49125738 1.67045716
    -0.1782367 0.5089734-1.1149944
    1.0012814 0.7045562-0.2383923

As you can see, my.list contains a vector AND a matrix. Now, lets use lapply:

lapply(my.list,function(x) x*2)
    1. 2
    2. 4
    3. 6
    4. 8
    5. 10
    6. 12
    7. 14
    8. 16
    9. 18
    10. 20
  1. -3.76831618-0.18271587-0.09866272
    0.55162895 1.19856310-0.02039544
    0.4701865-2.7636424-2.9830106
    -2.5918447-0.2036097-3.2496313
    0.6232768-2.3214133 1.2065576
    0.47371561.60627632.9805615
    -1.0376688 0.6133499 1.6375301
    -0.0210167-2.9825148 3.3409143
    -0.3564735 1.0179468-2.2299888
    2.0025629 1.4091123-0.4767846

What does sapply do?

sapply(my.list,function(x) x*2)
    1. 2
    2. 4
    3. 6
    4. 8
    5. 10
    6. 12
    7. 14
    8. 16
    9. 18
    10. 20
  1. -3.76831618-0.18271587-0.09866272
    0.55162895 1.19856310-0.02039544
    0.4701865-2.7636424-2.9830106
    -2.5918447-0.2036097-3.2496313
    0.6232768-2.3214133 1.2065576
    0.47371561.60627632.9805615
    -1.0376688 0.6133499 1.6375301
    -0.0210167-2.9825148 3.3409143
    -0.3564735 1.0179468-2.2299888
    2.0025629 1.4091123-0.4767846

Same thing! Well, what we see here is that R is sometimes smarter than us. sapply returns a list, because we gave it a list. Let’s try something else:

another.list<-list(1:5)
mult.list<-sapply(another.list,function(x) x*2)
print(mult.list)
     [,1]
[1,]    2
[2,]    4
[3,]    6
[4,]    8
[5,]   10

Now what? sapply gave us a matrix.

dim(mult.list)
  1. 5
  2. 1

Why R does this is a bit beyond our scope. The take-away here is to be careful with data types, and try to give R the right one. If you want a vector returned, give R a vector, not a list!

One more thing about lists

If you mix character or logical types, you can get unintended results:

my.list<-list(1:5,TRUE,TRUE)
mult.list<-sapply(my.list,function(x) x*2)
print(mult.list)
[[1]]
[1]  2  4  6  8 10

[[2]]
[1] 2

[[3]]
[1] 2

R interpreted TRUE as a numeric 1.

my.list<-list(1:5,"a","b")
mult.list<-sapply(my.list,function(x) x*2)
print(mult.list)
Error in x * 2: non-numeric argument to binary operator
     [,1]
[1,]    2
[2,]    4
[3,]    6
[4,]    8
[5,]   10
my.list<-list(1:5,"1","2")
mult.list<-sapply(my.list,function(x) x*2)
print(mult.list)
Error in x * 2: non-numeric argument to binary operator
     [,1]
[1,]    2
[2,]    4
[3,]    6
[4,]    8
[5,]   10

In these last two examples, R returns an error, because we have tried to multiply by a character type.