Functional programming refers to a specialized way of writing code so that ‘side effects’ are avoided, making the outcome of the code easier to predict. It is ‘declarative’ - functions are given as expressions, as opposed to ‘imperative’, where commands are issued to change the state of the program (e.g. change the value of variables by assignment). Historically, programming has been imperative, but the notion of functional programming is gaining ground lately, especially because of its connection to paralellization and distributed computing.
We won’t discuss the methodology in detail, but we will cover the basic R tools for functional programming - tools that allow us to apply functions across data structures. These tools allow us to avoid looping by extending the ability to vectorize operations.
apply
maps a function to a matrix:
m<-matrix(rnorm(30),ncol=3,nrow=10)
print(m)
[,1] [,2] [,3]
[1,] -1.88415809 -0.09135793 -0.04933136
[2,] 0.27581447 0.59928155 -0.01019772
[3,] 0.23509325 -1.38182122 -1.49150530
[4,] -1.29592234 -0.10180487 -1.62481563
[5,] 0.31163838 -1.16070666 0.60327878
[6,] 0.23685780 0.80313813 1.49028075
[7,] -0.51883441 0.30667497 0.81876505
[8,] -0.01050835 -1.49125738 1.67045716
[9,] -0.17823674 0.50897338 -1.11499439
[10,] 1.00128144 0.70455616 -0.23839229
# Get the mean of the rows, returned as a vector
apply(m,1,mean)
# Get the mean of the columns, returned as a vector
apply(m,2,mean)
apply
¶Any function that acts on a vector can be applied to the rows or columns
of a matrix using apply
. For example:
apply(m,1,function(x) x*x)
3.5500517085 | 0.0760736240 | 0.0552688372 | 1.6794147170 | 0.0971184776 | 0.0561016187 | 0.2691891415 | 0.0001104255 | 0.0317683363 | 1.0025645300 |
0.008346272 | 0.359138375 | 1.909429885 | 0.010364232 | 1.347239959 | 0.645030852 | 0.094049538 | 2.223848573 | 0.259053905 | 0.496399383 |
0.0024335833 | 0.0001039935 | 2.2245880524 | 2.6400258400 | 0.3639452898 | 2.2209367091 | 0.6703762101 | 2.7904271078 | 1.2432125008 | 0.0568308829 |
apply(m,1,function(x) x%*%x)
Note that apply
returns an object of the appropriate dimension
automagically. In the first example, the multiplication operator is the
element-by-element product, so apply
give us a matrix whose rows are
the result of multiplying each of the 3 rows by itself. In the second
example, the multiplication operator is a dot product, so apply
returns a 10-dimensional vector with each entry the squared norm of the
3-dimensional row vector.
Use apply
to count the number of negative values in each column of
m.
sapply
and lapply
¶R has several other versions of apply, for specific types of input and
output data structures. Two of the most common are lapply
and
sapply
. sapply
can operate on either a list or a vector and
returns a list or vector, respectively. lapply
can operate on either
a list or a vector, but returns a list.
For example:
v<-1:10
sapply(v,function(x) x*2)
v<-1:10
lapply(v,function(x) x*2)
You may be wondering what is the difference between lists and vectors? Well, vectors are lists - but they are a specific type of list that contains only numbers. Lists can contain different types of objects. Consider the following:
my.list<-list(v,m)
my.list
-1.88415809 | -0.09135793 | -0.04933136 |
0.27581447 | 0.59928155 | -0.01019772 |
0.2350933 | -1.3818212 | -1.4915053 |
-1.2959223 | -0.1018049 | -1.6248156 |
0.3116384 | -1.1607067 | 0.6032788 |
0.2368578 | 0.8031381 | 1.4902807 |
-0.5188344 | 0.3066750 | 0.8187651 |
-0.01050835 | -1.49125738 | 1.67045716 |
-0.1782367 | 0.5089734 | -1.1149944 |
1.0012814 | 0.7045562 | -0.2383923 |
As you can see, my.list
contains a vector AND a matrix. Now, lets
use lapply
:
lapply(my.list,function(x) x*2)
-3.76831618 | -0.18271587 | -0.09866272 |
0.55162895 | 1.19856310 | -0.02039544 |
0.4701865 | -2.7636424 | -2.9830106 |
-2.5918447 | -0.2036097 | -3.2496313 |
0.6232768 | -2.3214133 | 1.2065576 |
0.4737156 | 1.6062763 | 2.9805615 |
-1.0376688 | 0.6133499 | 1.6375301 |
-0.0210167 | -2.9825148 | 3.3409143 |
-0.3564735 | 1.0179468 | -2.2299888 |
2.0025629 | 1.4091123 | -0.4767846 |
What does sapply
do?
sapply(my.list,function(x) x*2)
-3.76831618 | -0.18271587 | -0.09866272 |
0.55162895 | 1.19856310 | -0.02039544 |
0.4701865 | -2.7636424 | -2.9830106 |
-2.5918447 | -0.2036097 | -3.2496313 |
0.6232768 | -2.3214133 | 1.2065576 |
0.4737156 | 1.6062763 | 2.9805615 |
-1.0376688 | 0.6133499 | 1.6375301 |
-0.0210167 | -2.9825148 | 3.3409143 |
-0.3564735 | 1.0179468 | -2.2299888 |
2.0025629 | 1.4091123 | -0.4767846 |
Same thing! Well, what we see here is that R is sometimes smarter than
us. sapply
returns a list, because we gave it a list. Let’s try
something else:
another.list<-list(1:5)
mult.list<-sapply(another.list,function(x) x*2)
print(mult.list)
[,1]
[1,] 2
[2,] 4
[3,] 6
[4,] 8
[5,] 10
Now what? sapply
gave us a matrix.
dim(mult.list)
Why R does this is a bit beyond our scope. The take-away here is to be careful with data types, and try to give R the right one. If you want a vector returned, give R a vector, not a list!
If you mix character or logical types, you can get unintended results:
my.list<-list(1:5,TRUE,TRUE)
mult.list<-sapply(my.list,function(x) x*2)
print(mult.list)
[[1]]
[1] 2 4 6 8 10
[[2]]
[1] 2
[[3]]
[1] 2
R interpreted TRUE
as a numeric 1.
my.list<-list(1:5,"a","b")
mult.list<-sapply(my.list,function(x) x*2)
print(mult.list)
Error in x * 2: non-numeric argument to binary operator
[,1]
[1,] 2
[2,] 4
[3,] 6
[4,] 8
[5,] 10
my.list<-list(1:5,"1","2")
mult.list<-sapply(my.list,function(x) x*2)
print(mult.list)
Error in x * 2: non-numeric argument to binary operator
[,1]
[1,] 2
[2,] 4
[3,] 6
[4,] 8
[5,] 10
In these last two examples, R returns an error, because we have tried to multiply by a character type.