Apply Functions
tapply
The documentation definition for tapply is a bit more specific than the others, where the arguments are now (X, INDEX, FUN), with X being an object where the split function applies, INDEX is a factor by which X is grouped, and FUN is function as before.
To simplify this definition, we can say tapply applies FUN to X when X is grouped by INDEX.
Examples
Using the reviews_sample csv file, show the three dates on which the mean score is a 5.
Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/beer/reviews_sample.csv")
tail(sort(tapply(myDF$score, myDF$date, mean, na.rm=TRUE)), n=3)
2001-04-26
5
2001-06-18
5
2002-01-26
5
Using the reviews_sample csv file, show a table displaying the mean score values for each month and year pair.
Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/beer/reviews_sample.csv")
beer_reviews$date <- as.Date(beer_reviews$date)
years <- format(beer_reviews$date, "%Y")
months <- format(beer_reviews$date, "%m")
mean_scores <- tapply(beer_reviews$score, list(years, months), mean)
print(mean_scores)
01 02 03 04 05 06 07 08
1998 3.770000 3.396667 4.092000 3.840000 3.702000 4.700000 3.100000 3.823333
1999 NA 3.613333 NA NA 3.820000 3.850000 3.880000 NA
2000 NA 4.300000 NA 3.880000 NA 4.470000 3.995000 NA
2001 4.220000 4.488000 4.403333 3.053333 NA 4.012000 4.080000 3.905455
2002 4.246667 3.706000 3.933846 3.831224 3.887788 3.782655 3.950776 3.628201
2003 3.842596 3.921875 3.840573 3.929500 3.895977 3.768022 3.742609 3.710635
2004 3.892104 3.822910 3.757987 3.825360 3.826656 3.798576 3.816569 3.861793
2005 3.872065 3.805870 3.884944 3.806607 3.743355 3.859615 3.769045 3.784184
2006 3.821626 3.789613 3.803201 3.833529 3.816436 3.847766 3.799106 3.795228
2007 3.796619 3.820563 3.785231 3.820230 3.768441 3.721336 3.809563 3.710408
2008 3.897296 3.879322 3.825841 3.866337 3.819464 3.824667 3.833681 3.845346
2009 3.868856 3.839302 3.847518 3.846370 3.892921 3.872649 3.851616 3.850718
2010 3.810428 3.886246 3.884490 3.869777 3.838745 3.838772 3.806898 3.842232
2011 3.861355 3.839600 3.839057 3.841564 3.844314 3.840459 3.855617 3.809778
2012 3.827531 3.813721 3.842391 3.856536 3.843407 3.827998 3.843218 3.818722
2013 3.930060 3.922282 3.945560 3.949689 3.945230 3.883678 3.849965 3.847642
2014 3.894819 3.919469 3.923504 3.890891 3.886415 3.909815 3.872173 3.872730
2015 3.997097 3.996901 4.005002 3.991280 3.984110 3.979467 3.967579 3.963439
2016 3.986488 4.001558 3.987044 3.950565 3.970315 3.982854 3.987852 3.993576
2017 4.011244 4.036964 4.025383 4.010692 3.986720 3.978366 3.978893 3.998201
2018 4.025227 4.030995 4.013674 4.007635 3.999648 4.001002 3.948450 3.980969
09 10 11 12
1998 3.355000 3.910000 NA 3.930000
1999 NA 3.500000 3.880000 4.000000
2000 3.885000 3.880000 4.670000 3.400000
2001 4.010556 3.948000 4.112069 3.851053
2002 3.798758 3.784247 3.885028 3.832537
2003 3.761452 3.771104 3.790879 3.802826
2004 3.802122 3.784444 3.741100 3.843094
2005 3.795644 3.782152 3.855852 3.860837
2006 3.826782 3.764831 3.802075 3.804746
2007 3.769330 3.826076 3.779580 3.834992
2008 3.824287 3.817620 3.841760 3.816298
2009 3.809730 3.862528 3.851910 3.860305
2010 3.844956 3.807355 3.844931 3.876926
2011 3.799865 3.854808 4.132859 4.013434
2012 3.826335 3.831577 3.869356 3.853065
2013 3.822392 3.860475 3.875164 3.885565
2014 3.885820 3.921727 3.933716 3.967668
2015 3.967281 3.967198 3.985564 3.993312
2016 4.005607 3.990623 4.007644 4.010456
2017 4.005115 4.002342 4.008391 4.045614
2018 3.992782 NA NA NA