Important first note: All this material has a copyright holder (myself). Unless otherwise stated, all the materail is under one of the Creative Commons licenses. These licenses give you a lot of freedom to use the material, modify it, etc (read the details of the license for what exactly you can, and cannot, do). So feel free to use this material, but, please, give credit where credit is due. (To give you an example, most of the recent material is under a Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license that allows you to freely mix, modify, adapt, and even commercially use this material, but you should give proper attribution and share the modified material with the same licencse).

This is a partial list of some classes I teach at UAM, with links to the PDFs or the original LaTeX (if the LaTeX files are not available here, you can ask me for them).

The PDFs (slides and/or class notes) are always available to our students from the institutional Moodle. But you can't access those unless you are from the UAM. Here I provide the PDFs or, even better, the original LaTeX or Rnw sources. (Eventually, I'll have public repos for all of this).

A master's level course that touches on a variety of topics, from HMMs to statistics to phylogenetic inference. The repository for the original LaTeX files. Unfortunately, this course was last taught on 2014-15, and is unlikely to be taught again in the near future. It is being replaced by two courses (which I'll be co-teaching, and will get their entry here eventually).

An intro course to statistics for master's students of the Molecular Biosciences Master. We use R to teach some basic stats (two-sample comparisons and a little bit of linear models). This is the repository for the Rnw files (LaTeX with R) and all data and scripts to produce the PDF.

A course for biochemistry and biology bachelors' degree students. We cover Python (taught by Luis del Peso), R and used to cover a few things about the shell and essential utilities such as sed, grep, etc. This is the github repo with the material (data sets and complete Rnw and scripts to generate the PDFs) for the R part.

A course for biochemistry bachelor's degree students. This is a course taught by several teachers, and I cover phylogenetic inference and statistics for omics. (The PDFs provided are a bundle of three or more files; use your PDF viewer index toolbar or similar to see the full index.) The material for Statistics for omics is very similar to that in the repo for Stats-bioinfo-intro (full LaTeX/Rnw, etc, sources).

(This is mainly for nostalgic and historical reference) I started using S-PLUS in 1996. Through the email list of the Stats department at the University of Wisconsin I got an email about a great offer to obtain a user license (and the floppies) for S-PLUS. I was planning a long field trip and wanted a reliable piece of stats software to analyze my lizard data. I went to the bookstore, looked through "Modern Applied Statistics with S-Plus", the classic by Venables and Ripley, and after 30 minutes I was hooked (the example of the linear regression with male and female cats, and how easy the subsetting and using different symbols was, lead to love after that 30 minute skimming session). I purchased the book, the software, and spent a year and a half in Brazil enjoying S-Plus and reading the manuals (and rejoicing in thoughts such as "how hard would this have been in SAS or SPSS?"), in addition to chasing lizards.

In the fall of 1998 I took a linear models class (Stats 849?) from Douglas Bates, an R-core member. He introduced us to R, then in version 0.62.3 (and also to ESS and Emacs). R being free software, and having been bitten by the free software virus shortly before that, I started using R as much as I could instead of S-Plus (there were things that were not available --- like trellis--- others that did not quite work well yet, lots for which I could easily rewrite my former S-Plus scripts, and a bunch for which I tried S-Plus, R, and sometimes SAS, such as mixed-effects models ---I was spending a lot of time with those by that time). By mid-2000 I was using R almost exclusively, and I submitted my first package to CRAN, PHYLOGR, around that time.

I kept using R in the first jobs I had in Spain, often to the surprise of my colleagues (some wondered why I insisted in using that unknown system instead of things such as SPSS or SAS ---both of which I had use quite heavily in the past and did not particularly want to use much again). When I got to CNIO in 2001, and I returned to doing science, I was pleasantly surprised to see that the Bioconductor project, which had started shortly before, was something that some of my colleagues in bioinformatics were positively curious about.

Since then, I've authored/co-authored several R and BioConductor packages, some of which even got their academic papers (e.g., ADaCGH2, RJaCGH, varSelRF, etc) and have used R as the basic computational engine behind several web-based bioinfo applications (some of which we also managed to publish, such as Pomelo II, now available from Pomelo's IIB site or SignS, now available from its IIB home, etc).

I continue using R as the language in which I do a large chunk of my overall programming and all of my statistical analysis/programming. I will use other things too (mainly C++ when I need the speed), and have been tempted by other languages (right now, the inevitable Julia and LuaJIT) but I think that R is here to stay for quite sometime and I think that the combination of R and C++ (via Rcpp) is a relatively good one.

I think I taught my first R course around 2003 at CNIO and not long after that I taught another one at UAM (organized by the "Red temática del CSIC de Bioinformática"). I've covered from basic programming and basic stats to specific issues of parallelization or "omics" data analysis.

Most of the R-related courses I've taught recently fall into two categories:

- General intro R programming courses. For example, the material I use for the "Herramientas de programación" bacherlor's degree course, or the annual one-day course I teach for my frieds at CNIO. This material is available from the R-bioinfo-intro github repo.
- Intro statistics with R. For example, the material for the "Applied statistics course" (available from the github BM-1 repo), or the very similar (except it is all command-line oriented, so no Rcommander or other GUIs) material from the R-basic-stats github repo.

I've taught other courses. This is a partial list with corresponding material. Note that some of the material is in Spanish.

- Together with Juan Ramon Gonzalez, at the CREAL, Barcelona, I have taught a couple of courses on advanced R programming during 2010 (one in May and one in December). Here is the pdf for the part I taught, Debugging and parallelization.
- Class notes for a former (2009) crash R course, part of BM-1. These notes are based on the original ones by S. Ellner and B. Bolker. The Rnw file (Sweave) (plus the bib, png, sty, etc, etc) that generate the pdf, R code, etc, are available here; this should work in any reasonable LaTeX installation, at least under Linux/Unix (use the script "make.course.sh").
- I taught a couple of long R courses at Azti-Tecnalia, on July and October 2008. Here are the slides. The cover from R programming to linear and generalized linear models.
- I had used similar material already in an R course in Santander, at the Oceanographical Institute, in 2006. Here are the slides.
- Older stuff is probably too old to be of even nostalgic interest.

This is an incomplete list (more or less reverse chronologicalorder) of some other talks, etc.

- Seminars, at CNIO in May 2014, and a few months later at IIB, about inferring restrictions restrictions in the order of accumulation of mutations during tumor progression.
- Using R for the parallelized analysis of big genomic data a talk at UseR 2013, Albacete, Spain, July 2013.
- Using parallel computing and data structures on disk to analyze genomic data with R, a talk at the Computing & Statistics, ERCIM 2012, meeting, Oviedo.
- PDF slides for Interpretación crítica de estudios de expresión diferencial y clasificación con datos ómicos taught at IdiPaz, November 2012. (Most of this material is now also incorporated, and improved, in the sources available from Stats-bioinfo-intro).
- The PDF of a talk about parallelization and web based applications with R in bioinformatics at the Facultad De Informática at Universidad Complutense de Madrid, May 2012 (this is in Spanish).
- Analysis of copy number variation in genomic DNA, a conference given at the Escuela de Ingenieros Industriales de Albacete, UCLM, June, 2011.
- Copy number variation in genomic DNA: recurrent regions and integration of expression and phenotypic data a PhD course taught at the Politecnico di Milano, May 2009.
- Presentacion (pdf) del Master Biofisica, UAM, Madrid, 6-11-2008 to 13-11-2008.
- Curso de doctorado Genomica Funcional: tecnicas de analisis masivos de datos, Madrid, 17-01-2008.
- Developing web-based and parallelized biostatistics/bioinformatics applications: ADaCGH as a case example. I gave this talk at the Statistical Computing 2007, Schloss (Germany) workshop, and a very similar one at the Statistics for Biomolecular Data Integration and Modeling, 2007, Ascona (Switzerland) and the ERCIM WG on Computing and Statistics kick-off meeting, 2007, Geneva (Switzerland).
- Analysis of aCGH data: statistical models and computational challenges, a talk at the Computational Biology Group Meeting from CNIO
- Introduction to biostatistics: differentially expressed genes and classification/prediction of disease status/prognosis from microarray data, a class given at the Molecular Oncology Master, 2007, CNIO, Madrid, Spain.
- Asterias: an example of using R in a web-based bioinformatics suite of tools, a mini-talk given at useR 2006.
- An applied statistician in microarray and genomics worlds: Review and reflections, invited talk at the X Conferencia Españolaa de Biometría, Oviedo, Spain, 26-05-2005.
- Variable selection from random forests: application to gene expression data., talk at the 5th Annual Spanish Bioinformatics Conference, Barcelona, Spain, December 2004.
- Plaid models et al (beware, a 7.7 MB download!), given on January 2004 at the System's Biology seminar at CNIO about biclustering. (Includes discussion of Plaid, COSA ---Clustering on subsets of attributes---, xMotif, SAMBA, Cheng and Church, PRM, and a few others).
- Searching for prognostic factors: review and admonitions given on November 2003 at CNIO as part of the II Ciclo Curso Patologia Molecular. An updated version with similar focus is the talk I gave on July 2004 as part of the Master en Oncologia Molecular, at CNIO, and the Bioinformatics Master, and I presented a similar talk in Barcelona, at the Institute of Biomedical Research (Parque Cientifico de Barcelona).
- Molecular signatures talk at the 2003 CNIO retreat. A similar talk was given at the October 2003ESF Workshop at CNIO. A similar (but much shorter) talk was given at theWye workshop on statistical analysis of gene expression data (if that link does not work, try this older one).
- Use of GO Terms to Understand the Biological Significance of Microarray Differential Gene Expression Data, with J. Dopazo, a presentation given in November 2002 at CAMDA 02.

Ramon Diaz-UriarteLast modified: March 2015