In a recent blog post, I addressed about the rather catastrophic (and totally avoidable) batch problems encountered by Lin et al. (2015) in their paper that investigated whether the source organ has greater influence on gene expression than the species.
Rafael Irizarry has written a niece piece on this specific topic (also focussing on batch problems in Lin et al.) as part of his very helpful simplystats blog. It touches on many of the same messages as my write-up, but is more statistical in nature, and includes a niece pointer to an online introductory linear modeling course from Harvard. Well worth reading, especially if you need a refresher on the math.
Also, do check out Rafael’s book, Bioinformatics and Computational Biology Solutions Using R and Bioconductor. This was at favorite at Stanford (I know this in several different ways!) A bit old (2005), but still very useful.