Courses from Stanford University – Astrostatistics and Astroinformatics Portal

You are here: Home / Resources / On-line courses / Courses from Stanford University

Elementary courses: Introduction to Statistical Inference; Statistical Methods in Engineering and the Physical Sciences; Data Mining and Analysis. Advanced courses: Modern Applied Statistics – Learning; Modern Applied Statistics – Data Mining; Paradigms for Computing with Data

StatLearning: This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical). This is not a math-heavy class, and computing is done in R. The lectures cover all the material in An Introduction to Statistical Learning, with Applications in R by James, Witten, Hastie and Tibshirani (Springer, 2013).
Modern Applied Statistics: Learning (Stanford) *******: Graduate-level course taught by Stanford Professor Trevor Hastie. New techniques have emerged for both predictive and descriptive learning that were not possible ten years ago, using ideas that bridge the gap between statistics, computer science and artificial intelligence. In this two-part series we cover many of these new methods, with emphasis on the statistical aspects of their application and their integration with more standard statistical methodology. Predictive learning refers to estimating models from data with the specific goal of predicting future outcomes, in particular regression and classification models. Regression topics include linear regression with recent advances to deal with large numbers of variables, smoothing techniques, additive models, and local regression. Classification topics include discriminant analysis, logistic regression, support vector machines, generalized additive models, naive Bayes, mixture models and nearest neighbor methods.
Modern Applied Statistics: Data Mining (Stanford): Graduate-level course from Stanford University. This is the second course in two-part sequence. This course covers new techniques for predictive and descriptive learning using ideas that bridge gaps among statistics, computer science, and artificial intelligence. Predictive learning refers to estimating models from data with the goal of predicting future outcomes, in particular, regression and classification models. Descriptive learning is used to discover general patterns and relationships in data without a predictive goal, viewed from a statistical perspective as computer automated exploratory analysis of large complex data sets.
Paradigms for Computing with Data (Stanford): Instructors: John Chambers and B. Narasimhan (Stanford). This course provides a practical introduction to modern techniques for computing with data, teaching advanced use of the R system and exploring connections to other environments such as C, python, Java, and databases. Students learn and practice the use of R for serious applications. Hands-on practice with all the paradigms, including parallel, cluster and map/reduce style computation. The final project will be to create an R package on a topic you propose and we approve.
Introduction to Statistical Inference (Stanford) *******: In this course, students will learn about modern statistical concepts and procedures derived from a mathematical framework. We will discuss statistical inference, decision theory, point and interval estimation, tests of hypotheses, and Neyman-Pearson theory. Topics include: Bayesian analysis, maximum likelihood, and large sample theory.
Statistical Methods in Engineering and the Physical Sciences (Stanford): Elementary course. Statistics is the science that deals with gathering, classifying, analyzing, and interpreting data. Statistics helps us turn data into information to see the relationship between variables—or the “big picture.” This course is an introduction to statistics with an emphasis on modern engineering applications. Students explore concepts of probability theory, discrete and continuous random variables, bivariate probability distributions, categorical data analysis, and model building. Topics Include: descriptive statistics, probability, interval estimation, tests of hypotheses, nonparametric methods, linear regression, analysis of variance, and experimental design.
Data Mining and Analysis *******: Elementary course. Topics include: decision trees, neural networks, association rules, clustering, case-based methods, and data visualization.