STAT 580: Statistical Computing - I.
Class meets: TTh 11:00 am -- 12:30 pm in: Mol Biol 1428
Instructor: Ranjan Maitra (Ron-joan Moi-tro)
-
Office:123 Snedecor Hall
- Phone: (515)-294-7757
- E-mail:maitra
- Office Hours: TTh 12:40-1:30 pm, or by appointment
Course Prerequisites:
- Stat 579
- Stat 447 or Stat 542
Grading Scheme:
Course Syllabus: Introduction to scientific computing for
statistics using tools and concepts in R: programming tools, modern
programming methodologies, modularization, design of
statistical algorithms. Introduction to C programming for efficiency;
interfacing R with C. Building statistical libraries. Use of
algorithms in modern subroutine packages, such as LAPACK, QUADPACK,
MINPACK for computational linear algebra, optimization and
integration. Implementation of simulation methods: inversion of the
probability integral transform, rejection sampling, importance
sampling. Monte Carlo integration.
Course Description: This course is designed for Ph. D.-level and
advanced Masters-level students. In this course, we will study the
tools needed for scientific computing for statistics. We shall do this
by developing theory and methodology of simulation and estimation
methods. Since an important part of developing numerical solutions is
mastering how computer hardware and software work, a major emphasis on
the class will be in programming concepts and methods.
Textbook: Because all the material is spread out over
three to four books, there are no required textbooks for this
class. However, the following books are highly recommended reading
material:
- Stochastic Simulation by Brian D. Ripley, John
Wiley and Sons, 1987.
- Numerical Methods in Statistical Computing for the Social
Scientist by Micah Altman, Jeff Gill and Michael
P. McDonald, John Wiley, 2004.
- C Programming Language (2nd Edition) by Brian
W. Kernighan and Dennis M. Ritchie, Prentice-Hall, 1988.
- Modern Applied Statistics in S by W. N. Venables and
B. D. Ripley, Springer-Verlag, 2002.
Statistical Software:
The statistical software used throughout this class will be R. R is very
similar to Splus but comes under the GNU Public License. It is
a comprehensive statistical software package freely available from
http://www.R-project.org/.
R is developed by a team of
international researchers and operates under the GNU Public License
and is free. It is very similar, though not the exact same software as
the commercially available Splus. Most commands in Splus work with
R. All lab machines running Windows and Linux have R installed. Since the
software is freely available, you may download it from the above web
site and use it on your home computer. You may use either the Windows
version or the Unix/Linux version. Please note that your
installation of R is at your own risk, though the department systems
administrators can perhaps help. You may not use Splus in lieu
of R in this class.
Computer Programming:
The low-level programming language taught and used in this class will
be C. C is a programming language developed by Dennis M. Ritchie in
the early 1970's at AT&T's Bell Laboratories. We will be using the
language as officially standardized by the American National Standards
Institute in 1989 -- hence we will be using ANSI C (also called C89,
for short). Note that in 1989, the International Organization for
Standardization (ISO) adopted ANSI's definition of C as the
international standard, whereupon the language became known as
ANSI/ISO C. Further changes and extensions to the C language
continued to be made, in response to rapid changes in computer
hardware technology, hardware and software. A new standard for the C
language was announced in 1999, known as ISO/IEC 9899:1999, or C99 for
short. Compilers for C99 are beginning to become available, and the
GNU compilers support C99 from release 3.0 onwards. A history of the
development of the C Language is provided by Dennis M. Ritchie at
http://cm.bell-labs.com/cm/cs/who/dmr/chist.html.
Computer Operating System:
The operating system used in this class will be the Linux operating
system. In layman's terms, and operating system is a program that
supervises the working of a computer. (Other popular operating systems
are Microsoft Windows, Mac OS and the Unix operating system.) Similar
to R, Linux is a free Unix-type operating system originally created by
Linus Torvalds with the assistance of developers around the
world. Developed under the GNU General Public License (GPL), the
source code for Linux is freely available to everyone. More
information on the Linux operating system is available at
www.linux.org/info/index.html.
Like Unix, Linux provides flexibility and is very useful for
scientific computing. This is because one can optimize the resources
and gear it towards computing that is of interest to us.
Because the source code for Linux is freely available, it can be (and
is) packaged in several forms. Each of these are called distributions:
examples are Fedora Core (sponsored by Redhat), SuSE (now owned by
Novell), Mandrake, etc. Iowa State University has site licenses for
the Redhat Enterprise Linux operating system. You may use this, or
any other linux distribution. The statistics department labs currently
dual-boot into Microsoft Windows and Fedora Core 1. There is also a
linux lab at the University's Academic Information Technologies
building in Durham Hall. Feel free to install Linux at home. An added
advantage of using Linux is that you can log in from off-campus and
run your program on a department machine, using X terminal emulators.
You will be taught how to use Linux in the department lab.
Homeworks: Homeworks will be handed out every two weeks. This
will mostly consist of applying and exploring the concepts learnt in class.
Parts of the homeworks will involve theoretical derivations. A
considerable part of the homework will involve programming computer
work. Please note that your program should be e-mailed to me, and
should be annotated. Your grade on the program will be dependent on
the output of your program.
Course Homepage: The course homepage will be located on
the WWW at
http://www.public.iastate.edu/~maitra/stat580/spring2005.html.
I will try and keep this homepage as up to date as possible. However,
you are still responsible for any announcements made in class.