12 classes on Tuesdays, April 30 to July 16, 2019
LMU Biozentrum, Room C00.013
13-15:30, 3 ECTS total
Taught by Martin Spacek
Class notes and files: https://github.com/SciPyCourse2019/notes
Description
Introduction to the Python programming language, with a focus on practical tools and techniques for scientific data analysis. Previous programming experience in a language such as Matlab or R is an asset, but not required. Introduces various key Python libraries, and provides example problems. Students will be encouraged to bring their own specific data analysis problems to class, for immediate applicability to their work, culminating in a course project. Basic command line operations and code version control with Git will also be covered. Students are expected to bring their own laptop. A minimal level of attendance (9/12 classes) and participation is required, and minimal homework exercises will be assigned.
This is course no. 19322 in the official course listing.
Class outline
- Python basics
- Python basics 2
- collections
- numpy 1D arrays
- numpy data types
- numpy file operations, plotting with matplotlib
- more matplotlib, matrices
- image analysis
- data analysis with Pandas
- statistics
- organizing code, data, results; version control with Git; work on project
- options:
- review
- dimension reduction & clustering
- hierarchical indexing in pandas
- work on project
Class project
Here are the class project guidelines.
Tutorials
These are all free, and require no signup or login:
Basic Python
-
http://introtopython.org - is excellent!
-
http://learnpython.org - has online editable and executable code
-
http://interactivepython.org/runestone/static/thinkcspy/index.html - also has online editable and executable code, plus quizzes and videos
-
The official Python tutorial is also quite good, but doesn’t really get started until section 3
IPython and Jupyter
-
IPython is an improved Python interactive terminal. This is what we will mostly spend our time in: http://ipython.readthedocs.io/en/stable/interactive/tutorial.html
-
Jupyter is IPython running in a web browser. It allows you to create “notebooks” that combine code, formatted text, and results: http://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/examples_index.html
Specific libraries
-
Numpy is probably the most important data analysis library for Python: https://docs.scipy.org/doc/numpy/user/quickstart.html
-
SciPy is another important library: https://docs.scipy.org/doc/scipy/reference/tutorial
-
Matplotlib is the most popular plotting library: http://matplotlib.org/users/pyplot_tutorial.html
-
Pandas is another useful Python library for data analysis, built on top of numpy, that we’ll introduce in the course. It has several tutorials: http://pandas.pydata.org/pandas-docs/stable/getting_started/tutorials.html