IMSE 8410 Advanced Computational Systems and Data Engineering
Material Copyright 2017-2019 by Timothy Middelkoop. Source code licensed under the Apache License, Version 2.0. Documentation licensed under CC by SA 3.0.
The objective of this course is to enable students to utilize advanced computational and data capabilities for research and industrial practice through 1) proper project, code, and data management techniques; 2) wide range of research workflows to solve complex problems; 3) integration of optimization or other domain specific software tools; and 4) parallel computing on High Performance Computing clusters.
This course will give students the skills, tools, and hands-on experience required to effectively utilize advanced computational and data capabilities for their research. Topics will include command line usage, batch submission on HPC systems, source code revision management systems, relational and non-relational databases, message and data structures, web application programmer interfaces, computational engineering software, software development, test driven development (TDD), scientific and engineering workflows, data management, experimental design, and the life-cycle of research projects. Tools include Git, Python, R, Julia, SQLite with domain specific tools such as CPLEX, Gurobi, and others.
- Welcome and personal introduction
- Motivation for the class (scientific and engineering workflows)
- Class structure and outline of topics
- Careers in Research Computing
- Expectations and syllabus overview
- Accessing and using course resources (Hand On)
- Discussion
This course is about
- reproducible science at scale, and
- shareable and sustainable development.
We will utilize the "One, Two, Three, Go!" method.
Remembering that
- "research is hard";
- to "do it right" and to "show your work";
- and that "Training is a process, not happening" - Earl Farrell.
At the end of this course you should be able to
- reproduce and share your research,
- collaborate with others on your research,
- scale your research on high performance computing systems,
- develop in a secure and sustainable way
- producing correct results.
- Canvas at https://missouri.instructure.com
- Syllabus
- Announcements
- Assignment submission
- Grades
- Discussion
- Slides and Video
- Zoom
- MU Gitlab at https://gitlab.missouri.edu (requires on campus access or vpn)
- Course material (https://gitlab.missouri.edu/middelkoopt/RC-2020-Spring)
- Student work
- MU Research Computing Support Services teaching cluster (Clark)
- HPC Cluster for computation and data storage
- Scientific Notebooks (Jupyter)
- Secure Shell (ssh) access at clark.rnet.missouri.edu
- Web access at https://ondemand.rnet.missouri.edu (requires on campus access or vpn)
- Campus VPN using https://anyconnect.missouri.edu
- Required for off campus access to some resources
- What does High Performance Computing mean to you?
- What motivates you to take this course?
- What about computational and data research would you like to learn?
- "High Performance Computing: Modern Systems and Practices" (Chapter 1, Introduction)
- National Strategic Computing Initiative Update: Pioneering the Future of Computing: https://www.nitrd.gov/pubs/National-Strategic-Computing-Initiative-Update-2019.pdf
- Software and HPC Carpentry: https://software-carpentry.org/ and https://hpc-carpentry.github.io/
- Introduction to Networking: http://intronetworks.cs.luc.edu/current/ComputerNetworks.pdf (Chapter 1)
- Pacific Research Platform (PRP): http://pacificresearchplatform.org/userdocs/start/get-access/