- EE 488 Parallel Computer Architecture (Fall'19)
- EE 817 Advanced Flash Storage and File Systems (Spring'19)
- IIT 6044 Advanced Operating Systems (Fall'18)
- IIT 3002 Operating Systems (Fall'18)
- IIT 1001 Advanced Programming (Spring'18)
- IIT 6047 Computer Architecture (Spring'18)
- IIT 8015 Memory Architecture and Storage Systems (Spring'17)
- IIT 1001 Advanced Programming (Spring'17)
- IIT 1014 Gateway to Integrated Technology (Spring'17)
- IIT 6036 Computer Organization and Design (Fall'16)
- IIT 3002 Operating Systems (Fall'16)
- IIT 7024 Advanced System Architecture (Spring'16)
- IIT 1001 Advanced Programming (Spring'16)
- IIT 6036 Computer Organization and Design (Fall'15)
- IIT 3002 Operating Systems (Fall'15)
- EE/CE 4304 Computer Architecture (Spring'15)
- EEDG 6302 Microprocessor Systems (Spring'15)
- EE/CE/CS 6304 Computer Architecture (Fall'14)
- EEDG 6302 Microprocessor Systems (Spring'14)
- EE/CE/CS 6304 Computer Architecture (Fall'13)
SYLLABUS AND COURSE INFORMATION (hide)
This course will mainly introduce computer organization and design, including the following topics: i) instruction-level parallelism, including parallel processing, superscalar, VILW, static instruction scheduling dynamic scheduling and precise exception handling, ii) memory-level parallelism, iii) data-level parallelism including multi-core architecture, GPU, iv) thread-level parallelism and v) NVM-level parallelism. This course is a project-centric; we will have five gem5 lab projects. Most projects are a step-by-step tutorial to teach you how you can do simulation-based architectural explorations and studies. It will include CPU design analysis, exploring different branch predictors, multi-threading on full-system mode evaluations, and SSD internal parallelism analysis on gem5. Considering undergraduate students, this course will also include quick review lectures, which will include, instruction set architecture, MIPS/RISC architecture, pipelining, hazard and cache architecture.
|#1||Overviews [ pdf ] [ ppsx ]||Logitistics|
|#2||Quick Review: Instruction Set Architecture [ pdf ]||ISA, MIPS, RISC and RISC-V|
|#3||Quick Review: MIPS Architecture [ pdf ]||RTL, Combination Logic, Single Cycle Datapath, ALU Control Unit|
|#4||Quick Review: Multi-cycle Machine and Pipelining [ pdf ]||Multi-cycle machine, FSM, Pipeline Design and Pipeline Paradox|
|#5||Quick Review: Hazards [ pdf ]||Dependencies and Harzards, Data Forwarding, Pipeline Integration|
|#6||Instruction-Level Parallel Processing [ pdf ]||Superscalar, Deep pipelining, VLIW, Tracing, Loop Unrolling|
|#7||Static Scheduling [ pdf ]||Compiler Optimization, Register Renaming Software Pipelining|
|#8||Dynamic Scheduling (Scoreboard) [ pdf ]||Out-of-Order Execution, Scorebard Algorithm|
|#9||Tomasulo Scheduling [ pdf ]||Reservation Stations, Decentralized Buffers, Tomasulo Algorithm|
|#10||Reorder Buffer [ pdf ]||Register Renaming, Alias Table, Advanced RoB Processors|
|#11||Precise Exception [ pdf ]||Exception Handling, Speculative Executions, LSQ, Spectre/Meltdown|
|#12||Branch Prediction [ pdf ]||Branch History Tables, Correlated Prediction, Tournament, RAS|
|#13||Cache [ pdf ]||$ Basics, Practical/Approximate LRUs, UnCacheable Speculative WC|
|#14||Virtual Memory [ pdf ]||Virtual Memory, Page Tables, Multi-level Paging, MMU, TLB|
|#15||Volatile Memory (DRAM) [ pdf Δ ]||MAT, Bank/Rank, Channel, EDO, Burst EDO, DDR, LRDIMM, FBDIMM|
|#16||Volatile Memory (Adavanced) [ pdf Δ ]||Xn Prefetching, Clock Skew, Single-ended, Reflection|
|#17||Volatile Memory (Adavanced) [ pdf Δ ]||Scheduling, Multi-channel, Bank-level Parallelism, LPDDR, GDDR|
|#18||Non-Volatile Memory (Practical Approach) [ pdf Δ ]||PMEM, Persitence Support, DAX, FSDAX, DEVDAX, New Instrutions|
|#19||Multicore with Coherence/Consistence [ pdf Δ ]||MESI, MSI, MOESI, MESID, Director, Snooper, Multi-level $ issues|
The purpose of this course is to teach the general concepts and principles behind operating systems. The topic we will cover through this class, including i) kernel and process abstractions and programming, ii) scheduling and synchronization, iii) memory management and address translation, iv) caching and virtual memory v) file systems, storage devices, files and reliability, vi) full and para-virtualization. In addition to these lectures, we will also have term projects, which use an operating system simulator/emulator built for an educational purpose. In these projects, we expect that you not only can learn Linux practices but also make great strides in studies on operating systems design and implementation. This is in C/C++, rather than Java or Python. We believe that these projects will provide a more realistic experience in operating systems to students. In this class, all homeworks are treated in an individual assignment; whereas projects are considered as a group assignment. In typical, it is difficult to figure out the contributions that each member committed, the submission for these projects will be done through a git repository per group (e.g., bitbucket), and TA will check all push and pull transactions to grade a team. Note that, in this sense, Intermediate pushes will help TA see how students are progressing. The project teams may be two persons, some may be threes; the number of students per team will be decided based on information of the final registration, and will be announced in classes.
|#3||Hadware management||OS history, architecture and hardware supports, I/O ports, memory mapped I/O, DMA, typical memory layout and a bootup sequence example|
|#4||Processes||Multiprogramming, execution stack, address spaces, context switch, process creation, inter-process communication|
|#5||Threads||Light-weight processes, thread State, lifecycle of a thread, dispatch Loop, evennts, interrupts, thread execution|
|#6||Concurrency management||UNIX process management, process tree, fork(), exec(), pthread, join(), OpenMP|
|#7||Synchronization||A high level view of parallelism and synchronization|
|#8||Atomicity||Race conditions, critical sections, mutexes, Instruction-level atomicity, spin Lock|
|#9||Deadlock||Bounded buffer problem, read/write lock, semaphore, condition variable, monitor, circular waiting, and deadlock avoidance|
|#10||CPU scheduler||Processor behavior anaylsis, scheduling architecture, FCFS, SJF, STCF, RR and CPU burst prediction,|
|#11||Advanced CPU scheduler||Priority scheduling basics, priority boost, EDF, MLQ, MLFQ, fair scheduling, lottery scheduling, stride scehduling, multiprocessor-aware scheduling, MQMS, process migration, CFS and red-black process tree|
|#12||Virtual memory||Segementation, multi-segment model, segment translation, translation table, swapping, paging, sharing, multi-level translation, two-level page table and inverted page table|
|#13||Cache and TLB||Cache basics, direct mapped cache, set associative cache, fully-associative cache, address translation on caching, TLB, demanding paging, paging table entry and software-driven TLB|
|#14||Page replacement||Page faults, FIFO, MIN, LRU, Belady's anomaly, clock algorithm, n-chance approximated LRU and free-list|
|#15||Disk scheduling||Disk architecture and organization, interfaces, transfering data, caching, FCFS, SSTF, SCAN, C-SCAN, C-LOOK and device-level command-queueing|
|#16||Beyond disks||Block addressing, chunk sizing, RAID performance analysis, RAID-0, RAID-1, hybrid-RAID, RAID-4, RAID-5, RAID-6, flash, SSD, garbage collection, TRIM and wear-leveling|
|#17||File system basics||File system overview, MBR, patition, root file system, mount, virtual file system, file allocation table and file meta data, FAT analysis|
|#18||inode, block, and block group||inode, inode block pointers, link, ext optimization for many small file, fast file system, ext and ext2|
|#19||Journaling||Consistency and reliability, file system checker, write-ahead log, commits and checkpoints, crash recovery, meta-data journaling|
|#20||Log-structure||Extents and B-Trees, log-structured file system, buffering writes, garbage collection and copy-on-write|
|#21||Full virtualization||VMM organization and functions, guest, virtual machine hardware, protected mode, privileged instructions, binary translation, caching translated code and shadow page tables|
|#22||Paravirtualization||Hardware support for VM, virtualization performance analysis, AMD-V and VT-x, second level address translation, Xen, hypercalls, virtual devices, virtual devices|
Operating Systems Concepts, 9th Edition Silbershatz, Galvin, Gagne
Operating Systems: Three Easy Pieces, Remzi and Andrea Arpaci-Dusseau.
Free, PDFs available online
The Design and Implementation of the FreeBSD OS
The Practice of Programming
The Mythical Man-Month
- Project-0: Install Pintos and implement print_name event [ description ]
- Project-1: Threads, timer and priority scheduler [ description is available at YSCEC ] [ DESIGNDOC sample ]
- Project-2: System call implementation [ DESIGNDOC sample ]
- Project-3: Virtual memory implementation [ DESIGNDOC sample ]
Modern flash-based solid state disk (SSDs) can be plagued by enormous performance variations depending on whether the underlying architectural complexities and flash management overheads can be hidden or not. Designing a smart flash controller and storage system is key hiding these architectural complexities and reducing the internal firmware overheads. In this course, we first understand the core components of SSD architecture and key concepts behind flash firmware. It then presents a set of novel storage optimizations including various concurrency methods, I/O scheduling algorithms, and garbage collection avoidance mechanisms.
The topic we will cover through this class as follows:
Lectures (in progress)
- Flash introduction -- NAND Flash Basics, Basics of floating gates and Reliability issues of NAND cells
- Fundamental address mapping (FTL) -- Simple mapping algorithm and physical block management
- Advanced address mapping (FTL) -- Associativity, garbage collection and wear leveling basics
- Garbage collection basic -- Foreground and background garbage collection techniques
- Flash-level controller [ pdf ] [ ppsx ] -- Flash memory transactions and advanced command control
- SSD architecture and system-level controller [ pdf ] [ ppsx ] -- SSD Architecture and I/O parallelism-centric design techniques
- Wear-leveling algorithms -- Reliability management functions and hot/cold data management
- Internal buffer management -- DRAM caching, flash aware replacement and buffering mechanisms
The lectures will also have invitation talks that cover a series of industrial topics non-volatile memory express (NVMe) architecture, device management, and distributed flash controller. We will also provide a simulation framework for your project that can accelerate your knowledge on storage systems.
This course is mainly designed towards introducing the design and implementation of programming language. From the design perspective, we will study language features for expressing algorithms. On the other hand, we will also study the basic concepts of the tools to map such language features onto modern computing hardware (such as compilers, and interpreters). In this course, rather than harp on the feature of a particular language, we will focus on fundamental concepts, the differences among the programming languages, the reasons for those differences, and the implication of those differences offers language implementation. The topic we will cover through this class as follows:
- formal aspects of syntax and semantics
- naming, scoping, and binding
- scanning, parsing, semantic analysis, and code generation
- control flow, subroutines, exception handling, and concurrency
- type systems, composite types, data abstraction, and storage management
- imperative, functional, logic-based, and object-oriented programming paradigms
It would be preferred to prepare the concepts of programming and have a practice on programming based on your language preference. We will assume that you either know the material that is supposed to be covered in the fundamental issues, or that you are willing to learn the material as necessary.
Programming language pragmatics/Scott, Michael Lee/Elsevier/Morgan Kaufmann Publishers
Structure and interpretation of computer programs/Abelson, Harold/MIT Press ; McGraw-Hill
This course will mainly introduce computer organization and design, including the following topics:
- Instruction set design, illustrated by the MIPS instruction set architecture.
- Design of the datapath and control for a simple processor.
- CPU performance analysis and systems-level view of computer arithmetic.
- Parallelism, pipelining, hazard and dependency
- Cache and Memory design.
- Hierarchical memory.
- I/O subsystems, storage systems, I/O performance analysis.
Even though the class topics are related to architecture fundamentals, students are expect to have some hardware and computer science background. This course will include two or three simple projects, one of each leveraging a different style of simulation models build for an educational purpose. The one of the goal behind these projects is that students can learn i) how to use full system simulation software and ii) how to perform simulation-based architectural studies, which in turn can be a good steppingstone for your future research. The simulation framework built on both most 32-bit and 64-bit flavors of UNIX and Windows NT-based operating systems, but we recommend to modify them on a UNIX-like system. The projects will be relatively simple (compared to what an advanced computer architecture course usually deals with), but students should be capable of freely analyzing/modifying C/C++ written software models.
We expect that you know C/C++ and data structures, have done some assembly language programming, and that you know about series and products, logarithms, advanced algebra, some calculus, and basic probability (means, standard deviations, etc.). We will assume that you either know the material that is supposed to be covered in those fundamental topics, or that you are willing to learn the material as necessary.