Risc V

Algorithm: affects IC, possibly CPI
Programming language: affects IC, CPI
Compiler: affects IC, CPI
Instruction set architecture: affects IC, CPI, Tc

Network

hardware
operating systme
application programs
users

Search

Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information

Math

machine epsilon
relative residual is always be small
```
PageRank - solved
```

Newton’s Method.

Newton’s method usually converges faster than bisection method.

r_newton = r_init
r_arr = [r_newton]
iteration_count_newton = -1
while True:
  iteration_count_newton = iteration_count_newton + 1
  H_ = H(r_newton)
  df_ = -1 * df(r_newton)
  _H = np.linalg.inv(H_)
  s = _H.dot(df_)
  r_newton = r_newton + s
  r_arr.append(r_newton)
  if np.linalg.norm(s) <= stop:
   break
print(iteration_count_newton)

System

leaq Src, Dst

Ruby

Ruby is an interpreted, high-level, general-purpose programming language.
Ruby is dynamically typed and uses garbage collection.
It supports multiple programming paradigms, including procedural, object-oriented, and functional programming.

Architect

RAM is an acronym for Random Access Memory. Random access means that memory contents can be accessed directly if you know its location.
Cache is a type of temporary memory that can be accessed faster than RAM.
SATA, hard disk interfaces with other system components.
- Serial ports send data as a series of pulses along one or two data lines.
- Parallel ports send data as a single pulse along at least eight data lines.
Booth’s algorithm

Os

Network

internetwork refer to interconnected network
router
- interconnect networks
- protocol conversion
- routing (determine the output for each incoming data unit)
- switching (transfer packets from inputs to outpus)
- other functions

Javaalg

T(n) = aT(n/b) + f(n)
1. < -> n^(log_b(a))
2. n^(log_b(a)) * log(n)^d -> n^(log_b(a)) * log(n)^(d+1)
3. > -> f(n)

Functionandcompiler

Programming Language Syntax (Ch. 2.1-2.3.3)
Logic Programming and Prolog (Ch. 12)
Scoping (Ch. 3.1-3.3)
Programming Language Semantics (Sc. Ch. 4.1-4.3)
Functional programming (Ch. 11)
Scheme and Haskell, map/fold questions
Lambda calculus (Ch. 11 Companion)
Data abstraction: Types (Ch. 7, 8)
Control abstraction: Parameter Passing (Ch. 9.1-9.3)
Object-oriented languages (10.1-10.2)
Concurrency (13.1). “What can go wrong?” questions
Dynamic languages (Ch. 14 optional)

Programming

Scope and Scoping

Run-time:
- refers to the time when an application actually executes.
Compile-time:
- everything before run-time, that is, compilation, linking, and loading.
For a program, its static structure is the structure of the source program, how it is organized.
The dynamic structure is the structure that evolves during run-time.

Image

Introduction to Digital Imaging
- Image Sensing and Acquisition
- Describe the operational principles of CCD and CMOS sensors.
- List the major differences between CCD and CMOS image sensors. - CCD: Charged Coupled Device
- Charge accumulates during exposure
- Collected charges shifted out by horizontal and then vertical shift registers
- Each pixel is converted to the voltage
- Voltage is amplified by an amplifier
- Blooming: The charge collected by a pixel leaks to other pixels. Electrons move more easily in the vertical direction, resulting in a vertical streak. - CMOS: Complementary Metal Oxide Semiconductor
- In a CMOS sensor, each pixel has its own charge-to-voltage conversion, amplifiers and digitization circuits -> No blooming, faster operation.
- Digital values are read in a line-by-line fashion -> Rolling shutter artifact
- CMOS circuitry dissipates less power. | CCD | CMOS | |—————————————————-|——————————————-| | Broadcast cameras (still industry standard for TV) | Phone, Web, and digital SLR cameras | | Better image quality | Cheaper and faster (data transfer) | | Susceptible to the blooming effect | Susceptible to the rolling shutter effect |
- Know how colour images are captured by CCD and CMOS image sensors.
  - 3 CCDs capturing red, green and blue, A prism
  - CMOS: Arranging colour filters on a square grid of photosensors, 50% green, 25% red and blue
    - Separated channel -> interpolated channel -> summing three channels

Pythonshell

Os

Virtual memory breaks down if working set is too large (thrashing), if there are too many large processes
DMA allows the I/O to bypass the CPU. The CPU is usually the master, initiating an operation and the device interrupts when the operation has completed.

Ai

AI

Acting humanly: Turing test
Acting rationally: rational agent, do the right thing -> which is expected to maximise goal.
Interpreters vs. Compilers
- Interpreters
  - Translates program one statement at a time.
  - Take less amount of time to analyse the source code. However, the overall execution time is comparatively slower than compilers.
  - No intermediate object code is generated, hence are memory efficient.
- Compilers
  - Scans the entire program and translates it as a whole into machine code.
  - Take a large amount of time to analyse the source code. However, the overall execution time is comparatively faster than interpreters.
  - Generates intermediate object code which further requires linking, hence requires more memory.

Pythondata

Data Science activities

Activities done in data science work, and how they typically occur
- Problem formulation
- Data acquisition
- Data exploration
- Analysis
- Communication of results
- Take action automatically
Data and metadata
- Data meaning, Data structure (schema), Provenance
Ethics in data science work

Network

Exam

end-system
- Protocol design, P2P, socket programming
- Error detection, reliable data transfer, flow control, congestion control
- TCP and UDP
network core: routing, hooking nets together
- Network addressing, scalability, hierarchical addressing
- Fragmentation as an example to deal with heterogeneous link layer technologies
- Routing protocols and algorithms: link state, distance vector
link-level protocols, e.g., Ethernet
- Addressing, ARP
- Medium access control, especially random access
- Interaction between link and network layers
other stuff: security, wireless networks
- Symmetric key and public key cryptography
- Confidentiality, message integrity, authentication
- The role of encryption in these

Parallel

##

Review the typical performance models covered in this lecture
Pay attention to in-class examples as well as your homework. There is one question in Section 2 is from this lecture.
Karp-Flatt Metric(*), speedup formula, isoefficiency formulas and examples ##
Memory hierarchies
Spatial and temporal locality
False sharing/true sharing: causes and how to fix them ##
Concept and usage of fork():
who is the child process? How is fork() used?
What are the zombie processes? What situation causes zombie processes? What are the solutions to zombie processes? E.g., waitpid(), wait(), or terminate the parent process. Pay attention the exercises on this in the lecture slides.
Signals communication between processes: signals are not queued
What are the differences between process and threads? ##
Threads vs Processes: Mechanisms and Sharing
Pthreads basics
Variable space (sharing? Private? Etc.)
Unintended sharing
Thread detached mode
Understanding race condition and data race, and what may cause them to happen; what are the potential solutions (e.g., data race example in slides).
Enforcing mutual exclusion solutions: semaphores, mutex locks, condition variables, atomic operations (e.g., spinlocks) ##
Semaphores, mutex, condition variables and atomic operations: coding, mechanisms and when to use what.
Questions on how use semaphores: P (sem_post) and W (sem_wait) problems, binary semaphores, consumer and producer questions using semaphores, deadlocks caused by semaphores.
Questions on deadlocks: what causes deadlocks? Examples that cause deadlocks. Four necessary conditions for triggering deadlocks.
Questions on condition variables: always o go h with mutex; condition variables usage example and the mechanism (how to wake up and then how they grab the lock).
Questions on atomic operations: what are they? Why are they fast? When to use atomic operations mutex/semaphores/condition variables?
Understanding barrier
Understanding spinlocks and what is the condition to use spinlocks (similar to atomic operations); how to use CAS (compare and swamp) to enable spinlocks.
the cnt program
Quiz2 review and other synchronization questions ##
Basic OpenMP directives, runtime APIs and environmental variables
OpenMP example with false sharing ##
MPI blocked send and receive

Python Data

value

Collection: getting the data
Engineering: storage and computational resources
Governance: overall management of data
Wrangling: data preprocessing,cleaning
Analysis: discovery (learning, visualisation, etc.)
Presentation: arguing that results are significant and useful
Operationalisation: putting the results towork

Parallel

Python

number

Data structure

Data Structures

• The Array. • The List. • The Stack. • The Queue. • The Record.

Distributedos

Organisation
Distributed algorithms
- In distributed algorithms, the problem is given by the network itself and solved by coordination protocols
Graphs and spanning trees revisited
- Tree edge: an edge that belongs to the spanning tree
- Frond edge: an edge that does not belong to the spanning tree
Basics
- Synchronous models
  - nodes work totally synchronised – in lock-step
  - easiest to develop, but often unrealistic and less efficient
  - all node process (1) + all message (0)
  - all node process (0) + all message (1)
- Asynchronous models
  - messages can take an arbitrary unbounded time to arrive
  - often unrealistic and sometimes impossible
  - all node process (0) + each message ([0, 1])
  - FIFO guarantee
  - async time complexity ≥ sync time complexity
- Partially synchronous models
  - some time bounds or guarantees
  - more realistic, but most difficult
Echo algorithm
Echo algorithm revisited
- sync, BFS spanning tree -> SyncBFS
  - Time Units = 3 ≤ 2D + 1
  - Messages = 10 = 2 E
- async, not a BFS spanning tree
  - Time Units on Broadcast = 1 ≤ D
  - Messages = 10
  - Time Units on Convergecast = 3 = V − 1
  - Total time is 4 ``` let parent = null let rec = 0

Os

OS

Pyspark

Overview

speed up or scale up, Throughput or Response time
parallel time
skewness degree
Shared-Something Architecture: A mixture of shared-memory and shared-nothing architectures
Round-robin or random equal data partitioning, Hash data partitioning, Range data partitioning, Random-unequal data partitioning
- Round-robin or random equal data partitioning is “Equal partitioning” or “Random-equal partitioning”, Data evenly distributed
- Hash data partitioning (not balanced) uses A hash function
- Range data partitioning (not balanced) Spreads the records based on a given range
Complex Data Partitioning: Complex data partitioning is based on multiple attributes or is based on a single attribute but with multiple partitioning methods. Hybrid-Range Partitioning Strategy (HRPS), Multiattribute Grid Declustering (MAGIC), Bubba’s Extended Range Declustering (BERD)
Serial search algorithms: Linear search and Binary search
Parallel search algorithms: Processor activation or involvement, Local searching method and Key comparison
- Depends on the data partitioning method used, Also depends on what type of selection query is performed (P15)
- Depends on the data ordering, regarding the type of the search
- Depends on whether the data in the table is unique or not
Parallel Inner Join components
- Data Partitioning
- Divide and Broadcast
- Disjoint Partitioning
Local Join
- Nested-Loop Join
- Sort-Merge Join
- Hash Join, The records of files R and S are both hashed to the same hash file, using the same hashing function on the join attributes A of R and B of S as hash keys
(P23/27,28, Important) Example of a Parallel Inner Join Algorithm
- Divide and Broadcast, plus Hash Join
- data partitioning using the divide and broadcast method, and a local join
- Divide one table into multiple disjoint partitions, where each partition is allocated a processor, and broadcast the other table to all available processors
- Dividing one table can simply use equal division
- Broadcast means replicate the table to all processors
- Hence, choose the smaller table to broadcast and the larger table to divide
(*P30) Parallel Join Query Processing
- ROJA (Redistribution Outer Join Algorithm)
- DOJA (Duplication Outer Join Algorithm)
- DER (Duplication & Efficient Redistribution)
- OJSO (Outer Join Skew Optimization)
Serial Sorting – INTERNAL, The data to be sorted fits entirely into the main memory: Bubble Sort, Insertion Sort, Quick Sort. Serial Sorting - EXTERNAL The data to be sorted DOES NOT fit entirely into the main memory: Sort-Merge
Parallel External Sort
- Parallel Merge-All Sort, local sort and final merge, Heavy load on one processor
- Parallel Binary-Merge Sort, Merging in pairs only, merging is still heavy
- Parallel Redistribution Binary-Merge Sort, add Redistribution, height of the tree.
- Parallel Redistribution Merge-All Sort, Reduce the height of the tree, and still maintain parallelism, Skew problem in the merging
- Parallel Partitioned Sort, Partitioning stage (Round Robin) and Independent local work, Skew produced by the partitioning.
Parallel Group By, (Merge-All and Hierarchical Merging), Two-phase method, and Redistribution method
- Two-phase method: local records according to the groupby attribute, global aggregation where all temp results from each processor are redistributed
- Redistribution Method: redistribute raw records to all processors and each processor performs a local aggregation.
Featurization-Extraction
- Count Vectorizer
- TF-IDF * (important)
- Word2Vec
Featurization-Transformation
- Tokenization
- Stop Words Remover
- String Indexing
- One Hot Encoding
- Vector Assembler
Featurization-Selection
- Vector Slicer
Supervised Machine Learning: Classification and Regression.
- Decision Trees (* P55) p(x)log(1/p(x))
- Compute the entropy for data set
- For every attribute/feature: 2.1. Calculate entropy for all categorical values 2.2 Take average information entropy for the current attribute 2.3 Calculate gain for the current attribute
- Pick the highest gain attribute
- Repeat until the tree is complete
Unsupervised Machine Learning: Clustering
- K-Means clustering (P76)
- data parallelism and result parallelism
Collaboration Filtering (P87)
- Data collection: Collecting user behaviour and associated data items
- Data processing: Processing the collected data
- Recommendation Calculation: The recommended calculation method used to calculate referrals
- Derive the result: Extract the similarity, sort it, and extract the top N to complete
Joins In Data Streams
- Symmetric Hash Join, similer to hash join but within the window.
- M Join - M-ways Hash Join
- AM Join - add a BiHT based on M join
- Handshake Join
Granularity Reduction
- Granularity is the level of detail at which data are stored in a database.
- Temporal-based Mixed Levels of Granularity
- Time based. - Spatial-based Mixed Levels of Granularity
- Space or location based.
add new dimension to the observation, and hence it helps to estimate more parameters
- Multiple sensors measuring the same things
- Reduce and then Merge
- Merge and then Reduce - Multiple sensors measuring different things, but they are grouped together.
- Reduce, Normalize, and then Merge
- Normalize, Merge and then Reduce

Information Fundamental

trends

Internet of Things
Big Data
Cloud Computing
Automation
Artificial Intelligence and Machine Learning

Keras Survey

Overview

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow.

K8s Survey

Installation

Normal Cluster and Deployment Launching - (private repo:xyao/k8s)
Using RDMA - (https://github.com/hustcat/k8s-rdma-device-plugin)
- 
- Environment="KUBELET_EXTRA_ARGS=--feature-gates=DevicePlugins=true XXX"

You're up and running!

Next you can update your site name, avatar and other options using the _config.yml file in the root of your repository (shown below).