HPC on linux clusters Nodes and Networks hardware
Overview
Typical Node Architecture
Computer families/1
Computer families/2
Micro architecture
Superscalar
Pipelining/1
Pipelining/2
Pipelining/3
P3/P4 superpipelining
Branch prediction Speculative execution
x86 uarchitectures
Intel x86 family
AMD Athlon family
Pentium 4 uarchitecture
Athlon uarchitecture
x86 Architecture extensions/1
x86 Architecture extensions/2
x86 architecture extensions/3
x86 architecture extensions/4
SIMD technology
typical SIMD operation
MMX
SSE
SSE2
Cache memory
Cache memory/1
Cache memory/2
Cache memory/3
Cache memory/4
Cache memory/5
Cache memory/6
Cache memory/7
Cache memory/8
Cache memory/9
Cache memory/10
Cache memory/11
Real Caches
Memory performance
MTRR/1
MTRR/2
MTRR/3
MTRR/4
MTRR/5
MTRR/6
Explicit cache control/1
Explicit cache control/2
Performance and timestamp counters/1
Performance and timestamp counters/2
Performance and timestamp counters/3
Performance and timestamp counters/4
Performance and timestamp counters/5
Performance and timestamp counters/6
Performance and timestamp counters/7
Processor bus/1
Processor bus/2
Intel IA32 node
Intel PIII/P4 processor bus
Alpha node
Alpha/Athlon EV6 bus
Pentium 4 (Willamette)
SMPs
Intel MP Processor bus arbitration
Cache coherency
Cache consistency
Snooping
Intel MP snooping
MESI protocol
MESI states
L2/L1 coherence
Atomic Read Modify Write
Intel MP interrupts
PCI Bus
PCI efficiency
PCI 2.2/X timing diagram
common chipsets PCI performance
Memory buses
Interconnects /1
LogP metrics (Culler)
LogP diagram
Interconnects /2
Interconnects /3
Interconnects /4
Interconnects /5
Interconnects /6
Interconnects /7
Interconnects /8
Interconnects /9
Interconnects /10
Interconnects /11
Bisection /1
Bisection /2
Bisection /3
NIC Interconnection point(from D.Culler)
Ethernet history
Ethernet
Ethernet Frames
VLAN
Hubs
Ethernet flow/control
Auto Negotiation
GigE
Network (Physical Layer)
LVDS/1
LVDS/2
LVDS/3
LVDS/4
VCSEL/1
VCSEL/2-EEL (Edge Emitting)
VCSEL/3- Surface Emission
VCSEL/4
VCSEL/5
MINI (Memory Integrated Network Interface)
MINI/2
Infiniband/1
Infiniband/2
Infiniband/3
Infiniband/4
Infiniband/5
Myrinet/1
Myrinet/2
Myrinet/3
Clos networks
Clos networks/2
Software
Software overhead
Zero Copy Research
OS bypass / User level networking
Active Messages (AM)
FastMessages (FM)
VIA/1
VIA/2
VIA/3
Software layering
Network layering considered harmful ?
Linux Socket buffers (sk_buff)
Memory Management
Linux 2.4 kiobuff
Bibliography
Author: Roberto Innocente
E-mail: rinnocente@eurohpc.org
Compressed postscript: ictp-school.ps.gz
Pdf : ictp-school2.pdf