@@ -15,6 +15,154 @@ To find out how to join the Hardware SIG `get in touch. <https://www.oneapi.io/c
1515Meeting Notes
1616=============
1717
18+ 2023-06-22
19+ ==========
20+
21+ Agenda
22+ ------
23+
24+ * Consistent timestamp reporting in Level Zero, Matias Cabral
25+ * Level Zero port for TornadoVM + the SPIR-V code gen, Juan Fumero
26+ * Tiles-as-devices model in Level Zero, Jaime Arteaga
27+ * Level Zero and Unified Runtime general updates,
28+ Jaime Arteaga Molina, Alastair Murray
29+
30+ Attendees
31+ ---------
32+
33+ .. list-table ::
34+
35+
36+ * - Alison L Richards, Intel
37+ - Jaime Arteaga Molina, Intel
38+ - Michal Mrozek, Intel
39+ * - Alastair Murray, Codeplay
40+ - Brice Videau, Intel
41+ - Matias Cabral, Intel
42+ * - Juan Fumero, The University of Manchester
43+ - Kenneth Benzie, Codeplay
44+ - Ben Ashbaugh, Intel
45+ * - Brice Goglin
46+ - John Daniel Holmes, Intel
47+ - Ruyman Reyes, Codeplay
48+ * - Kevin Harms, ANL
49+ - Xinmin Tian, Intel
50+ - Rod Burns, Codeplay
51+ * - Brandon Yates, Intel
52+ - Dorothee Marie Clotilde Balas, Intel
53+ -
54+
55+ Consistent timestamp reporting in Level Zero
56+ --------------------------------------------
57+
58+ `Slides <presentations/L0_timestamps_units.pdf >`__
59+
60+ * Inconsistency of metrics units for timestamps in
61+ Level Zero specification.
62+ * Proposal: Explicit definition of timestamps on ns
63+ and resolutions in frequency.
64+
65+ Level Zero port for TornadoVM + the SPIR-V code gen
66+ ---------------------------------------------------
67+
68+ `Slides <presentations/TornadoVM-oneAPIHardwareSIG-June23.pdf >`__
69+
70+ * Work on extending TornadoVM to run on oneAPI stack,
71+ focusing on Level Zero.
72+ * Current access to heterogeneous systems is fragmented
73+ with different stack for CPU and GPU.
74+ * What if access to these hardware could be done from
75+ existing high-level programming languages.
76+ * TornadoVM translates for instance Java Bytecode
77+ into CUDA, OpenCL, or SPIRV.
78+ * Translation is done by using ACC annotations.
79+ * TaskGraphs methods are used to identify functions
80+ to accelerate.
81+ * Found that for some workloads, suggested group size
82+ returned by L0 driver is not as optimal as the one
83+ found heuristically by application.
84+ * Garbage Collector needs to be controlled to avoid
85+ failures when dealing with memory moved to an accelerator.
86+ * Some suggestions for L0:
87+
88+ * Migration counters.
89+ * Async exception support.
90+ * Device aggregation.
91+ * Improvements to suggested group size returned by L0.
92+ * Questions:
93+
94+ * Unified runtime would be more useful to this
95+ project as a standard?
96+
97+ To convince Java community to use this software
98+ stack a standard is more appealing. Easier to justify
99+ a standard because not controlled by one party.
100+
101+ Tiles-as-devices model in Level Zero
102+ ------------------------------------
103+
104+ `Slides <presentations/tiles-as-devices-l0-sig.pdf >`__
105+
106+ * L0 adding support for new device models
107+
108+ * Mainly aimed at multi-tiled architectures
109+ * New environment variable: ZE_FLAT_DEVICE_HIERARCHY
110+
111+ * Mode 0: Cards-as-devices model
112+
113+ * Mode 1: Tiles-as-devices model (no access to root device)
114+
115+ * Mode 2: Tiles-as-devices model (with access to root device)
116+
117+ * Objective is to provide applications with ability to
118+ select best configuration for a given workload, especially
119+ when using architectures where access between tiles is
120+ more costly than local access.
121+ * Changes and new APIs to be added as part of L0 v1.7.
122+ * Questions:
123+
124+ * How is memory allocated between the different modes?
125+
126+ In mode 0 get device handle and allocations
127+ split between two sub-devices (tiles), or could get
128+ sub-device and memory will be allocated on that tile.
129+ In mode 1 get handle to a tile as device (no sub-devices)
130+ and memory is allocated on that tile. No magic.
131+
132+ * Is ZE_FLAT_DEVICE_HIERARCHY an environment variable?
133+
134+ Yes.
135+
136+ * Specification spec will say that mode 1 in the default?
137+
138+ Up for debate, specification may say that default
139+ is implementation defined.
140+
141+ * In Mode 2 can't do an allocation on root device?
142+
143+ Yes, you can.zeDeviceGet will return all tiles.
144+
145+
146+ Level Zero and Unified Runtime general updates
147+ ----------------------------------------------
148+
149+ `Slides <presentations/Unified-Runtime-for-oneAPI-HW-SIG-062223.pdf >`__
150+
151+ * UR Adapters have been merged to the SYCL runtime.
152+ * License was previously MIT and now being changed to Apache
153+ to make it friendlier with LLVM.
154+ * Current UR version is v0.6 and close to be tagging next
155+ version.
156+ * All development happening in public UR repo.
157+ * Experimental features being added as well, such us
158+ Command Buffers for graphs and USM import/export.
159+ * Developers looking to move to UR can build the loader
160+ from the UR repo and the adapters from intel/llvm
161+ and use the UR interfaces.
162+ * Application links to UR loader, which would load the
163+ UR adapters, which then load the corresponding drivers.
164+
165+
181662022-11-3
19167=========
20168
@@ -48,8 +196,8 @@ Attendees
48196 - Zack Waters, Intel
49197 -
50198
51- oneAPI Coummunity Forum
52- -----------------------
199+ oneAPI Community Forum
200+ ----------------------
53201
54202* TABs changing to SIGs
55203
0 commit comments