You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: episodes/optimisation-conclusion.md
+2-3Lines changed: 2 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,10 +54,9 @@ Your feedback enables us to improve the course for future attendees!
54
54
- Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
55
55
- There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
56
56
- Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.
57
-
- How the Computer Hardware Affects Performance
58
-
- Sequential accesses to memory (RAM or disk) will be faster than random or scattered accesses.
59
-
- This is not always natively possible in Python without the use of packages such as NumPy and Pandas
57
+
- How Latency Affects Performance
60
58
- One large file is preferable to many small files.
59
+
- Network requests can be parallelised to reduce the impact of fixed overheads.
61
60
- Memory allocation is not free, avoiding destroying and recreating objects can improve performance.
Copy file name to clipboardExpand all lines: episodes/optimisation-latency.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: "Understanding Memory"
2
+
title: "Understanding Latency"
3
3
teaching: 30
4
4
exercises: 0
5
5
---
@@ -24,12 +24,10 @@ exercises: 0
24
24
## Accessing Disk
25
25
26
26
<!-- Read data from a file it goes disk->disk cache->ram->cpu cache/s->cpu -->
27
-
When accessing data on disk (or network), a very similar process is performed to that between CPU and RAM when accessing variables.
27
+
When reading data from a file, it is first transferred from the disk, to the disk cache, to the RAM (the computer's main memory, where variables are stored).
28
+
The latency to access files on disk is another order of magnitude higher than accessing normal variables.
28
29
29
-
When reading data from a file, it transferred from the disk, to the disk cache, to the RAM.
30
-
The latency to access files on disk is another order of magnitude higher than accessing RAM.
31
-
32
-
As such, disk accesses similarly benefit from sequential accesses and reading larger blocks together rather than single variables.
30
+
As such, disk accesses benefit from sequential accesses and reading larger blocks together rather than single variables.
33
31
Python's `io` package is already buffered, so automatically handles this for you in the background.
34
32
35
33
However before a file can be read, the file system on the disk must be polled to transform the file path to its address on disk to initiate the transfer (or throw an exception).
@@ -179,6 +177,8 @@ Latency can have a big impact on the speed that a program executes, the below gr
179
177
180
178
{alt="A horizontal bar chart displaying the relative latencies for L1/L2/L3 cache, RAM, SSD, HDD and a packet being sent from London to California and back. These latencies range from 1 nanosecond to 140 milliseconds and are displayed with a log scale."}
181
179
180
+
L1/L2/L3 caches are where your most recently accessed variables are stored inside the CPU, whereas RAM is where most of your variables will be found.
181
+
182
182
The lower the latency typically the higher the effective bandwidth (L1 and L2 cache have 1 TB/s, RAM 100 GB/s, SSDs up to 32 GB/s, HDDs up to 150 MB/s), making large memory transactions even slower.
0 commit comments