Skip to content

Commit 3d73d1d

Browse files
author
Anastasiia Shcherbakova
committed
added a summary table for all data structure at the end of the data structures section
1 parent a3271f2 commit 3d73d1d

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

episodes/optimisation-data-structures-algorithms.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,6 @@ The above diagrams shows a hash table of 5 elements within a block of 11 slots:
168168
3. The number of jumps (or steps) it took to find the available slot are represented by i=1 (since we moved from position 4 to 5).
169169
In this case, the number of jumps i=1 indicates that the algorithm had to probe one slot to find an empty position at index 5.
170170

171-
172171
### Keys
173172

174173
Keys will typically be a core Python type such as a number or string. However, multiple of these can be combined as a Tuple to form a compound key, or a custom class can be used if the methods `__hash__()` and `__eq__()` have been implemented.
@@ -349,6 +348,21 @@ binary_search_list: 5.79ms
349348

350349
These results are subject to change based on the number of items and the proportion of searched items that exist within the list. However, the pattern is likely to remain the same. Linear searches should be avoided!
351350

351+
::::::::::::::::::::::::::::::::::::: callout
352+
353+
Dictionaries are designed to handle insertions efficiently, with average-case O(1) time complexity per insertion for a small size dict, but it is clearly problematic for large size dict. In this case, it is better to find an alternative Data Structure for example List, NumPy Array or Pandas DataFrame. The table below summarizes the best uses and performance characteristics of each data structure:
354+
355+
| Data Structure | Small Size Insertion (O(1)) | Large Size Insertion | Search Performance (O(1)) | Best For |
356+
|------------------|-----------------------------|------------------------------------------|---------------------------|--------------------------------------------------------------------------|
357+
| Dictionary | Yes | Problematic (O(n) resizing) | Yes | Fast insertions and lookups, key-value storage, small to medium data |
358+
| List | Yes (Amortized) | Efficient (Amortized O(1)) | No (O(n)) | Dynamic appends, ordered data storage, general-purpose use |
359+
| Set | Yes | Problematic (O(n) resizing) | Yes | Membership testing, unique elements, small to medium data |
360+
| NumPy Array | No | Efficient (Fixed Size) | No (O(n)) | Numerical computations, fixed-size data, vectorized operations |
361+
| Pandas DataFrame | No | Efficient (Column-wise) | No (O(n)) | Column-wise analytics, tabular data, large datasets |
362+
363+
NumPy and Pandas, which we have not yet covered, are powerful libraries designed for handling large matrices and arrays. They are implemented in C to optimize performance, making them ideal for numerical computations and data analysis tasks.
364+
365+
:::::::::::::::::::::::::::::::::::::::::::::
352366

353367
::::::::::::::::::::::::::::::::::::: keypoints
354368

0 commit comments

Comments
 (0)