Skip to content

Commit 28b2d59

Browse files
Update implementation of deduplication function (#16972)
Old implementation is very slow. It has O(n^2) complexity. With large input data, you can wait for the result for hours, while the new implementation works in just a couple of seconds. New implementation uses dict that is significantly faster - O(n). Since we require python 3.6, the ordering of dictionaries is guaranteed to be insertion order, so we can use 'dict' here but not 'set'.
1 parent 86b7ceb commit 28b2d59

File tree

1 file changed

+3
-5
lines changed

1 file changed

+3
-5
lines changed

emcc.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -935,11 +935,9 @@ def in_temp(name):
935935

936936

937937
def dedup_list(lst):
938-
rtn = []
939-
for item in lst:
940-
if item not in rtn:
941-
rtn.append(item)
942-
return rtn
938+
# Since we require python 3.6, that ordering of dictionaries is guaranteed
939+
# to be insertion order so we can use 'dict' here but not 'set'.
940+
return list(dict.fromkeys(lst))
943941

944942

945943
def move_file(src, dst):

0 commit comments

Comments
 (0)