Skip to content

Commit 1990183

Browse files
authored
Merge pull request #38 from ChEB-AI/feature-selfies-results-processing-improvements
process more SELFIES successfully, minor improvements to predictions
2 parents bdda7ee + 3eeb8b2 commit 1990183

File tree

7 files changed

+847
-20
lines changed

7 files changed

+847
-20
lines changed
Lines changed: 376 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,376 @@
1+
[C]
2+
[O]
3+
[N]
4+
*
5+
[S]
6+
[O-]
7+
[Cl]
8+
[Br]
9+
[Sn]
10+
[P]
11+
[N+]
12+
[F]
13+
[F+]
14+
[H]
15+
[Na+]
16+
[Te-]
17+
[V+]
18+
[O+]
19+
[Co-3]
20+
[Se]
21+
[Cl-]
22+
[I]
23+
[Au+3]
24+
[Sb]
25+
[Ca+2]
26+
[Si-]
27+
[Si]
28+
[Li+]
29+
[Co+]
30+
[Mg-2]
31+
[Re]
32+
[Zn]
33+
[H+]
34+
[Fe-]
35+
[Co-2]
36+
[Tc]
37+
[Tl]
38+
[N-]
39+
[Hg]
40+
[K+]
41+
[S+]
42+
[Mn]
43+
[Pt]
44+
[C-]
45+
[Mo-3]
46+
[Fe]
47+
[Cu-2]
48+
[Ru-2]
49+
[Ru-3]
50+
[B]
51+
[Ag+]
52+
[Pt-2]
53+
[Mo]
54+
[Al]
55+
[S-]
56+
[Fe-2]
57+
[W]
58+
[F-]
59+
[Ru]
60+
[Zn+2]
61+
[Al+3]
62+
[Sb-]
63+
[Mg+2]
64+
[B-]
65+
[Br-]
66+
[Mg]
67+
[B-2]
68+
[Mo+6]
69+
[Ar+]
70+
[U]
71+
[Al+]
72+
[As]
73+
[Ra]
74+
[Te]
75+
[Pb]
76+
[Zr]
77+
[Ni+2]
78+
[I+]
79+
[V]
80+
[P+]
81+
[He+2]
82+
[V-2]
83+
[Mo-4]
84+
[Bi-3]
85+
[Cr+3]
86+
[Co+3]
87+
[Se+]
88+
[Os]
89+
[As+]
90+
[Y+3]
91+
[Ag+2]
92+
[Hg+2]
93+
[Ta]
94+
[Ba+]
95+
[Cs+]
96+
[Au-]
97+
[La]
98+
[Mn+2]
99+
[Be+]
100+
[Ba]
101+
[Ni-2]
102+
[Se+2]
103+
[Fr-]
104+
[Cr-2]
105+
[Cd]
106+
[Pr+3]
107+
[Sn+]
108+
[I-]
109+
[La+2]
110+
[Fe+3]
111+
[Se-]
112+
[Co+2]
113+
[Li]
114+
[Bk]
115+
[Ca]
116+
[Ru+2]
117+
[Cr-3]
118+
[Cd+2]
119+
[Zn+]
120+
[Sb+]
121+
[Fe-3]
122+
[Ti+4]
123+
[O-2]
124+
[Cr-]
125+
[Sr+2]
126+
[Ir]
127+
[Au]
128+
[Ge-4]
129+
[Sn+2]
130+
[Cu+2]
131+
[P+2]
132+
[Bi+3]
133+
[Cr]
134+
[C+4]
135+
[Au+]
136+
[Po]
137+
[B+]
138+
[Ce-2]
139+
[W+5]
140+
[Gd+3]
141+
[Co]
142+
[Ba+2]
143+
[Cu+]
144+
[Mo-]
145+
[Bi]
146+
[Si+]
147+
[B+2]
148+
[Cu]
149+
[U+4]
150+
[Fe+2]
151+
[Ar]
152+
[Ni-]
153+
[Ge-]
154+
[Al-]
155+
[U+]
156+
[Rn]
157+
[Ds]
158+
[Ni]
159+
[Yb+3]
160+
[Ru+4]
161+
[Cu+3]
162+
[Lu+3]
163+
[Fe+]
164+
[C+]
165+
[Ga+3]
166+
[Rh-3]
167+
[Rh]
168+
[Ce]
169+
[Pb+2]
170+
[Ag-3]
171+
[Xe]
172+
[Au+2]
173+
[Sr]
174+
[Nb]
175+
[Mo+4]
176+
[Na]
177+
[Al-3]
178+
[Th]
179+
[Kr]
180+
[Pt+4]
181+
[Hg+]
182+
[W-]
183+
[Eu+3]
184+
[Cs-]
185+
[Mo-2]
186+
[Sc]
187+
[Ce+3]
188+
[Be]
189+
[Pb-2]
190+
[V+2]
191+
[Ca+]
192+
[Se-2]
193+
[In]
194+
[Cf]
195+
[V-]
196+
[Si+4]
197+
[Dy]
198+
[Cl+]
199+
[Sn+4]
200+
[N-3]
201+
[V+3]
202+
[He+]
203+
[P-]
204+
[Ge]
205+
[Pd]
206+
[Mo+3]
207+
[In+3]
208+
[Fe-4]
209+
[Eu+2]
210+
[Ho]
211+
[Hg-]
212+
[As+5]
213+
[K]
214+
[K-]
215+
[Tl+]
216+
[Mt]
217+
[Na-]
218+
[Br+]
219+
[Ac]
220+
[Ga]
221+
[N-2]
222+
[Eu]
223+
[Ir-2]
224+
[Gd]
225+
[Hf]
226+
[Ti]
227+
[Pb-]
228+
[Pd-2]
229+
[N+2]
230+
[Sn-]
231+
[S-2]
232+
[Rb]
233+
[Si-4]
234+
[P-3]
235+
[Hs]
236+
[As+2]
237+
[Cd-2]
238+
[Bi+]
239+
[Db]
240+
[Ag-2]
241+
[Sn-2]
242+
[H-]
243+
[Sm]
244+
[V+4]
245+
[Os+4]
246+
[At]
247+
[Rf]
248+
[Au-3]
249+
[Ir+3]
250+
[As-]
251+
[He]
252+
[Be+2]
253+
[Cr-4]
254+
[Cn]
255+
[W-4]
256+
[U+2]
257+
[Be-]
258+
[Pt+2]
259+
[Pa]
260+
[C-2]
261+
[Y]
262+
[Zn-]
263+
[C-4]
264+
[As-3]
265+
[Kr+]
266+
[Pb+]
267+
[Hg-2]
268+
[Am]
269+
[Mo+2]
270+
[Sm+3]
271+
[Rb+]
272+
[Ne+]
273+
[W-3]
274+
[Ta-]
275+
[Fm]
276+
[W-2]
277+
[Ni+3]
278+
[Pb+4]
279+
[U+6]
280+
[Ru+3]
281+
[Mn-]
282+
[Cr+2]
283+
[Pu]
284+
[W+6]
285+
[Ni-3]
286+
[Md]
287+
[Mn+3]
288+
[Al-2]
289+
[Pm]
290+
[Pt+3]
291+
[Si-2]
292+
[Te+]
293+
[P-2]
294+
[V+5]
295+
[As-2]
296+
[Lr]
297+
[Ge+4]
298+
[Bi-]
299+
[Bh]
300+
[Lu]
301+
[Ru-4]
302+
[Mn+4]
303+
[Rh-2]
304+
[Ag-]
305+
[Co-4]
306+
[U+5]
307+
[Ge+2]
308+
[Os-2]
309+
[Gd+2]
310+
[Gd+4]
311+
[Zn-2]
312+
[U+3]
313+
[W+4]
314+
[He-]
315+
[Gd-]
316+
[Rg]
317+
[Cm]
318+
[Mg+]
319+
[Te-2]
320+
[Ag]
321+
[Cs]
322+
[Ho+3]
323+
[Fr]
324+
[La+3]
325+
[As+3]
326+
[Xe+]
327+
[Ne]
328+
[In+]
329+
[Ge+]
330+
[Tm]
331+
[Rn+]
332+
[Er]
333+
[Fr+]
334+
[Es]
335+
[Be-2]
336+
[Ru-]
337+
[Mo-5]
338+
[Os+]
339+
[At-]
340+
[Ni+4]
341+
[Cr+5]
342+
[Ce+4]
343+
[Tb]
344+
[Ti-2]
345+
[Ni+]
346+
[Np]
347+
[Pr]
348+
[Tb+3]
349+
[Os+2]
350+
[Se+4]
351+
[Ni-4]
352+
[Te+4]
353+
[Y+2]
354+
[Sg]
355+
[V-4]
356+
[Rh+2]
357+
[B+3]
358+
[Li-]
359+
[B-3]
360+
[Ir-3]
361+
[Tl+3]
362+
[Nd]
363+
[Re+]
364+
[Cr+4]
365+
[Sb+3]
366+
[Yb]
367+
[Sr+]
368+
[No]
369+
[Mg-]
370+
[Rb-]
371+
[O+6]
372+
[Mo+5]
373+
[Cr+6]
374+
[O+2]
375+
[Os+3]
376+
[Rh+3]

chebai/preprocessing/bin/graph_properties/tokens.txt

Whitespace-only changes.

chebai/preprocessing/bin/selfies/tokens.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -772,3 +772,5 @@
772772
[Fm]
773773
[Md]
774774
[No]
775+
[HH1]
776+
[CH3-1]

0 commit comments

Comments
 (0)