Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions cinnabar/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,13 @@ def mle(graph: nx.DiGraph, factor: str = "f_ij", node_factor: Union[str, None] =
action="symmetrize",
node_label=node_factor.replace("_", "_d"),
)
# if the uncertainty is zero, the MLE solver will fail
# set all non-diagonal zeros to a small value if exactly zero
# see <https://github.com/OpenFreeEnergy/cinnabar/issues/97> for details
for i in range(N):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be warning users that this is happening / we're fixing their data on the fly? In many ways the 1e-6 is likely going to come from cases where only a single repeat was used, so maybe we should warn folks that this isn't a great situation (given that MLE does use the errors).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could be warning but how many people listen to warnigs, maybe the alternative would be better, and we should raise an error if we detect this case, causing the user to handle this properly? They could of course just pad the values themselves in that case, but at least they know what is happening.

for j in range(N):
if i != j and df_ij[i, j] == 0.0:
df_ij[i, j] = 1e-6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like df_ij[df_ij == 0.0] = 1e-6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wasn't sure about replacing the absolute values on the diagonal as they are currently zero and it works but I guess there is no harm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah that's fair! better keep the diagonals zero

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh no this causes the F_matrix to change see here where each row is multipiled by some large inverse weight. I'll revert for now.


node_name_to_index = {}
for i, name in enumerate(graph.nodes()):
Expand Down
11 changes: 11 additions & 0 deletions cinnabar/tests/test_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,17 @@ def test_mle_relative():
)


def test_mle_zero_uncertainty():
"""
Test that the MLE works when some edges have zero uncertainty
"""
graph = nx.DiGraph()
edges = [(0, 1), (0, 2), (2, 1)]
for a, b in edges:
graph.add_edge(a, b, f_ij=1.0 + np.random.normal(0.0, scale=0.5), f_dij=0.0)
_, _ = stats.mle(graph, factor="f_ij", node_factor="f_i")


def test_correlation_positive(example_data):
"""
Test that the absolute DG plots have the correct signs,
Expand Down
23 changes: 23 additions & 0 deletions news/svd_padding.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
**Added:**

* <news item>

**Changed:**

* The MLE estimator will now automatically pad uncertainties of exactly zero to a small non-zero value to avoid issues with SVD calculations `PR#177 <https://github.com/OpenFreeEnergy/cinnabar/pull/177>`_.

**Deprecated:**

* <news item>

**Removed:**

* <news item>

**Fixed:**

* <news item>

**Security:**

* <news item>