-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Hey All,
I am having some issues with transitive rules and I was wondering if my results were expected or if there may be a bug lurking here.
I have a transitive rule that I have tried to implement in two different ways, both of which have unfavorable outcomes (consuming all memory and a presumably infinite loop).
All methods were run using the CLI from the ProbLog Python package (from PyPi, version 2.2.2):
python -m problog -v test_file.txt
Any help you can provide in getting either of these methods (or any other method of dealing with collective transitivity) would be greatly appreciated.
General Model
This is a small synthetic model with the end goal of inferring whether two people know each other.
The data and structure for this model comes from the psl-examples repository:
https://github.com/linqs/psl-examples/tree/develop/simple-acquaintances
The specific rules that my initial ProbLog rules are based on come from here:
https://github.com/linqs/psl-examples/blob/develop/simple-acquaintances/cli/simple-acquaintances.psl
These examples will only use three predicates:
knows/2- The final target predicate, indicating two people know each other.knows_l1/2- An intermediate predicate forknows/2.knows/3- Used in the second method to carry a list of seen nodes.
In the examples I will provide, I stripped down the rules and data to the smallest set that still causes these issues (I have not fully tested every possible subset of the data since these can take a while, but I have cut it down considerably).
First Method
The first method uses a very straightforward approach to transitivity (with the hopes that memoization will stop cycles from happening).
knows(P1, P2) :- knows_l1(P1, P2), (P1 =\= P2) .
0.167 :: knows(P1, P2) :- knows_l1(P1, OtherPerson), knows(OtherPerson, P2), (P1 =\= OtherPerson), (P1 =\= P2) .
1.0 :: knows_l1(0, 2) .
% ... More data here.
query(knows(0, 1)) .
The probabilistic facts enumerates through [0, 9] for each argument and excludes self references and the query (0, 1).
The full file is attached:
problog_transitive_method1.txt
I also attached the output of a profile (py-spy top):
problog_transitive_method1_profile_top.txt
When run, it hangs on the output:
[INFO] Output level: INFO
[INFO] Propagating evidence: 0.0000s
[INFO] Grounding: 0.0396s
[INFO] Cycle breaking: 2.8354s
The process will continue to use ~100% of a core and will keep accumulating memory until there is no more or it is killed.
On my machine, the this took about 50GB of RAM and swap combined.
Second Method
The second method uses a commonly recommended pattern for transitivity in ProLog, by keeping a list of previously seen nodes.
:- use_module(library(lists)).
knows(P1, P2) :- knows(P1, P2, []) .
knows(P1, P2, V) :-
knows_l1(P1, P2),
\+ memberchk((P1, P2), V) .
knows(P1, P2, V) :-
knows_l1(P1, OtherPerson),
\+ memberchk((P1, OtherPerson), V),
knows(OtherPerson, P2, [(P1, OtherPerson) | V]) .
1.0 :: knows_l1(0, 2) .
% ... More data here.
query(knows(0, 1)) .
The probabilistic facts enumerates through [0, 4] for each argument and excludes self references and the query (0, 1).
The full file is attached:
problog_transitive_method2.txt
I also attached the output of a profile (py-spy top):
problog_transitive_method2_profile_top.txt
When run, it hangs on the output:
[INFO] Output level: INFO
[INFO] Propagating evidence: 0.0000s
So it looks like it is stuck in grounding.
The process will continue to use ~100% of a core, but will not use more RAM.
I ran this for about 12 hours until I killed it.