Skip to content

mscnn_eval.py results #18

@ktneale

Description

@ktneale

Training the model for 48 hours on the Dataset using mscnn_train.py appears to show the model converging, however I understand that further training time might be required i.e. the full 100k steps.

However, when running mscnn_eval.py on a snapshot, the following output is observed:

time: 10.645102999999999 loss_value: inf counting:172.0000040 predict:4326801291354836713957490688.0000000 diff:-4326801291354836713957490688.0000000 ./mscnn_eval.py:108: RuntimeWarning: overflow encountered in multiply sum_all_mse += sum_ab * sum_ab

As you can seem, the predicted value and loss value do not make sense. Is the evaluation routine broken?

After editing mscnn_eval.py to use the batch norm version of the inference method (as used by mscnn_train.py):

#predict_op = mscnn.inference(images)
predict_op = mscnn.inference_bn(images)

The results now seem more sensible:

time: 0.2461100000000016 loss_value: 18.351429 counting:370.9999983 predict:327.6858826 diff:43.3141174
time: 0.24157399999999996 loss_value: 13.386545 counting:501.9999339 predict:500.1551819 diff:1.8447571
time: 0.21929399999999788 loss_value: 24.295664 counting:1067.9998121 predict:985.3113403 diff:82.6884155
time: 0.2419879999999992 loss_value: 37.284615 counting:320.9999975 predict:236.7593231 diff:84.2406769

Is this the correct thing to do?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions