Reducing MSE Loss with poor R2 in a 0 to 1 regression challenge. #3414

ChrisJH1 · 2021-10-31T13:01:24Z

ChrisJH1
Oct 31, 2021

Hello all

I am trying to predict house prices using the following model, trained on a heterogenous graph.

class HeteroGNN(torch.nn.Module):
    def __init__(self, metadata, hidden_channels, out_channels, num_layers):
        super().__init__()

        self.convs = torch.nn.ModuleList()
        for _ in range(num_layers):
            conv = HeteroConv({edge_type: GraphConv((-1, -1), hidden_channels) for edge_type in data.edge_types
                              }, aggr='sum')
            self.convs.append(conv)

        self.lin = Linear(hidden_channels, out_channels)

    def forward(self, x_dict, edge_index_dict):
        for conv in self.convs:
            x_dict = conv(x_dict, edge_index_dict, edge_weight_dict=data.edge_weight_dict)
            x_dict = {key: F.relu(x) for key, x in x_dict.items()}#leaky_relu#torch.sigmoid
           return self.lin(x_dict['Property'])

I have scaled the y between 0 and 1 along with all of my other continuous x's

The loss successfully reduces to c0.25 mse after a 100 epochs but I have a negative R2 score on the train, val and test masks.

I understand that the model may still perform poorly on the val and test sets, but the negative train R2 is surprising considering the reducing loss. If the model was awful wouldn't the loss not reduce?

I have also constructed a tabular dataset with the same features to check their predictive value. I know, not a direct comparison but this removed the concern that the features are useless. A cross validated gradient boosting machine performs at 0.75 r2.

After looking at the GNN model predictions I can see that it is producing some negative outputs which surprised me as I didn't think this was possible with relu activation function?

Are there any other options for output layers and activation functions when undertaking regression with targets between 0 and 1? I have looked at sigmoid but always thought this was for multi-categorical classification.

Thanks in advance for any help!

Answered by rusty1s

Nov 2, 2021

Note that your final linear transformation does not have any activation or clamping, so you should either try to apply sigmoid or clamping (clamp(0, 1)) to your final predictions.

View full answer

rusty1s · 2021-11-02T07:17:35Z

rusty1s
Nov 2, 2021
Maintainer

Note that your final linear transformation does not have any activation or clamping, so you should either try to apply sigmoid or clamping (clamp(0, 1)) to your final predictions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reducing MSE Loss with poor R2 in a 0 to 1 regression challenge. #3414

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Reducing MSE Loss with poor R2 in a 0 to 1 regression challenge. #3414

Uh oh!

Uh oh!

ChrisJH1 Oct 31, 2021

Replies: 1 comment

Uh oh!

rusty1s Nov 2, 2021 Maintainer

ChrisJH1
Oct 31, 2021

rusty1s
Nov 2, 2021
Maintainer