How should this loss function be written? #19021

Nightbringers · 2023-12-17T05:10:05Z

Nightbringers
Dec 17, 2023

this is the model, it has two encoder and one decoder, call just for init, it not use in inference :

class model(nn.Module):
    dtype: jnp.dtype = jnp.float32


    def setup(self):
    

        self.encoder1 = encoder1(dtype = self.dtype)

        self.encoder2 = encoder2(dtype = self.dtype)

        self.decoder = decoder(dtype = self.dtype)

    def __call__(self, input1 ,input2,train: bool = True):
        s = self.encoder1(input1)

        s2 = self.encoder2(input2)

        out = self.decoder(s,train)

        return out

    def encode1(self, input):

        return self.encoder1(input)

    def encode2(self,input):

        return self.encoder2(input)

    def decoder(self, input, train: bool = True):
        return self.decoder(input,train)

this is i create TrainState, if it has any issue， please tell me:

class TrainState(train_state.TrainState):
  batch_stats: Any


def create_train_state(rng, learning_rate):
    agent = model(dtype=jnp.bfloat16)
    variables = agent.init(rng,input1,input2)
    params = variables['params']
    batch_stats = variables['batch_stats']
    tx = optax.adam(learning_rate=learning_rate)
    return TrainState.create(
        apply_fn=agent.apply, params=params, batch_stats=batch_stats, tx=tx)

I don't know how to write loss_fn, it have two input and two label, it should like this:
encode1(input1) -> s1 -> decoder(s1) -> predict1 -> (predict1,label1)
encode2(input2) -> s2 -> decoder(s2) -> predict2 -> (predict2,label2)

How should this loss function be written? please give me a example, thanks.

svarunid · 2023-12-17T15:26:21Z

svarunid
Dec 17, 2023

I don't fully understand the context but I do have a suggestion. The final loss calculation should only be done using the predictions and labels. It purely depends on the decoder's output and the labels. It is independent of the inputs you give to the encoder/decoder. So, you could try calculating the cross-entropy loss between prediction-label pairs individually, the taking the mean/sum.

Here's is a sample:

def loss(predict1, predict2, label1, label2):
    loss1 = cross_entropy(predict1, label1)
    loss2 = cross_entropy(predict2, label2)
    return (loss1 + loss2)/2

3 replies

Nightbringers Dec 17, 2023
Author

i mean a sample like this:
def train_step(state: TrainState, batch):
"""Train for a single step."""
def loss_fn(params):
logits, updates = state.apply_fn(
{'params': params, 'batch_stats': state.batch_stats},
x=batch['image'], train=True, mutable=['batch_stats'])
loss = optax.softmax_cross_entropy_with_integer_labels(
logits=logits, labels=batch['label'])
return loss, (logits, updates)
grad_fn = jax.value_and_grad(loss_fn, has_aux=True)
(loss, (logits, updates)), grads = grad_fn(state.params)
state = state.apply_gradients(grads=grads)
state = state.replace(batch_stats=updates['batch_stats'])

return state

svarunid Dec 17, 2023

Does the logits contain two probability distributions? And also does batch['label'] contrain the two labels?

Nightbringers Dec 17, 2023
Author

assume this is batch: batch['input1'], batch['input2'] , batch['label1'], batch['label2'].

encode1(input1) -> s1 -> decoder(s1) -> predict1 -> (predict1,label1)
encode2(input2) -> s2 -> decoder(s2) -> predict2 -> (predict2,label2)

this is the loss:
mse_loss1 = optax.l2_loss(predict1, batch['label1'])
mse_loss2 = optax.l2_loss(predict2, batch['label2'])
loss = mse_loss1+ mse_loss2

I need the rest of full train_step

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How should this loss function be written? #19021

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How should this loss function be written? #19021

Uh oh!

Nightbringers Dec 17, 2023

Replies: 1 comment · 3 replies

Uh oh!

Uh oh!

svarunid Dec 17, 2023

Uh oh!

Nightbringers Dec 17, 2023 Author

Uh oh!

svarunid Dec 17, 2023

Uh oh!

Nightbringers Dec 17, 2023 Author

Nightbringers
Dec 17, 2023

Replies: 1 comment 3 replies

svarunid
Dec 17, 2023

Nightbringers Dec 17, 2023
Author

Nightbringers Dec 17, 2023
Author