Skip to content

TNeuralFit completes training successfully but reports TrainingAccuracy, ValidationAccuracy, and TestAccuracy as 0.00% even when the network demonstrably learns #173

@CFTechno

Description

@CFTechno

TNeuralFit Accuracy Metrics Report Zero Percent on Training/Validation/Test Sets

Description

TNeuralFit completes training successfully but reports TrainingAccuracy, ValidationAccuracy, and TestAccuracy as 0.00% even when the network demonstrably learns (outputs vary appropriately, loss decreases, predictions are non-trivial).

This makes TNeuralFit's accuracy metrics unreliable for monitoring training progress and model evaluation.

Steps to Reproduce

Test Case: Hypotenuse Function (Regression)

Based on the official examples/Hypotenuse example pattern:

program TestNeuralFitMetrics;
{$mode objfpc}{$H+}

uses
  Classes, SysUtils, Math,
  neuralnetwork, neuralvolume, neuraldatasets, neuralfit;

const
  TRAINING_SAMPLES = 1000;
  VALIDATION_SAMPLES = 100;
  TEST_SAMPLES = 100;

var
  NN: TNNet;
  Trainer: TNeuralFit;
  TrainingPairs, ValidationPairs, TestPairs: TNNetVolumePairList;
  i: integer;
  vOutput: TNNetVolume;

function CreateSimplePairs(Count: integer): TNNetVolumePairList;
var
  i: integer;
  vIn, vOut: TNNetVolume;
  X, Y, Z: single;
begin
  Result := TNNetVolumePairList.Create;
  for i := 0 to Count - 1 do
  begin
    X := Random * 100;
    Y := Random * 100;
    Z := Sqrt(X*X + Y*Y);
    
    vIn := TNNetVolume.Create(2, 1, 1);
    vIn.FData[0] := X / 100;
    vIn.FData[1] := Y / 100;
    
    vOut := TNNetVolume.Create(1, 1, 1);
    vOut.FData[0] := Z / 141.42;
    
    Result.Add(TNNetVolumePair.Create(vIn, vOut));
  end;
end;

begin
  TrainingPairs := CreateSimplePairs(TRAINING_SAMPLES);
  ValidationPairs := CreateSimplePairs(VALIDATION_SAMPLES);
  TestPairs := CreateSimplePairs(TEST_SAMPLES);

  NN := TNNet.Create;
  NN.AddLayer(TNNetInput.Create(2, 1, 1));
  NN.AddLayer(TNNetFullConnectReLU.Create(32));
  NN.AddLayer(TNNetFullConnectReLU.Create(32));
  NN.AddLayer(TNNetFullConnect.Create(1));

  Trainer := TNeuralFit.Create;
  Trainer.InitialLearningRate := 0.00001;
  Trainer.LearningRateDecay := 0;
  Trainer.L2Decay := 0;
  Trainer.Verbose := True;

  WriteLn('Training...');
  Trainer.Fit(NN, TrainingPairs, ValidationPairs, TestPairs, 32, 50);

  WriteLn('Final Training Accuracy: ', FormatFloat('0.00%', Trainer.TrainingAccuracy * 100));
  WriteLn('Final Validation Accuracy: ', FormatFloat('0.00%', Trainer.ValidationAccuracy * 100));
  WriteLn('Final Test Accuracy: ', FormatFloat('0.00%', Trainer.TestAccuracy * 100));
  WriteLn('');
  WriteLn('Sample Output vs Expected:');
  vOutput := TNNetVolume.Create(1, 1, 1);
  for i := 0 to 4 do
  begin
    NN.Compute(TestPairs[i].A);
    NN.GetOutput(vOutput);
    WriteLn(Format('  Output: %.4f, Expected: %.4f',
      [vOutput.FData[0], TestPairs[i].B.FData[0]]));
  end;

  TrainingPairs.Free;
  ValidationPairs.Free;
  TestPairs.Free;
  Trainer.Free;
  NN.Free;
end.

Expected Behavior

After 50 epochs of training on the hypotenuse function:

  • TrainingAccuracy should be >50% (network learned mapping)
  • ValidationAccuracy should be >40% (reasonable generalization)
  • TestAccuracy should be >40% (unseen data performance)

Actual Behavior

Final Training Accuracy: 0.00%
Final Validation Accuracy: 0.00%
Final Test Accuracy: 0.00%

Sample Output vs Expected:
  Output: 0.5585, Expected: 0.8089   (network outputs vary, learning occurred)
  Output: 0.6195, Expected: 0.9582
  Output: 0.3811, Expected: 0.4340
  Output: 0.5848, Expected: 0.8370

Analysis

  • ✅ Training completes successfully (no crashes)
  • ✅ Network demonstrably learns (outputs vary, are non-trivial, not random)
  • Reported accuracies are always 0.00% across all three sets
  • ❌ Makes TNeuralFit's metric reporting unreliable for monitoring

Possible Root Causes

  1. InferHitFn default implementation incompatible with regression targets
  2. Threading (TNeuralFit allocates 32 threads by default) interferes with metric calculation
  3. Accuracy calculation expects different target encoding/format
  4. Silent failure in metric computation while training proceeds

Workaround

Use manual training loops with NN.Compute() + NN.Backpropagate() for reliable accuracy metrics:

for epoch := 1 to MAX_EPOCHS do
begin
  for i := 0 to TrainingPairs.Count - 1 do
  begin
    NN.Compute(TrainingPairs[i].A);
    NN.Backpropagate(TrainingPairs[i].B);
  end;
  // Manual accuracy computation via forward passes
end;

Environment

  • CAI Neural API: (latest from master branch)
  • FreePascal: 3.3.1+
  • Platform: Windows 10/11
  • Lazarus: 3.x

Impact

This prevents reliable use of TNeuralFit for model evaluation and progress monitoring, forcing users back to manual training loops even when batching would be beneficial.

Metadata

Metadata

Labels

bugSomething isn't workingdocumentationImprovements or additions to documentation

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions