Skip to content

Cannot load local dataset with run_image_classification_no_trainer.py #44190

@dyecon

Description

@dyecon

System Info

  • Ubuntu 24.04.4 LTS
  • Python 3.12.3
  • PyTorch 2.10.0

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Save the official example script: run_image_classification_no_trainer.py
  2. Obtain a dataset with 3 classes, like AI-Lab-Makerere/beans, unzip it, and save it to the directory which contains the script. Resize all images to 224x224.
  3. Configure accelerate (same steps as the official docs)
    pip install git+https://github.com/huggingface/accelerate
    accelerate config
    accelerate test
  4. Run the script:
    accelerate launch run_image_classification_no_trainer.py --image_column_name img --output_dir ./default_model --train_dir ./train`
    Here, ./train is the root directory of the beans dataset.

Result

The model is incorrectly trained on cifar10 even though a custom dataset was specified with --train_dir.
The 'config.json' file created in the output directory lists 10 classes, confirming this issue:

{
  "architectures": [
    "ViTForImageClassification"
  ],
  "attention_probs_dropout_prob": 0.0,
  "dtype": "float32",
  "encoder_stride": 16,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.0,
  "hidden_size": 768,
  "id2label": {
    "0": "airplane",
    "1": "automobile",
    "2": "bird",
    "3": "cat",
    "4": "deer",
    "5": "dog",
    "6": "frog",
    "7": "horse",
    "8": "ship",
    "9": "truck"
  },
  "image_size": 224,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "airplane": "0",
    "automobile": "1",
    "bird": "2",
    "cat": "3",
    "deer": "4",
    "dog": "5",
    "frog": "6",
    "horse": "7",
    "ship": "8",
    "truck": "9"
  },
  "layer_norm_eps": 1e-12,
  "model_type": "vit",
  "num_attention_heads": 12,
  "num_channels": 3,
  "num_hidden_layers": 12,
  "patch_size": 16,
  "pooler_act": "tanh",
  "pooler_output_size": 768,
  "problem_type": "single_label_classification",
  "qkv_bias": true,
  "transformers_version": "5.2.0"
}

Expected behavior

The model should be fine-tuned on the beans dataset, instead of falling back to CIFAR10.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions