Possible missing files in Hugging Face train_dataset/mcut/mcut_ba_small

Hi Jiale,

Thanks for open-sourcing ML4CO-Bench-101 — it has been very helpful for reproduction.

I noticed a possible issue with the Hugging Face training data for MCut.

In the repository README, users are asked to download `train_dataset` from Hugging Face. However, under:

`ML4CO-Bench-101-SL / train_dataset / mcut / mcut_ba_small`

I can currently only find one file:

- `mcut_ba-small_64k_1.txt`

This seems incomplete, because the ML4CO-Bench-101 paper lists the `MCut BA-SMALL` training set size as 128,000 instances. Based on the current Hugging Face folder, only a single 64K shard appears to be available.

For comparison, `mcut_ba_large` seems to be uploaded as multiple shards (`mcut_ba-large_16k_1.txt` to `mcut_ba-large_16k_8.txt`), which looks consistent with a complete 128K training set. So I suspect that `mcut_ba_small` may be missing the remaining shard(s), such as a second 64K file, or another equivalent complete upload.

Could you please check whether the `mcut_ba_small` training dataset on Hugging Face is incomplete, and if so, upload the full version?

This would be very helpful for reproducing the MCut BA-SMALL experiments in the benchmark.

Thanks again for releasing the benchmark and the code.

Best,
53mins

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible missing files in Hugging Face train_dataset/mcut/mcut_ba_small #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible missing files in Hugging Face train_dataset/mcut/mcut_ba_small #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions