Skip to content

Update readme fixes#58

Merged
ca16 merged 4 commits intomainfrom
chloea-add-back-no-sorting-for-readme-keys
Aug 14, 2025
Merged

Update readme fixes#58
ca16 merged 4 commits intomainfrom
chloea-add-back-no-sorting-for-readme-keys

Conversation

@ca16
Copy link
Copy Markdown
Collaborator

@ca16 ca16 commented Aug 13, 2025

snippet from the current README:

...
  features:
  - name: suite_config
    struct:
    - name: name
      dtype: string
    - name: version
      dtype: string
    - name: splits
      list:
      - name: name
        dtype: string
      - name: tasks
        list:
        - name: name
          dtype: string
        - name: path
          dtype: string
        - name: primary_metric
          dtype: string
        - name: tags
          sequence: string
      - name: macro_average_weight_adjustments
        list:
        - name: tag
          dtype: string
        - name: task
          dtype: string
        - name: weight
          dtype: float64
  - name: split
    dtype: string
...

if you do dry-run to update the readme without this change

  features:
  - name: suite_config
    struct:
    - dtype: string
      name: name
    - dtype: string
      name: version
    - list:
      - dtype: string
        name: name
      - list:
        - dtype: string
          name: name
        - dtype: string
          name: path
        - dtype: string
          name: primary_metric
        - list: string
          name: tags
        name: tasks
      - list:
        - dtype: string
          name: tag
        - dtype: string
          name: task
        - dtype: float64
          name: weight
        name: macro_average_weight_adjustments
      name: splits
  - dtype: string
    name: split

if you do dry-run to update the readme with this change

  features:
  - name: suite_config
    struct:
    - name: name
      dtype: string
    - name: version
      dtype: string
    - name: splits
      list:
      - name: name
        dtype: string
      - name: tasks
        list:
        - name: name
          dtype: string
        - name: path
          dtype: string
        - name: primary_metric
          dtype: string
        - name: tags
          list: string
      - name: macro_average_weight_adjustments
        list:
        - name: tag
          dtype: string
        - name: task
          dtype: string
        - name: weight
          dtype: float64
  - name: split
    dtype: string

Now that I know what's happening, it's less confusing, but might be nice to keep the old sorting for minimal diffs?

(note I do see a list vs sequence thing too but I think you said that's just packages being out of date)

@ca16 ca16 requested a review from rodneykinney August 13, 2025 17:00
@ca16
Copy link
Copy Markdown
Collaborator Author

ca16 commented Aug 13, 2025

For reference, this is where it used to be:

updated_yaml = yaml.dump(parsed_yaml, sort_keys=False).strip()

@ca16
Copy link
Copy Markdown
Collaborator Author

ca16 commented Aug 13, 2025

Note: I rebuilt my docker env (using --no-cache) and still see stuff like

        - name: tags
          list: string

vs

        - name: tags
          sequence: string

Maybe the first one is the latest thing?

@ca16
Copy link
Copy Markdown
Collaborator Author

ca16 commented Aug 13, 2025

I think I'm on the latest version of HF's datasets:

pip show datasets
Name: datasets
Version: 4.0.0
image

path_in_repo="README.md",
repo_id=repo_id,
commit_message=comment,
repo_type="dataset",
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a 404 without this.

Copy link
Copy Markdown
Member

@rodneykinney rodneykinney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I think for the maximum consistency we should have sorted keys, and run update_readme with the latest version of datasets. It might change the current README, but it should be functionally the same

@ca16
Copy link
Copy Markdown
Collaborator Author

ca16 commented Aug 13, 2025

Sounds good I'll take out the sorting=False! I think there's another small bug with a new line but I think I'm close to getting it...

configs.append(config_def)
yaml_section = yaml.dump({"configs": configs})
return f"---\n{yaml_section.strip()}\n---{self.text_content.lstrip()}"
return f"---\n{yaml_section.strip()}\n---\n{self.text_content.lstrip()}"
Copy link
Copy Markdown
Collaborator Author

@ca16 ca16 Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this I got

No YAML front matter found in README.md

the next time I ran something that tried to do the readme check, after making my initial change to add a new config.

@ca16
Copy link
Copy Markdown
Collaborator Author

ca16 commented Aug 14, 2025

@ca16 ca16 changed the title Add back no sorting for readme keys Update readme fixes Aug 14, 2025
@ca16 ca16 merged commit e01a903 into main Aug 14, 2025
4 checks passed
@ca16 ca16 deleted the chloea-add-back-no-sorting-for-readme-keys branch August 14, 2025 20:44
@ca16
Copy link
Copy Markdown
Collaborator Author

ca16 commented Aug 14, 2025

Published new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants