Skip to content
This repository was archived by the owner on Mar 9, 2025. It is now read-only.

build(deps): bump datasets from 2.19.2 to 3.2.0#227

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/datasets-3.2.0
Open

build(deps): bump datasets from 2.19.2 to 3.2.0#227
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/datasets-3.2.0

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Dec 16, 2024

Bumps datasets from 2.19.2 to 3.2.0.

Release notes

Sourced from datasets's releases.

3.2.0

Dataset Features

  • Faster parquet streaming + filters with predicate pushdown by @​lhoestq in huggingface/datasets#7309
    • Up to +100% streaming speed
    • Fast filtering via predicate pushdown (skip files/row groups based on predicate instead of downloading the full data), e.g.
      from datasets import load_dataset
      filters = [('date', '>=', '2023')]
      ds = load_dataset("HuggingFaceFW/fineweb-2", "fra_Latn", streaming=True, filters=filters)

Other improvements and bug fixes

New Contributors

Full Changelog: huggingface/datasets@3.1.0...3.2.0

3.1.0

Dataset Features

  • Video support by @​lhoestq in huggingface/datasets#7230
    >>> from datasets import Dataset, Video, load_dataset
    >>> ds = Dataset.from_dict({"video":["path/to/Screen Recording.mov"]}).cast_column("video", Video())
    >>> # or from the hub
    >>> ds = load_dataset("username/dataset_name", split="train")
    >>> ds[0]["video"]
    <decord.video_reader.VideoReader at 0x105525c70>
  • Add IterableDataset.shard() by @​lhoestq in huggingface/datasets#7252
    >>> from datasets import load_dataset
    >>> full_ds = load_dataset("amphion/Emilia-Dataset", split="train", streaming=True)
    >>> full_ds.num_shards
    2360
    >>> ds = full_ds.shard(num_shards=ds.num_shards, index=0)
    >>> ds.num_shards
    1

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [datasets](https://github.com/huggingface/datasets) from 2.19.2 to 3.2.0.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](huggingface/datasets@2.19.2...3.2.0)

---
updated-dependencies:
- dependency-name: datasets
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Dec 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants