Skip to content

docs: Added starter dev notes on push to hugging face hub#355

Open
nabinchha wants to merge 9 commits intomainfrom
nmulepati/docs/dev-notes-push-to-huggingface-hub
Open

docs: Added starter dev notes on push to hugging face hub#355
nabinchha wants to merge 9 commits intomainfrom
nmulepati/docs/dev-notes-push-to-huggingface-hub

Conversation

@nabinchha
Copy link
Copy Markdown
Contributor

@nabinchha nabinchha commented Feb 26, 2026

Adds a dev note post to cover push_to_hub feature of Data Designer

@nabinchha nabinchha requested a review from a team as a code owner February 26, 2026 18:20
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 26, 2026

Greptile Summary

This PR adds a new developer notes post covering the push_to_hub / push_to_hub_from_folder feature of Data Designer, along with four supporting PNG images, two new author entries in .authors.yml, and an updated mkdocs.yml nav entry.

The post is well-structured and covers the full feature surface — auth resolution, what gets uploaded, processor first-class treatment, auto-generated dataset cards, and round-trip reproducibility via builder_config.json. The previously flagged <!-- more --> multiplicity issue has been resolved (single marker placed correctly after the intro), and the dataset card template path has been updated to the full, verified path packages/data-designer/src/data_designer/integrations/huggingface/dataset_card_template.md.

  • The only new finding is plans/479/skip-when-conditional-generation.md — an empty, unrelated file that crept into the branch via a Merge branch 'main' commit. It should be dropped before merging to keep the diff clean.

Confidence Score: 5/5

Safe to merge — all changes are documentation only, with one trivial empty-file cleanup suggestion

No P0 or P1 issues found. The single remaining finding (empty unrelated plans file) is P2 and has no functional impact on the published docs site. Prior review concerns around the excerpt marker have been addressed.

plans/479/skip-when-conditional-generation.md — empty unrelated file that should be removed from the PR

Important Files Changed

Filename Overview
docs/devnotes/posts/push-datasets-to-hugging-face-hub.md New blog post documenting push_to_hub feature; single marker placed correctly after intro, template path verified as full path, content is accurate and well-structured
docs/devnotes/.authors.yml Added two new blog authors (nmulepati, davanstrien) with correct names, descriptions, and GitHub avatar URLs
mkdocs.yml Added nav entry for the new push-to-hub post in the correct position (most recent first)
plans/479/skip-when-conditional-generation.md Empty file unrelated to this PR, pulled in from main via a merge commit — should be excluded from this changeset

Sequence Diagram

sequenceDiagram
    participant User
    participant DataDesigner
    participant Results
    participant HFHubClient
    participant HuggingFaceHub

    User->>DataDesigner: create(config_builder, num_records)
    DataDesigner-->>Results: results object (parquet + processor files)

    alt Happy path
        User->>Results: push_to_hub(repo_id, description, tags)
        Results->>HFHubClient: push_to_hub_from_folder(dataset_path, repo_id, ...)
    else Saved artifacts path
        User->>HFHubClient: push_to_hub_from_folder(dataset_path, repo_id, ...)
    end

    HFHubClient->>HFHubClient: Resolve token (explicit → HF_TOKEN → cached creds)
    HFHubClient->>HuggingFaceHub: Upload README.md (dataset card)
    HFHubClient->>HuggingFaceHub: Upload data/*.parquet
    HFHubClient->>HuggingFaceHub: Upload images/* (if present)
    HFHubClient->>HuggingFaceHub: Upload {processor}/* per processor
    HFHubClient->>HuggingFaceHub: Upload builder_config.json
    HFHubClient->>HuggingFaceHub: Upload metadata.json (paths rewritten)
    HuggingFaceHub-->>User: dataset URL

    Note over User,HuggingFaceHub: Round-trip: load builder_config.json URL → from_config() → recreate pipeline
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: plans/479/skip-when-conditional-generation.md
Line: 1

Comment:
**Unrelated empty file included via merge**

This file is completely empty and unrelated to the push-to-hub docs PR. It appears to have been pulled in through one of the `Merge branch 'main' into ...` commits while syncing the branch. It belongs to a separate planning effort (`plans/479/`) and should not be part of this changeset.

Consider removing it before merging to keep the PR diff focused.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (9): Last reviewed commit: "Merge branch 'main' into nmulepati/docs/..." | Re-trigger Greptile

dhruvnathawani
dhruvnathawani previously approved these changes Feb 26, 2026
Copy link
Copy Markdown
Contributor

@dhruvnathawani dhruvnathawani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you use AI for the images?
LGTM

Move the single <\!-- more --> to after the intro paragraph for a shorter
blog teaser and remove the 6 redundant markers throughout the post.
@nabinchha
Copy link
Copy Markdown
Contributor Author

Did you use AI for the images? LGTM

@dhruvnathawani, yes!

nabinchha and others added 2 commits March 9, 2026 09:45
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
davanstrien and others added 3 commits March 30, 2026 09:13
* docs: add HF ecosystem context to push-to-hub dev notes

Add section on what datasets get on the Hub (Dataset Viewer, streaming,
Viewer API), link to Hub search for DataDesigner datasets, and note that
private datasets can be flipped to public.

* Update docs/devnotes/posts/push-datasets-to-hugging-face-hub.md

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* fix: remove doubled library: prefix in Hub search URL

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants