Skip to content

Commit 274783b

Browse files
severomishig25
andauthored
Fix links (#1377)
* no ../ links to other doc sites * fix link * fix link * remove broken links * fix link * fix format * fix anchor * fix anchors * Update docs/hub/datasets-adding.md Co-authored-by: Mishig <[email protected]> * https://.../docs -> /docs --------- Co-authored-by: Mishig <[email protected]>
1 parent 5aebdba commit 274783b

29 files changed

+48
-54
lines changed

docs/hub/datasets-adding.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -63,11 +63,11 @@ Adding a Dataset card is super valuable for helping users find your dataset and
6363

6464
## Using the `huggingface_hub` client library
6565

66-
The rich features set in the `huggingface_hub` library allows you to manage repositories, including creating repos and uploading datasets to the Hub. Visit [the client library's documentation](https://huggingface.co/docs/huggingface_hub/index) to learn more.
66+
The rich features set in the `huggingface_hub` library allows you to manage repositories, including creating repos and uploading datasets to the Hub. Visit [the client library's documentation](/docs/huggingface_hub/index) to learn more.
6767

6868
## Using other libraries
6969

70-
Some libraries like [🤗 Datasets](../datasets/index), [Pandas](https://pandas.pydata.org/), [Polars](https://pola.rs), [Dask](https://www.dask.org/) or [DuckDB](https://duckdb.org/) can upload files to the Hub.
70+
Some libraries like [🤗 Datasets](/docs/datasets/index), [Pandas](https://pandas.pydata.org/), [Polars](https://pola.rs), [Dask](https://www.dask.org/) or [DuckDB](https://duckdb.org/) can upload files to the Hub.
7171
See the list of [Libraries supported by the Datasets Hub](./datasets-libraries) for more information.
7272

7373
## Using Git
@@ -107,8 +107,8 @@ After uploading your dataset, make sure the Dataset Viewer correctly shows your
107107

108108
## Large scale datasets
109109

110-
The Hugging Face Hub supports large scale datasets, usually uploaded in Parquet (e.g. via `push_to_hub()` using [🤗 Datasets](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Dataset.push_to_hub)) or [WebDataset](https://github.com/webdataset/webdataset) format.
110+
The Hugging Face Hub supports large scale datasets, usually uploaded in Parquet (e.g. via `push_to_hub()` using [🤗 Datasets](/docs/datasets/main/en/package_reference/main_classes#datasets.Dataset.push_to_hub)) or [WebDataset](https://github.com/webdataset/webdataset) format.
111111

112112
You can upload large scale datasets at high speed using the `huggingface-hub` library.
113113

114-
See [how to upload a folder by chunks](https://huggingface.co/docs/huggingface_hub/guides/upload#upload-a-folder-by-chunks), the [tips and tricks for large uploads](https://huggingface.co/docs/huggingface_hub/guides/upload#tips-and-tricks-for-large-uploads) and the [repository limitations and recommendations](./repositories-recommendations).
114+
See [how to upload a folder by chunks](/docs/huggingface_hub/guides/upload#upload-a-folder-by-chunks), the [tips and tricks for large uploads](/docs/huggingface_hub/guides/upload#tips-and-tricks-for-large-uploads) and the [repository limitations and recommendations](./repositories-recommendations).

docs/hub/datasets-argilla.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ AI teams from companies like [the Red Cross](https://510.global/), [Loris.ai](ht
2727

2828
## Prerequisites
2929

30-
First [login with your Hugging Face account](../huggingface_hub/quick-start#login):
30+
First [login with your Hugging Face account](/docs/huggingface_hub/quick-start#login):
3131

3232
```bash
3333
huggingface-cli login

docs/hub/datasets-dask.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
11
# Dask
22

33
[Dask](https://github.com/dask/dask) is a parallel and distributed computing library that scales the existing Python and PyData ecosystem.
4-
Since it uses [fsspec](https://filesystem-spec.readthedocs.io) to read and write remote data, you can use the Hugging Face paths ([`hf://`](https://huggingface.co/docs/huggingface_hub/guides/hf_file_system#integrations)) to read and write data on the Hub:
4+
Since it uses [fsspec](https://filesystem-spec.readthedocs.io) to read and write remote data, you can use the Hugging Face paths ([`hf://`](/docs/huggingface_hub/guides/hf_file_system#integrations)) to read and write data on the Hub:
55

6-
First you need to [Login with your Hugging Face account](../huggingface_hub/quick-start#login), for example using:
6+
First you need to [Login with your Hugging Face account](/docs/huggingface_hub/quick-start#login), for example using:
77

88
```
99
huggingface-cli login
1010
```
1111

12-
Then you can [Create a dataset repository](../huggingface_hub/quick-start#create-a-repository), for example using:
12+
Then you can [Create a dataset repository](/docs/huggingface_hub/quick-start#create-a-repository), for example using:
1313

1414
```python
1515
from huggingface_hub import HfApi
1616

1717
HfApi().create_repo(repo_id="username/my_dataset", repo_type="dataset")
1818
```
1919

20-
Finally, you can use [Hugging Face paths](https://huggingface.co/docs/huggingface_hub/guides/hf_file_system#integrations) in Dask:
20+
Finally, you can use [Hugging Face paths](/docs/huggingface_hub/guides/hf_file_system#integrations) in Dask:
2121

2222
```python
2323
import dask.dataframe as dd
@@ -44,4 +44,4 @@ df_valid = dd.read_parquet("hf://datasets/username/my_dataset/validation")
4444
df_test = dd.read_parquet("hf://datasets/username/my_dataset/test")
4545
```
4646

47-
For more information on the Hugging Face paths and how they are implemented, please refer to the [the client library's documentation on the HfFileSystem](https://huggingface.co/docs/huggingface_hub/guides/hf_file_system).
47+
For more information on the Hugging Face paths and how they are implemented, please refer to the [the client library's documentation on the HfFileSystem](/docs/huggingface_hub/guides/hf_file_system).

docs/hub/datasets-data-files-configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Often it is as simple as naming your data files according to their split names,
77

88
## What are splits and subsets?
99

10-
Machine learning datasets typically have splits and may also have subsets. A dataset is generally made of _splits_ (e.g. `train` and `test`) that are used during different stages of training and evaluating a model. A _subset_ (also called _configuration_) is a sub-dataset contained within a larger dataset. Subsets are especially common in multilingual speech datasets where there may be a different subset for each language. If you're interested in learning more about splits and subsets, check out the [Splits and subsets](https://huggingface.co/docs/datasets-server/configs_and_splits) guide!
10+
Machine learning datasets typically have splits and may also have subsets. A dataset is generally made of _splits_ (e.g. `train` and `test`) that are used during different stages of training and evaluating a model. A _subset_ (also called _configuration_) is a sub-dataset contained within a larger dataset. Subsets are especially common in multilingual speech datasets where there may be a different subset for each language. If you're interested in learning more about splits and subsets, check out the [Splits and subsets](/docs/datasets-server/configs_and_splits) guide!
1111

1212
![split-configs-server](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/split-configs-server.gif)
1313

docs/hub/datasets-distilabel.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The Argilla community uses distilabel to create amazing [datasets](https://huggi
1414

1515
## Prerequisites
1616

17-
First [login with your Hugging Face account](../huggingface_hub/quick-start#login):
17+
First [login with your Hugging Face account](/docs/huggingface_hub/quick-start#login):
1818

1919
```bash
2020
huggingface-cli login

docs/hub/datasets-download-stats.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@
44

55
The Hub provides download stats for all datasets loadable via the `datasets` library. To determine the number of downloads, the Hub counts every time `load_dataset` is called in Python, excluding Hugging Face's CI tooling on GitHub. No information is sent from the user, and no additional calls are made for this. The count is done server-side as we serve files for downloads. This means that:
66

7-
* The download count is the same regardless of whether the data is directly stored on the Hub repo or if the repository has a [script](https://huggingface.co/docs/datasets/dataset_script) to load the data from an external source.
7+
* The download count is the same regardless of whether the data is directly stored on the Hub repo or if the repository has a [script](/docs/datasets/dataset_script) to load the data from an external source.
88
* If a user manually downloads the data using tools like `wget` or the Hub's user interface (UI), those downloads will not be included in the download count.

docs/hub/datasets-downloading.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ If a dataset on the Hub is tied to a [supported library](./datasets-libraries),
1616

1717
## Using the Hugging Face Client Library
1818

19-
You can use the [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub) library to create, delete, update and retrieve information from repos. You can also download files from repos or integrate them into your library! For example, you can quickly load a CSV dataset with a few lines using Pandas.
19+
You can use the [`huggingface_hub`](/docs/huggingface_hub) library to create, delete, update and retrieve information from repos. You can also download files from repos or integrate them into your library! For example, you can quickly load a CSV dataset with a few lines using Pandas.
2020

2121
```py
2222
from huggingface_hub import hf_hub_download

docs/hub/datasets-duckdb-auth.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ CREATE SECRET hf_token (TYPE HUGGINGFACE, PROVIDER credential_chain);
3131

3232
This command automatically retrieves the stored token from `~/.cache/huggingface/token`.
3333

34-
First you need to [Login with your Hugging Face account](../huggingface_hub/quick-start#login), for example using:
34+
First you need to [Login with your Hugging Face account](/docs/huggingface_hub/quick-start#login), for example using:
3535

3636
```bash
3737
huggingface-cli login
@@ -43,4 +43,4 @@ Alternatively, you can set your Hugging Face token as an environment variable:
4343
export HF_TOKEN="hf_xxxxxxxxxxxxx"
4444
```
4545

46-
For more information on authentication, see the [Hugging Face authentication](https://huggingface.co/docs/huggingface_hub/main/en/quick-start#authentication) documentation.
46+
For more information on authentication, see the [Hugging Face authentication](/docs/huggingface_hub/main/en/quick-start#authentication) documentation.

docs/hub/datasets-duckdb-select.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ SELECT COUNT(*) FROM 'hf://datasets/jamescalam/world-cities-geo/*.jsonl';
4949

5050
```
5151

52-
You can also query Parquet files using the `read_parquet` function (or its alias `parquet_scan`). This function, along with other [parameters]((https://duckdb.org/docs/data/parquet/overview.html#parameters)), provides flexibility in handling Parquet files specially if they dont have a `.parquet` extension. Let's explore these functions using the auto-converted Parquet files from the same dataset.
52+
You can also query Parquet files using the `read_parquet` function (or its alias `parquet_scan`). This function, along with other [parameters](https://duckdb.org/docs/data/parquet/overview.html#parameters), provides flexibility in handling Parquet files specially if they dont have a `.parquet` extension. Let's explore these functions using the auto-converted Parquet files from the same dataset.
5353

5454
Select using [read_parquet](https://duckdb.org/docs/guides/file_formats/query_parquet.html) function:
5555

docs/hub/datasets-fiftyone.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ FiftyOne datasets directly from the Hub.
1515

1616
## Prerequisites
1717

18-
First [login with your Hugging Face account](../huggingface_hub/quick-start#login):
18+
First [login with your Hugging Face account](/docs/huggingface_hub/quick-start#login):
1919

2020
```bash
2121
huggingface-cli login

0 commit comments

Comments
 (0)