Skip to content

Commit 02413d1

Browse files
FarukhS52sdiazlor
andauthored
[Docs] : fix typos in docs (#5612)
# Description <!-- Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change. --> Closes #<issue_number> **Type of change** <!-- Please delete options that are not relevant. Remember to title the PR according to the type of change --> - Bug fix (non-breaking change which fixes an issue) - New feature (non-breaking change which adds functionality) - Breaking change (fix or feature that would cause existing functionality to not work as expected) - Refactor (change restructuring the codebase without changing functionality) - Improvement (change adding some improvement to an existing functionality) - Documentation update **How Has This Been Tested** <!-- Please add some reference about how your feature has been tested. --> **Checklist** <!-- Please go over the list and make sure you've taken everything into account --> - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: Sara Han <[email protected]>
1 parent 038172c commit 02413d1

File tree

8 files changed

+15
-15
lines changed

8 files changed

+15
-15
lines changed

docs/_source/conceptual_guides/data_model.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ record = rg.TextClassificationRecord(
133133

134134
##### Token classification
135135

136-
Tasks of the kind of token classification are NLP tasks aimed at dividing the input text into words, or syllables, and assigning certain values to them. Think about giving each word in a sentence its grammatical category or highlight which parts of a medical report belong to a certain specialty. There are some popular ones like NER or POS-tagging.
136+
Tasks of the kind of token classification are NLP tasks aimed at dividing the input text into words, or syllables, and assigning certain values to them. Think about giving each word in a sentence its grammatical category or highlight which parts of a medical report belong to a certain speciality. There are some popular ones like NER or POS-tagging.
137137

138138
```python
139139
record = rg.TokenClassificationRecord(
@@ -190,4 +190,4 @@ You can see our supported tasks at {ref}`tasks`.
190190

191191
### Settings
192192

193-
For now, only a set of predefined labels (labels schema) is configurable. Still, other settings like annotators, and metadata schema, are planned to be supported as part of dataset settings.
193+
For now, only a set of predefined labels (labels schema) is configurable. Still, other settings like annotators, and metadata schema, are planned to be supported as part of dataset settings.

docs/_source/getting_started/argilla.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ Finally, platforms like Snorkel, Prodigy or Scale, while more comprehensive, oft
138138
<summary>What is Argilla currently working on?</summary>
139139
<p>
140140

141-
We are continuously working on improving Argilla's features and usability, focusing now concentrating on a three-pronged vision: the development of Argilla Core (open-source), Distilabel, and Argilla JS/TS. You can find a list of our current projects <a href="https://github.com/orgs/argilla-io/projects/10/views/1">here</a>.
141+
We are continuously working on improving Argilla's features and usability, focusing now on a three-pronged vision: the development of Argilla Core (open-source), Distilabel, and Argilla JS/TS. You can find a list of our current projects <a href="https://github.com/orgs/argilla-io/projects/10/views/1">here</a>.
142142

143143
</p>
144144
</details>

docs/_source/getting_started/installation/deployments/cloud_providers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ gcloud auth login
157157

158158
### 2. Build and deploy the container
159159

160-
We will use the `gcloud run deploy` command to deploy the Argilla container directly from the Docker Hub. We can point the cloud run url to the container's default port (6900) and define relevant compute resouces.
160+
We will use the `gcloud run deploy` command to deploy the Argilla container directly from the Docker Hub. We can point the cloud run url to the container's default port (6900) and define relevant compute resources.
161161

162162
```bash
163163
gcloud run deploy <deployment-name> \

docs/_source/practical_guides/annotate_dataset.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ You can track your progress and the number of `Pending`, `Draft`, `Submitted` an
9090

9191
In Argilla's Feedback Task datasets, you can annotate and process records in two ways:
9292

93-
- **Focus view**: you can only see, respond and perfom actions on one record at a time. This is better for records that need to be examined closely and individually before responding.
93+
- **Focus view**: you can only see, respond and perform actions on one record at a time. This is better for records that need to be examined closely and individually before responding.
9494
- **Bulk view**: you can see multiple records in a list so you can respond and perform actions on more than one record at a time. This is useful for actions that can be taken on many records that have similar characteristics e.g., apply the same label to the results of a similarity search, discard all records in a specific language or save/submit records with a suggestion score over a safe threshold.
9595

9696
```{hint}
@@ -105,7 +105,7 @@ If you have a Span question in your dataset, you can always answer other questio
105105

106106
In the queue of **Pending** records, you can change from _Focus_ to _Bulk_ view. Once in the _Bulk view_, you can expand or collapse records --i.e. see the full length of all records in the page or set a fixed height-- and select the number of records you want to see per page.
107107

108-
To select or unselect all records in the page, click on the checkbox above the record list. To select or unselect specific records, click on the checkbox inside the individual record card. When you use filters inside the bulk view and the results are higher than the records visible in the page but lower than 1000, you will also have the option to select all of the results after you click on the checkbox. You can cancel this selection clicking on the _Cancel_ button.
108+
To select or unselect all records in the page, click on the checkbox above the record list. To select or unselect specific records, click on the checkbox inside the individual record card. When you use filters inside the bulk view and the results are higher than the records visible in the page but lower than 1000, you will also have the option to select all of the results after you click on the checkbox. You can cancel this selection by clicking on the _Cancel_ button.
109109

110110
Once records are selected, choose the responses that apply to all selected records (if any) and do the desired action: _Discard_, _Save as draft_ or even _Submit_. Note that you can only submit the records if all required questions have been answered.
111111

@@ -169,7 +169,7 @@ Not all filters listed below are available for all tasks.
169169

170170
##### Predictions filter
171171

172-
This filter allows you to filter records with respect of their predictions:
172+
This filter allows you to filter records with respect to their predictions:
173173

174174
- **Predicted as**: filter records by their predicted labels.
175175
- **Predicted ok**: filter records whose predictions do, or do not, match the annotations.
@@ -291,4 +291,4 @@ If you struggle to increase the overall coverage, try to filter for the records
291291
#### Manage rules
292292

293293
Here you will see a list of your saved rules.
294-
You can edit a rule by clicking on its name, or delete it by clicking on the trash icon.
294+
You can edit a rule by clicking on its name, or delete it by clicking on the trash icon.

docs/_source/practical_guides/collect_responses.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ We plan on adding more support for other metrics so feel free to reach out on ou
183183

184184
#### Model Metrics
185185

186-
In contrast to agreement metrics, where we compare the responses of annotators with each other, it is a good practice to evaluate the suggestions of models against the annotators as ground truths. As `FeedbackDataset` already offers the possibility to add `suggestions` to the responses, we can compare these initial predictions against the verified reponses. This will give us two important insights: how reliable the responses of a given annotator are, and how good the suggestions we are giving to the annotators are. This way, we can take action to improve the quality of the responses by making changes to the guidelines or the structure, and the suggestions given to the annotators by changing or updating the model we use. Note that each question type has a different set of metrics available.
186+
In contrast to agreement metrics, where we compare the responses of annotators with each other, it is a good practice to evaluate the suggestions of models against the annotators as ground truths. As `FeedbackDataset` already offers the possibility to add `suggestions` to the responses, we can compare these initial predictions against the verified responses. This will give us two important insights: how reliable the responses of a given annotator are, and how good the suggestions we are giving to the annotators are. This way, we can take action to improve the quality of the responses by making changes to the guidelines or the structure, and the suggestions given to the annotators by changing or updating the model we use. Note that each question type has a different set of metrics available.
187187

188188
Here is an example use of the `compute` function to calculate the metrics for a `FeedbackDataset`:
189189

@@ -495,4 +495,4 @@ f1(name="sst2").visualize()
495495
# now compute metrics for negation ( -> negative precision and positive recall go down)
496496
f1(name="sst2", query="n't OR not").visualize()
497497
```
498-
![F1 metrics from query](/_static/images/guides/metrics/negation_f1.png)
498+
![F1 metrics from query](/_static/images/guides/metrics/negation_f1.png)

docs/_source/practical_guides/export_dataset.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ remote_dataset = rg.FeedbackDataset.from_argilla("my-dataset", workspace="my-wor
2020
local_dataset = remote_dataset.pull(max_records=100) # get first 100 records
2121
```
2222

23-
If your dataset includes vectors, by default these will **not** get pulled with the rest of the dataset in order to improve performace. If you would like to pull the vectors in your records, you will need to specify it like so:
23+
If your dataset includes vectors, by default these will **not** get pulled with the rest of the dataset in order to improve performance. If you would like to pull the vectors in your records, you will need to specify it like so:
2424

2525
::::{tab-set}
2626

@@ -204,4 +204,4 @@ df = dataset_rg.to_pandas()
204204
df.to_csv("my_dataset.csv") # Save as CSV
205205
df.to_json("my_dataset.json") # Save as JSON
206206
df.to_parquet("my_dataset.parquet") # Save as Parquet
207-
```
207+
```

docs/_source/practical_guides/fine_tune.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -533,7 +533,7 @@ task = TrainingTask.for_sentence_similarity(
533533
)
534534
```
535535

536-
For datasets that where annotated with numerical values we could also pass the label strategy we want to use (let's assume we have another question in the dataset named "other-question" that contains values that come from rated answers):
536+
For datasets that were annotated with numerical values we could also pass the label strategy we want to use (let's assume we have another question in the dataset named "other-question" that contains values that come from rated answers):
537537

538538
```python
539539
task = TrainingTask.for_sentence_similarity(
@@ -1547,4 +1547,4 @@ Options:
15471547
--update-config-kwargs TEXT update_config() kwargs to be passed as a dictionary. [default: {}]
15481548
--help Show this message and exit.
15491549

1550-
```
1550+
```

docs/_source/tutorials_and_integrations/integrations/add_sentence_transformers_embeddings_as_vectors.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
"\n",
2424
"The basic idea is to use a pre-trained model to generate a vector representation for each relevant `TextFields` within the records. These vectors are then indexed within our databse and can then used to search based the similarity between texts. This should be useful for searching similar records based on the semantic meaning of the text.\n",
2525
"\n",
26-
"To get the these vectors and config, we will use the `SentenceTransformersExtractor` based on the [sentence-transformers](https://www.sbert.net/index.html) library. The default model we use for this is the [TaylorAI/bge-micro-v2](https://huggingface.co/TaylorAI/bge-micro-v2), which offers a nice trade-off between speed and accuracy, but you can use any model from the [sentence-transformers](https://www.sbert.net/index.html) library or from the [Hugging Face Hub](https://huggingface.co/models?library=sentence-transformers)."
26+
"To get these vectors and config, we will use the `SentenceTransformersExtractor` based on the [sentence-transformers](https://www.sbert.net/index.html) library. The default model we use for this is the [TaylorAI/bge-micro-v2](https://huggingface.co/TaylorAI/bge-micro-v2), which offers a nice trade-off between speed and accuracy, but you can use any model from the [sentence-transformers](https://www.sbert.net/index.html) library or from the [Hugging Face Hub](https://huggingface.co/models?library=sentence-transformers)."
2727
]
2828
},
2929
{

0 commit comments

Comments
 (0)