Skip to content

Commit 9958e7f

Browse files
feat: aau i can add asset metadata as a keyvalue list that will be (#1880)
Co-authored-by: paulruelle <[email protected]>
1 parent b72e190 commit 9958e7f

File tree

12 files changed

+547
-83
lines changed

12 files changed

+547
-83
lines changed

docs/sdk/tutorials/importing_assets_and_metadata.md

Lines changed: 53 additions & 25 deletions
Large diffs are not rendered by default.

recipes/img/json_metadata.png

763 KB
Loading

recipes/importing_assets_and_metadata.ipynb

Lines changed: 57 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@
5151
"cell_type": "markdown",
5252
"metadata": {},
5353
"source": [
54-
"First, let's install and import the required modules."
54+
"First, let's install and import the required modules.\n"
5555
]
5656
},
5757
{
@@ -273,48 +273,58 @@
273273
"\n",
274274
"- `imageUrl`\n",
275275
"- `text`\n",
276-
"- `url`"
276+
"- `url`\n",
277+
"\n",
278+
"\n",
279+
"## Setting metadata properties"
277280
]
278281
},
279282
{
280283
"attachments": {},
281284
"cell_type": "markdown",
282285
"metadata": {},
283286
"source": [
284-
"As an optional step, you can set data types for each type of your metadata.\n",
285-
"The default data type is `string`, but setting some of your metadata as `number` can really help apply filters on your assets later on.\n",
287+
"As an optional step, you can define properties for each type of your metadata.\n",
288+
"These properties allow you to control:\n",
286289
"\n",
287-
"Note that we don't need to set data types for `imageUrl`, `text`, and `url`."
290+
"- The data type (`string` or `number`)\n",
291+
"- Whether the metadata is filterable in project queue\n",
292+
"- Visibility of each metadata to labelers and reviewers\n"
288293
]
289294
},
290295
{
291296
"cell_type": "code",
292297
"execution_count": null,
293298
"metadata": {},
294-
"outputs": [
295-
{
296-
"data": {
297-
"text/plain": [
298-
"{'id': 'cllamrwgl00670j393poh2t4j',\n",
299-
" 'metadataTypes': {'sensitiveData': 'string',\n",
300-
" 'customConsensus': 'number',\n",
301-
" 'uploadedFromCloud': 'string',\n",
302-
" 'modelLabelErrorScore': 'number'}}"
303-
]
304-
},
305-
"execution_count": null,
306-
"metadata": {},
307-
"output_type": "execute_result"
308-
}
309-
],
299+
"outputs": [],
310300
"source": [
311301
"kili.update_properties_in_project(\n",
312302
" project_id=project_id,\n",
313-
" metadata_types={\n",
314-
" \"customConsensus\": \"number\",\n",
315-
" \"sensitiveData\": \"string\",\n",
316-
" \"uploadedFromCloud\": \"string\",\n",
317-
" \"modelLabelErrorScore\": \"number\",\n",
303+
" metadata_properties={\n",
304+
" \"customConsensus\": {\n",
305+
" \"type\": \"number\",\n",
306+
" \"filterable\": True,\n",
307+
" \"visibleByLabeler\": True,\n",
308+
" \"visibleByReviewer\": True,\n",
309+
" },\n",
310+
" \"sensitiveData\": {\n",
311+
" \"type\": \"string\",\n",
312+
" \"filterable\": True,\n",
313+
" \"visibleByLabeler\": False, # Hide this from labelers\n",
314+
" \"visibleByReviewer\": True,\n",
315+
" },\n",
316+
" \"uploadedFromCloud\": {\n",
317+
" \"type\": \"string\",\n",
318+
" \"filterable\": True,\n",
319+
" \"visibleByLabeler\": True,\n",
320+
" \"visibleByReviewer\": True,\n",
321+
" },\n",
322+
" \"modelLabelErrorScore\": {\n",
323+
" \"type\": \"number\",\n",
324+
" \"filterable\": True,\n",
325+
" \"visibleByLabeler\": True,\n",
326+
" \"visibleByReviewer\": True,\n",
327+
" },\n",
318328
" },\n",
319329
")"
320330
]
@@ -324,6 +334,17 @@
324334
"cell_type": "markdown",
325335
"metadata": {},
326336
"source": [
337+
"> **Note**: The previous `metadata_types` parameter is deprecated. Please use metadata_properties instead. If you use metadata_types, it will still work but will be converted to metadata_properties internally with default visibility and filterability settings.\n",
338+
"\n",
339+
"If you don't specify all properties, default values will be used:\n",
340+
"\n",
341+
"```\n",
342+
"filterable: true\n",
343+
"type: 'string'\n",
344+
"visibleByLabeler: true\n",
345+
"visibleByReviewer: true\n",
346+
"```\n",
347+
"\n",
327348
"Now we can add metadata to our assets:"
328349
]
329350
},
@@ -355,19 +376,17 @@
355376
" \"sensitiveData\": \"yes\",\n",
356377
" \"uploadedFromCloud\": \"no\",\n",
357378
" \"modelLabelErrorScore\": 50,\n",
358-
" # Add metadata that will be visible to labelers in the labeling interface:\n",
359-
" \"imageUrl\": \"www.example.com/image_1.png\",\n",
360-
" \"text\": \"some text for asset 1\",\n",
379+
" \"imageUrl\": \"https://placehold.co/600x400/EEE/31343C\",\n",
380+
" \"text\": \"Some text for asset 1\",\n",
361381
" \"url\": \"www.example-website.com\",\n",
362382
" },\n",
363383
" {\n",
364384
" \"customConsensus\": 40,\n",
365385
" \"sensitiveData\": \"no\",\n",
366386
" \"uploadedFromCloud\": \"yes\",\n",
367387
" \"modelLabelErrorScore\": 30,\n",
368-
" # Add metadata that will be visible to labelers in the labeling interface:\n",
369-
" \"imageUrl\": \"www.example.com/image_2.png\",\n",
370-
" \"text\": \"some text for asset 2\",\n",
388+
" \"imageUrl\": \"https://placehold.co/600x400/EEE/31343C\",\n",
389+
" \"text\": \"Some text for asset 2\",\n",
371390
" \"url\": \"www.example-website.com\",\n",
372391
" },\n",
373392
" ],\n",
@@ -378,7 +397,10 @@
378397
"cell_type": "markdown",
379398
"metadata": {},
380399
"source": [
381-
"In the labeling interface, we can see that the assets have some metadata:"
400+
"> **Note** : alternatively, you can use `kili.set_metadata` or `kili.add_metadata` methods.\n",
401+
"\n",
402+
"\n",
403+
"In the labeling interface, we can see that the assets have some metadata (note that `sensitiveData` will be hidden from labelers based on our settings)."
382404
]
383405
},
384406
{
@@ -390,7 +412,7 @@
390412
"cell_type": "markdown",
391413
"metadata": {},
392414
"source": [
393-
"![image.png](attachment:0bea7811-9a67-461c-b716-319de1343ac8.png)"
415+
"![image.png](./img/json_metadata.png)"
394416
]
395417
},
396418
{
@@ -441,7 +463,7 @@
441463
"cell_type": "markdown",
442464
"metadata": {},
443465
"source": [
444-
"We've successfully set up a Kili project, imported assets to it, and finally added some metadata to our assets. Well done!"
466+
"We've successfully set up a Kili project, imported assets to it, and finally added some metadata to our assets with advanced property settings. Well done!"
445467
]
446468
}
447469
],

src/kili/adapters/kili_api_gateway/project/mappers.py

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def project_where_mapper(filters: ProjectFilters) -> Dict:
2929

3030
def project_data_mapper(data: ProjectDataKiliAPIGatewayInput) -> Dict:
3131
"""Build the GraphQL ProjectData variable to be sent in an operation."""
32-
return {
32+
result = {
3333
"archived": data.archived,
3434
"author": data.author,
3535
"complianceTags": data.compliance_tags,
@@ -42,7 +42,6 @@ def project_data_mapper(data: ProjectDataKiliAPIGatewayInput) -> Dict:
4242
"inputType": data.input_type,
4343
"instructions": data.instructions,
4444
"jsonInterface": data.json_interface,
45-
"metadataTypes": data.metadata_types,
4645
"minConsensusSize": data.min_consensus_size,
4746
"numberOfAssets": data.number_of_assets,
4847
"rules": data.rules,
@@ -55,3 +54,18 @@ def project_data_mapper(data: ProjectDataKiliAPIGatewayInput) -> Dict:
5554
"title": data.title,
5655
"useHoneyPot": data.use_honeypot,
5756
}
57+
58+
if data.metadata_properties is not None:
59+
result["metadataProperties"] = data.metadata_properties
60+
elif data.metadata_types is not None:
61+
metadata_properties = {}
62+
for key, type_value in data.metadata_types.items():
63+
metadata_properties[key] = {
64+
"filterable": True,
65+
"type": type_value,
66+
"visibleByLabeler": True,
67+
"visibleByReviewer": True,
68+
}
69+
result["metadataProperties"] = metadata_properties
70+
71+
return result

src/kili/adapters/kili_api_gateway/project/types.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ class ProjectDataKiliAPIGatewayInput:
2525
instructions: Optional[str]
2626
json_interface: Optional[str]
2727
metadata_types: Optional[Dict]
28+
metadata_properties: Optional[Dict]
2829
min_consensus_size: Optional[int]
2930
number_of_assets: Optional[int]
3031
rules: Optional[str]

src/kili/entrypoints/mutations/asset/__init__.py

Lines changed: 114 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
"""Asset mutations."""
2-
32
import warnings
43
from typing import Any, Dict, List, Literal, Optional, Union, cast
54

@@ -11,7 +10,7 @@
1110
from kili.adapters.kili_api_gateway.helpers.queries import QueryOptions
1211
from kili.core.helpers import is_empty_list_with_warning
1312
from kili.core.utils.pagination import mutate_from_paginated_call
14-
from kili.domain.asset import AssetFilters
13+
from kili.domain.asset import AssetFilters, AssetId
1514
from kili.domain.project import ProjectId
1615
from kili.entrypoints.base import BaseOperationEntrypointMixin
1716
from kili.entrypoints.mutations.asset.helpers import (
@@ -91,8 +90,8 @@ def append_many_to_dataset(
9190
9291
json_metadata_array: The metadata given to each asset should be stored in a json like dict with keys.
9392
94-
- Add metadata visible on the asset with the following keys: `imageUrl`, `text`, `url`.
95-
Example for one asset: `json_metadata_array = [{'imageUrl': '','text': '','url': ''}]`.
93+
- Add metadata visible on the asset
94+
Example for one asset: `json_metadata_array = [{'imageUrl': '','text': '','url': '','key1': 'value1'}]`.
9695
- For VIDEO projects (and not VIDEO_LEGACY), you can specify a value with key 'processingParameters' to specify the sampling rate (default: 30).
9796
Example for one asset: `json_metadata_array = [{'processingParameters': {'framesPlayedPerSecond': 10}}]`.
9897
- In Image projects with geoTIFF assets, you can specify the epsg, the `minZoom` and `maxZoom` values for the `processingParameters` key.
@@ -409,6 +408,117 @@ def generate_variables(batch: Dict) -> Dict:
409408
formated_results = [self.format_result("data", result, None) for result in results]
410409
return [item for batch_list in formated_results for item in batch_list]
411410

411+
@typechecked
412+
def add_metadata(
413+
self,
414+
json_metadata: List[Dict[str, Union[str, int, float]]],
415+
asset_ids: List[str],
416+
project_id: str,
417+
) -> List[Dict[Literal["id"], str]]:
418+
"""Add metadata to assets without overriding existing metadata.
419+
420+
Args:
421+
json_metadata: List of metadata dictionaries to add to each asset.
422+
Each dictionary contains key/value pairs to be added to the asset's metadata.
423+
asset_ids: The asset IDs to modify.
424+
project_id: The project ID.
425+
426+
Returns:
427+
A list of dictionaries with the asset ids.
428+
429+
Examples:
430+
>>> kili.add_metadata(
431+
json_metadata=[
432+
{"key1": "value1", "key2": "value2"},
433+
{"key3": "value3"}
434+
],
435+
asset_ids=["ckg22d81r0jrg0885unmuswj8", "ckg22d81s0jrh0885pdxfd03n"],
436+
project_id="cm92to3cx012u7l0w6kij9qvx"
437+
)
438+
"""
439+
if is_empty_list_with_warning("add_metadata", "json_metadata", json_metadata):
440+
return []
441+
442+
assets = self.kili_api_gateway.list_assets(
443+
AssetFilters(
444+
project_id=ProjectId(project_id), asset_id_in=cast(List[AssetId], asset_ids)
445+
),
446+
["id", "jsonMetadata"],
447+
QueryOptions(disable_tqdm=True),
448+
)
449+
450+
json_metadatas = []
451+
for i, asset in enumerate(assets):
452+
current_metadata = asset.get("jsonMetadata", {}) if asset.get("jsonMetadata") else {}
453+
new_metadata = json_metadata[i] if i < len(json_metadata) else {}
454+
455+
current_metadata.update(new_metadata)
456+
457+
json_metadatas.append(current_metadata)
458+
459+
return self.update_properties_in_assets(
460+
asset_ids=asset_ids,
461+
json_metadatas=json_metadatas,
462+
)
463+
464+
@typechecked
465+
def set_metadata(
466+
self,
467+
json_metadata: List[Dict[str, Union[str, int, float]]],
468+
asset_ids: List[str],
469+
project_id: str,
470+
) -> List[Dict[Literal["id"], str]]:
471+
"""Set metadata on assets, replacing any existing metadata.
472+
473+
Args:
474+
json_metadata: List of metadata dictionaries to set on each asset.
475+
Each dictionary contains key/value pairs to be set as the asset's metadata.
476+
asset_ids: The asset IDs to modify.
477+
project_id: The project ID.
478+
479+
Returns:
480+
A list of dictionaries with the asset ids.
481+
482+
Examples:
483+
>>> kili.set_metadata(
484+
json_metadata=[
485+
{"key1": "value1", "key2": "value2"},
486+
{"key3": "value3"}
487+
],
488+
asset_ids=["ckg22d81r0jrg0885unmuswj8", "ckg22d81s0jrh0885pdxfd03n"],
489+
project_id="cm92to3cx012u7l0w6kij9qvx"
490+
)
491+
"""
492+
if is_empty_list_with_warning("set_metadata", "json_metadata", json_metadata):
493+
return []
494+
495+
assets = self.kili_api_gateway.list_assets(
496+
AssetFilters(
497+
project_id=ProjectId(project_id), asset_id_in=cast(List[AssetId], asset_ids)
498+
),
499+
["id", "jsonMetadata"],
500+
QueryOptions(disable_tqdm=True),
501+
)
502+
503+
json_metadatas = []
504+
for i, asset in enumerate(assets):
505+
current_metadata = asset.get("jsonMetadata", {}) if asset.get("jsonMetadata") else {}
506+
new_metadata = json_metadata[i] if i < len(json_metadata) else {}
507+
508+
special_keys = ["text", "imageUrl", "url", "processingParameters"]
509+
preserved_metadata = {
510+
k: current_metadata[k] for k in special_keys if k in current_metadata
511+
}
512+
513+
preserved_metadata.update(new_metadata)
514+
515+
json_metadatas.append(preserved_metadata)
516+
517+
return self.update_properties_in_assets(
518+
asset_ids=asset_ids,
519+
json_metadatas=json_metadatas,
520+
)
521+
412522
@typechecked
413523
def change_asset_external_ids(
414524
self,

src/kili/entrypoints/mutations/project/queries.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
$instructions: String
2525
$inputType: InputType
2626
$jsonInterface: String
27-
$metadataTypes: JSON
27+
$metadataProperties: JSON
2828
$minConsensusSize: Int
2929
$numberOfAssets: Int
3030
$numberOfSkippedAssets: Int
@@ -50,7 +50,7 @@
5050
instructions: $instructions
5151
inputType: $inputType
5252
jsonInterface: $jsonInterface
53-
metadataTypes: $metadataTypes
53+
metadataProperties: $metadataProperties
5454
minConsensusSize: $minConsensusSize
5555
numberOfAssets: $numberOfAssets
5656
numberOfSkippedAssets: $numberOfSkippedAssets

0 commit comments

Comments
 (0)