Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
c1b4ab5
Improve automated tests
RohanBhattaraiNP Feb 7, 2025
9a80ce5
Add id return option to caltechdata_edit
tmorrell Feb 8, 2025
7c7e25f
Update test data files and test workflow (#60)
RohanBhattaraiNP Mar 5, 2025
a447acd
Use real data for tests
tmorrell Mar 5, 2025
3499bbf
Run CI on all changes
tmorrell Mar 5, 2025
92ba34f
Use real data for tests
tmorrell Mar 5, 2025
00b7bd1
Clean up tests and make parent optional
tmorrell Mar 8, 2025
8348a9a
Get full test suite working
tmorrell Mar 13, 2025
4f619fd
Bump for new release
tmorrell Mar 13, 2025
428a5b1
Sync setup.cfg with codemeta.json changes
github-actions[bot] Mar 13, 2025
40a16cc
Keep all codemeta actions in one run
tmorrell Mar 13, 2025
17a7f65
Switch out example file
tmorrell Mar 13, 2025
69d1e62
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Mar 13, 2025
28b733b
Add DOI to codemeta.json file
tmorrell Mar 13, 2025
efecc4c
Add updated CITATION.cff from codemeta.json file
tmorrell Mar 13, 2025
7ddd6be
CLI Documentation for Users (#66)
Kshemaahna Apr 7, 2025
2531f7b
add alt text
tmorrell Apr 7, 2025
50d2e9d
Update README.md
tmorrell Apr 7, 2025
89cb1bf
Reogranize docs
tmorrell Apr 7, 2025
6751dfa
Update cli.py to expand supported file options (#70)
Kshemaahna Jul 22, 2025
cc90ed3
Fix black
tmorrell Jul 22, 2025
f9a55d1
Bump version for release
tmorrell Jul 22, 2025
7150101
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Jul 22, 2025
06925e5
Update test prompt
tmorrell Jul 22, 2025
b64af55
Update test prompt
tmorrell Jul 22, 2025
2ed2816
Add DOI to codemeta.json file
tmorrell Jul 22, 2025
5ec9e89
Add updated CITATION.cff from codemeta.json file
tmorrell Jul 22, 2025
5fe50df
Add function to deny community requests
tmorrell Jul 30, 2025
dd068d9
Formatting
tmorrell Jul 30, 2025
6d6037a
Update for ROR v2
tmorrell Aug 6, 2025
d432976
Bump for release
tmorrell Aug 6, 2025
f2c6eed
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Aug 6, 2025
b375c07
Formatting
tmorrell Aug 6, 2025
3dbbf1a
Local authors development
tmorrell Aug 6, 2025
f67193a
Add DOI to codemeta.json file
tmorrell Aug 6, 2025
91b5880
Add updated CITATION.cff from codemeta.json file
tmorrell Aug 6, 2025
c6ec803
Switch to new OSN location
tmorrell Aug 22, 2025
efb9ecd
Improved description handling
tmorrell Aug 22, 2025
f28ce01
Bump for release
tmorrell Aug 22, 2025
7b6ddef
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Aug 22, 2025
a849676
Update for ROR v2
tmorrell Aug 22, 2025
a69be74
Add DOI to codemeta.json file
tmorrell Aug 22, 2025
44b8790
Add updated CITATION.cff from codemeta.json file
tmorrell Aug 22, 2025
165caed
Bump pypa/gh-action-pypi-publish in /.github/workflows
dependabot[bot] Sep 4, 2025
5c70ddc
Fix for records without description and add some update scripts
tmorrell Sep 9, 2025
41ee143
Version bump
tmorrell Sep 9, 2025
749d4d5
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Sep 9, 2025
43ff2af
Formatting
tmorrell Sep 9, 2025
26b88f9
Add DOI to codemeta.json file
tmorrell Sep 9, 2025
97b4165
Add updated CITATION.cff from codemeta.json file
tmorrell Sep 9, 2025
0ba257f
Add local option
tmorrell Sep 17, 2025
16a23a2
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Sep 17, 2025
676669c
Add DOI to codemeta.json file
tmorrell Sep 17, 2025
49aa0b6
Add updated CITATION.cff from codemeta.json file
tmorrell Sep 17, 2025
6dbe4aa
Authors add files script
tmorrell Oct 15, 2025
d6890a4
Make script more flexible
tmorrell Oct 17, 2025
4dbfa42
Bump version and add ROR
tmorrell Oct 18, 2025
434df0d
Add updated CITATION.cff and setup.cfg from codemeta.json file
tmorrell Oct 18, 2025
d19ac0e
Add handling of auto-accept comunities
tmorrell Oct 18, 2025
68f7768
Add DOI to codemeta.json file
tmorrell Oct 18, 2025
f3d5497
Add updated CITATION.cff from codemeta.json file
tmorrell Oct 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 10 additions & 20 deletions .github/workflows/bot.yaml
Original file line number Diff line number Diff line change
@@ -1,20 +1,6 @@
name: Bot validation

on:
push:
paths:
- 'caltechdata_api/cli.py'
- 'caltechdata_api/customize_schema.py'
- 'caltechdata_api/caltechdata_write.py'
- 'caltechdata_api/caltechdata_edit.py'
- 'README.md'
pull_request:
paths:
- 'caltechdata_api/cli.py'
- 'caltechdata_api/customize_schema.py'
- 'caltechdata_api/caltechdata_write.py'
- 'caltechdata_api/caltechdata_edit.py'
- 'README.md'
on: [push, pull_request]

jobs:
validate-metadata:
Expand All @@ -36,13 +22,17 @@ jobs:
pip install pytest requests s3fs cryptography
pip install .

- name: Run CaltechDATA Metadata Validation
- name: Run against CaltechData Test system
env:
CALTECHDATA_TOKEN: ${{ secrets.CALTECHDATA_TOKEN }}
run: |
python tests/bot_yaml.py
- name: Run Unit Tests
RDMTOK: ${{ secrets.CALTECHDATA_TOKEN }}
run: |
cd tests
pytest test_unit.py
pytest test_rdm.py
- name: Run Medata Validation Test and RDM
env:
RDMTOK: ${{ secrets.CALTECHDATA_TOKEN }}
run: |
cd tests
python bot_yaml.py

23 changes: 21 additions & 2 deletions .github/workflows/codemeta2cff.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,27 @@ jobs:
uses: actions/checkout@v4
- name: Convert CFF
uses: caltechlibrary/codemeta2cff@main
- name: Install jq for JSON parsing
run: sudo apt-get install -y jq
- name: Parse and update setup.cfg
run: |
# Extract values from codemeta.json
NAME=$(jq -r '.name' codemeta.json)
VERSION=$(jq -r '.version' codemeta.json)
AUTHORS=$(jq -r '[.author[] | .givenName + " " + .familyName] | join(", ")' codemeta.json)
AUTHOR_EMAILS=$(jq -r '[.author[] | .email // empty] | join(", ")' codemeta.json)
DESCRIPTION=$(jq -r '.description' codemeta.json)
URL=$(jq -r '.codeRepository // .url' codemeta.json)

# Update setup.cfg fields
sed -i "s/^name = .*/name = $NAME/" setup.cfg
sed -i "s/^version = .*/version = $VERSION/" setup.cfg
sed -i "s/^author = .*/author = $AUTHORS/" setup.cfg
sed -i "s/^author_email = .*/author_email = $AUTHOR_EMAILS/" setup.cfg
sed -i "s/^description = .*/description = $DESCRIPTION/" setup.cfg
sed -i "s|^url = .*|url = $URL|" setup.cfg
- name: Commit CFF
uses: EndBug/add-and-commit@v9
with:
message: 'Add updated CITATION.cff from codemeta.json file'
add: 'CITATION.cff'
message: 'Add updated CITATION.cff and setup.cfg from codemeta.json file'
add: '["setup.cfg", "CITATION.cff"]'
2 changes: 1 addition & 1 deletion .github/workflows/pypi-publish.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
run: |
python setup.py sdist bdist_wheel
- name: Publish
uses: pypa/gh-action-pypi-publish@v1.3.1
uses: pypa/gh-action-pypi-publish@v1.13.0
with:
user: __token__
password: ${{ secrets.pypi_token }}
50 changes: 0 additions & 50 deletions .github/workflows/update_setupcfg.yaml

This file was deleted.

9 changes: 6 additions & 3 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,18 @@ authors:
- family-names: Abakah
given-names: Alexander A
orcid: https://orcid.org/0009-0003-5640-6691
- family-names: Nagi
given-names: Kshemaahna
orcid: https://orcid.org/0009-0002-8113-3763
abstract: Python wrapper for CaltechDATA API.
repository-code: "https://github.com/caltechlibrary/caltechdata_api"
type: software
doi: 10.22002/bv2pv-2b295
version: 1.9.1
doi: 10.22002/2g4c7-zva46
version: 1.10.6
license-url: "https://data.caltech.edu/license"
keywords:
- GitHub
- metadata
- software
- InvenioRDM
date-released: 2025-02-06
date-released: 2025-10-18
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ pip install caltechdata_api

There are some example python scripts in the GitHub repository.

###Create a record:
### Create a record:

```shell
python write.py example.json -fnames logo.gif
Expand All @@ -39,7 +39,7 @@ python write.py example.json -fnames logo.gif
the end of a url to visit the record (e.g.
https://data.caltechlibrary.dev/records/pbkn6-m9y63)

###Edit a record
### Edit a record
Make changes to the example.json file to see a change)
```
python edit.py example.json -id pbkn6-m9y63
Expand Down Expand Up @@ -77,3 +77,6 @@ This returns the custom DOI of the record if it is successful.
Only test your application on the test repository (`data.caltechlibrary.dev`). Testing the API on the public
repository will generate junk records that are annoying to delete.

## Using the Command Line Interface

If you would like to interact with the CaltechDATA API using the Command line Interface (CLI), please [see the detailed documentation](https://caltechlibrary.github.io/caltechdata_api/caltechdata_api/cli-documentation-for-users).
50 changes: 50 additions & 0 deletions add_files_authors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import requests, os, argparse
from caltechdata_api import write_files_rdm

parser = argparse.ArgumentParser(
description="Add files to an existing CaltechAUTHORS record."
)
parser.add_argument(
"idv",
type=str,
help="The CaltechAUTHORS record idv to edit.",
)
parser.add_argument(
"files",
type=str,
nargs="+",
help="The files to upload to the record.",
)
args = parser.parse_args()
idv = args.idv
files = args.files
token = os.environ["RDMTOK"]
url = "https://authors.library.caltech.edu"

headers = {
"Authorization": "Bearer %s" % token,
"Content-type": "application/json",
}
f_headers = {
"Authorization": "Bearer %s" % token,
"Content-type": "application/octet-stream",
}

existing = requests.get(
url + "/api/records/" + idv + "/draft",
headers=headers,
)
if existing.status_code != 200:
raise Exception(f"Record {idv} does not exist, cannot edit")
data = existing.json()
data["files"] = {"enabled": True}
# Update metadata
result = requests.put(
url + "/api/records/" + idv + "/draft",
headers=headers,
json=data,
)
if result.status_code != 200:
raise Exception(result.text)
file_link = result.json()["links"]["files"]
write_files_rdm(files, file_link, headers, f_headers)
1 change: 1 addition & 0 deletions caltechdata_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
caltechdata_edit,
caltechdata_unembargo,
caltechdata_accept,
caltechdata_reject,
)
from .customize_schema import customize_schema, validate_metadata
from .get_metadata import get_metadata
Expand Down
69 changes: 64 additions & 5 deletions caltechdata_api/caltechdata_edit.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,48 @@ def caltechdata_accept(ids, token=None, production=False):
raise Exception(result.text)


def caltechdata_reject(ids, token=None, production=False, authors=False):
# Reject a record from a community

# If no token is provided, get from RDMTOK environment variable
if not token:
token = os.environ["RDMTOK"]

if production == True:
if authors:
url = "https://authors.library.caltech.edu"
else:
url = "https://data.caltech.edu"
else:
if authors:
url = "https://authors.caltechlibrary.dev"
else:
url = "https://data.caltechlibrary.dev"

headers = {
"Authorization": "Bearer %s" % token,
"Content-type": "application/json",
}

for idv in ids:
result = requests.get(
url + "/api/records/" + idv + "/draft/review", headers=headers
)
print(url + "/api/records/" + idv + "/draft/review")
if result.status_code != 200:
raise Exception(result.text)
accept_link = result.json()["links"]["actions"]["decline"]
data = comment = {
"payload": {
"content": "This record was declined automatically with the CaltechDATA API",
"format": "html",
}
}
result = requests.post(accept_link, json=data, headers=headers)
if result.status_code != 200:
raise Exception(result.text)


def caltechdata_edit(
idv,
metadata={},
Expand All @@ -66,6 +108,8 @@ def caltechdata_edit(
default_preview=None,
authors=False,
keepfiles=False,
return_id=False,
local=False,
):
# Make a copy of the metadata to make sure our local changes don't leak
metadata = copy.deepcopy(metadata)
Expand All @@ -81,14 +125,22 @@ def caltechdata_edit(
# Check if file links were provided in the metadata
descriptions = []
ex_file_links = []
ex_file_descriptions = []
if "descriptions" in metadata:
for d in metadata["descriptions"]:
if d["description"].startswith("Files available via S3"):
file_text = d["description"]
file_list = file_text.split('href="')
# Check if we have file_descriptions
split_comma = file_list[0].split(", ")
if len(split_comma) == 3:
ex_file_descriptions.append(split_comma[1])
# Loop over links in description, skip header text
for file in file_list[1:]:
ex_file_links.append(file.split('"\n')[0])
split_comma = file.split(", ")
if len(split_comma) == 3:
ex_file_descriptions.append(split_comma[1])
else:
descriptions.append(d)
# We remove file link descriptions, and re-add below
Expand All @@ -102,17 +154,21 @@ def caltechdata_edit(
# Otherwise we add file links found in the mtadata file
elif ex_file_links:
metadata = add_file_links(
metadata, ex_file_links, file_descriptions, s3_link=s3_link
metadata, ex_file_links, ex_file_descriptions, s3_link=s3_link
)

if authors == False:
if production == True:
url = "https://data.caltech.edu/"
elif local == True:
url = "https://127.0.0.1:5000/"
else:
url = "https://data.caltechlibrary.dev/"
else:
if production == True:
url = "https://authors.library.caltech.edu/"
elif local == True:
url = "https://127.0.0.1:5000/"
else:
url = "https://authors.caltechlibrary.dev/"

Expand Down Expand Up @@ -299,10 +355,13 @@ def caltechdata_edit(
result = requests.post(publish_link, headers=headers)
if result.status_code != 202:
raise Exception(result.text)
pids = result.json()["pids"]
if "doi" in pids:
return pids["doi"]["identifier"]
if return_id:
return result.json()["id"]
else:
return pids["oai"]["identifier"]
pids = result.json()["pids"]
if "doi" in pids:
return pids["doi"]["identifier"]
else:
return pids["oai"]["identifier"]
else:
return idv
Loading