-
Notifications
You must be signed in to change notification settings - Fork 12
Add ingest.sh script. #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| # Data management | ||
|
|
||
| eoAPI-k8s provides a basic data ingestion process that consist of manual operations on the components of the stack. | ||
|
|
||
| # Load data | ||
|
|
||
| You will have to have STAC records for the collection and items you wish to load (e.g., `collections.json` and `items.json`). | ||
| [This repo](https://github.com/vincentsarago/MAXAR_opendata_to_pgstac) contains a few script that may help you to generate sample input data. | ||
|
|
||
| ## Preshipped bash script | ||
|
|
||
| Execute `make ingest` to load data into the eoAPI service - it expects `collections.json` and `items.json` in the current directory. | ||
|
|
||
| ## Manual steps | ||
|
|
||
| In order to add raster data to eoAPI you can load STAC collections and items into the PostgreSQL database using pgSTAC and the tool `pypgstac`. | ||
|
|
||
| First, ensure your Kubernetes cluster is running and `kubectl` is configured to access and modify it. | ||
|
|
||
| In a second step, you'll have to upload the data into the pod running the raster eoAPI service. You can use the following commands to copy the data: | ||
|
|
||
| ```bash | ||
| kubectl cp collections.json "$NAMESPACE/$EOAPI_POD_RASTER":/tmp/collections.json | ||
| kubectl cp items.json "$NAMESPACE/$EOAPI_POD_RASTER":/tmp/items.json | ||
| ``` | ||
| Then, bash into the pod or server running the raster eoAPI service, you can use the following commands to load the data: | ||
|
|
||
| ```bash | ||
| #!/bin/bash | ||
| apt update -y && apt install python3 python3-pip -y && pip install pypgstac[psycopg]'; | ||
| pypgstac pgready --dsn $PGADMIN_URI | ||
| pypgstac load collections /tmp/collections.json --dsn $PGADMIN_URI --method insert_ignore | ||
| pypgstac load items /tmp/items.json --dsn $PGADMIN_URI --method insert_ignore | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| #!/bin/bash | ||
|
|
||
| # Default files | ||
| DEFAULT_COLLECTIONS_FILE="./collections.json" | ||
| DEFAULT_ITEMS_FILE="./items.json" | ||
|
|
||
| # Check for provided parameters or use defaults | ||
| if [ "$#" -eq 2 ]; then | ||
| EOAPI_COLLECTIONS_FILE="$1" | ||
| EOAPI_ITEMS_FILE="$2" | ||
| else | ||
| EOAPI_COLLECTIONS_FILE="$DEFAULT_COLLECTIONS_FILE" | ||
| EOAPI_ITEMS_FILE="$DEFAULT_ITEMS_FILE" | ||
| echo "No specific files provided. Using defaults:" | ||
| echo " Collections file: $EOAPI_COLLECTIONS_FILE" | ||
| echo " Items file: $EOAPI_ITEMS_FILE" | ||
| fi | ||
|
|
||
| # Define namespaces | ||
| NAMESPACES=("default" "eoapi", "data-access") | ||
| EOAPI_POD_RASTER="" | ||
| FOUND_NAMESPACE="" | ||
|
|
||
| # Discover the pod name from both namespaces | ||
| for NS in "${NAMESPACES[@]}"; do | ||
| EOAPI_POD_RASTER=$(kubectl get pods -n "$NS" -l app=raster-eoapi -o jsonpath="{.items[0].metadata.name}" 2>/dev/null) | ||
| if [ -n "$EOAPI_POD_RASTER" ]; then | ||
| FOUND_NAMESPACE="$NS" | ||
| echo "Found raster-eoapi pod: $EOAPI_POD_RASTER in namespace: $FOUND_NAMESPACE" | ||
| break | ||
| fi | ||
| done | ||
|
|
||
| # Check if the pod was found | ||
| if [ -z "$EOAPI_POD_RASTER" ]; then | ||
| echo "Could not determine raster-eoapi pod." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Check if input files exist | ||
| for FILE in "$EOAPI_COLLECTIONS_FILE" "$EOAPI_ITEMS_FILE"; do | ||
| if [ ! -f "$FILE" ]; then | ||
| echo "File not found: $FILE. You may set them via the EOAPI_COLLECTIONS_FILE and EOAPI_ITEMS_FILE environment variables." | ||
| exit 1 | ||
| fi | ||
| done | ||
|
|
||
| # Install required packages | ||
| echo "Installing required packages in pod $EOAPI_POD_RASTER in namespace $FOUND_NAMESPACE..." | ||
| if ! kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'apt update -y && apt install python3 python3-pip -y && pip install pypgstac[psycopg]'; then | ||
| echo "Failed to install packages." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Copy files to pod | ||
| echo "Copying files to pod..." | ||
| echo "Using collections file: $EOAPI_COLLECTIONS_FILE" | ||
| echo "Using items file: $EOAPI_ITEMS_FILE" | ||
| kubectl cp "$EOAPI_COLLECTIONS_FILE" "$FOUND_NAMESPACE/$EOAPI_POD_RASTER":/tmp/collections.json | ||
| kubectl cp "$EOAPI_ITEMS_FILE" "$FOUND_NAMESPACE/$EOAPI_POD_RASTER":/tmp/items.json | ||
|
|
||
| # Load collections and items | ||
| echo "Loading collections..." | ||
| if ! kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'pypgstac load collections /tmp/collections.json --dsn "$PGADMIN_URI" --method insert_ignore'; then | ||
| echo "Failed to load collections." | ||
| exit 1 | ||
| fi | ||
|
|
||
| echo "Loading items..." | ||
| if ! kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'pypgstac load items /tmp/items.json --dsn "$PGADMIN_URI" --method insert_ignore'; then | ||
| echo "Failed to load items." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Clean temporary files | ||
| echo "Cleaning temporary files..." | ||
| kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'rm -f /tmp/collection.json /tmp/items.json' | ||
|
|
||
| echo "Ingestion complete." | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.