Skip to content

sarahlunette/GeoAIHack_team_18

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeoAIHack_team_18

We are answering the GeoAI Hack challenge at ADP on locust breeding ground segmentation: https://geoaihack.com/ and https://www.kaggle.com/datasets/chaimaboukthir/geo-ai-hack on Kaggle.

Notebooks

Sarah made this code, using chatgpt, either for colab or kaggle with a DeepLabV3 (evi and ndvi addition in features), the remote sensing image is 6 bands: Blue, green, red, nir, swir1, swir2. Runs in about 200 minutes on kaggle with 1 GPU (not using 2nd though available).

Competition submission

Kenzo has used a prithvi model (base model from instageo package, with cloud masking possible) in order to make the diverse results with changing batch, epoch and learning_rate and adding bands, ndwi and ndvi. The model is a VIT that uses 3 timestamps for each area with 30 days lag. Stan added new bands to get locust colour areas in order to get the possible breading grounds. The metric used is AUC.

Bibliography and interesting links

  • https://custom-scripts.sentinel-hub.com/custom-scripts/
  • https://european-flood.emergency.copernicus.eu/en
  • https://medium.com/sentinel-hub/environmental-monitoring-of-conflicts-using-sentinel-2-61f07d76e27b
  • https://medium.com/sentinel-hub/mapping-deforestation-from-sentinel-hub-de6aae67f817
  • https://towardsdatascience.com/vision-transformers-explained-a9d07147e4c8/
  • https://medium.com/sentinel-hub/the-use-of-satellite-imagery-in-crisis-management-after-flooding-382be517224f
  • Data

    tif with mask 0 or 1, 3 frames of 30 days lags and several years data 256 by 256 with blue, green, red, nir, swir1, swir2 bands, possible to change chip sizes by aggregation

    Streamlit App

    Julian has made the streamlit app with the template provided by InstaDeep competition and modified it to get filter drop downs.

    InstaGeo is geospatial deep learning Python package designed to facilitate geospatial machine learning using satellite imagery data from Harmonized Landsat and Sentinel-2 (HLS) Data Product and Prithvi geospatial foundational model. It consists of three core components: Data, Model, and Apps, each tailored to support various aspects of geospatial data retrieval, manipulation, preprocessing, model training, and inference serving.

    Components

    1. Data: Focuses on retrieving, manipulating, and processing Harmonized Landsat Sentinel-2 (HLS) data for classification and segmentation tasks such as disaster mapping, crop classification, and breeding ground prediction.
    2. Model: Centers around data loading, training, and evaluating models, particularly leveraging the Prithvi model for various modeling tasks. It includes a sliding-window feature that allows inference to be run on large inputs.
    3. Apps: Aims to operationalize models developed in the Model component for practical applications.

    Installation

    To get started with InstaGeo, ensure you have Python installed on your system. Then, execute the following command in your terminal or command prompt to install InstaGeo:

    pip install instageo

    This command will download and install the latest version of InstaGeo along with its required dependencies.

    Running Tests

    After installation, you may want to verify that InstaGeo has been correctly installed and is functioning as expected. To do this, run the included test suite with the following commands:

    export PYTHONPATH=$PYTHONPATH:$(pwd)
    pytest --verbose .

    Usage

    Data Component

    • HLS Data Retrieval: InstaGeo efficiently searches for and download Sentinel-2 and Landsat 8/9 multi-spectral earth observation images from the HLS data product.

    • Create Chips and Segmentation Maps: InstaGeo breaks down large satellite image tiles into smaller, manageable patches (referred to as "chips") suitable for deep learning model training. It also generate segmentation maps, which serve as targets for training, by categorizing each pixel in the chips.

    Model Component

    • Training Custom Models: Utilize the Prithvi geospatial foundational model as a backbone to develop custom models tailored for precise geospatial applications. These applications include, but are not limited to, flood mapping for emergency response planning, crop classification for agricultural management, and locust breeding ground prediction to address food security.

    • Inference on Large-scale Geospatial Data: Perform inference using the models that have been trained on 'chips' (typically measuring 224 x 224 pixels) on expansive HLS tiles, which measure 3660 x 3660 pixels.

    Apps Component

    • Operationalize Models: Once data has been created and model trained, deploy model for use using the Apps components. HLS tile predictions can be overlaid and visualized on interactive maps.

    Putting It All Together - Locust Breeding Ground Prediction

    See InstGeo_Demo notebook for an end-to-end demo.

    Download locust breeding ground observation records.

    mkdir locust_breeding
    gsutil -m cp -r gs://instageo/data/locust_breeding/records locust_breeding

    Download HLS tiles, create chips and segmentation maps

    Note: Ensure that you have up to 1.5TB free disk space

    Create output directory for each split

    mkdir locust_breeding/train locust_breeding/val locust_breeding/test
    • Train Split
    python -m "instageo.data.chip_creator" \
        --dataframe_path="locust_breeding/records/train.csv" \
        --output_directory="locust_breeding/train" \
        --min_count=1 \
        --chip_size=224 \
        --no_data_value=-1 \
        --temporal_tolerance=3 \
        --temporal_step=30 \
        --mask_cloud=False \
        --num_steps=3
    • Validation Split
    python -m "instageo.data.chip_creator" \
        --dataframe_path="locust_breeding/records/val.csv" \
        --output_directory="locust_breeding/val" \
        --min_count=1 \
        --chip_size=224 \
        --no_data_value=-1 \
        --temporal_tolerance=3 \
        --temporal_step=30 \
        --mask_cloud=False \
        --num_steps=3
    • Test Split
    python -m "instageo.data.chip_creator" \
        --dataframe_path="locust_breeding/records/test.csv" \
        --output_directory="locust_breeding/test" \
        --min_count=1 \
        --chip_size=224 \
        --no_data_value=-1 \
        --temporal_tolerance=3 \
        --temporal_step=30 \
        --mask_cloud=False \
        --num_steps=3

    Launch Training

    Before launching training, modify the path to chips and segmentation maps in each split

    for split in ["train", "val", "test"]:
        root_dir = "locust_breeding"
        chips = [
            chip.replace("chip", f"{split}/chips/chip")
            for chip in os.listdir(os.path.join(root_dir, f"{split}/chips"))
        ]
        seg_maps = [
            chip.replace("chip", f"{split}/seg_maps/seg_map") for chip in chips_orig
        ]
    
        df = pd.DataFrame({"Input": chips, "Label": seg_maps})
        df.to_csv(os.path.join(root_dir, f"{split}.csv"))
    python -m instageo.model.run --config-name=locust \
        root_dir='locust_breeding' \
        train_filepath="locust_breeding/train.csv" \
        valid_filepath="locust_breeding/val.csv"

    Run Evaluation

    python -m instageo.model.run --config-name=locust \
        root_dir='locust_breeding' \
        test_filepath="locust_breeding/test.csv" \
        train.batch_size=16 \
        checkpoint_path='instageo-data/outputs/2024-03-01/09-16-30/instageo_epoch-10-val_iou-0.70.ckpt' \
        mode=eval

    Run Inference on Africa Continent

    • Download HLS tiles
    python -m "instageo.data.chip_creator" \
        --dataframe_path="gs://instageo/utils/africa_prediction_template.csv" \
        --output_directory="inference/20223-01" \
        --min_count=1 \
        --no_data_value=-1 \
        --temporal_tolerance=3 \
        --temporal_step=30 \
        --num_steps=3 \
        --download_only
    • Inference
    python -m instageo.model.run --config-name=locust \
        root_dir='inference/20223-01' \
        test_filepath='hls_dataset.json' \
        train.batch_size=16 \
        checkpoint_path='instageo-data/outputs/2024-03-01/09-16-30/instageo_epoch-10-val_iou-0.70.ckpt' \
        mode=predict

    Visualize Predictions

    • Run InstaGeo Serve
    cd instageo/apps
    streamlit run app.py
    • Specify the directory containing the predictions.

    InstaGeo Serve

    Contributing

    We welcome contributions to InstaGeo. Please follow the contribution guidelines for submitting pull requests and reporting issues to help us improve the package.

    License

    This project is licensed under the CC BY-NC-SA 4.0.

    About

    Satellite Imagery for classification of locust breeding grounds

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

     
     
     

    Contributors