Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions utilities/bulk-delete-projects/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Bulk Delete Projects Script

## What is it
This script will delete projects in bulk from the deployment specified by ORGSLUG, by looping over a CSV of Project Names, and hitting the `DELETE - Delete project` endpoint. Once complete, it will generate a log of what was deleted, and if there were any errors (as well as providing the realtime responses in your CLI).

## How to run
To run the script, you first need to create and populate an `input.csv` file with all the project names of the projects you want to delete. See the included `input.csv.example` file as an example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this require a CSV if it's just taking a list of names? Also, it would be nice for this filename to be customizable (and the output filename too). We usually use argparse to do CLI arguments in our scripts - there are some good examples of this in the repo if you want to use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also take the list on stdin, then the file can be named whatever and we don't have to implement args about it at all.


You can use the `GET - List all projects` endpoint on the API to get these, but this will only return **scanned** projects, if you want to delete unscanned projects in bulk, you'll need to contact Semgrep Support to do this for you.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can use the `GET - List all projects` endpoint on the API to get these, but this will only return **scanned** projects, if you want to delete unscanned projects in bulk, you'll need to contact Semgrep Support to do this for you.
You can use the `GET - List all projects` endpoint on the API to get these, but this will only return **scanned** projects, if you want to delete unscanned projects in bulk, you'll need to contact Semgrep Support to do this for you.

It would be nice for the script to provide the option to both get and delete projects or to take an input file and delete. Not a requirement for approval, but I think it would improve the UX.


Now you've got the data, you need to setup the config at the top of the script - just add your Organization Slug to `ORGSLUG`, and your token to `BEARER_TOKEN` (must be authorised for the API) for the deployment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per comment on this code, these instructions would also need an update.


Then, once that's done you're good to go!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Then, once that's done you're good to go!


CD to the scripts directory (`bulk-delete-projects`) and run it with the below command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CD to the scripts directory (`bulk-delete-projects`) and run it with the below command:
In the directory where the script is saved, run it with the following command:


`python3 index.py` (may vary depending on which Python version you have installed).
109 changes: 109 additions & 0 deletions utilities/bulk-delete-projects/index.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
import csv
import requests
from datetime import datetime
import sys
import time
import json

# ------------------------------------------------------------
# Configuration ⚙️⚙️⚙️
# ------------------------------------------------------------

ORGSLUG = "yourorgsluggoeshere" # Replace with your organization slug (found in Settings > Identifiers)
BEARER_TOKEN = "yourkeygoeshere" # Replace with your bearer token (generate one in Settings > Tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be an env var or extracted from the local settings.yml - we should not publish scripts that encourage hardcoding tokens.

If you look in the other API scripts, there are usage patterns you can follow for both retrieving the token and getting the deployment information using the token (instead of requiring a hardcode there too).


# ------------------------------------------------------------
# NO EDITING BELOW THIS LINE
# ------------------------------------------------------------
Comment on lines +15 to +17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# ------------------------------------------------------------
# NO EDITING BELOW THIS LINE
# ------------------------------------------------------------


API_ENDPOINT = "https://semgrep.dev/api/v1/deployments/{deployment_slug}/projects/{project_name}"

def extract_error_message(response_text):
"""Extract error message from JSON response"""
try:
error_json = json.loads(response_text)
if "error" in error_json:
return error_json["error"]
return response_text
except json.JSONDecodeError:
return response_text

def delete_project(project_name):
"""
Delete a project using the Semgrep API
Returns a tuple of (success, message)
"""
url = API_ENDPOINT.format(
deployment_slug=ORGSLUG,
project_name=project_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we need to url encode the project name here? They almost always have slashes and occasionally have spaces.

)

headers = {
"Authorization": f"Bearer {BEARER_TOKEN}"
}

try:
response = requests.delete(url, headers=headers)
if response.status_code == 200:
print(f"Successfully deleted project: {project_name}")
return True, "deleted"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it's really useful to produce both an output file and printed logs. I'd generally expect one or the other, or something like a file and then a summary output like "Successfully deleted X projects, failed to delete Y projects".

else:
error_msg = extract_error_message(response.text)
print(f"Failed to delete project: {project_name}")
print(f"Error: {error_msg}")
return False, error_msg
except Exception as e:
error_msg = f"Error: {str(e)}"
print(f"Error deleting project {project_name}: {str(e)}")
return False, error_msg

def count_projects_in_csv():
try:
with open('input.csv', 'r') as file:
return sum(1 for row in csv.reader(file) if row) - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider using enumerate(file) here - I'm not sure at what point readlines becomes unwieldy, but we do know some folks have a lot of projects.

except FileNotFoundError:
print("Error: input.csv file not found")
sys.exit(1)
except Exception as e:
print(f"An error occurred while reading the CSV: {str(e)}")
sys.exit(1)

def main():
project_count = count_projects_in_csv()
print(f"\nThe script will attempt to delete {project_count} projects, would you like to continue?")
confirmation = input("Enter Y/N >>> ")

if confirmation.lower() != 'y':
print("\nOperation cancelled by user\n")
return
Comment on lines +74 to +78
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
confirmation = input("Enter Y/N >>> ")
if confirmation.lower() != 'y':
print("\nOperation cancelled by user\n")
return
confirmation = input("Enter y to proceed >>> ")
if confirmation.lower() != 'y':
print("\nOperation cancelled by user\n")
return

I'm a little dubious about doing bespoke prompting in a script like this generally, but definitely if it's used, the prompt needs to provide accurate instructions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is, no? lower() will ensure Y matches y

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but the prompt should just say enter y if that's the desired behavior, since anything else will result in not proceeding - it doesn't need to specify y/n, and there's no reason to use a different case than is targeted. There's a fairly common Unix-y convention where capitalization indicates it's the default, so using lowercase is generally clearer if there is no default.


timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_filename = f"bulkDeleteProjectsRun-{timestamp}.csv"
results = []

try:
with open('input.csv', 'r') as file:
csv_reader = csv.reader(file)
next(csv_reader, None)

for row in csv_reader:
if row:
project_name = row[0].strip()
success, status = delete_project(project_name)
results.append([project_name, status])
time.sleep(0.25)

with open(output_filename, 'w', newline='') as output_file:
csv_writer = csv.writer(output_file)
csv_writer.writerow(['Project Name', 'Status'])
csv_writer.writerows(results)

print(f"\nResults have been saved to {output_filename}")

except FileNotFoundError:
print("Error: input.csv file not found")
except Exception as e:
print(f"An error occurred: {str(e)}")

if __name__ == "__main__":
main()
3 changes: 3 additions & 0 deletions utilities/bulk-delete-projects/input.csv.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
projectName
Semgrep/SC.Observability.Queues
Semgrep/Code.Security.ElasticSearch.Rules