Skip to content

dbdemos llm-fine-tuning not working - TypeError: string indices must be integers, not 'str' #230

@rvvittal

Description

@rvvittal

After successful installation of the demo resources with this statement - dbdemos.install('llm-fine-tuning'), when running the first notebook - 01-classification-fine-tuning-customer-support, I get the following error when running this statement: %run ./_resources/00-setup

USE CATALOG main
using catalog.database main.dbdemos_llm_fine_tuning

TypeError: string indices must be integers, not 'str'
File , line 5
1 if not spark.catalog.tableExists('training_dataset_question') or
2 not spark.catalog.tableExists('training_dataset_answer') or
3 not spark.catalog.tableExists('databricks_documentation')or
4 not spark.catalog.tableExists('customer_tickets'):
----> 5 DBDemos.download_file_from_git(volume_folder+"/training_dataset", "databricks-demos", "dbdemos-dataset", "llm/databricks-documentation")
7 #spark.read.format('parquet').load(f"{volume_folder}/training_dataset/raw_documentation.parquet").write.saveAsTable("raw_documentation")
8 spark.read.format('parquet').load(f"{volume_folder}/training_dataset/training_dataset_question.parquet").write.mode('overwrite').saveAsTable("training_dataset_question")
File , line 108, in (.0)
106 from concurrent.futures import ThreadPoolExecutor
107 files = requests.get(f'https://api.github.com/repos/" target="_blank" rel="noopener noreferrer">https://api.github.com/repos/<span class="ansi-bold">{owner}/{repo}/contents{path}').json()
--> 108 files = [f['download_url'] for f in files if 'NOTICE' not in f['name']]
109 def download_to_dest(url):
110 try:
111 #Temporary fix to avoid hitting github limits - Swap github to our S3 bucket to download files
Command skipped

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions