Replies: 1 comment
-
🤖 Hi, Based on the information available in the LangChain repository, the def load(self) -> List[Document]:
"""Load documents."""
try:
from azure.storage.blob import ContainerClient
except ImportError as exc:
raise ImportError(
"Could not import azure storage blob python package. "
"Please install it with `pip install azure-storage-blob`."
) from exc
container = ContainerClient.from_connection_string(
conn_str=self.conn_str, container_name=self.container
)
docs = []
blob_list = container.list_blobs(name_starts_with=self.prefix)
for blob in blob_list:
loader = AzureBlobStorageFileLoader(
self.conn_str, self.container, blob.name # type: ignore
)
docs.extend(loader.load())
return docs (Source: libs/langchain/langchain/document_loaders/azure_blob_storage_container.py) As for preserving the markdown format during the loading process, the from unstructured.partition.md import partition_md
return partition_md(filename=self.file_path, **self.unstructured_kwargs) (Source: libs/langchain/langchain/document_loaders/markdown.py) However, I wasn't able to find information on whether the I hope this information helps! If you have any more questions or need further clarification, feel free to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, everyone.
I would like to check could we reserve markdown format during using AzureBlobStorageContainerLoader to load markdown file in azure blob storage?
Because we want to use MarkdownHeaderTextSplitter to process so that can ensure paragraph integrity.
If you have any ideas, please share to me.
I will be appreciate it , thank you.
Beta Was this translation helpful? Give feedback.
All reactions