Skip to content

added Japanese LLama elyza#1310

Open
YToleubay wants to merge 10 commits intomasterfrom
elyza
Open

added Japanese LLama elyza#1310
YToleubay wants to merge 10 commits intomasterfrom
elyza

Conversation

@YToleubay
Copy link
Copy Markdown
Contributor

@YToleubay YToleubay commented Nov 14, 2023

@YToleubay YToleubay marked this pull request as ready for review November 14, 2023 07:40
@kyakuno
Copy link
Copy Markdown
Collaborator

kyakuno commented Nov 18, 2023

モデルをアップロードしました。
https://storage.googleapis.com/ailia-models/elyza-japanese-llama-2-7b/decoder_model.onnx

@kyakuno
Copy link
Copy Markdown
Collaborator

kyakuno commented Nov 18, 2023

macOSだと実時間で処理が終わらない。

@kyakuno
Copy link
Copy Markdown
Collaborator

kyakuno commented Nov 18, 2023

@YToleubay How many time do you need for inference? About ONNX Runtime and ailia?

@YToleubay
Copy link
Copy Markdown
Contributor Author

@YToleubay How many time do you need for inference? About ONNX Runtime and ailia?
I did following benchmark with NVIDIA GeForce RTX 3090, 32GB ram,
With onnx I have the following output:

processing time 36854 ms
processing time 32836 ms
processing time 31787 ms
processing time 31776 ms
processing time 31774 ms
**Average onnx time is =  33005.4 ms**

with ailia I have the following numbers:

 ailia processing time 1060661 ms
 ailia processing time 1061135 ms 
Average ailia time 1060898 ms

It seems inference runtime is 32 times slower on ailia than onnx

@kyakuno
Copy link
Copy Markdown
Collaborator

kyakuno commented Nov 19, 2023

Thanks. I will investigate it.

@YToleubay
Copy link
Copy Markdown
Contributor Author

Thanks. I will investigate it.

Can I help you somehow?

@kyakuno
Copy link
Copy Markdown
Collaborator

kyakuno commented Nov 19, 2023

Thank you. We will verify it with the ailia SDK team as it will be the core implementation of the ailia SDK.

# Conflicts:
#	scripts/download_all_models.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants