-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
Steps to reproduce:
pip install span-marker>>> m_cuda = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-roberta-large-fewnerd-fine-super").cuda()
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.75k/6.75k [00:00<00:00, 50.7MB/s]
model.safetensors: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.42G/1.42G [00:21<00:00, 65.9MB/s]
>>> m_cuda.device
device(type='cuda', index=0)
>>> m_cuda.predict("John Smith works at Amazon.")
[]
>>> m_cpu = m_cuda.to("cpu")
>>> m_cpu.predict("John Smith works at Amazon.")
SpanMarker model predictions are being computed on the CPU while CUDA is available. Moving the model to CUDA using `model.cuda()` before performing predictions is heavily recommended to significantly boost prediction speeds.
[{'span': 'John Smith', 'label': 'person-other', 'score': 0.9197737574577332, 'char_start_index': 0, 'char_end_index': 10}, {'span': 'Amazon', 'label': 'organization-company', 'score': 0.9607704877853394, 'char_start_index': 20, 'char_end_index': 26}]CPU inference yields expected results but CUDA is returning empty for short texts... Not a problem if we have longer texts... why is this?
Output from nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
This is running on an A10 but we observe the same results on a T4... Is SpanMarker not compatible with cu12x?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels