Skip to content

Error: AttributeError: 'LlamaTokenizerFast' object has no attribute '_bos_token' #13

@kaiwenKevinn

Description

@kaiwenKevinn

I encountered the following problem when running the run.sh script:
AttributeError: 'LlamaTokenizerFast' object has no attribute '_bos_token'

However my model's tokenizer_config.json file contains the bos_token attribute

Is there any solution?

This is the log. Thank you!

bash scripts/run.sh --method best_of_n --LM $POLICY_MODEL_PATH --RM $VALUE_MODEL_PATH --width 1 --num_seq 1 --num_q 256 --bs 16000
LM: /research/d1/gds/ytyang/kwchen/hf_models/deepseek/DeepSeek-R1-Distill-Qwen-1.5B/models--deepseek-ai--DeepSeek-R1-Distill-Qwen-1.5B/snapshots/ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562, RM: /research/d1/gds/ytyang/kwchen/hf_models/Qwen/Qwen2.5-Math-PRM-7B/models--Qwen--Qwen2.5-Math-PRM-7B/snapshots/0610740060112df12585d00a1c5f4624d2f59051, task: MATH, tree_max_width: 1, num_sequence: 1, question_parallel_num: 256
batch_size: 16000, max_time: 3, n_gpus: 1, double_line_break: 1
CUDA_VISIBLE_DEVICES: 0, n_gpus: 1
GPU_LIST:
2 3
Running best_of_n evaluation ...
2025-03-10 17:41:30,525 INFO worker.py:1654 -- Connecting to existing Ray cluster at address: 192.168.50.187:6379...
2025-03-10 17:41:30,544 INFO worker.py:1832 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
Auto set dir as /research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/output/MATH_best_of_n/ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562/0610740060112df12585d00a1c5f4624d2f59051/1_1_1
Loading MATH dataset
Batch size: 16000, Max exist time: 3
Assign start at: 2025-03-10 17:43:15
Len: 16000, Chosen dict: {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 1: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 2: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 3: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 4: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 5: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 6: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]......(ignored due to words limits)

23, 24, 25, 26, 27, 28, 29, 30, 31]}
Assign end at: 2025-03-10 17:44:41, Time cost: 86.3 seconds
Traceback (most recent call last):
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/evaluation/evaluate.py", line 352, in
returned_temp = parallel_evaluate_test_dataset(
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/evaluation/evaluate.py", line 199, in parallel_evaluate_test_dataset
for i, (problem_inst, result, output) in enumerate(res_q):
File "/research/d1/gds/ytyang/anaconda3/envs/tts/lib/python3.10/site-packages/ray/util/actor_pool.py", line 170, in get_generator
yield self.get_next_unordered()
File "/research/d1/gds/ytyang/anaconda3/envs/tts/lib/python3.10/site-packages/ray/util/actor_pool.py", line 370, in get_next_unordered
return ray.get(future)
File "/research/d1/gds/ytyang/anaconda3/envs/tts/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/research/d1/gds/ytyang/anaconda3/envs/tts/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/research/d1/gds/ytyang/anaconda3/envs/tts/lib/python3.10/site-packages/ray/_private/worker.py", line 2772, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/research/d1/gds/ytyang/anaconda3/envs/tts/lib/python3.10/site-packages/ray/_private/worker.py", line 919, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::RemoteMathEvaluator.evaluate_problem() (pid=4127907, ip=192.168.50.187, actor_id=359deb1dd8330ba5e9d21d2f87000000, repr=<reason.evaluation.evaluator.RemoteMathEvaluator object at 0x7f82c0db0700>)
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/envs/base_env.py", line 256, in update_legal_actions
result: ConcatedLMGenResult = self.llm_gen_fns[0](
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/inference/lm_call.py", line 62, in call
return _generate_fastchat(
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/inference/text_generation.py", line 79, in _generate_fastchat
bos_token=tokenizer.bos_token or "",
File "/research/d1/gds/ytyang/anaconda3/envs/open_reasoner/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1061, in bos_token
if self._bos_token is None:
AttributeError: 'LlamaTokenizerFast' object has no attribute '_bos_token'

During handling of the above exception, another exception occurred:

ray::RemoteMathEvaluator.evaluate_problem() (pid=4127907, ip=192.168.50.187, actor_id=359deb1dd8330ba5e9d21d2f87000000, repr=<reason.evaluation.evaluator.RemoteMathEvaluator object at 0x7f82c0db0700>)
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/evaluation/evaluator.py", line 127, in evaluate_problem
solution: SolutionOutput = solver_fn(problem_inst, self.lm_calls, self.rm_call)
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/evaluation/methods.py", line 89, in beam_search
traj_list = search_tree.beam_search(env, config.beam_size, config.tree_max_depth, rm_call)
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/guided_search/tree.py", line 273, in beam_search
_, info = simulate_env.reset(update_legal_action=True)
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/envs/base_env.py", line 176, in reset
self._legal_actions, api_completion_token = self.update_legal_actions(initial=True, force_update=True)
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/envs/base_env.py", line 256, in update_legal_actions
result: ConcatedLMGenResult = self.llm_gen_fns[0](
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/inference/lm_call.py", line 62, in call
return _generate_fastchat(
File "/research/d1/gds/ytyang/kwchen/compute-optimal-tts/src/reason/inference/text_generation.py", line 79, in _generate_fastchat
bos_token=tokenizer.bos_token or "",
File "/research/d1/gds/ytyang/anaconda3/envs/open_reasoner/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1061, in bos_token
if self._bos_token is None:
AttributeError: 'LlamaTokenizerFast' object has no attribute '_bos_token'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions