Skip to content

Commit 011ba5d

Browse files
tjohnson31415njhill
authored andcommitted
test: add tests cases for a llama model including prefixes and truncation
Signed-off-by: Travis Johnson <[email protected]>
1 parent b54dc37 commit 011ba5d

File tree

3 files changed

+316
-0
lines changed

3 files changed

+316
-0
lines changed
Binary file not shown.
Lines changed: 308 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
# Test empty requests
2+
- name: Empty 1
3+
request: {}
4+
response: {}
5+
- name: Empty 2
6+
request:
7+
params: {}
8+
requests: []
9+
response: {}
10+
11+
# Basic Greedy (implicit)
12+
- name: Basic Greedy, max new tokens (implicit)
13+
request:
14+
requests:
15+
- {"text": "Once upon a time,"}
16+
response:
17+
responses:
18+
- generatedTokenCount: 20
19+
inputTokenCount: 6
20+
stopReason: MAX_TOKENS
21+
text: ' there was a little girl named Lily. She loved to play with her toy car.
22+
One day,'
23+
24+
# Basic Greedy (explicit)
25+
- name: Basic Greedy, max new tokens (explicit)
26+
request:
27+
params:
28+
method: GREEDY
29+
stopping: {"maxNewTokens": 17}
30+
requests:
31+
- {"text": "Once upon a time,"}
32+
response:
33+
responses:
34+
- generatedTokenCount: 17
35+
inputTokenCount: 6
36+
stopReason: MAX_TOKENS
37+
text: ' there was a little girl named Lily. She loved to play with her toy car.'
38+
39+
- name: Long input with tokens truncated
40+
request:
41+
params:
42+
truncateInputTokens: 25
43+
stopping:
44+
maxNewTokens: 14
45+
response:
46+
inputText: true
47+
requests:
48+
- text: >
49+
The hallway smelt of boiled cabbage and old rag mats. At one end of it a coloured poster, too large for
50+
indoor display, had been tacked to the wall. It depicted simply an enormous face, more than a metre wide:
51+
the face of a man of about forty-five, with a heavy black moustache and ruggedly handsome features.
52+
Winston made for the stairs. It was no use trying the lift. Even at the best of times it was seldom working,
53+
and at present the electric current was cut off during daylight hours. It was part of the economy drive in
54+
preparation for Hate Week. The flat was seven flights up, and Winston, who was thirty-nine and had a
55+
varicose ulcer above his right ankle, went slowly, resting several times on the way. On each landing,
56+
opposite the lift-shaft, the poster with the enormous face gazed from the wall.
57+
The hallway smelt of boiled cabbage and old rag mats. At one end of it a coloured poster, too large for
58+
indoor display, had been tacked to the wall. It depicted simply an enormous face, more than a metre wide:
59+
the face of a man of about forty-five, with a heavy black moustache and ruggedly handsome features.
60+
Winston made for the stairs. It was no use trying the lift. Even at the best of times it was seldom working,
61+
and at present the electric current was cut off during daylight hours. It was part of the economy drive in
62+
preparation for Hate Week. The flat was seven flights up, and Winston, who was thirty-nine and had a
63+
varicose ulcer above his right ankle, went slowly, resting several times on the way. On each landing,
64+
opposite the lift-shaft, the poster with the enormous face gazed from the wall.
65+
response:
66+
responses:
67+
- generatedTokenCount: 14
68+
inputTokenCount: 25
69+
stopReason: MAX_TOKENS
70+
text: 'The hallway smelt of boiled cabbage and old rag mats. At one end of it a
71+
coloured poster, too large for indoor display, had been tacked to the wall. It
72+
depicted simply an enormous face, more than a metre wide: the face of a man of
73+
about forty-five, with a heavy black moustache and ruggedly handsome features.
74+
Winston made for the stairs. It was no use trying the lift. Even at the best of
75+
times it was seldom working, and at present the electric current was cut off during
76+
daylight hours. It was part of the economy drive in preparation for Hate Week.
77+
The flat was seven flights up, and Winston, who was thirty-nine and had a varicose
78+
ulcer above his right ankle, went slowly, resting several times on the way. On
79+
each landing, opposite the lift-shaft, the poster with the enormous face gazed
80+
from the wall. The hallway smelt of boiled cabbage and old rag mats. At one end
81+
of it a coloured poster, too large for indoor display, had been tacked to the
82+
wall. It depicted simply an enormous face, more than a metre wide: the face of
83+
a man of about forty-five, with a heavy black moustache and ruggedly handsome
84+
features. Winston made for the stairs. It was no use trying the lift. Even at
85+
the best of times it was seldom working, and at present the electric current was
86+
cut off during daylight hours. It was part of the economy drive in preparation
87+
for Hate Week. The flat was seven flights up, and Winston, who was thirty-nine
88+
and had a varicose ulcer above his right ankle, went slowly, resting several times
89+
on the way. On each landing, opposite the lift-shaft, the poster with the enormous
90+
face gazed from the wall.
91+
92+
The mailman was so excited to see what was inside the door.'
93+
94+
- name: Sampling with multiple requests
95+
request:
96+
params:
97+
method: SAMPLE
98+
sampling:
99+
seed: 99
100+
top_k: 3
101+
top_p: 0.7
102+
typical_p: 0.6
103+
temperature: 0.99
104+
stopping: {"maxNewTokens": 2}
105+
response:
106+
generatedTokens: true
107+
tokenLogprobs: true
108+
topNTokens: 2
109+
requests:
110+
- {"text": ""}
111+
- {"text": "A boy"}
112+
- {"text": "A girl"}
113+
response:
114+
responses:
115+
- generatedTokenCount: 2
116+
inputTokenCount: 1
117+
seed: '99'
118+
stopReason: MAX_TOKENS
119+
text: ' Once upon'
120+
tokens:
121+
- text: "\u2581Once"
122+
topTokens:
123+
- text: "\u2581Once"
124+
- text: "\u2581upon"
125+
topTokens:
126+
- text: "\u2581upon"
127+
- generatedTokenCount: 2
128+
inputTokenCount: 3
129+
seed: '99'
130+
stopReason: MAX_TOKENS
131+
text: ' named Tim'
132+
tokens:
133+
- logprob: -0.5503204
134+
text: "\u2581named"
135+
topTokens:
136+
- logprob: -0.5503204
137+
text: "\u2581named"
138+
- logprob: -0.85982776
139+
text: "\u2581was"
140+
- text: "\u2581Tim"
141+
topTokens:
142+
- text: "\u2581Tim"
143+
- generatedTokenCount: 2
144+
inputTokenCount: 3
145+
seed: '99'
146+
stopReason: MAX_TOKENS
147+
text: ' was walking'
148+
tokens:
149+
- text: "\u2581was"
150+
topTokens:
151+
- text: "\u2581was"
152+
- logprob: -0.5918741
153+
text: "\u2581walking"
154+
topTokens:
155+
- logprob: -0.5918741
156+
text: "\u2581walking"
157+
- logprob: -0.8058443
158+
text: "\u2581a"
159+
160+
161+
# Prompt prefix
162+
- name: Greedy with tuned prompt prefix
163+
# Prompt prefixes with multi-shard not yet supported
164+
singleShardOnly: true
165+
request:
166+
prefixId: tinyllama
167+
params:
168+
method: GREEDY
169+
stopping:
170+
maxNewTokens: 13
171+
requests:
172+
- {"text": ""}
173+
response:
174+
responses:
175+
- generatedTokenCount: 13
176+
inputTokenCount: 1
177+
stopReason: MAX_TOKENS
178+
text: ' Once he can go to the park and play with his friends.'
179+
180+
# Prompt prefix with truncation
181+
- name: Greedy with tuned prompt prefix with truncation
182+
# Prompt prefixes with multi-shard not yet supported
183+
singleShardOnly: true
184+
request:
185+
prefixId: tinyllama
186+
params:
187+
method: GREEDY
188+
# this truncate will only leave the BOS token
189+
truncateInputTokens: 1
190+
stopping:
191+
maxNewTokens: 13
192+
requests:
193+
- {"text": "this will all be truncated"}
194+
response:
195+
responses:
196+
- generatedTokenCount: 13
197+
inputTokenCount: 1
198+
stopReason: MAX_TOKENS
199+
text: ' Once he can go to the park and play with his friends.'
200+
201+
202+
# Prompt prefix returning input and generated tokens
203+
- name: Greedy with tuned prompt prefix and returned tokens
204+
# Prompt prefixes with multi-shard not yet supported
205+
singleShardOnly: true
206+
request:
207+
prefixId: tinyllama
208+
params:
209+
method: GREEDY
210+
stopping: {"maxNewTokens": 2}
211+
response:
212+
inputTokens: true
213+
generatedTokens: true
214+
tokenLogprobs: true
215+
tokenRanks: true
216+
topNTokens: 2
217+
requests:
218+
- {"text": "Luke"}
219+
response:
220+
responses:
221+
- generatedTokenCount: 2
222+
inputTokenCount: 2
223+
inputTokens:
224+
- logprob: NaN
225+
text: <unk>
226+
- logprob: -15.049328
227+
rank: 22217
228+
text: <unk>
229+
topTokens:
230+
- logprob: -1.1141717
231+
text: ''''
232+
- logprob: -4.02287
233+
text: "\u2581the"
234+
- logprob: -13.876278
235+
rank: 21532
236+
text: <unk>
237+
topTokens:
238+
- logprob: -3.9175758
239+
text: "\u2581beautiful"
240+
- logprob: -4.1629815
241+
text: "\u2581big"
242+
- logprob: -14.279059
243+
rank: 20730
244+
text: <unk>
245+
topTokens:
246+
- logprob: -1.2582091
247+
text: '!'
248+
- logprob: -3.2666798
249+
text: ','
250+
- logprob: -14.407384
251+
rank: 21752
252+
text: <unk>
253+
topTokens:
254+
- logprob: -0.74738955
255+
text: "\u2581away"
256+
- logprob: -3.7206948
257+
text: "\u2581home"
258+
- logprob: -14.740826
259+
rank: 21652
260+
text: <unk>
261+
topTokens:
262+
- logprob: -2.3787293
263+
text: "\u2581away"
264+
- logprob: -2.5638995
265+
text: "\u2581and"
266+
- logprob: -15.371115
267+
rank: 20720
268+
text: <unk>
269+
topTokens:
270+
- logprob: -1.3795971
271+
text: .
272+
- logprob: -2.144157
273+
text: "\u2581in"
274+
- logprob: -12.507558
275+
rank: 6857
276+
text: <s>
277+
topTokens:
278+
- logprob: -0.8159742
279+
text: "\u2581"
280+
- logprob: -1.5868378
281+
text: <0x0A>
282+
- logprob: -9.704017
283+
rank: 194
284+
text: "\u2581Luke"
285+
topTokens:
286+
- logprob: -0.48339355
287+
text: "\u2581Once"
288+
- logprob: -1.4185036
289+
text: "\u2581One"
290+
stopReason: MAX_TOKENS
291+
text: ' was so'
292+
tokens:
293+
- logprob: -0.815401
294+
rank: 1
295+
text: "\u2581was"
296+
topTokens:
297+
- logprob: -0.815401
298+
text: "\u2581was"
299+
- logprob: -1.8202467
300+
text: "\u2581had"
301+
- logprob: -1.3788807
302+
rank: 1
303+
text: "\u2581so"
304+
topTokens:
305+
- logprob: -1.3788807
306+
text: "\u2581so"
307+
- logprob: -1.7917235
308+
text: "\u2581very"

integration_tests/text_generation_tests/test_server.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -353,6 +353,14 @@ async def test_mt0(server_fixture, test_cases):
353353
async def test_gptbigcode(server_fixture, test_cases):
354354
await run_test_cases_async(test_cases)
355355

356+
# test with Llama model which has tokenizer.add_bos_token == true
357+
@pytest.mark.model("Maykeye/TinyLLama-v0")
358+
@pytest.mark.extensions(".bin,.json,.model")
359+
@pytest.mark.shards(1)
360+
@pytest.mark.test_case_file("test_cases_tinyllama.yaml")
361+
@pytest.mark.asyncio
362+
async def test_llama(server_fixture, test_cases):
363+
await run_test_cases_async(test_cases)
356364

357365
# Test distributed inference - two shards
358366
@pytest.mark.model("bigscience/bloom-560m")

0 commit comments

Comments
 (0)