Skip to content

Commit 28b900e

Browse files
authored
Merge pull request #11 from HumanCompatibleAI/generated-docs-schema
Improvements to fake data generator, using pydantic models to generate docs examples
2 parents 5d3c912 + f28b495 commit 28b900e

File tree

3 files changed

+133
-72
lines changed

3 files changed

+133
-72
lines changed

docs/api_reference.md

Lines changed: 83 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -6,80 +6,106 @@ Your ranker should be implemented as a service that accepts an HTTP POST request
66

77
## Request/response format
88

9-
_NOTE: This is provisional, and will almost certainly change._
10-
119
Your ranker should accept a list of social media posts and comments, each with a corresponding ID, in JSON format:
1210

11+
### Request
12+
13+
(this example is a single post with two threaded comments)
14+
1315
```jsonc
1416
{
15-
"session": {
16-
"user_id": "193a9e01-8849-4e1f-a42a-a859fa7f2ad3",
17-
"user_name_hash": "6511c5688bbb87798128695a283411a26da532df06e6e931a53416e379ddda0e",
18-
"platform": "reddit",
19-
"current_time": "2024-01-20 18:41:20",
17+
"session": {
18+
"user_id": "1cfe49e5-02b6-4e58-a376-4b254a62650e",
19+
"user_name_hash": "0af8c7486e97a23b4631283970f55a3c51338cbf7a7748ca39449a895822be84",
20+
"platform": "reddit",
21+
"current_time": "2024-04-09T19:29:38.072017Z"
22+
},
23+
"items": [
24+
{
25+
"id": "fde9c535-2d98-45db-b2d9-c3f8c4de0330",
26+
"post_id": null,
27+
"parent_id": null,
28+
"title": null,
29+
"text": "Sed error repellat minima ex. Numquam recusandae unde perspiciatis quasi suscipit. Natus repellat voluptate nostrum vel.",
30+
"author_name_hash": "2e7a2066f0d892ecfd656fa64c1081aa9c6778fb0d22217240a62377435c9ace",
31+
"type": "post",
32+
"created_at": "2024-04-09T19:29:38.071245Z",
33+
"engagements": {
34+
"upvote": 16,
35+
"downvote": 38,
36+
"comment": 46,
37+
"award": 4
38+
}
2039
},
21-
"items": [
22-
{
23-
"id": "de83fc78-d648-444e-b20d-853bf05e4f0e",
24-
"title": "this is the post title, available only on reddit",
25-
"text": "this is a social media post",
26-
"author_name_hash": "60b46b7370f80735a06b7aa8c4eb6bd588440816b086d5ef7355cf202a118305",
27-
"type": "post",
28-
"created_at": "2023-12-06 17:02:11",
29-
"enagements": {
30-
"upvote": 34,
31-
"downvote": 27
32-
}
33-
},
34-
{
35-
"id": "a4c08177-8db2-4507-acc1-1298220be98d",
36-
"parent_id": "", // this is a top-level comment
37-
"post_id": "de83fc78-d648-444e-b20d-853bf05e4f0e",
38-
"text": "this is a comment, by the author of the post",
39-
"author_name_hash": "60b46b7370f80735a06b7aa8c4eb6bd588440816b086d5ef7355cf202a118305",
40-
"type": "comment",
41-
"created_at": "2023-12-08 11:32:12",
42-
"enagements": {
43-
"upvote": 3,
44-
"downvote": 5
45-
}
46-
},
47-
{
48-
"id": "06fb0b62-2501-40f1-a152-db019d03d2e6",
49-
"parent_id": "a4c08177-8db2-4507-acc1-1298220be98d",
50-
"post_id": "de83fc78-d648-444e-b20d-853bf05e4f0e",
51-
"text": "this is a reply to the first comment",
52-
"author_name_hash": "60b46b7370f80735a06b7aa8c4eb6bd588440816b086d5ef7355cf202a118305",
53-
"type": "comment",
54-
"created_at": "2023-12-08 11:32:12",
55-
"enagements": {
56-
"upvote": 3,
57-
"downvote": 5
58-
}
59-
}
60-
]
40+
{
41+
"id": "1d4d65c1-32bc-486b-bb44-761f33820f12",
42+
"post_id": "fde9c535-2d98-45db-b2d9-c3f8c4de0330",
43+
"parent_id": null,
44+
"title": null,
45+
"text": "Incidunt temporibus at maiores ratione eveniet facere. Eligendi nulla ipsa. Temporibus ex magnam voluptate enim laborum quod.",
46+
"author_name_hash": "e601eae141746a9677174503e03ee41298f8b1e89ba63565edf4ed0553fdd40a",
47+
"type": "comment",
48+
"created_at": "2024-04-09T19:29:38.071843Z",
49+
"engagements": {
50+
"upvote": 38,
51+
"downvote": 2,
52+
"comment": 9,
53+
"award": 11
54+
}
55+
},
56+
{
57+
"id": "ceb75c43-a4f6-4426-a7af-5b178a6fc19a",
58+
"post_id": "fde9c535-2d98-45db-b2d9-c3f8c4de0330",
59+
"parent_id": "1d4d65c1-32bc-486b-bb44-761f33820f12",
60+
"title": null,
61+
"text": "Nemo suscipit consequuntur officia blanditiis repellendus dolor neque. Dolore reiciendis adipisci reprehenderit blanditiis ad iste hic.",
62+
"author_name_hash": "911fb438baa1eb6bbb28b4af3419150fbc44409f5129c400ef4ab58c02102a6b",
63+
"type": "comment",
64+
"created_at": "2024-04-09T19:29:38.071940Z",
65+
"engagements": {
66+
"upvote": 18,
67+
"downvote": 0,
68+
"comment": 29,
69+
"award": 36
70+
}
71+
}
72+
]
6173
}
6274
```
6375

76+
### Response
77+
6478
Your ranker should return an ordered list of IDs. You can also remove items by removing an ID, or add items by inserting a new ID that you generate. For new posts (only posts insertion is supported), also provide the post URL.
6579

6680
```jsonc
6781
{
68-
"ranked_ids": [
69-
"de83fc78-d648-444e-b20d-853bf05e4f0e",
70-
"571775f3-2564-4cf5-b01c-f4cb6bab461b"
71-
],
72-
"new_items": [
73-
{
74-
"id": "571775f3-2564-4cf5-b01c-f4cb6bab461b",
75-
"url": "https://reddit.com/r/PRCExample/comments/1f33ead/example_to_insert",
76-
}
77-
]
82+
"ranked_ids": [
83+
"fde9c535-2d98-45db-b2d9-c3f8c4de0330",
84+
"1d4d65c1-32bc-486b-bb44-761f33820f12",
85+
"c9c0ea77-7501-4b34-b1a3-f56e41a14f44",
86+
"10f32cf7-4566-41f9-b07b-6655f4f7fe46"
87+
],
88+
"new_items": [
89+
{
90+
"id": "c9c0ea77-7501-4b34-b1a3-f56e41a14f44",
91+
"url": "https://reddit.com/r/PRCExample/comments/1f33ead/example_to_insert"
92+
},
93+
{
94+
"id": "10f32cf7-4566-41f9-b07b-6655f4f7fe46",
95+
"url": "https://reddit.com/r/PRCExample/comments/1f33ead/another_example"
96+
}
97+
]
7898
}
7999
```
80100

81101
You do not need to return the same number of content items as you received. However, keep in mind that making a significant change in the number of items could have a negative impact on the user experience.
82102

103+
## Pydantic models
104+
105+
We have a set of pydanitc models, which are the source of truth for the API format. Using them, you can encode, parse, and validate the request and response json. You can also use them natively in fastapi. The examples above were generated from these models.
106+
107+
You can always find the most current version in [examples/models](https://github.com/HumanCompatibleAI/ranking-challenge/tree/main/examples/models)
108+
83109
## Request fields
84110

85111
### Session fields

examples/models/fake.py

Lines changed: 44 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -18,31 +18,62 @@
1818
from models.request import ContentItem, RankingRequest, Session
1919
from models.response import RankingResponse
2020

21-
def fake_request(n_items=1):
21+
def fake_request(n_posts=1, n_comments=0, platform="reddit"):
22+
posts = [fake_item(platform=platform, type="post") for _ in range(n_posts)]
23+
comments = []
24+
for post in posts:
25+
last_comment_id = None
26+
for _ in range(n_comments):
27+
comments.append(fake_item(platform=platform, type="comment", post_id=post.id, parent_id=last_comment_id))
28+
last_comment_id = comments[-1].id
29+
2230
return RankingRequest(
2331
session=Session(
2432
user_id=str(uuid4()),
2533
user_name_hash=hashlib.sha256(fake.name().encode()).hexdigest(),
26-
platform="reddit",
34+
platform=platform,
2735
current_time=time.time(),
2836
),
29-
items=[fake_item() for _ in range(n_items)]
30-
37+
items=posts + comments,
3138
)
3239

33-
def fake_item():
40+
def fake_item(platform="reddit", type="post", post_id=None, parent_id=None):
41+
if platform == "reddit":
42+
engagements = {
43+
"upvote": randint(0, 50),
44+
"downvote": randint(0, 50),
45+
"comment": randint(0, 50),
46+
"award": randint(0, 50)}
47+
elif platform == "twitter":
48+
engagements = {
49+
"like": randint(0, 50),
50+
"retweet": randint(0, 50),
51+
"comment": randint(0, 50),
52+
"share": randint(0, 50)}
53+
elif platform == "facebook":
54+
engagements = {
55+
"like": randint(0, 50),
56+
"love": randint(0, 50),
57+
"care": randint(0, 50),
58+
"haha": randint(0, 50),
59+
"wow": randint(0, 50),
60+
"sad": randint(0, 50),
61+
"angry": randint(0, 50),
62+
"comment": randint(0, 50),
63+
"share": randint(0, 50)
64+
}
65+
else:
66+
raise ValueError(f"Unknown platform: {platform}")
67+
3468
return ContentItem(
3569
id=str(uuid4()),
3670
text=fake.text(),
71+
post_id=post_id,
72+
parent_id=parent_id,
3773
author_name_hash=hashlib.sha256(fake.name().encode()).hexdigest(),
38-
type="post",
74+
type=type,
3975
created_at=time.time(),
40-
engagements={
41-
"upvote": randint(0, 50),
42-
"downvote": randint(0, 50),
43-
"comment": randint(0, 50),
44-
"award": randint(0, 50)
45-
},
76+
engagements=engagements,
4677
)
4778

4879
def fake_response(ids, n_new_items=1):
@@ -63,7 +94,7 @@ def fake_new_item():
6394

6495
# if run from command line
6596
if __name__ == "__main__":
66-
request = fake_request(3)
97+
request = fake_request(n_posts=1, n_comments=2)
6798
print("Request:")
6899
print(request.model_dump_json(indent=2))
69100

examples/models/fake_test.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,17 @@
1313

1414
def test_fake_request():
1515
# this test's purpose is mostly to run the code to make sure it doesn't
16-
# have any validation errors
17-
request = fake.fake_request(5)
16+
# have any validation errors. pydantic will make sure it has the right fields.
17+
request = fake.fake_request(n_posts=5)
1818
assert len(request.items) == 5
1919

2020
# all ids are unique
2121
assert len(set(item.id for item in request.items)) == 5
2222

23+
request = fake.fake_request(n_posts=5, n_comments=2, platform="twitter")
24+
assert len(request.items) == 15
25+
assert request.session.platform == "twitter"
26+
2327

2428
def test_fake_response():
2529
ids = [str(i) for i in range(5)]

0 commit comments

Comments
 (0)