Skip to content

Commit 6d79082

Browse files
paulztkerseyCopilot
authored
Update documentation and Gemfile.lock for consistency and clarity (#7)
* Update documentation and Gemfile.lock for consistency and clarity * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Fix some typos Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Paul Zabelin <[email protected]> Co-authored-by: Tim Kersey <[email protected]> Co-authored-by: Copilot <[email protected]>
1 parent aecd5b1 commit 6d79082

File tree

6 files changed

+53
-56
lines changed

6 files changed

+53
-56
lines changed

docs/Gemfile.lock

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,15 @@ GEM
1313
http_parser.rb (~> 0)
1414
eventmachine (1.2.7)
1515
ffi (1.17.1)
16-
ffi (1.17.1-aarch64-linux-gnu)
16+
ffi (1.17.1-aarch64-linux)
1717
ffi (1.17.1-aarch64-linux-musl)
18-
ffi (1.17.1-arm-linux-gnu)
18+
ffi (1.17.1-arm-linux)
1919
ffi (1.17.1-arm-linux-musl)
2020
ffi (1.17.1-arm64-darwin)
21-
ffi (1.17.1-x86-linux-gnu)
21+
ffi (1.17.1-x86-linux)
2222
ffi (1.17.1-x86-linux-musl)
2323
ffi (1.17.1-x86_64-darwin)
24-
ffi (1.17.1-x86_64-linux-gnu)
24+
ffi (1.17.1-x86_64-linux)
2525
ffi (1.17.1-x86_64-linux-musl)
2626
forwardable-extended (2.6.0)
2727
google-protobuf (4.29.3)
@@ -99,35 +99,35 @@ GEM
9999
sass-embedded (1.83.4)
100100
google-protobuf (~> 4.29)
101101
rake (>= 13)
102-
sass-embedded (1.83.4-aarch64-linux-android)
102+
sass-embedded (1.83.4-aarch64-linux)
103103
google-protobuf (~> 4.29)
104-
sass-embedded (1.83.4-aarch64-linux-gnu)
104+
sass-embedded (1.83.4-aarch64-linux-android)
105105
google-protobuf (~> 4.29)
106106
sass-embedded (1.83.4-aarch64-linux-musl)
107107
google-protobuf (~> 4.29)
108108
sass-embedded (1.83.4-aarch64-mingw-ucrt)
109109
google-protobuf (~> 4.29)
110-
sass-embedded (1.83.4-arm-linux-androideabi)
110+
sass-embedded (1.83.4-arm-linux)
111111
google-protobuf (~> 4.29)
112-
sass-embedded (1.83.4-arm-linux-gnueabihf)
112+
sass-embedded (1.83.4-arm-linux-androideabi)
113113
google-protobuf (~> 4.29)
114114
sass-embedded (1.83.4-arm-linux-musleabihf)
115115
google-protobuf (~> 4.29)
116116
sass-embedded (1.83.4-arm64-darwin)
117117
google-protobuf (~> 4.29)
118-
sass-embedded (1.83.4-riscv64-linux-android)
118+
sass-embedded (1.83.4-riscv64-linux)
119119
google-protobuf (~> 4.29)
120-
sass-embedded (1.83.4-riscv64-linux-gnu)
120+
sass-embedded (1.83.4-riscv64-linux-android)
121121
google-protobuf (~> 4.29)
122122
sass-embedded (1.83.4-riscv64-linux-musl)
123123
google-protobuf (~> 4.29)
124124
sass-embedded (1.83.4-x86_64-cygwin)
125125
google-protobuf (~> 4.29)
126126
sass-embedded (1.83.4-x86_64-darwin)
127127
google-protobuf (~> 4.29)
128-
sass-embedded (1.83.4-x86_64-linux-android)
128+
sass-embedded (1.83.4-x86_64-linux)
129129
google-protobuf (~> 4.29)
130-
sass-embedded (1.83.4-x86_64-linux-gnu)
130+
sass-embedded (1.83.4-x86_64-linux-android)
131131
google-protobuf (~> 4.29)
132132
sass-embedded (1.83.4-x86_64-linux-musl)
133133
google-protobuf (~> 4.29)
@@ -137,29 +137,29 @@ GEM
137137
webrick (1.9.1)
138138

139139
PLATFORMS
140+
aarch64-linux
140141
aarch64-linux
141142
aarch64-linux-android
142-
aarch64-linux-gnu
143143
aarch64-linux-musl
144144
aarch64-mingw-ucrt
145+
arm-linux
146+
arm-linux
145147
arm-linux-androideabi
146-
arm-linux-gnu
147-
arm-linux-gnueabihf
148148
arm-linux-musl
149149
arm-linux-musleabihf
150150
arm64-darwin
151+
riscv64-linux
151152
riscv64-linux-android
152-
riscv64-linux-gnu
153153
riscv64-linux-musl
154154
ruby
155155
x86-linux
156-
x86-linux-gnu
156+
x86-linux
157157
x86-linux-musl
158158
x86_64-cygwin
159159
x86_64-darwin
160160
x86_64-linux
161+
x86_64-linux
161162
x86_64-linux-android
162-
x86_64-linux-gnu
163163
x86_64-linux-musl
164164

165165
DEPENDENCIES

docs/_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ description: >- # this means to ignore newlines until "baseurl:"
2525
baseurl: "/continuous-alignment-testing" # the subpath of your site, e.g. /blog
2626
url: "https://thisisartium.github.io/continuous-alignment-testing" # the base hostname & protocol for your site, e.g. http://example.com
2727
# twitter_username: jekyllrb
28-
# github_username: jekyll
28+
github_username: thisisartium
2929
show_excerpts: false
3030
# Build settings
3131
theme: minima

docs/about.markdown

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,16 @@
11
---
22
layout: page
33
title: About
4-
permalink: /about/
54
---
65

7-
This is the base Jekyll theme. You can find out more info about customizing your Jekyll theme, as well as basic Jekyll usage documentation at [jekyllrb.com](https://jekyllrb.com/)
6+
## Overview
87

9-
You can find the source code for Minima at GitHub:
10-
[jekyll][jekyll-organization] /
11-
[minima](https://github.com/jekyll/minima)
8+
CAT Harness provides the infrastructure needed to:
129

13-
You can find the source code for Jekyll at GitHub:
14-
[jekyll][jekyll-organization] /
15-
[jekyll](https://github.com/jekyll/jekyll)
10+
- Run and track CAT tests against LLM outputs
11+
- Store and analyze test results over time
12+
- Monitor changes in LLM behavior as prompts/models/data evolve
13+
- Integrate validation into CI/CD pipelines
1614

17-
18-
[jekyll-organization]: https://github.com/jekyll
15+
[Getting Started](getting-started.html)
16+
[Reference](api/index.html)

docs/getting-started.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,14 @@ title: Getting Started
77

88
## Poetry
99
```sh
10-
poetry install cat-ai
11-
10+
poetry add cat-ai
11+
```
1212
## UV
1313

14+
```sh
1415
uv add cat-ai
1516
```
1617

1718
# Driving out non-deterministic projects with CAT
1819

19-
Let's do a step by step journey through the lifecycle of a project to show how and why to use CAT. We will use an example of a project using an LLM and prompt to give recommendations of software teams for a project. The first step will be working with the prompt and LLM in [local development](local-development.md)
20+
Let's do a step by step journey through the lifecycle of a project to show how and why to use CAT. We will use an example of a project using an LLM and prompt to give recommendations of software teams for a project. The first step will be working with the prompt and LLM in [local development](local-development.html)

docs/index.markdown

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,7 @@
55
layout: home
66
---
77

8-
## Overview
8+
## Index
99

10-
CAT Harness provides the infrastructure needed to:
11-
12-
- Run and track CAT tests against LLM outputs
13-
- Store and analyze test results over time
14-
- Monitor changes in LLM behavior as prompts/models/data evolve
15-
- Integrate validation into CI/CD pipelines
16-
17-
[Getting Started](getting-started.html)
18-
[Refernece](api/index.html)
10+
- [Getting Started](getting-started.html)
11+
- [API Reference](api/index.html)

docs/local-development.md

Lines changed: 19 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,25 @@ The first step will be just to be able to run the first version of your prompt a
88
Imagine we have a python project called `team_recommender` where we recommend teams of developers to be used on a given project. The basic structure looks like this:
99

1010
```
11-
team_recommender/
12-
├── README.md
13-
├── requirements.txt
14-
├── src/
15-
│ ├── __init__.py
16-
│ ├── main.py
17-
│ └── utils.py
18-
└── tests/
19-
├── fixtures/
20-
| ├── example_output.json
21-
| └── skills.json
22-
├── __init__.py
23-
├── test_allocations.py
11+
examples/team_recommender
12+
├── conftest.py
13+
├── readme.md
14+
└── tests
15+
├── example_0_text_output
16+
├── example_1_unit
17+
│   └── test_allocations_unit.py
18+
├── example_2_loop
19+
│   └── test_allocations_loop.py
20+
├── example_3_loop_no_hallucinating
21+
│   └── test_allocations_hallucinating.py
22+
├── example_4_gate_on_success_threshold
23+
│   └── test_allocations_threshold.py
24+
├── fixtures
25+
│   ├── example_output.json
26+
│   ├── output_schema.json
27+
│   └── skills.json
2428
└── settings.py
29+
2530
```
2631

2732
## Single Test
@@ -457,4 +462,4 @@ O.k! Great! Lets look at our second failure:
457462
}
458463
}
459464
```
460-
WOW! We didn't get any developers at all. Great! We can work with this! From here we can update our prompt to be more reslient. Once we make our updates, we will want to make sure these promblems are decreasing and not not regressing over time. Obviously, that isn't something you would try to control on your local machine, and the amount of test runs to get statisticle confidence about the rates of failure/hallucination are staying low. The best surface to gate and monitor this is going to be in your [Continous Integration](running-in-ci.md).
465+
WOW! We didn't get any developers at all. Great! We can work with this! From here we can update our prompt to be more resilient. Once we make our updates, we will want to make sure these problems are decreasing and not not regressing over time. Obviously, that isn't something you would try to control on your local machine, and the amount of test runs to get statisticle confidence about the rates of failure/hallucination are staying low. The best surface to gate and monitor this is going to be in your [Continous Integration](running-in-ci.html).

0 commit comments

Comments
 (0)