You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`weco` is a command-line interface for interacting with Weco AI's code optimizer, powered by [AI-Driven Exploration](https://arxiv.org/abs/2502.13138). It helps you automate the improvement of your code for tasks like GPU kernel optimization, feature engineering, model development, and prompt engineering.
7
+
Weco systematically optimizes your code, guided directly by your evaluation metrics.
8
+
Example applications include:
9
+
10
+
-**GPU Kernel Optimization**: Reimplement PyTorch functions using CUDA, Triton or Metal, optimizing for `latency`, `throughput`, or `memory_bandwidth`.
11
+
-**Model Development**: Tune feature transformations or architectures, optimizing for `validation_accuracy`, `AUC`, or `Sharpe Ratio`.
12
+
-**Prompt Engineering**: Refine prompts for LLMs, optimizing for `win_rate`, `relevance`, or `format_adherence`
@@ -18,37 +23,6 @@ The `weco` CLI leverages a tree search approach guided by Large Language Models
18
23
19
24
---
20
25
21
-
## Example Use Cases
22
-
23
-
Here's how `weco` can be applied to common ML engineering tasks:
24
-
25
-
***GPU Kernel Optimization:**
26
-
***Goal:** Improve the speed or efficiency of low-level GPU code.
27
-
***How:**`weco` iteratively refines CUDA, Triton, Metal, or other kernel code specified in your `--source` file.
28
-
***`--eval-command`:** Typically runs a script that compiles the kernel, executes it, and benchmarks performance (e.g., latency, throughput).
29
-
***`--metric`:** Examples include `latency`, `throughput`, `TFLOPS`, `memory_bandwidth`. Optimize to `minimize` latency or `maximize` throughput.
30
-
31
-
***Feature Engineering:**
32
-
***Goal:** Discover better data transformations or feature combinations for your machine learning models.
33
-
***How:**`weco` explores different processing steps or parameters within your feature transformation code (`--source`).
34
-
***`--eval-command`:** Executes a script that applies the features, trains/validates a model using those features, and prints a performance score.
35
-
***`--metric`:** Examples include `accuracy`, `AUC`, `F1-score`, `validation_loss`. Usually optimized to `maximize` accuracy/AUC/F1 or `minimize` loss.
36
-
37
-
***Model Development:**
38
-
***Goal:** Tune hyperparameters or experiment with small architectural changes directly within your model's code.
39
-
***How:**`weco` modifies hyperparameter values (like learning rate, layer sizes if defined in the code) or structural elements in your model definition (`--source`).
40
-
***`--eval-command`:** Runs your model training and evaluation script, printing the key performance indicator.
41
-
***`--metric`:** Examples include `validation_accuracy`, `test_loss`, `inference_time`, `perplexity`. Optimize according to the metric's nature (e.g., `maximize` accuracy, `minimize` loss).
42
-
43
-
***Prompt Engineering:**
44
-
***Goal:** Refine prompts used within larger systems (e.g., for LLM interactions) to achieve better or more consistent outputs.
45
-
***How:**`weco` modifies prompt templates, examples, or instructions stored in the `--source` file.
46
-
***`--eval-command`:** Executes a script that uses the prompt, generates an output, evaluates that output against desired criteria (e.g., using another LLM, checking for keywords, format validation), and prints a score.
47
-
***`--metric`:** Examples include `quality_score`, `relevance`, `task_success_rate`, `format_adherence`. Usually optimized to `maximize`.
0 commit comments