-
Notifications
You must be signed in to change notification settings - Fork 5
feat: Inference mock #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
af81a64
feat: mock inference server for notebook testing
iamemilio 4772ea3
inference mock composite action
iamemilio e74fa45
feat: contribute to readme
iamemilio be4ec36
fix: make the linters happy with inference-mock
iamemilio c9b5111
fix: remove config.yml from inference-mock
iamemilio 9f4b729
fix: additional linting fixes for inference-mock
iamemilio f01cc72
fix: inference mock not running as background
iamemilio 42b46f5
feat: inference mock unit tests
iamemilio 8576f26
fix: inference-mock code review changes
iamemilio 5e81a59
fix: ruff is pretty rough man
iamemilio 6e20376
isort'ed it out :')
iamemilio 02790f9
fix: add pyyaml to inference-mock requirements
iamemilio File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| name: Inference Mock Tests | ||
|
|
||
| on: | ||
| pull_request: | ||
| branches: | ||
| - "main" | ||
| - "release-**" | ||
| paths: | ||
| - 'actions/inference-mock/**' | ||
| - '.github/workflows/inference-mock.yml' # This workflow | ||
|
|
||
| jobs: | ||
| inference-mock-unit-tests: | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| matrix: | ||
| python-version: ["3.11"] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| with: | ||
| sparse-checkout: | | ||
| actions/inference-mock | ||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | ||
| - name: Install dependencies | ||
| working-directory: actions/inference-mock | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install -r requirements.txt | ||
| - name: Run Unit Tests | ||
| working-directory: actions/inference-mock | ||
| run: | | ||
| python -m unittest test/test.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| # Inference Mock | ||
|
|
||
| ## Overview | ||
|
|
||
| Inference Mock is a tool that creates a flask server that runs as a background process. OpenAI comptabile calls can be made to its completions API. | ||
| Based on how the server is configured, it will send a set of programmed responses back. | ||
|
|
||
| ## When to Use it? | ||
|
|
||
| Testing notebooks is difficult to do since you often don't write functions or unit tests in them. Instead, if you want to mock an LLM call and response, | ||
| this is an easy way to rig that up in your testing environment. This is best used for integration, unit, and smoke tests. This is obviously not a real | ||
| inference service, so its best used for testing code that makes occasional calls to an LLM to do a task. | ||
|
|
||
| ## Usage | ||
|
|
||
| This is a reusable workflow, and can be referenced and used in any github actions workflow. First, you will need to make a config file. You can set the following fields: | ||
|
|
||
| ```yaml | ||
| # debug: enable debug logging and debug mode in flask | ||
| # optional: this defaults to False | ||
| debug: True | ||
|
|
||
| # port: the port the server will listen on | ||
| # optional: this defaults to 11434 | ||
| port: 11434 | ||
|
|
||
| # matches: a list of matching strategies for expected sets of prompt response pairs. The following strategies are available: | ||
| # - contains: accepts a list of substrings. All incoming prompts will need to contain all listed substrings for this match to be positive | ||
| # - response: passing only a response is an `Always` match strategy. If no other strategy has matched yet, this will always be a positive match. | ||
| # | ||
| # note: the strategies are executed in the order listed, and the first succesful match is accepted. If you start with an `Always` strategy, its | ||
| # response will be the only response returned. | ||
| matches: | ||
|
|
||
| # this is an example of a `contains` strategy. If the prompt contains the substrings, it returns the response. | ||
| - contains: | ||
| - 'I need you to generate three questions that must be answered only with information contained in this passage, and nothing else.' | ||
| response: '{"fact_single": "What are some common ways to assign rewards to partial answers?", "fact_single_answer": "There are three: prod, which takes the product of rewards across all steps; min, which selects the minimum reward over all steps; and last, which uses the reward from the final step.", "reasoning": "What is the best method for rewarding models?", "reasoning_answer": "That depends on whether the training data is prepared using MC rollout, human annotation, or model annotation.", "summary": "How does QWEN implement model reward?", "summary_answer": "Qwen computes the aggregate reward based on the entire partial reward trajectory. I also uses a method that feeds the performance reference model with partial answers, then only considering the final reward token."}' | ||
|
|
||
| # this is an example of an `Always` strategy. It will always match, and return this response. | ||
| - response: "hi I am the default response" | ||
| ``` | ||
|
|
||
| This config must be passed to this action as an input. Here is an example of a workflow that invokes this action to create a mock server. | ||
|
|
||
| ```yaml | ||
| jobs: | ||
| example-job: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout "inference-mock" in-house CI action | ||
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| with: | ||
| repository: instructlab/ci-actions | ||
| path: ci-actions | ||
| sparse-checkout: | | ||
| actions/inference-mock | ||
| - name: Inference Mock | ||
| uses: ./ci-actions/actions/inference-mock | ||
| with: | ||
| config: "example-config.yml" | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| name: 'Inference Mock' | ||
| description: 'Creates and runs a server that returns mock Open AI completions as a background process' | ||
| author: "InstructLab" | ||
|
|
||
| inputs: | ||
| config: | ||
| description: the path to a config.yml file for the inference mock server | ||
| required: true | ||
| type: string | ||
|
|
||
| runs: | ||
| using: "composite" | ||
| steps: | ||
| - name: Install Dependencies | ||
| shell: bash | ||
| run: pip install -r ${{ github.action_path }}/requirements.txt | ||
| - name: Run Inference Mock Server | ||
| shell: bash | ||
| run: | | ||
| nohup python ${{ github.action_path }}/app.py --config ${{ inputs.config }} & | ||
| sleep 2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| # Standard | ||
| from dataclasses import dataclass | ||
| import logging | ||
| import pprint | ||
|
|
||
| # Third Party | ||
| from completions.completion import create_chat_completion | ||
| from flask import Flask, request # type: ignore[import-not-found] | ||
| from matching.matching import Matcher | ||
| from werkzeug import exceptions # type: ignore[import-not-found] | ||
| import click # type: ignore[import-not-found] | ||
| import yaml | ||
|
|
||
| # Globals | ||
| app = Flask(__name__) | ||
| strategies: Matcher # a read only list of matching strategies | ||
|
|
||
|
|
||
| # Routes | ||
| @app.route("/v1/completions", methods=["POST"]) | ||
| def completions(): | ||
| data = request.get_json() | ||
| if not data or "prompt" not in data: | ||
| raise exceptions.BadRequest("prompt is empty or None") | ||
|
|
||
| prompt = data.get("prompt") | ||
| prompt_debug_str = prompt | ||
| if len(prompt) > 90: | ||
| prompt_debug_str = data["prompt"][:90] + "..." | ||
|
|
||
| app.logger.debug( | ||
| f"{request.method} {request.url} {data['model']} {prompt_debug_str}" | ||
| ) | ||
|
|
||
| chat_response = strategies.find_match( | ||
| prompt | ||
| ) # handle prompt and generate correct response | ||
|
|
||
| response = create_chat_completion(chat_response, model=data.get("model")) | ||
| app.logger.debug(f"response: {pprint.pformat(response, compact=True)}") | ||
| return response | ||
|
|
||
|
|
||
| # config | ||
| @dataclass | ||
| class Config: | ||
| matches: list[dict] | ||
| port: int = 11434 | ||
| debug: bool = False | ||
|
|
||
|
|
||
| @click.command() | ||
| @click.option( | ||
| "-c", | ||
| "--config", | ||
| "config", | ||
| type=click.File(mode="r", encoding="utf-8"), | ||
| required=True, | ||
| help="yaml config file", | ||
| ) | ||
| def start_server(config): | ||
| # get config | ||
| yaml_data = yaml.safe_load(config) | ||
| if not isinstance(yaml_data, dict): | ||
| raise ValueError(f"config file {config} must be a set of key-value pairs") | ||
|
|
||
| conf = Config(**yaml_data) | ||
|
|
||
| # configure logger | ||
| if conf.debug: | ||
| app.logger.setLevel(logging.DEBUG) | ||
| app.logger.debug("debug mode enabled") | ||
| else: | ||
| app.logger.setLevel(logging.INFO) | ||
|
|
||
| # create match strategy object | ||
| global strategies # pylint: disable=global-statement | ||
| strategies = Matcher(conf.matches) | ||
|
|
||
| # init server | ||
| app.run(debug=conf.debug, port=conf.port) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| start_server() # pylint: disable=no-value-for-parameter | ||
iamemilio marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| # mock openAI completion responses | ||
| # credit: https://github.com/openai/openai-python/issues/715#issuecomment-1809203346 | ||
| # License: MIT | ||
|
|
||
| # Standard | ||
| import random | ||
|
|
||
|
|
||
| # TODO: use a library to return and validate completions so this doesn't need to be maintained | ||
| def create_chat_completion(content: str, model: str = "gpt-3.5") -> dict: | ||
| response = { | ||
| "id": "chatcmpl-2nYZXNHxx1PeK1u8xXcE1Fqr1U6Ve", | ||
| "object": "chat.completion", | ||
| "created": "12345678", | ||
| "model": model, | ||
| "system_fingerprint": "fp_44709d6fcb", | ||
| "choices": [ | ||
| { | ||
| "text": content, | ||
| "content": content, | ||
| "index": 0, | ||
| "logprobs": None, | ||
| "finish_reason": "length", | ||
| }, | ||
| ], | ||
| "usage": { | ||
| "prompt_tokens": random.randint(10, 500), | ||
| "completion_tokens": random.randint(10, 500), | ||
| "total_tokens": random.randint(10, 500), | ||
| }, | ||
| } | ||
|
|
||
| return response |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| # Standard | ||
| from abc import abstractmethod | ||
| from typing import Protocol | ||
| import pprint | ||
|
|
||
|
|
||
| class Match(Protocol): | ||
| """ | ||
| Match represents a single prompt matching | ||
| strategy. When a match is successful, | ||
| the response is what should be returned. | ||
| """ | ||
|
|
||
| response: str | ||
|
|
||
| @abstractmethod | ||
| def match(self, prompt: str) -> str | None: | ||
| raise NotImplementedError | ||
|
|
||
|
|
||
| class Always: | ||
| """ | ||
| Always is a matching strategy that always | ||
| is a positive match on a given prompt. | ||
|
|
||
| This is best used when only one prompt response | ||
| is expected. | ||
| """ | ||
|
|
||
| def __init__(self, response: str): | ||
| self.response = response | ||
|
|
||
| def match(self, prompt: str) -> str | None: | ||
| if prompt: | ||
| return self.response | ||
| return None | ||
|
|
||
|
|
||
| class Contains: | ||
| """ | ||
| Contains is a matching strategy that checks | ||
| if the prompt string contains all of | ||
| the substrings in the `contains` attribute. | ||
| """ | ||
|
|
||
| contains: list[str] | ||
|
|
||
| def __init__(self, contains: list[str], response: str): | ||
| if not contains or len(contains) == 0: | ||
| raise ValueError("contains must not be empty") | ||
| self.response = response | ||
| self.contains = contains | ||
|
|
||
| def match(self, prompt: str) -> str | None: | ||
| if not prompt: | ||
| return None | ||
| for context in self.contains: | ||
| if context not in prompt: | ||
| return None | ||
|
|
||
| return self.response | ||
|
|
||
|
|
||
| # helper function pulled out for easier testing | ||
| def to_match(pattern: dict) -> Match: | ||
| response = pattern.get("response") | ||
| if not response: | ||
| raise ValueError( | ||
| f"matching strategy must have a response: {pprint.pformat(pattern, compact=True)}" | ||
| ) | ||
| if "contains" in pattern: | ||
| return Contains(**pattern) | ||
| return Always(**pattern) | ||
|
|
||
|
|
||
| class Matcher: | ||
| """ | ||
| Matcher matches prompt context and then | ||
| selects a user provided reply. | ||
| """ | ||
|
|
||
| strategies: list[Match] | ||
|
|
||
| def __init__(self, matching_patterns: list[dict]): | ||
| if not matching_patterns: | ||
| raise ValueError( | ||
| "matching strategies must contain at least one Match strategy" | ||
| ) | ||
|
|
||
| self.strategies: list[Match] = [] | ||
| for matching_pattern in matching_patterns: | ||
| self.strategies.append(to_match(matching_pattern)) | ||
|
|
||
| def find_match(self, prompt: str) -> str: | ||
| for strategy in self.strategies: | ||
| response = strategy.match(prompt) | ||
| if response: | ||
| return response | ||
| return "" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| flask | ||
| werkzeug | ||
| click | ||
iamemilio marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| pyyaml | ||
Empty file.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Can we add unit tests for these new scripts in case others want to contribute to them in the future?