A modular NLP-to-motion pipeline that interprets natural language commands and translates them into robotic arm control actions, with environmental awareness and reasoning via a local LLM.
This middleware connects natural language input to a robotic control system. It performs:
- Intent Detection – Determines whether input requires robotic action.
- Object Matching – Parses environment data and finds relevant objects.
- LLM Reasoning – Uses a local LLM to generate a chain-of-thought plan.
- Feedback Loop – Handles uncertainty via clarification or rejections.
- Output Generation – Produces structured action steps (
Output.json).
robotic-arm-middleware/
│
├── app.py # Entry point of the middleware
│
├── input\_layer/
│ └── intent\_filter.py # Filters whether command targets robotic arm
│
├── middleware/
│ └── pipeline.py # Core logic: match, reasoning, output
│
├── utils/
│ └── env\_loader.py # Environment.json loader
│
├── data/
│ ├── environment.json # Current scene objects (input)
│ └── output.json # Generated step-by-step plan (output)
│
├── llm/
│ └── dummy\_llm.py # Placeholder LLM call logic (replaceable)
│
└── README.md # You're reading it
- Python 3.8+
openaior your own LLM client SDK- Your local model endpoint (e.g. LM Studio, Ollama)
Install dependencies:
pip install openai
-
Place your
environment.jsonindata/. -
Run the app application:
python app.py
- Enter a command:
move the handy next to the fruit
- Generated plan will appear in
data/output.json.
Update llm/dummy_llm.py to point to your local/remote LLM API:
client = OpenAI(base_url="http://localhost:1234/v1", api_key="your-key")
# for LM Studio
# client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
response = client.chat.completions.create(...)
- Add speech-to-text preprocessing layer
- Collision detection refinement
- Neurapy arm API integration
MIT License © 2025
