Add R-to-Python syscall workflow for sentiment analysis training and prediction in a Kaiaulu Notebook

This proposal outlines a workflow where users never leave the Kaiaulu R Notebook. Instead, a standalone Python script (for both training and prediction) is exposed to Kaiaulu via tools.yml and invoked from R using system calls.

This shifts the entire analysis workflow (data download, parsing, training, prediction, and results) into a single Kaiaulu Notebook.

<img width="1483" height="618" alt="Image" src="https://github.com/user-attachments/assets/80b76f6b-fff4-4bdf-95eb-13b6c50f5400" />

## Left side: Kaiaulu
R Scripts (R/sentiment.R)
**pysenti_train_model**(pysenti_path,reply_dt,model_save_path, model) Calls a Python script (train_or_predict.py) that passes in a data table to train a sentiment model.
**pysenti_predict**(pysenti_path, reply_dt, model_save_path, model): Calls the same Python script to that passes in a data table to predict sentiment on new data.
**get_pysenti_path**("pysenti", "../conf/tools.yml"): Fetches the path to the Python script based on a configuration (tools.yml).

Configuration (tools.yml)
Defines the path to the Python script for training/prediction:

`pysenti: ~/path/to/train_or_predict.py`

Vignette (vignettes/sentiment_analysis.Rmd)
Download data
Parse
Train or predict using pysenti_train_model() or pysenti_predict()
Load table with results

## Right side: pysenti
Script (exec/train_or_predict.py)
Handles train and predict commands.
Receives the model path and the parsed data table from R, then executes the corresponding Python functions in:

API functions (api/model.py)
train_model(This function already exists):
**train_model**(parsed_dt, model_saved_path, model_select=0): Takes in a data table (w/ columns "Text" and "Polarity" where Polarity is assumed to have already correct values to train the models) Trains a model and saves it to a path "model_saved_path" and returns that path.

We will add this function:
predict_sentiment(), this is similar to the existing test_model(). test_model() predicts sentiment, compares it to the flat true labels, then combines both labels into a data table and returns it.  Our new function predict_sentiment() only predicts sentiment, then returns a data table with updated sentiment values.
**predict_sentiment**(parsed_dt, model_saved_path, model_select=0): Takes in a data table (w/ columns "Text" and "Polarity") then applies predicted sentiment values and returns that table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add R-to-Python syscall workflow for sentiment analysis training and prediction in a Kaiaulu Notebook #3

Left side: Kaiaulu

Right side: pysenti

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add R-to-Python syscall workflow for sentiment analysis training and prediction in a Kaiaulu Notebook #3

Description

Left side: Kaiaulu

Right side: pysenti

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions