AutoGluon Cloud Fails to Handle Multi-Partition Batch Transform Job Correctly

**Description:**

When running batch transform with `autogluon.cloud` on a dataset that is partitioned into multiple records (using `MultiRecord` strategy), AutoGluon Cloud appears to fail when the input CSV file contains multiple partitions. The issue arises when headers from different partitions are not handled properly, leading to misaligned columns and prediction failures during inference.

#### **Steps to Reproduce:**
The following script can be used to reproduce the issue:

```python
from autogluon.cloud import TabularCloudPredictor
import pandas as pd

# Load datasets
train_data = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv")
test_data = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv")
test_data.drop(columns=['class'], inplace=True)

# Cloud Predictor Arguments
predictor_init_args = {"label": "class"}  
predictor_fit_args = {"train_data": train_data, "time_limit": 60}  

# Initialize Cloud Predictor and Fit
cloud_predictor = TabularCloudPredictor(cloud_output_path='tonyhu-autogluon')
cloud_predictor.fit(predictor_init_args=predictor_init_args, predictor_fit_args=predictor_fit_args)

# Batch Inference with small max_payload to force multiple partitions
result = cloud_predictor.predict(test_data, backend_kwargs={"transformer_kwargs": {"max_payload": 1}})
```

#### **Expected Behavior:**
The batch transform job should handle multiple partitions correctly, aligning columns across the partitions and ignoring or managing headers if present in individual partitions. 

#### **Observed Behavior:**
The job fails with the following error logs:
```
Bad HTTP status received from algorithm: 500
invalid literal for int() with base 10: '0.1': Error while type casting for column 'capital-loss'
```
Logs show that the columns are misaligned for certain partitions:
```
test_columns: [' 11th', ' Machine-op-inspct', ' Male', ' Never-married', ' Other-relative', ' Private', ' United-States', ' White', '0', '0.1', '207443', '50', '62', '7']
2024-09-13T21:56:19,062 [INFO ] W-9000-model_1.0-stdout MODEL_LOG - train_columns: ['age', 'capital-gain', 'capital-loss', 'education', 'education-num', 'fnlwgt', 'hours-per-week', 'marital-status', 'native-country', 'occupation', 'race', 'relationship', 'sex', 'workclass']
```
This suggests that the CSV header is not not being duplicated across partitions, leading to column misalignment.

#### **Environment:**
- `autogluon==1.1.0`
- Running batch transform in SageMaker with `MultiRecord` strategy.
- `MaxPayloadInMB=1` is set to ensure multiple partitions.

#### **Additional Information:**
The issue seems to be that AutoGluon Cloud is not handling the headers properly when dealing with batch transform partitioned records. In a multi-partition job, not all batch will have the header/column, which is causing the column misalignment.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AutoGluon Cloud Fails to Handle Multi-Partition Batch Transform Job Correctly #136

Steps to Reproduce:

Expected Behavior:

Observed Behavior:

Environment:

Additional Information:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AutoGluon Cloud Fails to Handle Multi-Partition Batch Transform Job Correctly #136

Description

Steps to Reproduce:

Expected Behavior:

Observed Behavior:

Environment:

Additional Information:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions