- 01.28 Released our paper.
- 01.28 Released the code of Q-Hawkeye.
In this paper, we propose Q-Hawkeye, a GRPO-based framework for reliable visual policy optimization in image quality assessment. Built on Qwen2.5-VL-7B, Q-Hawkeye reshapes the RL learning signal from two complementary perspectives: an Uncertainty-Aware Dynamic Optimization strategy that adaptively reweights per image updates based on score variance across rollouts, and a Perception-Aware Optimization module that enforces consistent distributional differences between original and degraded images via an implicit perception loss with double entropy regularization. Extensive experiments on eight IQA benchmarks further demonstrate the effectiveness of the proposed modules, and show that Q-Hawkeye consistently outperforms existing state-of-the-art methods in both single- and multi-dataset settings, with clear gains in average PLCC/SRCC metrics and improves the model's robustness on challenging out-of-distribution distortions.
pip install -r requirements.txtpython inference.py
Download meta files from Data-DeQA-Score and the source images from the KONIQ dataset.
Your JSON data should follow this format:
[
{
"id": "sample_001",
"images": ["/path/to/image.jpg"],
"gt_score": 3.75
}
]Generate degraded images for Perception Loss training through a four-stage pipeline:
Stage 1: Initial Degradation
cd src/Dataset/Degradation_Dataset/
python Degradation.py \
--input_json /path/to/original_data.json \
--output_json /path/to/degraded_data.json \
--degraded_images_dir /path/to/degraded_images/Applies one degradation type (noise, blur, jpeg, darken) to each image. Each original image generates one degraded variant paired by group_id.
Stage 2: VLM-based Filtering
python VLM_filter.py \
--input_json /path/to/degraded_data.json \
--output_json /path/to/vlm_filtered_data.json \
--model_path /path/to/qwen2-vl-7bUses Qwen2-VL to automatically filter samples by comparing quality scores between original and degraded images.
Stage 3: Human Verification
python Human_filter.py \
--input_json /path/to/vlm_filtered_data.json \
--output_json /path/to/human_verified_data.jsonProvides a GUI for manual review and verification of degraded samples.
Stage 4: Second Degradation
python Second_Degradation.py \
--input_json /path/to/human_verified_data.json \
--output_json /path/to/final_degraded_data.json \
--noise_std 65 --blur_radius 4.0 --jpeg_quality 5 --darken_factor 0.5Applies a second, different degradation type to increase difficulty. The degradation_type field is updated to combined form (e.g., noise+blur).
Convert JSON to HuggingFace Dataset format:
cd src/Dataset/
python RL_Construction.py \
--input /path/to/data.json \
--output /path/to/hf_dataset \
--keep_fields gt_score group_id degradation_type \
--verifyThis converts data into the format required by training scripts with fields: image, problem, solution, plus any additional fields specified in --keep_fields.
cd src/scripts/
bash qw7b_local.sh@article{xie2026qhawkeye,
title={Q-Hawkeye: Reliable Visual Policy Optimization for Image Quality Assessment},
author={Xie, Wulin and Dai, Rui and Ding, Ruidong and Liu, Kaikui and Chu, Xiangxiang and Hou, Xinwen and Wen, Jie},
journal={arXiv preprint arXiv:2601.22920},
year={2026}
}- Release inference code
- Release training code
- Release the paper
We appreciate the releasing codes and data of Visual-RFT, Q-insight and DeQA-Score.
