Skip to content

Strange inference score #4

@Iriya99

Description

@Iriya99

[Raw Model Output]

multi-dimensional analysis: Let me analyze this video step by step according to the three dimensions:

Visual Quality Analysis:
Overall, Visual Quality is mid-range. There is decent sharpness and color reproduction, but occasional softness in distant areas and minor compression artifacts prevent a high-end look. The car fire and smoke are recognizable and convey the intended drama, though some frames show slight inconsistencies in lighting or edge detail. The fire extinguisher in its stand reads clearly, and the overlay text is readable, but overall texture and polish do not reach premium levels. No major distortions or artifacts dominate the sequence, keeping the visuals acceptable for the scenario.

Text-to-Video Alignment Analysis:
The video captures the core elements described in the prompt: flames bursting from the front of a burning car, thick smoke billowing upward, and a red fire extinguisher in a yellow stand that remains stationary. The scene composition and object placements align well with the prompt, and the textual overlays reinforce the message without introducing extraneous content. While there may be minor non-critical deviations in fine-grained motion details, the overall alignment between prompt and visuals is strong.

Physical Consistency (Common-sense):
Physical realism is adequate. Fire dynamics and smoke behavior are plausible, and the extinguisher’s placement relative to the car is sensible within the scene. However, a few micro-aspects (such as exact proportions, wind interaction with smoke, or the timing of smoke rise) show small deviations that prevent a higher score on physical realism. Overall, the scene maintains believable physical relationships among objects and events.

(1) visual quality – clarity, smoothness, artifacts: 3
(2) text-to-video alignment – fidelity to the prompt: 4
(3) physical/common-sense consistency – naturalness and physics plausibility: 3

====== Inference Result ======
Video Path: /home/work/wentao/wan22_ph/00b15ce63fd2a2925c0fd5a3d3a0044e.mp4
Visual Quality: None
Text-to-Video Alignment: None
Physical Consistency: None

Why are all three scores in the Inference Result showing as none?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions