Abstract: The vision-language model CLIP has profoundly transformed the filed of zero-shot anomaly detection. Recent studies acquire anomaly maps by aligning images with normal and abonormal prompts.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results