Complete retrain with
full (JOY2) captions (v1 had only few tag words)
attention-mask (no backgrounds)
1024x1024 (v1 was 768x768)
640+ images (v1 was 450+)
Results:
follows prompts much better
should work better together with other Loras
is better on the closeup details (skin pores, lips, etc.)
Images look really different from V1.x
In general if the output looks 'too much'/overtrained you can simply lower the strength of the lora (down to 0.6)
Please see example images for prompts