First version; The model itself was trained on 5k steps using ai-toolkit with the default learning rate of 1e-4. There was roughly 36 images used, which were all generated within Pony. The captions generated by Florence2 and lazily edited.
Create & Edit Images Instantly with Grok Imagine
Try Grok Imagine