@wronglang @devsimsek Yes, sure. I mean I can imagine it improving somewhat still, like when you augment your training set for image recognition by adding noise to a smaller set, but only to a point before it goes downhill from feedback.No, my gut feeling is rather that there have to be much more effective ways to train a model than to brute force funnel billions of pages of text to a transformer which blindly fits relations between words and structures without understanding them, that seems like doing it the hard way, even if I'm not expert enough to tell you what an alternative would look like