linerflo.blogg.se

Dirty jenga tumblr
Dirty jenga tumblr





they are compressing gargantuan amounts of data down into smaller (still huge, but much smaller) models of that data by looking at trends and likelihoods and repetitions. they are not borg-assimilating all your best ideas from your fics to frankenstein them back together. I think people are, on a gut level, still understanding these models as “collage machines.” they’re not. it’s not gonna have an effect on how the model behaves. they are not re-training their model with the little prompts you put in, and even if they did, it’s like… a drop of water in the ocean. the common crawl dataset alone is around 205 billion words for gpt-3 they don’t even manage to use all of it. as a rule of thumb in natural language processing, one word is on average two tokens. These are the size of the training sets used to train gpt-3.

dirty jenga tumblr dirty jenga tumblr

only the organizing power of labor has a shot at mitigating some of the effects we’re all worried about there is nothing you can do to stop these models from being made and getting more powerful. Not to be cynical but the genie is already far more out of the bottle than most anti-AI people realize, i think. you can prevent your stuff from being used with Glaze, if you’re an artist, but for the written word there’s nothing you can do. if it can be scraped, just assume it will be. and if it isn’t in now, it will be in future: the increases in performance from GPT 2 to 3 to 4 were not gained through novel machine-learning architectures or anything but by ramping up the amount of data they used to train by orders of magnitude.

dirty jenga tumblr

they are not dynamically crunching up anything you put into a web interface.Ĭhances are, if you have something published on a fanfic site, or your art is on deviantart or any publicly available repository, it’s already in the enormous datasets that they are using to train. it takes an absurd amount of compute power and coordination between many GPUs to re-train a model with billions of parameters. All the frothing-at-the-mouth posts about how “don’t you dare put a fic writer’s work into chatGPT or an artist’s work into stable diffusion” are.







Dirty jenga tumblr