![]() ![]() If a model has 10 times the compute, its loss will be about 11% lower. The scaling relationship between loss and compute found by OpenAI in 2020 is a power law. Compute is therefore influenced by both the amount of data and number of parameters. Is simply the number of computer operations (typically matrix multiplications) that must be performed throughout the model’s training. Training data is the size of the dataset a model is trained on.The number of parameters in the model is a measure of its complexity - equivalent to the number of nodes in the neural network.įinally, the amount of “compute” used for a model, measured in floating point operations, or flops, A close antonym of performance is “loss,” which is a measure of how far off a model’s predictions were from reality lower loss means better performance. A large language model is trained to predict text completions the more often it correctly predicts how to complete a text, the better its performance. The performance of a model describes its accuracy in choosing the “right” answer on known data. Scaling laws in AI generally relate the performance of a model to its inputs: training data, model parameters, and compute. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |