The output text generated by a language model in response to a prompt. Also refers to the API endpoint type for generating text continuations.
In the context of language models, a completion is the text generated by the model in response to an input prompt. The model "completes" the prompt by predicting and generating the most likely continuation based on its training.
The completion process works by repeatedly predicting the next most likely token, then appending it to the sequence and predicting again. This autoregressive process continues until a stopping condition is met (max tokens, stop sequence, or end token).
Key aspects of completions:
Understanding completions is essential for:
Understanding completions helps you design effective prompts and predict API costs based on output length, essential for budgeting AI projects.
We help Australian businesses optimise their completion settings - balancing quality, cost, and speed. Proper completion configuration can reduce AI costs by 30-50% while maintaining quality.
"A prompt asking to summarise an article generates a completion containing the summary. The completion length can be controlled via max_tokens parameter."