gerd.models.model
Model configuration for supported model classes.
Classes:
Name | Description |
---|---|
ChatMessage |
Data structure for chat messages. |
ModelConfig |
Configuration for large language models. |
ModelEndpoint |
Configuration for model endpoints where models are hosted remotely. |
PromptConfig |
Configuration for prompts. |
Attributes:
Name | Type | Description |
---|---|---|
ChatRole |
Currently supported chat roles. |
|
EndpointType |
Endpoint for remote llm services. |
ChatRole
module-attribute
Currently supported chat roles.
EndpointType
module-attribute
Endpoint for remote llm services.
ChatMessage
ModelConfig
Bases: BaseModel
Configuration for large language models.
Most llm libraries and/or services share common parameters for configuration. Explaining each parameter is out of scope for this documentation. The most essential parameters are explained for instance here. Default values have been chosen according to ctransformers library.
Attributes:
Name | Type | Description |
---|---|---|
batch_size |
int
|
The batch size for the generation. |
context_length |
int
|
The context length for the model. Currently only LLaMA, MPT and Falcon |
endpoint |
Optional[ModelEndpoint]
|
The endpoint of the model when hosted remotely. |
extra_kwargs |
Optional[dict[str, Any]]
|
Additional keyword arguments for the model library. |
file |
Optional[str]
|
The path to the model file. For local models only. |
gpu_layers |
int
|
The number of layers to run on the GPU. |
last_n_tokens |
int
|
The number of tokens to consider for the repetition penalty. |
loras |
set[Path]
|
The list of additional LoRAs files to load. |
max_new_tokens |
int
|
The maximum number of new tokens to generate. |
name |
str
|
The name of the model. Can be a path to a local model or a huggingface handle. |
prompt_config |
PromptConfig
|
The prompt configuration. |
prompt_setup |
List[Tuple[Literal['system', 'user', 'assistant'], PromptConfig]]
|
A list of predefined prompts for the model. |
repetition_penalty |
float
|
The repetition penalty. |
seed |
int
|
The seed for the random number generator. |
stop |
Optional[List[str]]
|
The stop tokens for the generation. |
stream |
bool
|
Whether to stream the output. |
temperature |
float
|
The temperature for the sampling. |
threads |
Optional[int]
|
The number of threads to use for the generation. |
top_k |
int
|
The number of tokens to consider for the top-k sampling. |
top_p |
float
|
The cumulative probability for the top-p sampling. |
torch_dtype |
Optional[str]
|
The torch data type for the model. |
batch_size
class-attribute
instance-attribute
The batch size for the generation.
context_length
class-attribute
instance-attribute
The context length for the model. Currently only LLaMA, MPT and Falcon
endpoint
class-attribute
instance-attribute
The endpoint of the model when hosted remotely.
extra_kwargs
class-attribute
instance-attribute
Additional keyword arguments for the model library.
The accepted keys and values depend on the model library used.
file
class-attribute
instance-attribute
The path to the model file. For local models only.
gpu_layers
class-attribute
instance-attribute
The number of layers to run on the GPU.
The actual number is only used llama.cpp. The other model libraries will determine whether to run on the GPU just by checking of this value is larger than 0.
last_n_tokens
class-attribute
instance-attribute
The number of tokens to consider for the repetition penalty.
loras
class-attribute
instance-attribute
The list of additional LoRAs files to load.
max_new_tokens
class-attribute
instance-attribute
The maximum number of new tokens to generate.
name
class-attribute
instance-attribute
The name of the model. Can be a path to a local model or a huggingface handle.
prompt_config
class-attribute
instance-attribute
The prompt configuration.
This is used to process the input passed to the services.
prompt_setup
class-attribute
instance-attribute
A list of predefined prompts for the model.
When a model context is inialized or reset, this will be used to set up the context.
repetition_penalty
class-attribute
instance-attribute
The repetition penalty.
stop
class-attribute
instance-attribute
The stop tokens for the generation.
temperature
class-attribute
instance-attribute
The temperature for the sampling.
threads
class-attribute
instance-attribute
The number of threads to use for the generation.
top_k
class-attribute
instance-attribute
The number of tokens to consider for the top-k sampling.
top_p
class-attribute
instance-attribute
The cumulative probability for the top-p sampling.
ModelEndpoint
Bases: BaseModel
Configuration for model endpoints where models are hosted remotely.
PromptConfig
Bases: BaseModel
Configuration for prompts.
Methods:
Name | Description |
---|---|
format |
Format the prompt with the given parameters. |
model_post_init |
Post-initialization hook for pyandic. |
Attributes:
Name | Type | Description |
---|---|---|
is_template |
bool
|
Whether the config uses jinja2 templates. |
parameters |
list[str]
|
Retrieves and returns the parameters of the prompt. |
path |
Optional[str]
|
The path to an external prompt file. |
template |
Optional[Template]
|
Optional template of the prompt. This should follow the Jinja2 syntax. |
text |
str
|
The text of the prompt. Can contain placeholders. |
is_template
class-attribute
instance-attribute
Whether the config uses jinja2 templates.
parameters
property
Retrieves and returns the parameters of the prompt.
This happens on-the-fly and is not stored in the model.
Returns:
Type | Description |
---|---|
list[str]
|
The parameters of the prompt. |
path
class-attribute
instance-attribute
The path to an external prompt file.
This will overload the values of text and/or template.
template
class-attribute
instance-attribute
Optional template of the prompt. This should follow the Jinja2 syntax.
text
class-attribute
instance-attribute
The text of the prompt. Can contain placeholders.
format
Format the prompt with the given parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Mapping[str, str | list[ChatMessage]] | None
|
The parameters to format the prompt with. |
None
|
Returns:
Type | Description |
---|---|
str
|
The formatted prompt |
Source code in gerd/models/model.py
model_post_init
Post-initialization hook for pyandic.
When path is set, the text or template is read from the file and the template is created. Path ending with '.jinja2' will be treated as a template. If no path is set, the text parameter is used to initialize the template if is_template is set to True. Parameters: __context: The context of the model (not used)