Skip to content

Components

Finetune

lazyllm.components.finetune.AlpacaloraFinetune

Bases: LazyLLMFinetuneBase

This class is a subclass of LazyLLMFinetuneBase, based on the LoRA fine-tuning capabilities provided by the alpaca-lora project, used for LoRA fine-tuning of large language models.

Parameters:

  • base_model (str) –

    Path to the base model for fine-tuning.

  • target_path (str) –

    Path to save LoRA weights of the fine-tuned model.

  • merge_path (Optional[str], default: None ) –

    Path to save merged LoRA weights, default None. If not provided, "lazyllm_lora" and "lazyllm_merge" directories are created under target_path.

  • model_name (Optional[str], default: 'LLM' ) –

    Model name used as log prefix, default "LLM".

  • cp_files (Optional[str], default: 'tokeniz*' ) –

    Configuration files copied from base model path to merge_path, default tokeniz*.

  • launcher (launcher, default: remote(ngpus=1) ) –

    Launcher for fine-tuning, default launchers.remote(ngpus=1).

  • kw (dict, default: {} ) –

    Keyword arguments to update default training parameters:

Other Parameters:

  • data_path (Optional[str]) –

    Path to dataset, default None.

  • batch_size (Optional[int]) –

    Batch size, default 64.

  • micro_batch_size (Optional[int]) –

    Micro-batch size, default 4.

  • num_epochs (Optional[int]) –

    Number of training epochs, default 2.

  • learning_rate (Optional[float]) –

    Learning rate, default 5.e-4.

  • cutoff_len (Optional[int]) –

    Cutoff length, default 1030.

  • filter_nums (Optional[int]) –

    Number of filters, default 1024.

  • val_set_size (Optional[int]) –

    Validation set size, default 200.

  • lora_r (Optional[int]) –

    LoRA rank, default 8.

  • lora_alpha (Optional[int]) –

    LoRA fusion factor, default 32.

  • lora_dropout (Optional[float]) –

    LoRA dropout rate, default 0.05.

  • lora_target_modules (Optional[str]) –

    LoRA target modules, default [wo,wqkv].

  • modules_to_save (Optional[str]) –

    Modules for full fine-tuning, default [tok_embeddings,output].

  • deepspeed (Optional[str]) –

    Path to DeepSpeed config, default uses repository pre-made ds.json.

  • prompt_template_name (Optional[str]) –

    Name of prompt template, default "alpaca".

  • train_on_inputs (Optional[bool]) –

    Whether to train on inputs, default True.

  • show_prompt (Optional[bool]) –

    Whether to show the prompt, default False.

  • nccl_port (Optional[int]) –

    NCCL port, default random between 19000-20500.

Examples:

>>> from lazyllm import finetune
>>> trainer = finetune.alpacalora('path/to/base/model', 'path/to/target')
Source code in lazyllm/components/finetune/alpacalora.py
class AlpacaloraFinetune(LazyLLMFinetuneBase):
    """This class is a subclass of ``LazyLLMFinetuneBase``, based on the LoRA fine-tuning capabilities provided by the [alpaca-lora](https://github.com/tloen/alpaca-lora) project, used for LoRA fine-tuning of large language models.

Args:
    base_model (str): Path to the base model for fine-tuning.
    target_path (str): Path to save LoRA weights of the fine-tuned model.
    merge_path (Optional[str]): Path to save merged LoRA weights, default ``None``.
        If not provided, "lazyllm_lora" and "lazyllm_merge" directories are created under ``target_path``.
    model_name (Optional[str]): Model name used as log prefix, default "LLM".
    cp_files (Optional[str]): Configuration files copied from base model path to ``merge_path``, default ``tokeniz*``.
    launcher (lazyllm.launcher): Launcher for fine-tuning, default ``launchers.remote(ngpus=1)``.
    kw (dict): Keyword arguments to update default training parameters:

Keyword Args:
    data_path (Optional[str]): Path to dataset, default ``None``.
    batch_size (Optional[int]): Batch size, default 64.
    micro_batch_size (Optional[int]): Micro-batch size, default 4.
    num_epochs (Optional[int]): Number of training epochs, default 2.
    learning_rate (Optional[float]): Learning rate, default 5.e-4.
    cutoff_len (Optional[int]): Cutoff length, default 1030.
    filter_nums (Optional[int]): Number of filters, default 1024.
    val_set_size (Optional[int]): Validation set size, default 200.
    lora_r (Optional[int]): LoRA rank, default 8.
    lora_alpha (Optional[int]): LoRA fusion factor, default 32.
    lora_dropout (Optional[float]): LoRA dropout rate, default 0.05.
    lora_target_modules (Optional[str]): LoRA target modules, default ``[wo,wqkv]``.
    modules_to_save (Optional[str]): Modules for full fine-tuning, default ``[tok_embeddings,output]``.
    deepspeed (Optional[str]): Path to DeepSpeed config, default uses repository pre-made ds.json.
    prompt_template_name (Optional[str]): Name of prompt template, default "alpaca".
    train_on_inputs (Optional[bool]): Whether to train on inputs, default ``True``.
    show_prompt (Optional[bool]): Whether to show the prompt, default ``False``.
    nccl_port (Optional[int]): NCCL port, default random between 19000-20500.


Examples:
    >>> from lazyllm import finetune
    >>> trainer = finetune.alpacalora('path/to/base/model', 'path/to/target')
    """
    defatult_kw = ArgsDict({
        'data_path': None,
        'batch_size': 64,
        'micro_batch_size': 4,
        'num_epochs': 2,
        'learning_rate': 5.e-4,
        'cutoff_len': 1030,
        'filter_nums': 1024,
        'val_set_size': 200,
        'lora_r': 8,
        'lora_alpha': 32,
        'lora_dropout': 0.05,
        'lora_target_modules': '[wo,wqkv]',
        'modules_to_save': '[tok_embeddings,output]',
        'deepspeed': '',
        'prompt_template_name': 'alpaca',
        'train_on_inputs': True,
        'show_prompt': False,
        'nccl_port': 19081,
    })
    auto_map = {'micro_batch_size': 'micro_batch_size'}

    def __init__(self,
                 base_model,
                 target_path,
                 merge_path=None,
                 model_name='LLM',
                 cp_files='tokeniz*',
                 launcher=launchers.remote(ngpus=1),  # noqa B008
                 **kw
                 ):
        if not merge_path:
            save_path = os.path.join(lazyllm.config['train_target_root'], target_path)
            target_path, merge_path = os.path.join(save_path, 'lazyllm_lora'), os.path.join(save_path, 'lazyllm_merge')
            os.makedirs(target_path, exist_ok=True)
            os.makedirs(merge_path, exist_ok=True)
        super().__init__(
            base_model,
            target_path,
            launcher=launcher,
        )
        self.folder_path = os.path.dirname(os.path.abspath(__file__))
        deepspeed_config_path = os.path.join(self.folder_path, 'alpaca-lora', 'ds.json')
        self.kw = copy.deepcopy(self.defatult_kw)
        self.kw['deepspeed'] = deepspeed_config_path
        self.kw['nccl_port'] = random.randint(19000, 20500)
        self.kw.check_and_update(kw)
        self.merge_path = merge_path
        self.cp_files = cp_files
        self.model_name = model_name

    def cmd(self, trainset, valset=None) -> str:
        """Generate shell command sequence for Alpaca-LoRA fine-tuning and model merging.

Args:
    trainset (str): Training dataset path, supports both relative path (to configured data_path) and absolute path
    valset (str, optional): Validation dataset path, will auto-split from trainset if not specified

**Returns:**

- str or list: Returns a single command string when no merging needed, otherwise returns a list containing:
                 [fine-tune command, merge command, file copy command]


Examples:
    >>> from lazyllm import finetune
    >>> trainer = finetune.alpacalora('path/to/base/model', 'path/to/target')
    >>> cmd = trainer.cmd("my_dataset.json")
    """
        thirdparty.check_packages(['datasets', 'deepspeed', 'fire', 'numpy', 'peft', 'torch', 'transformers'])
        if not os.path.exists(trainset):
            defatult_path = os.path.join(lazyllm.config['data_path'], trainset)
            if os.path.exists(defatult_path):
                trainset = defatult_path
        if not self.kw['data_path']:
            self.kw['data_path'] = trainset

        run_file_path = os.path.join(self.folder_path, 'alpaca-lora', 'finetune.py')
        cmd = (f'python {run_file_path} '
               f'--base_model={self.base_model} '
               f'--output_dir={self.target_path} '
            )
        cmd += self.kw.parse_kwargs()
        cmd += f' 2>&1 | tee {os.path.join(self.target_path, self.model_name)}_$(date +"%Y-%m-%d_%H-%M-%S").log'

        if self.merge_path:
            run_file_path = os.path.join(self.folder_path, 'alpaca-lora', 'utils', 'merge_weights.py')

            cmd = [cmd,
                   f'python {run_file_path} '
                   f'--base={self.base_model} '
                   f'--adapter={self.target_path} '
                   f'--save_path={self.merge_path} ',
                   f' cp {os.path.join(self.base_model, self.cp_files)} {self.merge_path} '
                ]

        # cmd = 'realpath .'
        return cmd

cmd(trainset, valset=None)

Generate shell command sequence for Alpaca-LoRA fine-tuning and model merging.

Parameters:

  • trainset (str) –

    Training dataset path, supports both relative path (to configured data_path) and absolute path

  • valset (str, default: None ) –

    Validation dataset path, will auto-split from trainset if not specified

Returns:

  • str or list: Returns a single command string when no merging needed, otherwise returns a list containing: [fine-tune command, merge command, file copy command]

Examples:

>>> from lazyllm import finetune
>>> trainer = finetune.alpacalora('path/to/base/model', 'path/to/target')
>>> cmd = trainer.cmd("my_dataset.json")
Source code in lazyllm/components/finetune/alpacalora.py
    def cmd(self, trainset, valset=None) -> str:
        """Generate shell command sequence for Alpaca-LoRA fine-tuning and model merging.

Args:
    trainset (str): Training dataset path, supports both relative path (to configured data_path) and absolute path
    valset (str, optional): Validation dataset path, will auto-split from trainset if not specified

**Returns:**

- str or list: Returns a single command string when no merging needed, otherwise returns a list containing:
                 [fine-tune command, merge command, file copy command]


Examples:
    >>> from lazyllm import finetune
    >>> trainer = finetune.alpacalora('path/to/base/model', 'path/to/target')
    >>> cmd = trainer.cmd("my_dataset.json")
    """
        thirdparty.check_packages(['datasets', 'deepspeed', 'fire', 'numpy', 'peft', 'torch', 'transformers'])
        if not os.path.exists(trainset):
            defatult_path = os.path.join(lazyllm.config['data_path'], trainset)
            if os.path.exists(defatult_path):
                trainset = defatult_path
        if not self.kw['data_path']:
            self.kw['data_path'] = trainset

        run_file_path = os.path.join(self.folder_path, 'alpaca-lora', 'finetune.py')
        cmd = (f'python {run_file_path} '
               f'--base_model={self.base_model} '
               f'--output_dir={self.target_path} '
            )
        cmd += self.kw.parse_kwargs()
        cmd += f' 2>&1 | tee {os.path.join(self.target_path, self.model_name)}_$(date +"%Y-%m-%d_%H-%M-%S").log'

        if self.merge_path:
            run_file_path = os.path.join(self.folder_path, 'alpaca-lora', 'utils', 'merge_weights.py')

            cmd = [cmd,
                   f'python {run_file_path} '
                   f'--base={self.base_model} '
                   f'--adapter={self.target_path} '
                   f'--save_path={self.merge_path} ',
                   f' cp {os.path.join(self.base_model, self.cp_files)} {self.merge_path} '
                ]

        # cmd = 'realpath .'
        return cmd

lazyllm.components.finetune.CollieFinetune

Bases: LazyLLMFinetuneBase

This class is a subclass of LazyLLMFinetuneBase, based on the LoRA fine-tuning capabilities provided by the Collie framework, used for LoRA fine-tuning of large language models.

Parameters:

  • base_model (str) –

    Path to the base model for fine-tuning.

  • target_path (str) –

    Path to save LoRA weights of the fine-tuned model.

  • merge_path (Optional[str], default: None ) –

    Path to save merged LoRA weights, default None. If not provided, "lazyllm_lora" and "lazyllm_merge" directories are created under target_path.

  • model_name (Optional[str], default: 'LLM' ) –

    Model name used as log prefix, default "LLM".

  • cp_files (Optional[str], default: 'tokeniz*' ) –

    Configuration files copied from base model path to merge_path, default "tokeniz*".

  • launcher (launcher, default: remote(ngpus=1) ) –

    Launcher for fine-tuning, default launchers.remote(ngpus=1).

  • kw (dict, default: {} ) –

    Keyword arguments to update default training parameters:

Other Parameters:

  • data_path (Optional[str]) –

    Path to dataset, default None.

  • batch_size (Optional[int]) –

    Batch size, default 64.

  • micro_batch_size (Optional[int]) –

    Micro-batch size, default 4.

  • num_epochs (Optional[int]) –

    Number of training epochs, default 3.

  • learning_rate (Optional[float]) –

    Learning rate, default 5.e-4.

  • dp_size (Optional[int]) –

    Data parallelism parameter, default 8.

  • pp_size (Optional[int]) –

    Pipeline parallelism parameter, default 1.

  • tp_size (Optional[int]) –

    Tensor parallelism parameter, default 1.

  • lora_r (Optional[int]) –

    LoRA rank, default 8.

  • lora_alpha (Optional[int]) –

    LoRA fusion factor, default 16.

  • lora_dropout (Optional[float]) –

    LoRA dropout rate, default 0.05.

  • lora_target_modules (Optional[str]) –

    LoRA target modules, default [wo,wqkv].

  • modules_to_save (Optional[str]) –

    Modules for full fine-tuning, default [tok_embeddings,output].

  • prompt_template_name (Optional[str]) –

    Name of prompt template, default "alpaca".

Examples:

>>> from lazyllm import finetune
>>> trainer = finetune.collie('path/to/base/model', 'path/to/target')
Source code in lazyllm/components/finetune/collie.py
class CollieFinetune(LazyLLMFinetuneBase):
    """This class is a subclass of ``LazyLLMFinetuneBase``, based on the LoRA fine-tuning capabilities provided by the [Collie](https://github.com/OpenLMLab/collie) framework, used for LoRA fine-tuning of large language models.

Args:
    base_model (str): Path to the base model for fine-tuning.
    target_path (str): Path to save LoRA weights of the fine-tuned model.
    merge_path (Optional[str]): Path to save merged LoRA weights, default ``None``.
        If not provided, "lazyllm_lora" and "lazyllm_merge" directories are created under ``target_path``.
    model_name (Optional[str]): Model name used as log prefix, default "LLM".
    cp_files (Optional[str]): Configuration files copied from base model path to ``merge_path``, default "tokeniz*".
    launcher (lazyllm.launcher): Launcher for fine-tuning, default ``launchers.remote(ngpus=1)``.
    kw (dict): Keyword arguments to update default training parameters:

Keyword Args:
    data_path (Optional[str]): Path to dataset, default ``None``.
    batch_size (Optional[int]): Batch size, default 64.
    micro_batch_size (Optional[int]): Micro-batch size, default 4.
    num_epochs (Optional[int]): Number of training epochs, default 3.
    learning_rate (Optional[float]): Learning rate, default 5.e-4.
    dp_size (Optional[int]): Data parallelism parameter, default 8.
    pp_size (Optional[int]): Pipeline parallelism parameter, default 1.
    tp_size (Optional[int]): Tensor parallelism parameter, default 1.
    lora_r (Optional[int]): LoRA rank, default 8.
    lora_alpha (Optional[int]): LoRA fusion factor, default 16.
    lora_dropout (Optional[float]): LoRA dropout rate, default 0.05.
    lora_target_modules (Optional[str]): LoRA target modules, default ``[wo,wqkv]``.
    modules_to_save (Optional[str]): Modules for full fine-tuning, default ``[tok_embeddings,output]``.
    prompt_template_name (Optional[str]): Name of prompt template, default "alpaca".


Examples:
    >>> from lazyllm import finetune
    >>> trainer = finetune.collie('path/to/base/model', 'path/to/target')
    """
    defatult_kw = ArgsDict({
        'data_path': None,
        'batch_size': 64,
        'micro_batch_size': 4,
        'num_epochs': 3,
        'learning_rate': 5.e-4,
        'dp_size': 8,
        'pp_size': 1,
        'tp_size': 1,
        'lora_r': 8,
        'lora_alpha': 16,
        'lora_dropout': 0.05,
        'lora_target_modules': '[wo,wqkv]',
        'modules_to_save': '[tok_embeddings,output]',
        'prompt_template_name': 'alpaca',
    })
    auto_map = {
        'ddp': 'dp_size',
        'micro_batch_size': 'micro_batch_size',
        'tp': 'tp_size',
    }

    def __init__(self,
                 base_model,
                 target_path,
                 merge_path=None,
                 model_name='LLM',
                 cp_files='tokeniz*',
                 launcher=launchers.remote(ngpus=1),  # noqa B008
                 **kw
                 ):
        if not merge_path:
            save_path = os.path.join(lazyllm.config['train_target_root'], target_path)
            target_path, merge_path = os.path.join(save_path, 'lazyllm_lora'), os.path.join(save_path, 'lazyllm_merge')
            os.makedirs(target_path, exist_ok=True)
            os.makedirs(merge_path, exist_ok=True)
        super().__init__(
            base_model,
            target_path,
            launcher=launcher,
        )
        self.folder_path = os.path.dirname(os.path.abspath(__file__))
        self.kw = copy.deepcopy(self.defatult_kw)
        self.kw.check_and_update(kw)
        self.merge_path = merge_path
        self.cp_files = cp_files
        self.model_name = model_name

    def cmd(self, trainset, valset=None) -> str:
        thirdparty.check_packages(['numpy', 'peft', 'torch', 'transformers'])
        if not os.path.exists(trainset):
            defatult_path = os.path.join(lazyllm.config['data_path'], trainset)
            if os.path.exists(defatult_path):
                trainset = defatult_path
        if not self.kw['data_path']:
            self.kw['data_path'] = trainset

        run_file_path = os.path.join(self.folder_path, 'collie', 'finetune.py')
        cmd = (f'python {run_file_path} '
               f'--base_model={self.base_model} '
               f'--output_dir={self.target_path} '
            )
        cmd += self.kw.parse_kwargs()
        cmd += f' 2>&1 | tee {os.path.join(self.target_path, self.model_name)}_$(date +"%Y-%m-%d_%H-%M-%S").log'

        if self.merge_path:
            run_file_path = os.path.join(self.folder_path, 'alpaca-lora', 'utils', 'merge_weights.py')

            cmd = [cmd,
                   f'python {run_file_path} '
                   f'--base={self.base_model} '
                   f'--adapter={self.target_path} '
                   f'--save_path={self.merge_path} ',
                   f' cp {os.path.join(self.base_model, self.cp_files)} {self.merge_path} '
                ]

        return cmd

lazyllm.components.finetune.LlamafactoryFinetune

Bases: LazyLLMFinetuneBase

This class is a subclass of LazyLLMFinetuneBase, based on the training capabilities provided by the LLaMA-Factory framework, used for training large language models(or visual language models).

Parameters:

  • base_model

    Path to the base model used for training. Supports local paths; if the path does not exist, it will attempt to locate it from the configured model directory.

  • target_path

    Target directory to save model weights after training is completed.

  • merge_path (str, default: None ) –

    Path to save the model after merging LoRA weights. Defaults to None. If not specified, two directories will be automatically created under target_path: - "lazyllm_lora" (for storing LoRA fine-tuned weights) - "lazyllm_merge" (for storing the merged model weights)

  • config_path (str, default: None ) –

    Path to the YAML file containing training configuration. Defaults to None. If not specified, the default config file llama_factory/sft.yaml will be used. This file can override default training parameters.

  • export_config_path (str, default: None ) –

    Path to the YAML file for LoRA weight export/merging configuration. Defaults to None. If not specified, the default config file llama_factory/lora_export.yaml will be used.

  • lora_r (int, default: None ) –

    Rank of the LoRA adaptation. If provided, overrides the lora_rank value in the configuration.

  • modules_to_save (str, default: None ) –

    List of additional module names to be saved. Should be provided as a string in Python list format, e.g., "[module1, module2]".

  • lora_target_modules (str, default: None ) –

    List of module names to apply LoRA fine-tuning to. Format is the same as above.

  • launcher (launcher, default: remote(ngpus=1, sync=True) ) –

    Launcher for the fine-tuning task. Defaults to a single-GPU, synchronous remote launcher: launchers.remote(ngpus=1, sync=True).

  • **kw

    Additional keyword arguments used to dynamically override default parameters in the training configuration.

Other Parameters:

  • stage (Literal['pt', 'sft', 'rm', 'ppo', 'dpo', 'kto']) –

    Default is: sft. Which stage will be performed in training.

  • do_train (bool) –

    Default is: True. Whether to run training.

  • finetuning_type (Literal['lora', 'freeze', 'full']) –

    Default is: lora. Which fine-tuning method to use.

  • lora_target (str) –

    Default is: all. Name(s) of target modules to apply LoRA. Use commas to separate multiple modules. Use all to specify all the linear modules.

  • template (Optional[str]) –

    Default is: None. Which template to use for constructing prompts in training and inference.

  • cutoff_len (int) –

    Default is: 1024. The cutoff length of the tokenized inputs in the dataset.

  • max_samples (Optional[int]) –

    Default is: 1000. For debugging purposes, truncate the number of examples for each dataset.

  • overwrite_cache (bool) –

    Default is: True. Overwrite the cached training and evaluation sets.

  • preprocessing_num_workers (Optional[int]) –

    Default is: 16. The number of processes to use for the pre-processing.

  • dataset_dir (str) –

    Default is: lazyllm_temp_dir. Path to the folder containing the datasets. If not explicitly specified, LazyLLM will generate a dataset_info.json file in the .temp folder in the current working directory for use by LLaMA-Factory.

  • logging_steps (float) –

    Default is: 10. Log every X updates steps. Should be an integer or a float in range [0,1). If smaller than 1, will be interpreted as ratio of total training steps.

  • save_steps (float) –

    Default is: 500. Save checkpoint every X updates steps. Should be an integer or a float in range [0,1). If smaller than 1, will be interpreted as ratio of total training steps.

  • plot_loss (bool) –

    Default is: True. Whether or not to save the training loss curves.

  • overwrite_output_dir (bool) –

    Default is: True. Overwrite the content of the output directory.

  • per_device_train_batch_size (int) –

    Default is: 1. Batch size per GPU/TPU/MPS/NPU core/CPU for training.

  • gradient_accumulation_steps (int) –

    Default is: 8. Number of updates steps to accumulate before performing a backward/update pass.

  • learning_rate (float) –

    Default is: 1e-04. The initial learning rate for AdamW.

  • num_train_epochs (float) –

    Default is: 3.0. Total number of training epochs to perform.

  • lr_scheduler_type (Union[SchedulerType, str]) –

    Default is: cosine. The scheduler type to use.

  • warmup_ratio (float) –

    Default is: 0.1. Linear warmup over warmup_ratio fraction of total steps.

  • fp16 (bool) –

    Default is: True. Whether to use fp16 (mixed) precision instead of 32-bit.

  • ddp_timeout (Optional[int]) –

    Default is: 180000000. Overrides the default timeout for distributed training (value should be given in seconds).

  • report_to (Union[NoneType, str, List[str]]) –

    Default is: tensorboard. The list of integrations to report the results and logs to.

  • val_size (float) –

    Default is: 0.1. Size of the development set, should be an integer or a float in range [0,1).

  • per_device_eval_batch_size (int) –

    Default is: 1. Batch size per GPU/TPU/MPS/NPU core/CPU for evaluation.

  • eval_strategy (Union[IntervalStrategy, str]) –

    Default is: steps. The evaluation strategy to use.

  • eval_steps (Optional[float]) –

    Default is: 500. Run an evaluation every X steps. Should be an integer or a float in range [0,1). If smaller than 1, will be interpreted as ratio of total training steps.

Examples:

>>> from lazyllm import finetune
>>> trainer = finetune.llamafactory('internlm2-chat-7b', 'path/to/target')
<lazyllm.llm.finetune type=LlamafactoryFinetune>
Source code in lazyllm/components/finetune/llamafactory.py
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
class LlamafactoryFinetune(LazyLLMFinetuneBase):
    """This class is a subclass of ``LazyLLMFinetuneBase``, based on the training capabilities provided by the [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) framework, used for training large language models(or visual language models).

Args:
    base_model: Path to the base model used for training. Supports local paths; if the path does not exist, it will attempt to locate it from the configured model directory.
    target_path: Target directory to save model weights after training is completed.
    merge_path (str, optional): Path to save the model after merging LoRA weights. Defaults to None.
        If not specified, two directories will be automatically created under ``target_path``:
        - "lazyllm_lora" (for storing LoRA fine-tuned weights)
        - "lazyllm_merge" (for storing the merged model weights)
    config_path (str, optional): Path to the YAML file containing training configuration. Defaults to None.
        If not specified, the default config file ``llama_factory/sft.yaml`` will be used.
        This file can override default training parameters.
    export_config_path (str, optional): Path to the YAML file for LoRA weight export/merging configuration. Defaults to None.
        If not specified, the default config file ``llama_factory/lora_export.yaml`` will be used.
    lora_r (int, optional): Rank of the LoRA adaptation. If provided, overrides the ``lora_rank`` value in the configuration.
    modules_to_save (str, optional): List of additional module names to be saved. Should be provided as a string in Python list format, e.g., "[module1, module2]".
    lora_target_modules (str, optional): List of module names to apply LoRA fine-tuning to. Format is the same as above.
    launcher (lazyllm.launcher, optional): Launcher for the fine-tuning task. Defaults to a single-GPU, synchronous remote launcher: ``launchers.remote(ngpus=1, sync=True)``.
    **kw: Additional keyword arguments used to dynamically override default parameters in the training configuration.

Keyword Args:
    stage (typing.Literal['pt', 'sft', 'rm', 'ppo', 'dpo', 'kto']): Default is: ``sft``. Which stage will be performed in training.
    do_train (bool): Default is: ``True``. Whether to run training.
    finetuning_type (typing.Literal['lora', 'freeze', 'full']): Default is: ``lora``. Which fine-tuning method to use.
    lora_target (str): Default is: ``all``. Name(s) of target modules to apply LoRA. Use commas to separate multiple modules. Use `all` to specify all the linear modules.
    template (typing.Optional[str]): Default is: ``None``. Which template to use for constructing prompts in training and inference.
    cutoff_len (int): Default is: ``1024``. The cutoff length of the tokenized inputs in the dataset.
    max_samples (typing.Optional[int]): Default is: ``1000``. For debugging purposes, truncate the number of examples for each dataset.
    overwrite_cache (bool): Default is: ``True``. Overwrite the cached training and evaluation sets.
    preprocessing_num_workers (typing.Optional[int]): Default is: ``16``. The number of processes to use for the pre-processing.
    dataset_dir (str): Default is: ``lazyllm_temp_dir``. Path to the folder containing the datasets. If not explicitly specified, LazyLLM will generate a ``dataset_info.json`` file in the ``.temp`` folder in the current working directory for use by LLaMA-Factory.
    logging_steps (float): Default is: ``10``. Log every X updates steps. Should be an integer or a float in range ``[0,1)``. If smaller than 1, will be interpreted as ratio of total training steps.
    save_steps (float): Default is: ``500``. Save checkpoint every X updates steps. Should be an integer or a float in range ``[0,1)``. If smaller than 1, will be interpreted as ratio of total training steps.
    plot_loss (bool): Default is: ``True``. Whether or not to save the training loss curves.
    overwrite_output_dir (bool): Default is: ``True``. Overwrite the content of the output directory.
    per_device_train_batch_size (int): Default is: ``1``. Batch size per GPU/TPU/MPS/NPU core/CPU for training.
    gradient_accumulation_steps (int): Default is: ``8``. Number of updates steps to accumulate before performing a backward/update pass.
    learning_rate (float): Default is: ``1e-04``. The initial learning rate for AdamW.
    num_train_epochs (float): Default is: ``3.0``. Total number of training epochs to perform.
    lr_scheduler_type (typing.Union[transformers.trainer_utils.SchedulerType, str]): Default is: ``cosine``. The scheduler type to use.
    warmup_ratio (float): Default is: ``0.1``. Linear warmup over warmup_ratio fraction of total steps.
    fp16 (bool): Default is: ``True``. Whether to use fp16 (mixed) precision instead of 32-bit.
    ddp_timeout (typing.Optional[int]): Default is: ``180000000``. Overrides the default timeout for distributed training (value should be given in seconds).
    report_to (typing.Union[NoneType, str, typing.List[str]]): Default is: ``tensorboard``. The list of integrations to report the results and logs to.
    val_size (float): Default is: ``0.1``. Size of the development set, should be an integer or a float in range `[0,1)`.
    per_device_eval_batch_size (int): Default is: ``1``. Batch size per GPU/TPU/MPS/NPU core/CPU for evaluation.
    eval_strategy (typing.Union[transformers.trainer_utils.IntervalStrategy, str]): Default is: ``steps``. The evaluation strategy to use.
    eval_steps (typing.Optional[float]): Default is: ``500``. Run an evaluation every X steps. Should be an integer or a float in range `[0,1)`. If smaller than 1, will be interpreted as ratio of total training steps.


Examples:
    >>> from lazyllm import finetune
    >>> trainer = finetune.llamafactory('internlm2-chat-7b', 'path/to/target')
    <lazyllm.llm.finetune type=LlamafactoryFinetune>
    """
    auto_map = {
        'gradient_step': 'gradient_accumulation_steps',
        'micro_batch_size': 'per_device_train_batch_size',
    }

    def __init__(self,
                 base_model,
                 target_path,
                 merge_path=None,
                 config_path=None,
                 export_config_path=None,
                 lora_r=None,
                 modules_to_save=None,
                 lora_target_modules=None,
                 launcher=launchers.remote(ngpus=1, sync=True),  # noqa B008
                 **kw
                 ):
        if not os.path.exists(base_model):
            default_path = os.path.join(lazyllm.config['model_path'], base_model)
            if os.path.exists(default_path):
                base_model = default_path
        if not merge_path:
            normalized_target = os.path.normpath(target_path)
            if normalized_target.endswith('lazyllm_lora'):
                merge_path = normalized_target.replace('lazyllm_lora', 'lazyllm_merge')
            else:
                save_path = os.path.join(lazyllm.config['train_target_root'], target_path)
                target_path = os.path.join(save_path, 'lazyllm_lora')
                merge_path = os.path.join(save_path, 'lazyllm_merge')
            os.makedirs(target_path, exist_ok=True)
            os.makedirs(merge_path, exist_ok=True)
        super().__init__(
            base_model,
            target_path,
            launcher=launcher,
        )
        self.merge_path = merge_path
        self.temp_yaml_file = None
        self.temp_export_yaml_file = None
        self.config_path = config_path
        self.export_config_path = export_config_path
        self.config_folder_path = os.path.dirname(os.path.abspath(__file__))

        default_config_path = os.path.join(self.config_folder_path, 'llama_factory', 'sft.yaml')
        self.template_dict = ArgsDict(self._load_yaml(default_config_path))

        if self.config_path:
            self.template_dict.update(self._load_yaml(self.config_path))

        if lora_r:
            self.template_dict['lora_rank'] = lora_r
        if modules_to_save:
            self.template_dict['additional_target'] = modules_to_save.strip('[]')
        if lora_target_modules:
            self.template_dict['lora_target'] = lora_target_modules.strip('[]')
        self.template_dict['model_name_or_path'] = base_model
        self.template_dict['output_dir'] = target_path
        self.template_dict['template'] = self._get_template_name(base_model)

        # Filter kw to only include keys that exist in template_dict
        # This ensures check_and_update won't fail due to unexpected keys
        # Keys not in template_dict will be silently ignored (they may be used elsewhere)
        ignored_keys = set(kw.keys()) - set(self.template_dict.keys())
        if ignored_keys:
            lazyllm.LOG.info(f'Ignored parameters not in template_dict: {sorted(ignored_keys)}')
        filtered_kw = {k: v for k, v in kw.items() if k in self.template_dict}

        self.template_dict.check_and_update(filtered_kw)

        default_export_config_path = os.path.join(self.config_folder_path, 'llama_factory', 'lora_export.yaml')
        self.export_dict = ArgsDict(self._load_yaml(default_export_config_path))

        if self.export_config_path:
            self.export_dict.update(self._load_yaml(self.export_config_path))

        self.export_dict['model_name_or_path'] = base_model
        self.export_dict['adapter_name_or_path'] = target_path
        self.export_dict['export_dir'] = merge_path
        self.export_dict['template'] = self.template_dict['template']

        self.temp_folder = os.path.join(lazyllm.config['temp_dir'], 'llamafactory_config', str(uuid.uuid4())[:10])
        if not os.path.exists(self.temp_folder):
            os.makedirs(self.temp_folder)
        self.log_file_path = None

    def _get_template_name(self, base_model):
        base_name = os.path.basename(base_model).lower()
        key_value = match_longest_prefix(base_name)
        if key_value:
            return key_value
        else:
            raise RuntimeError(f'Cannot find prfix of base_model({base_model}) '
                               f'in DEFAULT_TEMPLATE of LLaMA_Factory: {llamafactory_mapping_dict}')

    def _load_yaml(self, config_path):
        with open(config_path, 'r') as file:
            config_dict = yaml.safe_load(file)
        return config_dict

    def _build_temp_yaml(self, updated_template_str, prefix='train_'):
        fd, temp_yaml_file = tempfile.mkstemp(prefix=prefix, suffix='.yaml', dir=self.temp_folder)
        with os.fdopen(fd, 'w') as temp_file:
            temp_file.write(updated_template_str)
        return temp_yaml_file

    def _build_temp_dataset_info(self, datapaths, stage=None):  # noqa C901
        """
        Build dataset_info.json based on training stage and dataset format.
        """
        if isinstance(datapaths, str):
            datapaths = [datapaths]
        elif isinstance(datapaths, list) and all(isinstance(item, str) for item in datapaths):
            pass
        else:
            raise TypeError(f'datapaths({datapaths}) should be str or list of str.')

        if stage is None:
            stage = self.template_dict.get('stage', 'sft').lower()

        supported_stages = ['sft', 'pt', 'dpo']
        if stage not in supported_stages:
            raise ValueError(
                f'Unsupported training stage: {stage}. '
                f'Only supported stages are: {", ".join(supported_stages)}'
            )

        temp_dataset_dict = dict()
        for datapath in datapaths:
            datapath = os.path.join(lazyllm.config['data_path'], datapath)
            assert os.path.isfile(datapath)
            file_name, _ = os.path.splitext(os.path.basename(datapath))
            temp_dataset_dict[file_name] = {'file_name': datapath}

            formatting = None
            first_item = None

            if stage == 'pt':
                formatting = None
                try:
                    with open(datapath, 'r', encoding='utf-8') as file:
                        first_bytes = file.read(1024)
                        file.seek(0)

                        if not first_bytes.strip().startswith(('[', '{')):
                            lines = file.readlines()
                            if not lines:
                                raise ValueError(f'PT stage: Dataset file {datapath} is empty')

                            first_item = {'text': lines[0].strip() if lines else ''}
                            lazyllm.LOG.info(
                                f'PT stage: Dataset {file_name} detected as plain text format '
                                f'({len(lines)} lines). LLaMA-Factory will handle conversion.'
                            )
                        else:
                            try:
                                data = json.load(file)
                                if isinstance(data, list):
                                    if not data:
                                        raise ValueError(f'PT stage: Dataset file {datapath} is empty (empty list)')
                                    first_item = data[0]
                                elif isinstance(data, dict):
                                    first_item = data
                                else:
                                    raise ValueError(
                                        f'PT stage: Dataset file {datapath} has invalid JSON structure. '
                                        f'Expected list or dict, got {type(data).__name__}'
                                    )
                                lazyllm.LOG.info(
                                    f'PT stage: Dataset {file_name} detected as JSON format. '
                                    f'Looking for "text" field.'
                                )
                            except json.JSONDecodeError as json_err:
                                raise ValueError(
                                    f'PT stage: Dataset file {datapath} is neither valid plain text nor valid JSON. '
                                    f'JSON parse error: {str(json_err)}'
                                )

                    if not first_item:
                        raise ValueError(f'PT stage: Failed to extract first item from dataset {datapath}')

                    self._build_alpaca_dataset_info(
                        temp_dataset_dict, file_name, first_item, stage
                    )

                except Exception as e:
                    error_msg = (
                        f'PT stage: Failed to process dataset {file_name} from {datapath}. '
                        f'Error: {str(e)}. '
                        f'PT mode requires either: '
                        f'(1) Plain text format (one text per line), or '
                        f'(2) JSON format with "text" field in each object.'
                    )
                    lazyllm.LOG.error(error_msg)
                    raise ValueError(error_msg) from e
            else:
                formatting = 'alpaca'
                try:
                    with open(datapath, 'r', encoding='utf-8') as file:
                        data = json.load(file)
                    if not data:
                        raise ValueError(f'Dataset file {datapath} is empty')

                    first_item = data[0]

                    if 'messages' in first_item:
                        formatting = 'sharegpt'
                        self._build_sharegpt_dataset_info(
                            temp_dataset_dict, file_name, first_item, stage
                        )
                    else:
                        self._build_alpaca_dataset_info(
                            temp_dataset_dict, file_name, first_item, stage
                        )

                except Exception as e:
                    lazyllm.LOG.warning(
                        f'Failed to analyze dataset {datapath} for stage {stage}: {e}. '
                        f'Using default formatting.'
                    )

            if formatting is not None:
                temp_dataset_dict[file_name].update({'formatting': formatting})

        self.temp_dataset_info_path = os.path.join(self.temp_folder, 'dataset_info.json')
        with open(self.temp_dataset_info_path, 'w') as json_file:
            json.dump(temp_dataset_dict, json_file, indent=4)
        return self.temp_dataset_info_path, ','.join(temp_dataset_dict.keys())

    def _build_alpaca_dataset_info(self, dataset_dict, file_name, first_item, stage):  # noqa C901
        """
        Build dataset info for Alpaca format based on training stage.
        """
        columns = {}
        ranking = False

        media_types = []
        for media in ['images', 'videos', 'audios']:
            if media in first_item:
                media_types.append(media)

        if stage == 'pt':
            if 'text' in first_item:
                columns['prompt'] = 'text'
            else:
                if 'instruction' in first_item:
                    columns['prompt'] = 'instruction'
                elif 'output' in first_item:
                    columns['prompt'] = 'output'
                else:
                    lazyllm.LOG.warning(
                        f'PT stage: No "text" field found in dataset {file_name}, '
                        f'using "instruction" or "output" as fallback'
                    )
                    columns['prompt'] = 'instruction' if 'instruction' in first_item else 'output'

        elif stage == 'dpo':
            ranking = True
            if 'chosen' in first_item and 'rejected' in first_item:
                columns['prompt'] = 'instruction' if 'instruction' in first_item else None
                columns['query'] = 'input' if 'input' in first_item else None
                columns['chosen'] = 'chosen'
                columns['rejected'] = 'rejected'
                columns = {k: v for k, v in columns.items() if v is not None}
            else:
                raise ValueError(
                    f'DPO stage requires "chosen" and "rejected" fields in dataset, '
                    f'but found: {list(first_item.keys())}'
                )

        elif stage == 'sft':
            if 'instruction' in first_item:
                columns['prompt'] = 'instruction'
            if 'input' in first_item:
                columns['query'] = 'input'
            if 'output' in first_item:
                columns['response'] = 'output'
            if 'system' in first_item:
                columns['system'] = 'system'
            if 'history' in first_item:
                columns['history'] = 'history'
        else:
            raise ValueError(f'Unsupported stage: {stage}. Only sft, pt, dpo are supported.')

        if media_types:
            multimodal_columns = {item: item for item in media_types}
            multimodal_columns.update(columns)
            columns = multimodal_columns

        update_dict = {'columns': columns}
        if ranking:
            update_dict['ranking'] = True

        dataset_dict[file_name].update(update_dict)

    def _build_sharegpt_dataset_info(self, dataset_dict, file_name, first_item, stage):  # noqa C901
        """
        Build dataset info for ShareGPT format based on training stage.
        """
        columns = {}
        ranking = False

        media_types = []
        for media in ['images', 'videos', 'audios']:
            if media in first_item:
                media_types.append(media)

        if stage == 'dpo':
            ranking = True
            if 'chosen' in first_item and 'rejected' in first_item:
                columns['messages'] = 'conversations' if 'conversations' in first_item else 'messages'
                columns['chosen'] = 'chosen'
                columns['rejected'] = 'rejected'
            else:
                raise ValueError(
                    f'DPO stage requires "chosen" and "rejected" fields in dataset, '
                    f'but found: {list(first_item.keys())}'
                )
        elif stage == 'sft':
            columns['messages'] = 'conversations' if 'conversations' in first_item else 'messages'
            if 'system' in first_item:
                columns['system'] = 'system'
            if 'tools' in first_item:
                columns['tools'] = 'tools'

            if 'messages' in first_item and isinstance(first_item['messages'], list):
                if len(first_item['messages']) > 0:
                    msg = first_item['messages'][0]
                    if 'role' in msg and 'content' in msg:
                        dataset_dict[file_name].update({
                            'tags': {
                                'role_tag': 'role',
                                'content_tag': 'content',
                                'user_tag': 'user',
                                'assistant_tag': 'assistant',
                                'system_tag': 'system'
                            }
                        })
        else:
            raise ValueError(f'Unsupported stage: {stage}. Only sft, pt, dpo are supported.')

        if media_types:
            multimodal_columns = {item: item for item in media_types}
            multimodal_columns.update(columns)
            columns = multimodal_columns

        update_dict = {'columns': columns}
        if ranking:
            update_dict['ranking'] = True

        dataset_dict[file_name].update(update_dict)

    def _rm_temp_yaml(self):
        if self.temp_yaml_file:
            if os.path.exists(self.temp_yaml_file):
                os.remove(self.temp_yaml_file)
            self.temp_yaml_file = None

    def cmd(self, trainset, valset=None) -> str:
        """Generate LLaMA-Factory fine-tuning command sequence, including training and model merge commands.

Args:
    trainset (str): Training dataset path (supports relative path to lazyllm.config['data_path'])
    valset (str, optional): Validation dataset path (not directly used in current implementation)

**Returns:**

- str: Complete shell command string containing:
    - Training command (with auto-configured parameters)
    - Log redirection (saved to target path)
    - Optional model merge command (when LoRA is configured)

Notes:
    - Automatically generates timestamped training log files
    - Temporary files are automatically cleaned up after use
    - Supports multiple data formats (alpaca/sharegpt etc.)
    - Multimodal data (images/videos/audios) is automatically detected and handled
"""
        thirdparty.check_packages(['datasets', 'deepspeed', 'numpy', 'peft', 'torch', 'transformers', 'trl'])
        if 'dataset_dir' in self.template_dict and self.template_dict['dataset_dir'] == 'lazyllm_temp_dir':
            stage = self.template_dict.get('stage', 'sft')
            _, datasets = self._build_temp_dataset_info(trainset, stage=stage)
            self.template_dict['dataset_dir'] = self.temp_folder
        else:
            datasets = trainset
        self.template_dict['dataset'] = datasets

        if self.template_dict['finetuning_type'] == 'lora':
            # For LoRA/QLoRA: use llamafactory-cli export to merge adapter with base model
            # For Full finetuning: model copy handling is done in cmds below
            updated_export_str = yaml.dump(dict(self.export_dict), default_flow_style=False)
            self.temp_export_yaml_file = self._build_temp_yaml(updated_export_str, prefix='merge_')

        updated_template_str = yaml.dump(dict(self.template_dict), default_flow_style=False)
        self.temp_yaml_file = self._build_temp_yaml(updated_template_str)

        formatted_date = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
        random_value = random.randint(1000, 9999)
        self.log_file_path = f'{self.target_path}/train_log_{formatted_date}_{random_value}.log'

        # Use bash instead of sh to support pipefail
        # This ensures export only runs if training succeeds
        cmds = (f'export DISABLE_VERSION_CHECK=1 && bash -c "set -o pipefail && '
                f'llamafactory-cli train {self.temp_yaml_file} 2>&1 | '
                f'tee {self.log_file_path}"')
        if self.temp_export_yaml_file:
            # For LoRA/QLoRA: merge adapter with base model
            # Only run export if training succeeds (exit code 0)
            # With pipefail, tee will preserve the exit code from llamafactory-cli
            cmds += f' && llamafactory-cli export {self.temp_export_yaml_file}'
        elif self.template_dict['finetuning_type'] == 'full':
            # For Full finetuning: copy model from lazyllm_lora to lazyllm_merge
            # This maintains consistency with LoRA/QLoRA workflow
            # Only copy if training succeeds (exit code 0)
            # Only copy model files, exclude training process information (checkpoints, logs, etc.)
            # to save storage space since lazyllm_merge is only used for exporting to models directory
            exclude_patterns = [
                '--exclude=checkpoint-*',  # Exclude checkpoint directories
                '--exclude=train_log_*.log',  # Exclude training logs
                '--exclude=trainer_state.json',  # Exclude trainer state
                '--exclude=trainer_log.jsonl',  # Exclude trainer log
                '--exclude=train_results.json',  # Exclude training results
                '--exclude=all_results.json',  # Exclude all results
                '--exclude=eval_results.json',  # Exclude evaluation results
                '--exclude=training_loss.png',  # Exclude training loss plot
                '--exclude=runs/',  # Exclude tensorboard logs
                '--exclude=training_args.bin',  # Exclude training arguments
            ]
            exclude_str = ' '.join(exclude_patterns)
            cmds += f' && rsync -a {exclude_str} {self.target_path}/ {self.merge_path}/ 2>/dev/null || true'
        return cmds

cmd(trainset, valset=None)

Generate LLaMA-Factory fine-tuning command sequence, including training and model merge commands.

Parameters:

  • trainset (str) –

    Training dataset path (supports relative path to lazyllm.config['data_path'])

  • valset (str, default: None ) –

    Validation dataset path (not directly used in current implementation)

Returns:

  • str: Complete shell command string containing:
    • Training command (with auto-configured parameters)
    • Log redirection (saved to target path)
    • Optional model merge command (when LoRA is configured)
Notes
  • Automatically generates timestamped training log files
  • Temporary files are automatically cleaned up after use
  • Supports multiple data formats (alpaca/sharegpt etc.)
  • Multimodal data (images/videos/audios) is automatically detected and handled
Source code in lazyllm/components/finetune/llamafactory.py
    def cmd(self, trainset, valset=None) -> str:
        """Generate LLaMA-Factory fine-tuning command sequence, including training and model merge commands.

Args:
    trainset (str): Training dataset path (supports relative path to lazyllm.config['data_path'])
    valset (str, optional): Validation dataset path (not directly used in current implementation)

**Returns:**

- str: Complete shell command string containing:
    - Training command (with auto-configured parameters)
    - Log redirection (saved to target path)
    - Optional model merge command (when LoRA is configured)

Notes:
    - Automatically generates timestamped training log files
    - Temporary files are automatically cleaned up after use
    - Supports multiple data formats (alpaca/sharegpt etc.)
    - Multimodal data (images/videos/audios) is automatically detected and handled
"""
        thirdparty.check_packages(['datasets', 'deepspeed', 'numpy', 'peft', 'torch', 'transformers', 'trl'])
        if 'dataset_dir' in self.template_dict and self.template_dict['dataset_dir'] == 'lazyllm_temp_dir':
            stage = self.template_dict.get('stage', 'sft')
            _, datasets = self._build_temp_dataset_info(trainset, stage=stage)
            self.template_dict['dataset_dir'] = self.temp_folder
        else:
            datasets = trainset
        self.template_dict['dataset'] = datasets

        if self.template_dict['finetuning_type'] == 'lora':
            # For LoRA/QLoRA: use llamafactory-cli export to merge adapter with base model
            # For Full finetuning: model copy handling is done in cmds below
            updated_export_str = yaml.dump(dict(self.export_dict), default_flow_style=False)
            self.temp_export_yaml_file = self._build_temp_yaml(updated_export_str, prefix='merge_')

        updated_template_str = yaml.dump(dict(self.template_dict), default_flow_style=False)
        self.temp_yaml_file = self._build_temp_yaml(updated_template_str)

        formatted_date = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
        random_value = random.randint(1000, 9999)
        self.log_file_path = f'{self.target_path}/train_log_{formatted_date}_{random_value}.log'

        # Use bash instead of sh to support pipefail
        # This ensures export only runs if training succeeds
        cmds = (f'export DISABLE_VERSION_CHECK=1 && bash -c "set -o pipefail && '
                f'llamafactory-cli train {self.temp_yaml_file} 2>&1 | '
                f'tee {self.log_file_path}"')
        if self.temp_export_yaml_file:
            # For LoRA/QLoRA: merge adapter with base model
            # Only run export if training succeeds (exit code 0)
            # With pipefail, tee will preserve the exit code from llamafactory-cli
            cmds += f' && llamafactory-cli export {self.temp_export_yaml_file}'
        elif self.template_dict['finetuning_type'] == 'full':
            # For Full finetuning: copy model from lazyllm_lora to lazyllm_merge
            # This maintains consistency with LoRA/QLoRA workflow
            # Only copy if training succeeds (exit code 0)
            # Only copy model files, exclude training process information (checkpoints, logs, etc.)
            # to save storage space since lazyllm_merge is only used for exporting to models directory
            exclude_patterns = [
                '--exclude=checkpoint-*',  # Exclude checkpoint directories
                '--exclude=train_log_*.log',  # Exclude training logs
                '--exclude=trainer_state.json',  # Exclude trainer state
                '--exclude=trainer_log.jsonl',  # Exclude trainer log
                '--exclude=train_results.json',  # Exclude training results
                '--exclude=all_results.json',  # Exclude all results
                '--exclude=eval_results.json',  # Exclude evaluation results
                '--exclude=training_loss.png',  # Exclude training loss plot
                '--exclude=runs/',  # Exclude tensorboard logs
                '--exclude=training_args.bin',  # Exclude training arguments
            ]
            exclude_str = ' '.join(exclude_patterns)
            cmds += f' && rsync -a {exclude_str} {self.target_path}/ {self.merge_path}/ 2>/dev/null || true'
        return cmds

lazyllm.components.deploy.LazyLLMDeployBase

Bases: ComponentBase

This class is a subclass of ComponentBase that provides basic functionality for LazyLLM deployment. It supports encoding conversion for various media types and provides configuration options for result extraction and streaming processing.

Parameters:

  • launcher (LauncherBase, default: remote() ) –

    Launcher instance for deployment, defaults to remote launcher (launchers.remote()).

Notes
  • Need to implement specific deployment logic when inheriting this class
  • Can customize result extraction logic by overriding the extract_result method

Examples:

>>> import lazyllm
>>> from lazyllm.components.deploy.base import LazyLLMDeployBase
>>> class MyDeployer(LazyLLMDeployBase):
...     def __call__(self, inputs):
...         return processed_result
        def extract_result(output, inputs):
...         return output.json()['result']
>>> deployer = MyDeployer()
>>> result = deployer.extract_result(raw_output, input_data)
Source code in lazyllm/components/deploy/base.py
class LazyLLMDeployBase(ComponentBase):
    """This class is a subclass of ``ComponentBase`` that provides basic functionality for LazyLLM deployment. It supports encoding conversion for various media types and provides configuration options for result extraction and streaming processing.

Args:
    launcher (LauncherBase): Launcher instance for deployment, defaults to remote launcher (``launchers.remote()``).

Notes: 
    - Need to implement specific deployment logic when inheriting this class
    - Can customize result extraction logic by overriding the extract_result method


Examples:
    >>> import lazyllm
    >>> from lazyllm.components.deploy.base import LazyLLMDeployBase
    >>> class MyDeployer(LazyLLMDeployBase):
    ...     def __call__(self, inputs):
    ...         return processed_result
            def extract_result(output, inputs):
    ...         return output.json()['result']
    >>> deployer = MyDeployer()
    >>> result = deployer.extract_result(raw_output, input_data)
    """
    keys_name_handle = None
    message_format = None
    default_headers = {'Content-Type': 'application/json'}
    stream_url_suffix = ''
    stream_parse_parameters = {}

    encoder_map = dict(image=_image_to_base64, audio=_audio_to_base64, ocr_files=ocr_to_base64)

    @staticmethod
    def extract_result(output, inputs):
        """Extract final result from model output. The default implementation returns raw output directly, subclasses can override this method to implement custom result extraction logic.

Args:
    output: Raw model output
    inputs: Original input data, can be used for post-processing

**Returns:**

- Processed final result
"""
        return output

    def __init__(self, *, launcher=launchers.remote()):  # noqa B008
        super().__init__(launcher=launcher)

extract_result(output, inputs) staticmethod

Extract final result from model output. The default implementation returns raw output directly, subclasses can override this method to implement custom result extraction logic.

Parameters:

  • output

    Raw model output

  • inputs

    Original input data, can be used for post-processing

Returns:

  • Processed final result
Source code in lazyllm/components/deploy/base.py
    @staticmethod
    def extract_result(output, inputs):
        """Extract final result from model output. The default implementation returns raw output directly, subclasses can override this method to implement custom result extraction logic.

Args:
    output: Raw model output
    inputs: Original input data, can be used for post-processing

**Returns:**

- Processed final result
"""
        return output

lazyllm.components.deploy.LazyLLMDeployBase.extract_result(output, inputs) staticmethod

Extract final result from model output. The default implementation returns raw output directly, subclasses can override this method to implement custom result extraction logic.

Parameters:

  • output

    Raw model output

  • inputs

    Original input data, can be used for post-processing

Returns:

  • Processed final result
Source code in lazyllm/components/deploy/base.py
    @staticmethod
    def extract_result(output, inputs):
        """Extract final result from model output. The default implementation returns raw output directly, subclasses can override this method to implement custom result extraction logic.

Args:
    output: Raw model output
    inputs: Original input data, can be used for post-processing

**Returns:**

- Processed final result
"""
        return output

lazyllm.components.finetune.FlagembeddingFinetune

Bases: LazyLLMFinetuneBase

This class is a subclass of LazyLLMFinetuneBase, based on the training capabilities provided by the FlagEmbedding framework, used for training embedding and reranker models.

Parameters:

  • base_model (str) –

    The base model used for training. It is required to be the path of the base model.

  • target_path (str) –

    The path where the trained model weights are saved.

  • launcher (launcher, default: remote(ngpus=1, sync=True) ) –

    The launcher for fine-tuning, default is launchers.remote(ngpus=1, sync=True).

  • kw

    Keyword arguments used to update the default training parameters.

The keyword arguments and their default values for this class of embedding model are as follows:

Other Parameters:

  • train_group_size (int) –

    Default is: 8. The size of train group. It is used to control the number of negative samples in each training set.

  • query_max_len (int) –

    Default is: 512. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.

  • passage_max_len (int) –

    Default is: 512. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.

  • pad_to_multiple_of (int) –

    Default is: 8. If set will pad the sequence to be a multiple of the provided value.

  • query_instruction_for_retrieval (str) –

    Default is: Represent this sentence for searching relevant passages:. Instruction for query.

  • query_instruction_format (str) –

    Default is: {}{}. Format for query instruction.

  • learning_rate (float) –

    Default is: 1e-5. Learning rate.

  • num_train_epochs (int) –

    Default is: 1. Total number of training epochs to perform.

  • per_device_train_batch_size (int) –

    Default is: 2. Train batch size

  • gradient_accumulation_steps (int) –

    Default is: 1. Number of updates steps to accumulate before performing a backward/update pass.

  • dataloader_drop_last (bool) –

    Default is: True. When it='True', the last incomplete batch is dropped if the dataset size is not divisible by the batch size, meaning DataLoader only returns complete batches.

  • warmup_ratio (float) –

    Default is: 0.1. Warmup ratio for linear scheduler.

  • weight_decay (float) –

    Default is: 0.01. Weight decay in AdamW.

  • deepspeed (str) –

    Default is: ``. The path of the DeepSpeed configuration file, default to use the pre-made configuration file in the LazyLLM code repository:ds_stage0.json``.

  • logging_steps (int) –

    Default is: 1. Logging frequency according to logging strategy.

  • save_steps (int) –

    Default is: 1000. Saving frequency.

  • temperature (float) –

    Default is: 0.02. Temperature used for similarity score

  • sentence_pooling_method (str) –

    Default is: cls. The pooling method. Available options: 'cls', 'mean', 'last_token'.

  • normalize_embeddings (bool) –

    Default is: True. Whether to normalize the embeddings.

  • kd_loss_type (str) –

    Default is: kl_div. The loss type for knowledge distillation. Available options:'kl_div', 'm3_kd_loss'.

  • overwrite_output_dir (bool) –

    Default is: True. It is used to allow the program to overwrite an existing output directory.

  • fp16 (bool) –

    Default is: True. Whether to use fp16 (mixed) precision instead of 32-bit.

  • gradient_checkpointing (bool) –

    Default is: True. Whether enable gradient checkpointing.

  • negatives_cross_device (bool) –

    Default is: True. Whether share negatives across devices.

The keyword arguments and their default values for this class of reranker model are as follows:

Other Parameters:

  • train_group_size (int) –

    Default is: 8. The size of train group. It is used to control the number of negative samples in each training set.

  • query_max_len (int) –

    Default is: 256. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.

  • passage_max_len (int) –

    Default is: 256. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.

  • pad_to_multiple_of (int) –

    Default is: 8. If set will pad the sequence to be a multiple of the provided value.

  • learning_rate (float) –

    Default is: 6e-5. Learning rate.

  • num_train_epochs (int) –

    Default is: 1. Total number of training epochs to perform.

  • per_device_train_batch_size (int) –

    Default is: 2. Train batch size

  • gradient_accumulation_steps (int) –

    Default is: 1. Number of updates steps to accumulate before performing a backward/update pass.

  • dataloader_drop_last (bool) –

    Default is: True. When it='True', the last incomplete batch is dropped if the dataset size is not divisible by the batch size, meaning DataLoader only returns complete batches.

  • warmup_ratio (float) –

    Default is: 0.1. Warmup ratio for linear scheduler.

  • weight_decay (float) –

    Default is: 0.01. Weight decay in AdamW.

  • deepspeed (str) –

    Default is: ``. The path of the DeepSpeed configuration file, default to use the pre-made configuration file in the LazyLLM code repository:ds_stage0.json``.

  • logging_steps (int) –

    Default is: 1. Logging frequency according to logging strategy.

  • save_steps (int) –

    Default is: 1000. Saving frequency.

  • overwrite_output_dir (bool) –

    Default is: True. It is used to allow the program to overwrite an existing output directory.

  • fp16 (bool) –

    Default is: True. Whether to use fp16 (mixed) precision instead of 32-bit.

  • gradient_checkpointing (bool) –

    Default is: True. Whether enable gradient checkpointing.

Examples:

>>> from lazyllm import finetune
>>> finetune.FlagembeddingFinetune('bge-m3', 'path/to/target')
<lazyllm.llm.finetune type=FlagembeddingFinetune>
Source code in lazyllm/components/finetune/flagembedding.py
class FlagembeddingFinetune(LazyLLMFinetuneBase):
    """This class is a subclass of ``LazyLLMFinetuneBase``, based on the training capabilities provided by the [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding) framework, used for training embedding and reranker models.

Args:
    base_model (str): The base model used for training. It is required to be the path of the base model.
    target_path (str): The path where the trained model weights are saved.
    launcher (lazyllm.launcher): The launcher for fine-tuning, default is ``launchers.remote(ngpus=1, sync=True)``.
    kw: Keyword arguments used to update the default training parameters.

The keyword arguments and their default values for this class of embedding model are as follows:

Keyword Args:
    train_group_size (int): Default is: ``8``. The size of train group. It is used to control the number of negative samples in each training set.
    query_max_len (int): Default is: ``512``. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.
    passage_max_len (int): Default is: ``512``. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.
    pad_to_multiple_of (int): Default is: ``8``. If set will pad the sequence to be a multiple of the provided value.
    query_instruction_for_retrieval (str): Default is: ``Represent this sentence for searching relevant passages: ``. Instruction for query.
    query_instruction_format (str): Default is: ``{}{}``. Format for query instruction.
    learning_rate (float): Default is: ``1e-5``. Learning rate.
    num_train_epochs (int): Default is: ``1``. Total number of training epochs to perform.
    per_device_train_batch_size (int): Default is: ``2``. Train batch size
    gradient_accumulation_steps (int): Default is: ``1``. Number of updates steps to accumulate before performing a backward/update pass.
    dataloader_drop_last (bool): Default is: ``True``. When it='True', the last incomplete batch is dropped if the dataset size is not divisible by the batch size, meaning DataLoader only returns complete batches.
    warmup_ratio (float): Default is: ``0.1``. Warmup ratio for linear scheduler.
    weight_decay (float): Default is: ``0.01``. Weight decay in AdamW.
    deepspeed (str): Default is: ````. The path of the DeepSpeed configuration file, default to use the pre-made configuration file in the LazyLLM code repository: ``ds_stage0.json``.
    logging_steps (int): Default is: ``1``. Logging frequency according to logging strategy.
    save_steps (int): Default is: ``1000``. Saving frequency.
    temperature (float): Default is: ``0.02``. Temperature used for similarity score
    sentence_pooling_method (str): Default is: ``cls``. The pooling method. Available options: 'cls', 'mean', 'last_token'.
    normalize_embeddings (bool): Default is: ``True``. Whether to normalize the embeddings.
    kd_loss_type (str): Default is: ``kl_div``. The loss type for knowledge distillation. Available options:'kl_div', 'm3_kd_loss'.
    overwrite_output_dir (bool): Default is: ``True``. It is used to allow the program to overwrite an existing output directory.
    fp16 (bool): Default is: ``True``.  Whether to use fp16 (mixed) precision instead of 32-bit.
    gradient_checkpointing (bool): Default is: ``True``. Whether enable gradient checkpointing.
    negatives_cross_device (bool): Default is: ``True``. Whether share negatives across devices.

The keyword arguments and their default values for this class of reranker model are as follows:

Keyword Args:
    train_group_size (int): Default is: ``8``. The size of train group. It is used to control the number of negative samples in each training set.
    query_max_len (int): Default is: ``256``. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.
    passage_max_len (int): Default is: ``256``. The maximum total input sequence length after tokenization for passage. Sequences longer than this will be truncated, sequences shorter will be padded.
    pad_to_multiple_of (int): Default is: ``8``. If set will pad the sequence to be a multiple of the provided value.
    learning_rate (float): Default is: ``6e-5``. Learning rate.
    num_train_epochs (int): Default is: ``1``. Total number of training epochs to perform.
    per_device_train_batch_size (int): Default is: ``2``. Train batch size
    gradient_accumulation_steps (int): Default is: ``1``. Number of updates steps to accumulate before performing a backward/update pass.
    dataloader_drop_last (bool): Default is: ``True``. When it='True', the last incomplete batch is dropped if the dataset size is not divisible by the batch size, meaning DataLoader only returns complete batches.
    warmup_ratio (float): Default is: ``0.1``. Warmup ratio for linear scheduler.
    weight_decay (float): Default is: ``0.01``. Weight decay in AdamW.
    deepspeed (str): Default is: ````. The path of the DeepSpeed configuration file, default to use the pre-made configuration file in the LazyLLM code repository: ``ds_stage0.json``.
    logging_steps (int): Default is: ``1``. Logging frequency according to logging strategy.
    save_steps (int): Default is: ``1000``. Saving frequency.
    overwrite_output_dir (bool): Default is: ``True``. It is used to allow the program to overwrite an existing output directory.
    fp16 (bool): Default is: ``True``.  Whether to use fp16 (mixed) precision instead of 32-bit.
    gradient_checkpointing (bool): Default is: ``True``. Whether enable gradient checkpointing.


Examples:
    >>> from lazyllm import finetune
    >>> finetune.FlagembeddingFinetune('bge-m3', 'path/to/target')
    <lazyllm.llm.finetune type=FlagembeddingFinetune>
    """
    defatult_embed_kw = ArgsDict({
        'train_group_size': 8,
        'query_max_len': 512,
        'passage_max_len': 512,
        'pad_to_multiple_of': 8,
        'query_instruction_for_retrieval': 'Represent this sentence for searching relevant passages: ',
        'query_instruction_format': '{}{}',
        'learning_rate': 1e-5,
        'num_train_epochs': 1,
        'per_device_train_batch_size': 2,
        'gradient_accumulation_steps': 1,
        'dataloader_drop_last': True,
        'warmup_ratio': 0.1,
        'weight_decay': 0.01,
        'deepspeed': '',
        'logging_steps': 1,
        'save_steps': 1000,
        'temperature': 0.02,
        'sentence_pooling_method': 'cls',
        'normalize_embeddings': True,
        'kd_loss_type': 'kl_div',
        'overwrite_output_dir': True,
        'fp16': True,
        'gradient_checkpointing': True,
        'negatives_cross_device': True
    })
    defatult_rerank_kw = ArgsDict({
        'train_group_size': 8,
        'query_max_len': 256,
        'passage_max_len': 256,
        'pad_to_multiple_of': 8,
        'learning_rate': 6e-5,
        'num_train_epochs': 1,
        'per_device_train_batch_size': 2,
        'gradient_accumulation_steps': 1,
        'dataloader_drop_last': True,
        'warmup_ratio': 0.1,
        'weight_decay': 0.01,
        'deepspeed': '',
        'logging_steps': 1,
        'save_steps': 1000,
        'overwrite_output_dir': True,
        'fp16': True,
        'gradient_checkpointing': True
    })
    store_true_embed_kw = {'overwrite_output_dir', 'fp16', 'gradient_checkpointing', 'negatives_cross_device'}
    store_true_rerank_kw = {'overwrite_output_dir', 'fp16', 'gradient_checkpointing'}

    def __init__(
        self,
        base_model,
        target_path,
        launcher=launchers.remote(ngpus=1, sync=True),  # noqa B008
        **kw
    ):
        model_type = ModelManager.get_model_type(base_model.split('/')[-1])
        if model_type not in ('embed', 'rerank'):
            raise RuntimeError(f'Not supported {model_type} type to finetune.')
        if not os.path.exists(base_model):
            defatult_path = os.path.join(lazyllm.config['model_path'], base_model)
            if os.path.exists(defatult_path):
                base_model = defatult_path
        save_path = os.path.join(lazyllm.config['train_target_root'], target_path)
        target_path = os.path.join(save_path, model_type)
        os.system(f'mkdir -p {target_path}')
        super().__init__(
            base_model,
            target_path,
            launcher=launcher,
        )
        if model_type == 'rerank':
            self.kw = copy.deepcopy(self.defatult_rerank_kw)
            self.store_true_kw = copy.deepcopy(self.store_true_rerank_kw)
            self.module_run_path = 'FlagEmbedding.finetune.reranker.encoder_only.base'
        else:
            self.kw = copy.deepcopy(self.defatult_embed_kw)
            self.store_true_kw = copy.deepcopy(self.store_true_embed_kw)
            self.module_run_path = 'FlagEmbedding.finetune.embedder.encoder_only.base'
        self.kw.check_and_update(kw)
        if not self.kw['deepspeed']:
            folder_path = os.path.dirname(os.path.abspath(__file__))
            deepspeed_config_path = os.path.join(folder_path, 'flag_embedding', 'ds_stage0.json')
            self.kw['deepspeed'] = deepspeed_config_path
        self.nproc_per_node = launcher.ngpus

    def cmd(self, trainset, valset=None) -> str:
        thirdparty.check_packages(['flagembedding'])
        self.kw['train_data'] = trainset

        formatted_date = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
        self.log_file_path = f'{self.target_path}/train_log_{formatted_date}_{random.randint(1000, 9999)}.log'
        cache_path = os.path.join(os.path.expanduser(lazyllm.config['home']), 'fintune', 'embeding')
        cache_model_path = os.path.join(cache_path, 'model')
        cache_data_path = os.path.join(cache_path, 'data')
        os.system(f'mkdir -p {cache_model_path} {cache_data_path}')

        cmds = (f'export WANDB_MODE=disabled && torchrun --nproc_per_node {self.nproc_per_node} '
                f'-m {self.module_run_path} '
                f'--model_name_or_path {self.base_model} '
                f'--output_dir {self.target_path} '
                f'--cache_dir {cache_model_path} '
                f'--cache_path {cache_data_path} '
            )
        for key in self.store_true_kw:
            cmds += f'--{key} ' if self.kw.pop(key) else ''
        cmds += self.kw.parse_kwargs()
        cmds += f' 2>&1 | tee {self.log_file_path}'
        return cmds

lazyllm.components.auto.AutoFinetune

Bases: LazyLLMFinetuneBase

This class is a subclass of LazyLLMFinetuneBase and can automatically select the appropriate fine-tuning framework and parameters based on the input arguments to fine-tune large language models.

Specifically, based on the input model parameters of base_model, ctx_len, batch_size, lora_r, the type and number of GPUs in launcher, this class can automatically select the appropriate fine-tuning framework (such as: AlpacaloraFinetune or CollieFinetune) and the required parameters.

Parameters:

  • base_model (str) –

    The base model used for fine-tuning. It is required to be the path of the base model.

  • source (config[model_source]) –

    Specifies the model download source. This can be configured by setting the environment variable LAZYLLM_MODEL_SOURCE.

  • target_path (str) –

    The path where the LoRA weights of the fine-tuned model are saved.

  • merge_path (str) –

    The path where the model merges the LoRA weights, default to None. If not specified, "lazyllm_lora" and "lazyllm_merge" directories will be created under target_path as target_path and merge_path respectively.

  • ctx_len (int) –

    The maximum token length for input to the fine-tuned model, default to 1024.

  • batch_size (int) –

    Batch size, default to 32.

  • lora_r (int) –

    LoRA rank, default to 8; this value determines the amount of parameters added, the smaller the value, the fewer the parameters.

  • launcher (launcher, default: remote() ) –

    The launcher for fine-tuning, default to launchers.remote(ngpus=1).

  • kw

    Keyword arguments, used to update the default training parameters. Note that additional keyword arguments cannot be arbitrarily specified, as they depend on the framework inferred by LazyLLM, so it is recommended to set them with caution.

Examples:

>>> from lazyllm import finetune
>>> finetune.auto("internlm2-chat-7b", 'path/to/target')
<lazyllm.llm.finetune type=AlpacaloraFinetune>
Source code in lazyllm/components/auto/autofinetune.py
class AutoFinetune(LazyLLMFinetuneBase):
    """This class is a subclass of ``LazyLLMFinetuneBase`` and can automatically select the appropriate fine-tuning framework and parameters based on the input arguments to fine-tune large language models.

Specifically, based on the input model parameters of ``base_model``, ``ctx_len``, ``batch_size``, ``lora_r``, the type and number of GPUs in ``launcher``, this class can automatically select the appropriate fine-tuning framework (such as: ``AlpacaloraFinetune`` or ``CollieFinetune``) and the required parameters.

Args:
    base_model (str): The base model used for fine-tuning. It is required to be the path of the base model.
    source (lazyllm.config['model_source']): Specifies the model download source. This can be configured by setting the environment variable ``LAZYLLM_MODEL_SOURCE``.
    target_path (str): The path where the LoRA weights of the fine-tuned model are saved.
    merge_path (str): The path where the model merges the LoRA weights, default to ``None``. If not specified, "lazyllm_lora" and "lazyllm_merge" directories will be created under ``target_path`` as ``target_path`` and ``merge_path`` respectively.
    ctx_len (int): The maximum token length for input to the fine-tuned model, default to ``1024``.
    batch_size (int): Batch size, default to ``32``.
    lora_r (int): LoRA rank, default to ``8``; this value determines the amount of parameters added, the smaller the value, the fewer the parameters.
    launcher (lazyllm.launcher): The launcher for fine-tuning, default to ``launchers.remote(ngpus=1)``.
    kw: Keyword arguments, used to update the default training parameters. Note that additional keyword arguments cannot be arbitrarily specified, as they depend on the framework inferred by LazyLLM, so it is recommended to set them with caution.


Examples:
    >>> from lazyllm import finetune
    >>> finetune.auto("internlm2-chat-7b", 'path/to/target')
    <lazyllm.llm.finetune type=AlpacaloraFinetune>
    """

    def __new__(cls, base_model: str, target_path: str, source: Optional[str] = None,
                merge_path: Optional[str] = None, batch_size: int = 32, lora_r: int = 8,
                model_type: Optional[str] = None, launcher: Optional[LazyLLMLaunchersBase] = None, **kw):
        base_model = ModelManager(source).download(base_model) or ''
        LOG.info(f'[AutoFinetune] Using base model from: {base_model}')
        model_name = get_model_name(base_model)
        if not model_type:
            model_type = ModelManager.get_model_type(model_name)
            LOG.info(f'[AutoFinetune] Infer type of model {model_name} is {model_type}')
        if model_type in ['tts', 'stt', 'sd', 'ocr', 'cross_modal_embed']:
            raise RuntimeError(f'Fine-tuning of the {model_type} model is not currently supported.')

        if model_type in ['embed', 'rerank']:
            LOG.info(f'[AutoFinetune] Finetune {model_name} with FlagEmbedding.')
            return finetune.flagembedding(base_model, target_path, **kw)

        params = {'gradient_step': 1, 'micro_batch_size': 32}
        if not launcher:
            match = re.search(r'(\d+)[bB]', model_name)
            model_size = int(match.group(1)) if match else 0
            gs, mbs, ngpus = estimate_finetune_plan(
                gpu_mem_gb=config['gpu_memory'], model_size_b=model_size, batch_size=batch_size, lora_r=lora_r)
            params.update({'gradient_step': gs, 'micro_batch_size': mbs})
            LOG.info(f'[AutoFinetune] Infer model_size: {model_size} B, '
                     f'gradient_step: {gs}, micro_batch_size: {mbs}, ngpus: {ngpus}')
            launcher = launchers.remote(ngpus=ngpus, sync=True)

        candidates = ['llamafactory', 'alpacalora']
        candidates = dict(llm=['llamafactory', 'alpacalora'], vlm=['llamafactory'])
        for finetune_cls_name in candidates[model_type]:
            if check_requirements(requirements[finetune_cls_name]):
                finetune_cls = getattr(finetune, finetune_cls_name)
                for key, value in finetune_cls.auto_map.items():
                    if value and value not in kw:
                        kw[value] = params[key]
                LOG.info(f'[AutoFinetune] Use {finetune_cls_name} to finetune.')
                if finetune_cls_name == 'llamafactory':
                    return finetune_cls(base_model, target_path, lora_r=lora_r, launcher=launcher, **kw)
                return finetune_cls(base_model, target_path, merge_path, cp_files='tokeniz*',
                                    batch_size=batch_size, lora_r=lora_r, launcher=launcher, **kw)
        raise RuntimeError('No valid framework found, candidates are '
                           f'{[c.framework.lower() for c in candidates[model_type]]}.')

lazyllm.components.finetune.base.DummyFinetune

Bases: LazyLLMFinetuneBase

DummyFinetune is a subclass of LazyLLMFinetuneBase that serves as a placeholder implementation for fine-tuning. The class is primarily used for demonstration or testing purposes, as it does not perform any actual fine-tuning logic.

Parameters:

  • base_model

    A string specifying the base model name. Defaults to 'base'.

  • target_path

    A string specifying the target path for fine-tuning outputs. Defaults to 'target'.

  • launcher

    A launcher instance for executing commands. Defaults to [launchers.remote()][lazyllm.launchers.remote].

  • **kw

    Additional keyword arguments that are stored for later use.

Returns:

  • A string representing a dummy command. The string includes the initial arguments passed during initialization.

Examples:

>>> from lazyllm.components import DummyFinetune
>>> from lazyllm import launchers
>>> # 创建一个 DummyFinetune 实例
>>> finetuner = DummyFinetune(base_model='example-base', target_path='example-target', launcher=launchers.local(), custom_arg='custom_value')
>>> # 调用 cmd 方法生成占位命令
>>> command = finetuner.cmd('--example-arg', key='value')
>>> print(command)
... echo 'dummy finetune!, and init-args is {'custom_arg': 'custom_value'}'
Source code in lazyllm/components/finetune/base.py
class DummyFinetune(LazyLLMFinetuneBase):
    """DummyFinetune is a subclass of [LazyLLMFinetuneBase][lazyllm.components.LazyLLMFinetuneBase] that serves as a placeholder implementation for fine-tuning.
The class is primarily used for demonstration or testing purposes, as it does not perform any actual fine-tuning logic.

Args:
    base_model: A string specifying the base model name. Defaults to 'base'.
    target_path: A string specifying the target path for fine-tuning outputs. Defaults to 'target'.
    launcher: A launcher instance for executing commands. Defaults to [launchers.remote()][lazyllm.launchers.remote].
    **kw: Additional keyword arguments that are stored for later use.

Returns:
    A string representing a dummy command. The string includes the initial arguments passed during initialization.


Examples:
    >>> from lazyllm.components import DummyFinetune
    >>> from lazyllm import launchers
    >>> # 创建一个 DummyFinetune 实例
    >>> finetuner = DummyFinetune(base_model='example-base', target_path='example-target', launcher=launchers.local(), custom_arg='custom_value')
    >>> # 调用 cmd 方法生成占位命令
    >>> command = finetuner.cmd('--example-arg', key='value')
    >>> print(command)
    ... echo 'dummy finetune!, and init-args is {'custom_arg': 'custom_value'}'
    """
    def __init__(self, base_model='base', target_path='target', *, launcher=launchers.remote(), **kw):  # noqa B008
        super().__init__(base_model, target_path, launcher=launchers.empty)
        self.kw = kw

    def cmd(self, *args, **kw) -> str:
        """The `cmd` method generates a dummy command string for fine-tuning. This method is primarily for testing or demonstration purposes.

Args:
    *args: Positional arguments to be included in the command (not used in this implementation).
    **kw: Keyword arguments to be included in the command (not used in this implementation).

Returns:
    A string representing a dummy command. The string includes the initial arguments (`**kw`) passed during the instance initialization, which are stored in `self.kw`.

Example:
    If the class is initialized with `custom_arg='value'`, calling the `cmd` method will return:
    `"echo 'dummy finetune!, and init-args is {'custom_arg': 'value'}'"`


Examples:
    >>> from lazyllm.components import DummyFinetune
    >>> from lazyllm import launchers
    >>> # 创建一个 DummyFinetune 实例,并传递初始化参数
    >>> finetuner = DummyFinetune(base_model='example-base', target_path='example-target', launcher=launchers.local(), custom_arg='value')
    >>> # 调用 cmd 方法生成占位命令
    >>> command = finetuner.cmd()
    >>> # 打印生成的占位命令
    >>> print(command)
    ... echo 'dummy finetune!, and init-args is {'custom_arg': 'value'}'
    """
        return f'echo \'dummy finetune!, and init-args is {self.kw}\''

cmd(*args, **kw)

The cmd method generates a dummy command string for fine-tuning. This method is primarily for testing or demonstration purposes.

Parameters:

  • *args

    Positional arguments to be included in the command (not used in this implementation).

  • **kw

    Keyword arguments to be included in the command (not used in this implementation).

Returns:

  • str

    A string representing a dummy command. The string includes the initial arguments (**kw) passed during the instance initialization, which are stored in self.kw.

Example

If the class is initialized with custom_arg='value', calling the cmd method will return: "echo 'dummy finetune!, and init-args is {'custom_arg': 'value'}'"

Examples:

>>> from lazyllm.components import DummyFinetune
>>> from lazyllm import launchers
>>> # 创建一个 DummyFinetune 实例,并传递初始化参数
>>> finetuner = DummyFinetune(base_model='example-base', target_path='example-target', launcher=launchers.local(), custom_arg='value')
>>> # 调用 cmd 方法生成占位命令
>>> command = finetuner.cmd()
>>> # 打印生成的占位命令
>>> print(command)
... echo 'dummy finetune!, and init-args is {'custom_arg': 'value'}'
Source code in lazyllm/components/finetune/base.py
    def cmd(self, *args, **kw) -> str:
        """The `cmd` method generates a dummy command string for fine-tuning. This method is primarily for testing or demonstration purposes.

Args:
    *args: Positional arguments to be included in the command (not used in this implementation).
    **kw: Keyword arguments to be included in the command (not used in this implementation).

Returns:
    A string representing a dummy command. The string includes the initial arguments (`**kw`) passed during the instance initialization, which are stored in `self.kw`.

Example:
    If the class is initialized with `custom_arg='value'`, calling the `cmd` method will return:
    `"echo 'dummy finetune!, and init-args is {'custom_arg': 'value'}'"`


Examples:
    >>> from lazyllm.components import DummyFinetune
    >>> from lazyllm import launchers
    >>> # 创建一个 DummyFinetune 实例,并传递初始化参数
    >>> finetuner = DummyFinetune(base_model='example-base', target_path='example-target', launcher=launchers.local(), custom_arg='value')
    >>> # 调用 cmd 方法生成占位命令
    >>> command = finetuner.cmd()
    >>> # 打印生成的占位命令
    >>> print(command)
    ... echo 'dummy finetune!, and init-args is {'custom_arg': 'value'}'
    """
        return f'echo \'dummy finetune!, and init-args is {self.kw}\''

lazyllm.components.finetune.LazyLLMFinetuneBase

Bases: ComponentBase

LazyLLM fine-tuning base component class, inherits from ComponentBase.

Provides base functionality for large language model fine-tuning, supports remote launcher configuration and model path management.

Parameters:

  • base_model (str) –

    Base model path or identifier

  • target_path (str) –

    Fine-tuned model output path

  • launcher (Launcher, default: remote() ) –

    Task launcher, defaults to remote launcher

Source code in lazyllm/components/finetune/base.py
class LazyLLMFinetuneBase(ComponentBase):
    """LazyLLM fine-tuning base component class, inherits from ComponentBase.

Provides base functionality for large language model fine-tuning, supports remote launcher configuration and model path management.

Args:
    base_model (str): Base model path or identifier
    target_path (str): Fine-tuned model output path
    launcher (Launcher, optional): Task launcher, defaults to remote launcher
"""
    __reg_overwrite__ = 'cmd'

    def __init__(self, base_model, target_path, *, launcher=launchers.remote()):  # noqa B008
        super().__init__(launcher=launcher)
        self.base_model = base_model
        self.target_path = target_path
        self.merge_path = None

    def __call__(self, *args, **kw):
        super().__call__(*args, **kw)
        if self.merge_path:
            return self.merge_path
        else:
            return self.target_path

Deploy

lazyllm.components.deploy.Lightllm

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase, based on the inference capabilities provided by the LightLLM framework, used for inference with large language models.

Parameters:

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code, defaults to True

  • launcher (Launcher, default: remote(ngpus=1) ) –

    Task launcher, defaults to single GPU remote launcher

  • log_path (str, default: None ) –

    Log file path, defaults to None

  • **kw

    Other LightLLM server configuration parameters

The keyword arguments and their default values for this class are as follows:

Other Parameters:

  • tp (int) –

    Tensor parallelism parameter, default is 1.

  • max_total_token_num (int) –

    Maximum total token number, default is 64000.

  • eos_id (int) –

    End-of-sentence ID, default is 2.

  • port (int) –

    Service port number, default is None, in which case LazyLLM will automatically generate a random port number.

  • host (str) –

    Service IP address, default is 0.0.0.0.

  • nccl_port (int) –

    NCCL port, default is None, in which case LazyLLM will automatically generate a random port number.

  • tokenizer_mode (str) –

    Tokenizer loading mode, default is auto.

  • running_max_req_size (int) –

    Maximum number of parallel requests for the inference engine, default is 256.

  • data_type (str) –

    Data type for model weights, default is float16.

  • max_req_total_len (int) –

    Maximum total length for requests, default is 64000.

  • max_req_input_len (int) –

    Maximum input length, default is 4096.

  • long_truncation_mode (str) –

    Truncation mode for long texts, default is head.

Examples:

>>> from lazyllm import deploy
>>> infer = deploy.lightllm()
Source code in lazyllm/components/deploy/lightllm.py
class Lightllm(LazyLLMDeployBase):
    """This class is a subclass of ``LazyLLMDeployBase``, based on the inference capabilities provided by the [LightLLM](https://github.com/ModelTC/lightllm) framework, used for inference with large language models.

Args:
    trust_remote_code (bool, optional): Whether to trust remote code, defaults to True
    launcher (Launcher, optional): Task launcher, defaults to single GPU remote launcher
    log_path (str, optional): Log file path, defaults to None
    **kw: Other LightLLM server configuration parameters
The keyword arguments and their default values for this class are as follows:

Keyword Args: 
    tp (int): Tensor parallelism parameter, default is ``1``.
    max_total_token_num (int): Maximum total token number, default is ``64000``.
    eos_id (int): End-of-sentence ID, default is ``2``.
    port (int): Service port number, default is ``None``, in which case LazyLLM will automatically generate a random port number.
    host (str): Service IP address, default is ``0.0.0.0``.
    nccl_port (int): NCCL port, default is ``None``, in which case LazyLLM will automatically generate a random port number.
    tokenizer_mode (str): Tokenizer loading mode, default is ``auto``.
    running_max_req_size (int): Maximum number of parallel requests for the inference engine, default is ``256``.
    data_type (str): Data type for model weights, default is ``float16``.
    max_req_total_len (int): Maximum total length for requests, default is ``64000``.
    max_req_input_len (int): Maximum input length, default is ``4096``.
    long_truncation_mode (str): Truncation mode for long texts, default is ``head``.


Examples:
    >>> from lazyllm import deploy
    >>> infer = deploy.lightllm()
    """
    keys_name_handle = {
        'inputs': 'inputs',
        'stop': 'stop_sequences'
    }
    default_headers = {'Content-Type': 'application/json'}
    message_format = {
        'inputs': 'Who are you ?',
        'parameters': {
            'do_sample': False,
            'presence_penalty': 0.0,
            'frequency_penalty': 0.0,
            'repetition_penalty': 1.0,
            'temperature': 1.0,
            'top_p': 1,
            'top_k': -1,  # -1 is for all
            'ignore_eos': False,
            'max_new_tokens': 8192,
            'stop_sequences': None,
        }
    }
    auto_map = {}
    stream_url_suffix = '_stream'
    stream_parse_parameters = {'delimiter': b'\n\n'}

    def __init__(self, trust_remote_code=True, launcher=launchers.remote(ngpus=1), log_path=None, # noqa B008
                 openai_api: Optional[bool] = None, **kw):
        super().__init__(launcher=launcher)
        self.kw = ArgsDict({
            'tp': 1,
            'max_total_token_num': 64000,
            'eos_id': 2,
            'port': None,
            'host': '0.0.0.0',
            'nccl_port': None,
            'tokenizer_mode': 'auto',
            'running_max_req_size': 256,
            'data_type': 'float16',
            'max_req_total_len': 64000,
            'max_req_input_len': 4096,
            'long_truncation_mode': 'head',
        })
        self.options_keys = kw.pop('options_keys', [])
        if trust_remote_code and 'trust_remote_code' not in self.options_keys:
            self.options_keys.append('trust_remote_code')
        self.kw.check_and_update(kw)
        self.random_port = False if 'port' in kw and kw['port'] else True
        self.random_nccl_port = False if 'nccl_port' in kw and kw['nccl_port'] else True
        self.temp_folder = make_log_dir(log_path, 'lightllm') if log_path else None

    def cmd(self, finetuned_model=None, base_model=None):
        """This method generates the command to start the LightLLM service.

Args:
    finetuned_model (str): Path to the fine-tuned model.
    base_model (str): Path to the base model, used when finetuned_model is invalid.

**Returns:**

- LazyLLMCMD: A LazyLLMCMD object containing the startup command.
"""
        if not os.path.exists(finetuned_model) or \
            not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                    for filename in os.listdir(finetuned_model)):
            if not finetuned_model:
                LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                            f'base_model({base_model}) will be used')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)
            if self.random_nccl_port:
                self.kw['nccl_port'] = random.randint(20000, 30000)
            cmd = f'python -m lightllm.server.api_server --model_dir {finetuned_model} '
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

    def geturl(self, job=None):
        """Get the URL address of the LightLLM service.

Args:
    job (optional): Job object, defaults to None, in which case self.job is used.

**Returns:**

- str: The service URL address in the format "http://{ip}:{port}/generate".
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return 'http://{ip}:{port}/generate'
        else:
            return f'http://{job.get_jobip()}:{self.kw["port"]}/generate'

    @staticmethod
    def extract_result(x, inputs):
        """Extract generated text from the service response.

Args:
    x (str): Response text from the service.
    inputs (str): Input text.

**Returns:**

- str: The extracted generated text.

Raises:
    Exception: When JSON response parsing fails.
"""
        try:
            if x.startswith('data:'): return json.loads(x[len('data:'):])['token']['text']
            else: return json.loads(x)['generated_text'][0]
        except Exception as e:
            LOG.warning(f'JSONDecodeError on load {x}')
            raise e

cmd(finetuned_model=None, base_model=None)

This method generates the command to start the LightLLM service.

Parameters:

  • finetuned_model (str, default: None ) –

    Path to the fine-tuned model.

  • base_model (str, default: None ) –

    Path to the base model, used when finetuned_model is invalid.

Returns:

  • LazyLLMCMD: A LazyLLMCMD object containing the startup command.
Source code in lazyllm/components/deploy/lightllm.py
    def cmd(self, finetuned_model=None, base_model=None):
        """This method generates the command to start the LightLLM service.

Args:
    finetuned_model (str): Path to the fine-tuned model.
    base_model (str): Path to the base model, used when finetuned_model is invalid.

**Returns:**

- LazyLLMCMD: A LazyLLMCMD object containing the startup command.
"""
        if not os.path.exists(finetuned_model) or \
            not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                    for filename in os.listdir(finetuned_model)):
            if not finetuned_model:
                LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                            f'base_model({base_model}) will be used')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)
            if self.random_nccl_port:
                self.kw['nccl_port'] = random.randint(20000, 30000)
            cmd = f'python -m lightllm.server.api_server --model_dir {finetuned_model} '
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

geturl(job=None)

Get the URL address of the LightLLM service.

Parameters:

  • job (optional, default: None ) –

    Job object, defaults to None, in which case self.job is used.

Returns:

  • str: The service URL address in the format "http://{ip}:{port}/generate".
Source code in lazyllm/components/deploy/lightllm.py
    def geturl(self, job=None):
        """Get the URL address of the LightLLM service.

Args:
    job (optional): Job object, defaults to None, in which case self.job is used.

**Returns:**

- str: The service URL address in the format "http://{ip}:{port}/generate".
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return 'http://{ip}:{port}/generate'
        else:
            return f'http://{job.get_jobip()}:{self.kw["port"]}/generate'

extract_result(x, inputs) staticmethod

Extract generated text from the service response.

Parameters:

  • x (str) –

    Response text from the service.

  • inputs (str) –

    Input text.

Returns:

  • str: The extracted generated text.

Raises:

  • Exception

    When JSON response parsing fails.

Source code in lazyllm/components/deploy/lightllm.py
    @staticmethod
    def extract_result(x, inputs):
        """Extract generated text from the service response.

Args:
    x (str): Response text from the service.
    inputs (str): Input text.

**Returns:**

- str: The extracted generated text.

Raises:
    Exception: When JSON response parsing fails.
"""
        try:
            if x.startswith('data:'): return json.loads(x[len('data:'):])['token']['text']
            else: return json.loads(x)['generated_text'][0]
        except Exception as e:
            LOG.warning(f'JSONDecodeError on load {x}')
            raise e

lazyllm.components.deploy.Vllm

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase, leveraging the VLLM framework to deploy and run inference on large language models.

Parameters:

  • trust_remote_code (bool, default: True ) –

    Whether to allow loading of model code from remote sources. Default is True.

  • launcher (launcher, default: remote(ngpus=1) ) –

    The launcher used to start the model. Default is launchers.remote(ngpus=1).

  • log_path (str, default: None ) –

    Path to store logs. If None, logs will not be saved.

  • openai_api (bool, default: None ) –

    Whether to start VLLM with OpenAI-compatible API. Default is False.

  • kw

    Keyword arguments used to override default deployment parameters. No extra arguments beyond the supported ones are allowed.

The supported keyword arguments and their default values are as follows:

Other Parameters:

  • tensor-parallel-size (int) –

    Tensor parallelism size. Default is 1.

  • dtype (str) –

    Data type for model weights and activations. Default is auto. Options include: half, float16, bfloat16, float, float32.

  • kv-cache-dtype (str) –

    Data type for KV cache. Default is auto. Options include: fp8, fp8_e5m2, fp8_e4m3.

  • device (str) –

    Backend device type supported by VLLM. Default is auto. Options include: cuda, neuron, cpu.

  • block-size (int) –

    Token block size. Default is 16.

  • port (int | str) –

    Service port number. Default is auto (random assignment).

  • host (str) –

    Service binding IP address. Default is 0.0.0.0.

  • seed (int) –

    Random seed. Default is 0.

  • tokenizer_mode (str) –

    Tokenizer loading mode. Default is auto.

  • max-num-seqs (int) –

    Maximum number of concurrent requests supported by the inference engine. Default is 256.

  • pipeline-parallel-size (int) –

    Pipeline parallelism size. Default is 1.

  • max-num-batched-tokens (int) –

    Maximum number of batched tokens. Default is 64000.

Examples:

>>> from lazyllm import deploy
>>> infer = deploy.vllm()
Source code in lazyllm/components/deploy/vllm.py
class Vllm(LazyLLMDeployBase, metaclass=_VllmStreamParseParametersMeta):
    """This class is a subclass of ``LazyLLMDeployBase``, leveraging the [VLLM](https://github.com/vllm-project/vllm) framework to deploy and run inference on large language models.

Args:
    trust_remote_code (bool): Whether to allow loading of model code from remote sources. Default is ``True``.
    launcher (lazyllm.launcher): The launcher used to start the model. Default is ``launchers.remote(ngpus=1)``.
    log_path (str): Path to store logs. If ``None``, logs will not be saved.
    openai_api (bool): Whether to start VLLM with OpenAI-compatible API. Default is ``False``.
    kw: Keyword arguments used to override default deployment parameters. No extra arguments beyond the supported ones are allowed.

The supported keyword arguments and their default values are as follows:

Keyword Args: 
    tensor-parallel-size (int): Tensor parallelism size. Default is ``1``.
    dtype (str): Data type for model weights and activations. Default is ``auto``. Options include: ``half``, ``float16``, ``bfloat16``, ``float``, ``float32``.
    kv-cache-dtype (str): Data type for KV cache. Default is ``auto``. Options include: ``fp8``, ``fp8_e5m2``, ``fp8_e4m3``.
    device (str): Backend device type supported by VLLM. Default is ``auto``. Options include: ``cuda``, ``neuron``, ``cpu``.
    block-size (int): Token block size. Default is ``16``.
    port (int | str): Service port number. Default is ``auto`` (random assignment).
    host (str): Service binding IP address. Default is ``0.0.0.0``.
    seed (int): Random seed. Default is ``0``.
    tokenizer_mode (str): Tokenizer loading mode. Default is ``auto``.
    max-num-seqs (int): Maximum number of concurrent requests supported by the inference engine. Default is ``256``.
    pipeline-parallel-size (int): Pipeline parallelism size. Default is ``1``.
    max-num-batched-tokens (int): Maximum number of batched tokens. Default is ``64000``.


Examples:
    >>> from lazyllm import deploy
    >>> infer = deploy.vllm()
    """
    # keys_name_handle/default_headers/message_format will lose efficacy when openai_api is True
    keys_name_handle = {'inputs': 'prompt', 'stop': 'stop'}
    default_headers = {'Content-Type': 'application/json'}
    message_format = {
        'prompt': 'Who are you ?',
        'stream': False,
        'stop': ['<|im_end|>', '<|im_start|>', '</s>', '<|assistant|>', '<|user|>', '<|system|>', '<eos>'],
        'skip_special_tokens': False,
        'temperature': 0.6,
        'top_p': 0.8,
        'max_tokens': 4096
    }
    auto_map = {
        'tp': 'tensor_parallel_size'
    }  # from cli to vllm
    optional_keys = set([
        'max_model_len',
        'gpu_memory_utilization',
        'task',
        'dtype',
        'kv_cache_dtype',
        'tokenizer_mode',
        'block_size',
        'max_num_seqs',
        'pipeline_parallel_size',
        'tensor_parallel_size',
        'seed',
        'port',
        'max_num_batched_tokens',
        'tool_call_parser',
        'swap_space',
        'mm_processor_kwargs',
        'limit_mm_per_prompt',
        'hf_overrides'])

    # TODO(wangzhihong): change default value for `openai_api` argument to True
    def __init__(self, trust_remote_code: bool = True, launcher: LazyLLMLaunchersBase = launchers.remote(ngpus=1),  # noqa B008
                 log_path: str = None, openai_api: Optional[bool] = None, **kw):
        self.launcher_list, launcher = reallocate_launcher(launcher)
        super().__init__(launcher=launcher)
        self.kw = ArgsDict({
            'host': '0.0.0.0',
            'max_model_len': 10240,
        })
        if openai_api is None: openai_api = lazyllm.config['openai_api']
        self._vllm_cmd = 'vllm.entrypoints.openai.api_server' if openai_api else 'vllm.entrypoints.api_server'
        self._openai_api = openai_api
        self.options_keys = kw.pop('options_keys', [])
        if trust_remote_code and 'trust_remote_code' not in self.options_keys:
            self.options_keys.append('trust_remote_code')
        self.kw.update(**{key: kw[key] for key in self.optional_keys if key in kw})
        self.kw.check_and_update(kw)
        self.random_port = False if 'port' in kw and kw['port'] and kw['port'] != 'auto' else True
        self.temp_folder = make_log_dir(log_path, 'vllm') if log_path else None
        if self.launcher_list:
            ray_launcher = [Distributed(launcher=launcher) for launcher in self.launcher_list]
            parall_launcher = [lazyllm.pipeline(sleep_moment, launcher) for launcher in ray_launcher[1:]]
            self._prepare_deploy = lazyllm.pipeline(
                ray_launcher[0], post_action=(lazyllm.parallel(*parall_launcher) if len(parall_launcher) else None))

    def cmd(self, finetuned_model=None, base_model=None, master_ip=None):
        """Build the command to launch the vLLM inference service.

This method validates the model path and constructs an executable command string based on current configuration. In distributed mode, it will also prepend the ray cluster start command.

Args:
    finetuned_model (str): Path to the fine-tuned model.
    base_model (str): Fallback base model path if finetuned_model is invalid.
    master_ip (str): IP address of the master node in a distributed setup.

**Returns:**

- LazyLLMCMD: The command object with shell instruction, return value handler, and health checker.
"""
        if finetuned_model:
            LOG.info(f'Using finetuned model from {finetuned_model} to deploy.')
        if not finetuned_model:
            LOG.info(f'Using model {base_model} to deploy.')
            finetuned_model = base_model
        elif not os.path.exists(finetuned_model):
            LOG.warning(f'Warning! The finetuned_model path does not exist: {finetuned_model}. '
                        f'Using base_model({base_model}) instead.')
            finetuned_model = base_model
        elif not any(filename.endswith(('.bin', '.safetensors', '.pt'))
                     for filename in os.listdir(finetuned_model)):
            LOG.warning(f'Warning! No valid model files (.bin, .safetensors or .pt) found in: {finetuned_model}. '
                        f'Using base_model({base_model}) instead.')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)

            cmd = ''
            if self.launcher_list:
                cmd += f'ray start --address="{master_ip}" && '
            cmd += f'{sys.executable} -m {self._vllm_cmd} --model {finetuned_model} '
            if self._openai_api: cmd += '--served-model-name lazyllm '
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl,
                          checkf=(verify_vllm_openai_func if self._openai_api else verify_fastapi_func))

    def geturl(self, job=None):
        """Get the inference service URL for the vLLM deployment.

Depending on the execution mode (Display or actual deployment), this method returns the appropriate URL for accessing the model's generate endpoint.

Args:
    job (Job, optional): Deployment job object. Defaults to the module's associated job.

**Returns:**

- str: The HTTP URL for inference service.
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return 'http://{ip}:{port}/generate'
        else:
            return f'http://{job.get_jobip()}:{self.kw["port"]}' + (
                '/v1/' if self._openai_api else '/generate')

    @staticmethod
    def extract_result(x, inputs):
        """Extracts the inference result from the JSON string returned by the VLLM service.

Args:
    x (str): Raw JSON string returned from the VLLM service.
    inputs (Any): Input arguments (not used here, kept for interface consistency).

**Returns:**

- str: The generated text result from the model.
"""
        return json.loads(x)['text'][0]

cmd(finetuned_model=None, base_model=None, master_ip=None)

Build the command to launch the vLLM inference service.

This method validates the model path and constructs an executable command string based on current configuration. In distributed mode, it will also prepend the ray cluster start command.

Parameters:

  • finetuned_model (str, default: None ) –

    Path to the fine-tuned model.

  • base_model (str, default: None ) –

    Fallback base model path if finetuned_model is invalid.

  • master_ip (str, default: None ) –

    IP address of the master node in a distributed setup.

Returns:

  • LazyLLMCMD: The command object with shell instruction, return value handler, and health checker.
Source code in lazyllm/components/deploy/vllm.py
    def cmd(self, finetuned_model=None, base_model=None, master_ip=None):
        """Build the command to launch the vLLM inference service.

This method validates the model path and constructs an executable command string based on current configuration. In distributed mode, it will also prepend the ray cluster start command.

Args:
    finetuned_model (str): Path to the fine-tuned model.
    base_model (str): Fallback base model path if finetuned_model is invalid.
    master_ip (str): IP address of the master node in a distributed setup.

**Returns:**

- LazyLLMCMD: The command object with shell instruction, return value handler, and health checker.
"""
        if finetuned_model:
            LOG.info(f'Using finetuned model from {finetuned_model} to deploy.')
        if not finetuned_model:
            LOG.info(f'Using model {base_model} to deploy.')
            finetuned_model = base_model
        elif not os.path.exists(finetuned_model):
            LOG.warning(f'Warning! The finetuned_model path does not exist: {finetuned_model}. '
                        f'Using base_model({base_model}) instead.')
            finetuned_model = base_model
        elif not any(filename.endswith(('.bin', '.safetensors', '.pt'))
                     for filename in os.listdir(finetuned_model)):
            LOG.warning(f'Warning! No valid model files (.bin, .safetensors or .pt) found in: {finetuned_model}. '
                        f'Using base_model({base_model}) instead.')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)

            cmd = ''
            if self.launcher_list:
                cmd += f'ray start --address="{master_ip}" && '
            cmd += f'{sys.executable} -m {self._vllm_cmd} --model {finetuned_model} '
            if self._openai_api: cmd += '--served-model-name lazyllm '
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl,
                          checkf=(verify_vllm_openai_func if self._openai_api else verify_fastapi_func))

geturl(job=None)

Get the inference service URL for the vLLM deployment.

Depending on the execution mode (Display or actual deployment), this method returns the appropriate URL for accessing the model's generate endpoint.

Parameters:

  • job (Job, default: None ) –

    Deployment job object. Defaults to the module's associated job.

Returns:

  • str: The HTTP URL for inference service.
Source code in lazyllm/components/deploy/vllm.py
    def geturl(self, job=None):
        """Get the inference service URL for the vLLM deployment.

Depending on the execution mode (Display or actual deployment), this method returns the appropriate URL for accessing the model's generate endpoint.

Args:
    job (Job, optional): Deployment job object. Defaults to the module's associated job.

**Returns:**

- str: The HTTP URL for inference service.
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return 'http://{ip}:{port}/generate'
        else:
            return f'http://{job.get_jobip()}:{self.kw["port"]}' + (
                '/v1/' if self._openai_api else '/generate')

extract_result(x, inputs) staticmethod

Extracts the inference result from the JSON string returned by the VLLM service.

Parameters:

  • x (str) –

    Raw JSON string returned from the VLLM service.

  • inputs (Any) –

    Input arguments (not used here, kept for interface consistency).

Returns:

  • str: The generated text result from the model.
Source code in lazyllm/components/deploy/vllm.py
    @staticmethod
    def extract_result(x, inputs):
        """Extracts the inference result from the JSON string returned by the VLLM service.

Args:
    x (str): Raw JSON string returned from the VLLM service.
    inputs (Any): Input arguments (not used here, kept for interface consistency).

**Returns:**

- str: The generated text result from the model.
"""
        return json.loads(x)['text'][0]

lazyllm.components.deploy.LMDeploy

Bases: LazyLLMDeployBase

The LMDeploy class, a subclass of LazyLLMDeployBase,
leverages LMDeploy to launch and manage large language model inference services.

Parameters:

  • launcher (Optional[launcher], default: remote(ngpus=1) ) –

    The service launcher, defaults to launchers.remote(ngpus=1).

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code, defaults to True.

  • log_path (Optional[str], default: None ) –

    Path to store logs, defaults to None.

  • **kw

    Keyword arguments used to update the default deployment configuration. No extra arguments beyond those listed below are allowed.

Other Parameters:

  • tp (int) –

    Tensor parallelism factor, defaults to 1.

  • server-name (str) –

    The IP address on which the service listens, defaults to 0.0.0.0.

  • server-port (Optional[int]) –

    Port number for the service. Defaults to None; in this case, a random port between 30000–40000 will be assigned.

  • max-batch-size (int) –

    Maximum batch size, defaults to 128.

  • chat-template (Optional[str]) –

    Path to the chat template file. If the model is not a vision-language model and no template is specified, a default template will be used.

  • eager-mode (bool) –

    Whether to enable eager mode, controlled by the environment variable LMDEPLOY_EAGER_MODE, defaults to False.

Examples:

>>> # Basic use:
>>> from lazyllm import deploy
>>> infer = deploy.LMDeploy()
>>>
>>> # MultiModal:
>>> import lazyllm
>>> from lazyllm import deploy, globals
>>> from lazyllm.components.formatter import encode_query_with_filepaths
>>> chat = lazyllm.TrainableModule('InternVL3_5-1B').deploy_method(deploy.LMDeploy)
>>> chat.update_server()
>>> inputs = encode_query_with_filepaths('What is it?', ['path/to/image'])
>>> res = chat(inputs)
Source code in lazyllm/components/deploy/lmdeploy.py
class LMDeploy(LazyLLMDeployBase):
    """The ``LMDeploy`` class, a subclass of ``LazyLLMDeployBase``,  
leverages [LMDeploy](https://github.com/InternLM/lmdeploy) to launch and manage large language model inference services.

Args:
    launcher (Optional[lazyllm.launcher]): The service launcher, defaults to ``launchers.remote(ngpus=1)``.  
    trust_remote_code (bool): Whether to trust remote code, defaults to ``True``.  
    log_path (Optional[str]): Path to store logs, defaults to ``None``.  
    **kw: Keyword arguments used to update the default deployment configuration. No extra arguments beyond those listed below are allowed.  

Keyword Args:
    tp (int): Tensor parallelism factor, defaults to ``1``.  
    server-name (str): The IP address on which the service listens, defaults to ``0.0.0.0``.  
    server-port (Optional[int]): Port number for the service. Defaults to ``None``; in this case, a random port between 30000–40000 will be assigned.  
    max-batch-size (int): Maximum batch size, defaults to ``128``.  
    chat-template (Optional[str]): Path to the chat template file. If the model is not a vision-language model and no template is specified, a default template will be used.  
    eager-mode (bool): Whether to enable eager mode, controlled by the environment variable ``LMDEPLOY_EAGER_MODE``, defaults to ``False``.  


Examples:
    >>> # Basic use:
    >>> from lazyllm import deploy
    >>> infer = deploy.LMDeploy()
    >>>
    >>> # MultiModal:
    >>> import lazyllm
    >>> from lazyllm import deploy, globals
    >>> from lazyllm.components.formatter import encode_query_with_filepaths
    >>> chat = lazyllm.TrainableModule('InternVL3_5-1B').deploy_method(deploy.LMDeploy)
    >>> chat.update_server()
    >>> inputs = encode_query_with_filepaths('What is it?', ['path/to/image'])
    >>> res = chat(inputs)
    """
    keys_name_handle = {
        'inputs': 'prompt',
        'stop': 'stop',
    }
    default_headers = {'Content-Type': 'application/json'}
    message_format = {
        'prompt': 'Who are you ?',
        'stream': False,
        'stop': None,
        'top_p': 0.8,
        'temperature': 0.8,
        'skip_special_tokens': True,
    }
    auto_map = {
        'port': 'server-port',
        'host': 'server-name',
        'max_batch_size': 'max-batch-size',
    }
    stream_parse_parameters = {'delimiter': b'\n'}

    def __init__(self, launcher=launchers.remote(ngpus=1), trust_remote_code=True, log_path=None, **kw):  # noqa B008
        super().__init__(launcher=launcher)
        self.kw = ArgsDict({
            'server-name': '0.0.0.0',
            'server-port': None,
            'tp': 1,
            'max-batch-size': 128,
        })
        self.options_keys = kw.pop('options_keys', [])
        self.kw.check_and_update(kw)
        self._trust_remote_code = trust_remote_code
        self.random_port = False if 'server-port' in kw and kw['server-port'] else True
        self.temp_folder = make_log_dir(log_path, 'lmdeploy') if log_path else None

    def cmd(self, finetuned_model=None, base_model=None):
        """This method generates the command to start the LMDeploy service.

Args:
    finetuned_model (str): Path to the fine-tuned model.
    base_model (str): Path to the base model, used when finetuned_model is invalid.

**Returns:**

- LazyLLMCMD: A LazyLLMCMD object containing the startup command.
"""
        if not os.path.exists(finetuned_model) or \
            not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                    for filename in os.listdir(finetuned_model)):
            if not finetuned_model:
                LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                            f'base_model({base_model}) will be used')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['server-port'] = random.randint(30000, 40000)
            cmd = f'lmdeploy serve api_server {finetuned_model} --model-name lazyllm '

            if importlib.util.find_spec('torch_npu') is not None: cmd += '--device ascend '
            if config['lmdeploy_eager_mode']: cmd += '--eager-mode '
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

    def geturl(self, job=None):
        """Get the URL address of the LMDeploy service.

Args:
    job (optional): Job object, defaults to None, in which case self.job is used.

**Returns:**

- str: The service URL address in the format "http://{ip}:{port}/v1/chat/interactive".
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return 'http://{ip}:{port}/v1/'
        else:
            return f'http://{job.get_jobip()}:{self.kw["server-port"]}/v1/'

    @staticmethod
    def extract_result(x, inputs):
        """Parses the model inference result and extracts the text output from a JSON response string.

Args:
    x (str): JSON-formatted string returned by the model.  
    inputs (dict): The original input data (not directly used, reserved for interface compatibility).  

**Returns:**

- str: The text result extracted from the response.  
"""
        return json.loads(x)['text']

cmd(finetuned_model=None, base_model=None)

This method generates the command to start the LMDeploy service.

Parameters:

  • finetuned_model (str, default: None ) –

    Path to the fine-tuned model.

  • base_model (str, default: None ) –

    Path to the base model, used when finetuned_model is invalid.

Returns:

  • LazyLLMCMD: A LazyLLMCMD object containing the startup command.
Source code in lazyllm/components/deploy/lmdeploy.py
    def cmd(self, finetuned_model=None, base_model=None):
        """This method generates the command to start the LMDeploy service.

Args:
    finetuned_model (str): Path to the fine-tuned model.
    base_model (str): Path to the base model, used when finetuned_model is invalid.

**Returns:**

- LazyLLMCMD: A LazyLLMCMD object containing the startup command.
"""
        if not os.path.exists(finetuned_model) or \
            not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                    for filename in os.listdir(finetuned_model)):
            if not finetuned_model:
                LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                            f'base_model({base_model}) will be used')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['server-port'] = random.randint(30000, 40000)
            cmd = f'lmdeploy serve api_server {finetuned_model} --model-name lazyllm '

            if importlib.util.find_spec('torch_npu') is not None: cmd += '--device ascend '
            if config['lmdeploy_eager_mode']: cmd += '--eager-mode '
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

geturl(job=None)

Get the URL address of the LMDeploy service.

Parameters:

  • job (optional, default: None ) –

    Job object, defaults to None, in which case self.job is used.

Returns:

  • str: The service URL address in the format "http://{ip}:{port}/v1/chat/interactive".
Source code in lazyllm/components/deploy/lmdeploy.py
    def geturl(self, job=None):
        """Get the URL address of the LMDeploy service.

Args:
    job (optional): Job object, defaults to None, in which case self.job is used.

**Returns:**

- str: The service URL address in the format "http://{ip}:{port}/v1/chat/interactive".
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return 'http://{ip}:{port}/v1/'
        else:
            return f'http://{job.get_jobip()}:{self.kw["server-port"]}/v1/'

extract_result(x, inputs) staticmethod

Parses the model inference result and extracts the text output from a JSON response string.

Parameters:

  • x (str) –

    JSON-formatted string returned by the model.

  • inputs (dict) –

    The original input data (not directly used, reserved for interface compatibility).

Returns:

  • str: The text result extracted from the response.
Source code in lazyllm/components/deploy/lmdeploy.py
    @staticmethod
    def extract_result(x, inputs):
        """Parses the model inference result and extracts the text output from a JSON response string.

Args:
    x (str): JSON-formatted string returned by the model.  
    inputs (dict): The original input data (not directly used, reserved for interface compatibility).  

**Returns:**

- str: The text result extracted from the response.  
"""
        return json.loads(x)['text']

lazyllm.components.deploy.base.DummyDeploy

Bases: LazyLLMDeployBase, Pipeline

DummyDeploy(launcher=launchers.remote(sync=False), , stream=False, *kw)

A mock deployment class for testing purposes. It extends both LazyLLMDeployBase and flows.Pipeline, simulating a simple pipeline-style deployable service with optional streaming support.

This class is primarily intended for internal testing and demonstration. It receives inputs in the format defined by message_format, and returns a dummy response or a streaming response depending on the stream flag.

Parameters:

  • launcher

    Deployment launcher instance, defaulting to launchers.remote(sync=False).

  • stream (bool, default: False ) –

    Whether to simulate streaming output.

  • kw

    Additional keyword arguments passed to the superclass.

Call Arguments

keys_name_handle (dict): Mapping of input keys for request formatting.

message_format (dict): Default request template including input and generation parameters.

Source code in lazyllm/components/deploy/base.py
class DummyDeploy(LazyLLMDeployBase, flows.Pipeline):
    """DummyDeploy(launcher=launchers.remote(sync=False), *, stream=False, **kw)

A mock deployment class for testing purposes. It extends both `LazyLLMDeployBase` and `flows.Pipeline`,
simulating a simple pipeline-style deployable service with optional streaming support.

This class is primarily intended for internal testing and demonstration. It receives inputs in the format defined
by `message_format`, and returns a dummy response or a streaming response depending on the `stream` flag.

Args:
    launcher: Deployment launcher instance, defaulting to `launchers.remote(sync=False)`.
    stream (bool): Whether to simulate streaming output.
    kw: Additional keyword arguments passed to the superclass.

Call Arguments:
    keys_name_handle (dict): Mapping of input keys for request formatting. 

    message_format (dict): Default request template including input and generation parameters. 

"""
    keys_name_handle = {'inputs': 'inputs'}
    message_format = {
        'inputs': '',
        'parameters': {
            'do_sample': False,
            'temperature': 0.1,
        }
    }

    def __init__(self, launcher=launchers.remote(sync=False), *, stream=False, **kw):  # noqa B008
        super().__init__(launcher=launcher)

        def func():

            def impl(x):
                LOG.info(f'input is {x["inputs"]}, parameters is {x["parameters"]}')
                return f'reply for {x["inputs"]}, and parameters is {x["parameters"]}'

            def impl_stream(x):
                for s in ['reply', ' for', f' {x["inputs"]}', ', and',
                          ' parameters', ' is', f' {x["parameters"]}']:
                    yield s
                    time.sleep(0.2)
            return impl_stream if stream else impl
        flows.Pipeline.__init__(self, func,
                                lazyllm.deploy.RelayServer(port=random.randint(30000, 40000), launcher=launcher))

    def __call__(self, *args):
        url = flows.Pipeline.__call__(self)
        LOG.info(f'dummy deploy url is : {url}')
        return url

    def __repr__(self):
        return flows.Pipeline.__repr__(self)

lazyllm.components.auto.AutoDeploy

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase that automatically selects the appropriate inference framework and parameters based on the input arguments for inference with large language models.

Specifically, based on the input base_model parameters, max_token_num, the type and number of GPUs in launcher, this class can automatically select the appropriate inference framework (such as Lightllm or Vllm) and the required parameters.

Parameters:

  • base_model (str) –

    The base model for fine-tuning, which is required to be the name or the path to the base model. Used to provide base model information.

  • source (config[model_source]) –

    Specifies the model download source. This can be configured by setting the environment variable LAZYLLM_MODEL_SOURCE.

  • trust_remote_code (bool) –

    Whether to allow loading of model code from remote servers, default is True.

  • launcher (launcher, default: remote() ) –

    The launcher for fine-tuning, default is launchers.remote(ngpus=1).

  • stream (bool) –

    Whether the response is streaming, default is False.

  • type (str) –

    Type parameter, default is None, which corresponds to the llm type. Additionally, the embed type is also supported.

  • max_token_num (int) –

    The maximum token length for the input fine-tuning model, default is 1024.

  • launcher (launcher, default: remote() ) –

    The launcher for fine-tuning, default is launchers.remote(ngpus=1).

  • kw

    Keyword arguments used to update default training parameters. Note that whether additional keyword arguments can be specified depends on the framework inferred by LazyLLM, so it is recommended to set them carefully.

Examples:

>>> from lazyllm import deploy
>>> deploy.auto('internlm2-chat-7b')
<lazyllm.llm.deploy type=Lightllm>
Source code in lazyllm/components/auto/autodeploy.py
class AutoDeploy(LazyLLMDeployBase):
    """This class is a subclass of ``LazyLLMDeployBase`` that automatically selects the appropriate inference framework and parameters based on the input arguments for inference with large language models.

Specifically, based on the input ``base_model`` parameters, ``max_token_num``, the type and number of GPUs in ``launcher``, this class can automatically select the appropriate inference framework (such as ``Lightllm`` or ``Vllm``) and the required parameters.

Args:
    base_model (str): The base model for fine-tuning, which is required to be the name or the path to the base model. Used to provide base model information.
    source (lazyllm.config['model_source']): Specifies the model download source. This can be configured by setting the environment variable ``LAZYLLM_MODEL_SOURCE``.
    trust_remote_code (bool): Whether to allow loading of model code from remote servers, default is ``True``.
    launcher (lazyllm.launcher): The launcher for fine-tuning, default is ``launchers.remote(ngpus=1)``.
    stream (bool): Whether the response is streaming, default is ``False``.
    type (str): Type parameter, default is ``None``, which corresponds to the ``llm`` type. Additionally, the ``embed`` type is also supported.
    max_token_num (int): The maximum token length for the input fine-tuning model, default is ``1024``.
    launcher (lazyllm.launcher): The launcher for fine-tuning, default is ``launchers.remote(ngpus=1)``.
    kw: Keyword arguments used to update default training parameters. Note that whether additional keyword arguments can be specified depends on the framework inferred by LazyLLM, so it is recommended to set them carefully.


Examples:
    >>> from lazyllm import deploy
    >>> deploy.auto('internlm2-chat-7b')
    <lazyllm.llm.deploy type=Lightllm> 
    """
    @staticmethod
    def _get_embed_deployer(launcher, type, kw):
        launcher = launcher or launchers.remote(ngpus=1)
        kw['model_type'] = type
        if lazyllm.config['default_embedding_engine'].lower() in ('transformers', 'flagembedding') \
            or kw.get('embed_type')=='sparse' or not check_requirements('infinity_emb'):
            return deploy.Rerank if type == 'rerank' else deploy.Embedding, launcher, kw
        else:
            return deploy.InfinityRerank if type == 'rerank' else deploy.Infinity, launcher, kw

    @classmethod
    def get_deployer(cls, base_model: str, source: Optional[str] = None, trust_remote_code: bool = True,
                     launcher: Optional[LazyLLMLaunchersBase] = None, type: Optional[str] = None,
                     log_path: Optional[str] = None, **kw):
        """Get corresponding deployer class based on model type.

Automatically detects model type and returns the most suitable deployer class, 
launcher and configuration parameters.

Args:
    base_model (str): Base model name or path.
    source (Optional[str], optional): Model source.
    trust_remote_code (bool, optional): Whether to trust remote code.
    launcher (Optional[LazyLLMLaunchersBase], optional): Launcher instance.
    type (Optional[str], optional): Model type.
    log_path (Optional[str], optional): Log file path.
    **kw: Other configuration parameters.

**Returns:**

- Tuple: Returns (deployer class, launcher, configuration parameters dict) triple.
"""
        model_name = get_model_name(base_model)
        kw['log_path'], kw['trust_remote_code'] = log_path, trust_remote_code
        if not type:
            type = ModelManager.get_model_type(model_name)

        if type in ('embed', 'cross_modal_embed', 'rerank'):
            return AutoDeploy._get_embed_deployer(launcher, type, kw)
        elif type == 'sd':
            return StableDiffusionDeploy, launcher or launchers.remote(ngpus=1), kw
        elif type == 'stt':
            return SenseVoiceDeploy, launcher or launchers.remote(ngpus=1), kw
        elif type == 'tts':
            return TTSDeploy.get_deploy_cls(model_name), launcher or launchers.remote(ngpus=1), kw
        elif type == 'vlm':
            return deploy.LMDeploy, launcher or launchers.remote(ngpus=1), kw
        elif type == 'ocr':
            return OCRDeploy, launcher or launchers.remote(ngpus=1), kw

        if not launcher:
            match = re.search(r'(\d+)[bB]', model_name)
            size = int(match.group(1)) if match else 0
            size = (size * 2) if 'awq' not in model_name.lower() else (size / 1.5)
            ngpus = (1 << (math.ceil(size * 2 * 0.6 / config['gpu_memory']) - 1).bit_length())
            launcher = launchers.remote(ngpus = ngpus)

        for deploy_cls in ['vllm', 'lightllm', 'lmdeploy', 'mindie']:
            if check_cmd(deploy_cls) or check_requirements(requirements.get(deploy_cls)):
                deploy_cls = getattr(deploy, deploy_cls)
                return deploy_cls, launcher, kw
        return deploy.auto, launcher, kw

    def __new__(cls, base_model, source=lazyllm.config['model_source'], trust_remote_code=True,
                launcher=None, type=None, log_path=None, **kw):
        deploy_cls, launcher, kw = __class__.get_deployer(
            base_model=base_model, source=source, trust_remote_code=trust_remote_code,
            launcher=launcher, type=type, log_path=log_path, **kw)
        return deploy_cls(launcher=launcher, **kw)

get_deployer(base_model, source=None, trust_remote_code=True, launcher=None, type=None, log_path=None, **kw) classmethod

Get corresponding deployer class based on model type.

Automatically detects model type and returns the most suitable deployer class, launcher and configuration parameters.

Parameters:

  • base_model (str) –

    Base model name or path.

  • source (Optional[str], default: None ) –

    Model source.

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code.

  • launcher (Optional[LazyLLMLaunchersBase], default: None ) –

    Launcher instance.

  • type (Optional[str], default: None ) –

    Model type.

  • log_path (Optional[str], default: None ) –

    Log file path.

  • **kw

    Other configuration parameters.

Returns:

  • Tuple: Returns (deployer class, launcher, configuration parameters dict) triple.
Source code in lazyllm/components/auto/autodeploy.py
    @classmethod
    def get_deployer(cls, base_model: str, source: Optional[str] = None, trust_remote_code: bool = True,
                     launcher: Optional[LazyLLMLaunchersBase] = None, type: Optional[str] = None,
                     log_path: Optional[str] = None, **kw):
        """Get corresponding deployer class based on model type.

Automatically detects model type and returns the most suitable deployer class, 
launcher and configuration parameters.

Args:
    base_model (str): Base model name or path.
    source (Optional[str], optional): Model source.
    trust_remote_code (bool, optional): Whether to trust remote code.
    launcher (Optional[LazyLLMLaunchersBase], optional): Launcher instance.
    type (Optional[str], optional): Model type.
    log_path (Optional[str], optional): Log file path.
    **kw: Other configuration parameters.

**Returns:**

- Tuple: Returns (deployer class, launcher, configuration parameters dict) triple.
"""
        model_name = get_model_name(base_model)
        kw['log_path'], kw['trust_remote_code'] = log_path, trust_remote_code
        if not type:
            type = ModelManager.get_model_type(model_name)

        if type in ('embed', 'cross_modal_embed', 'rerank'):
            return AutoDeploy._get_embed_deployer(launcher, type, kw)
        elif type == 'sd':
            return StableDiffusionDeploy, launcher or launchers.remote(ngpus=1), kw
        elif type == 'stt':
            return SenseVoiceDeploy, launcher or launchers.remote(ngpus=1), kw
        elif type == 'tts':
            return TTSDeploy.get_deploy_cls(model_name), launcher or launchers.remote(ngpus=1), kw
        elif type == 'vlm':
            return deploy.LMDeploy, launcher or launchers.remote(ngpus=1), kw
        elif type == 'ocr':
            return OCRDeploy, launcher or launchers.remote(ngpus=1), kw

        if not launcher:
            match = re.search(r'(\d+)[bB]', model_name)
            size = int(match.group(1)) if match else 0
            size = (size * 2) if 'awq' not in model_name.lower() else (size / 1.5)
            ngpus = (1 << (math.ceil(size * 2 * 0.6 / config['gpu_memory']) - 1).bit_length())
            launcher = launchers.remote(ngpus = ngpus)

        for deploy_cls in ['vllm', 'lightllm', 'lmdeploy', 'mindie']:
            if check_cmd(deploy_cls) or check_requirements(requirements.get(deploy_cls)):
                deploy_cls = getattr(deploy, deploy_cls)
                return deploy_cls, launcher, kw
        return deploy.auto, launcher, kw

lazyllm.components.deploy.embed.AbstractEmbedding

Bases: ABC

Abstract embedding base class that provides unified interface and basic functionality for all embedding models. This class defines the standard interface for embedding models, including model loading, calling, and serialization capabilities.

Parameters:

  • base_embed (str) –

    The base path or identifier of the embedding model, used to specify which embedding model to load.

  • source (str, default: None ) –

    Model source, default to None. If not specified, will use the default model source from LazyLLM configuration.

  • init (bool, default: False ) –

    Whether to load the model immediately during initialization, default to False. If True, will call the load_embed() method immediately when the object is created.

Source code in lazyllm/components/deploy/embed.py
class AbstractEmbedding(ABC):
    """Abstract embedding base class that provides unified interface and basic functionality for all embedding models. This class defines the standard interface for embedding models, including model loading, calling, and serialization capabilities.

Args:
    base_embed (str): The base path or identifier of the embedding model, used to specify which embedding model to load.
    source (str, optional): Model source, default to ``None``. If not specified, will use the default model source from LazyLLM configuration.
    init (bool): Whether to load the model immediately during initialization, default to ``False``. If ``True``, will call the ``load_embed()`` method immediately when the object is created.
"""
    def __init__(self, base_embed, source=None, init=False):
        from ..utils.downloader import ModelManager
        self._source = source or lazyllm.config['model_source']
        self._base_embed = ModelManager(self._source).download(base_embed) or ''
        self._embed = None
        self._init = lazyllm.once_flag()
        if init:
            lazyllm.call_once(self._init, self.load_embed)

    @abstractmethod
    def load_embed(self) -> None:
        """Abstract method for loading embedding models. This method is implemented by subclasses to perform specific model loading logic.

**Note**: This method is currently under development.
"""
        pass

    @abstractmethod
    def _call(self, data: Dict[str, Union[str, List[str]]]) -> str:
        pass

    def __call__(self, data: Dict[str, Union[str, List[str]]]) -> str:
        lazyllm.call_once(self._init, self.load_embed)
        return self._call(data)

    def __reduce__(self):
        init = bool(os.getenv('LAZYLLM_ON_CLOUDPICKLE', None) == 'ON' or self._init)
        return self.__class__, (self._base_embed, self._source, init)

load_embed() abstractmethod

Abstract method for loading embedding models. This method is implemented by subclasses to perform specific model loading logic.

Note: This method is currently under development.

Source code in lazyllm/components/deploy/embed.py
    @abstractmethod
    def load_embed(self) -> None:
        """Abstract method for loading embedding models. This method is implemented by subclasses to perform specific model loading logic.

**Note**: This method is currently under development.
"""
        pass

lazyllm.components.deploy.EmbeddingDeploy

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase, designed for deploying text embedding services. It supports both dense and sparse embedding methods, compatible with HuggingFace models and FlagEmbedding models.

Parameters:

  • launcher (Optional[launcher], default: None ) –

    The launcher instance, defaults to None.

  • model_type (Optional[str], default: 'embed' ) –

    Model type, defaults to 'embed'.

  • log_path (Optional[str], default: None ) –

    Path for log file, defaults to None.

  • embed_type (Optional[str], default: 'dense' ) –

    Embedding type, either 'dense' or 'sparse', defaults to 'dense'.

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code, defaults to True.

  • port (Optional[int], default: None ) –

    Service port number, defaults to None, in which case LazyLLM will generate a random port.

Call Arguments

finetuned_model (Optional[str]): Path or name of the fine-tuned model.

base_model (Optional[str]): Path or name of the base model, used when finetuned_model is invalid.

Message Format

Input format is a dictionary containing text and images list.

  • text (str): Text content to be encoded

  • images (Union[str, List[str]]): List of images to be encoded (optional)

Examples:

>>> from lazyllm import deploy
>>> embed_service = deploy.EmbeddingDeploy(embed_type='dense')
>>> embed_service('path/to/model')
Source code in lazyllm/components/deploy/embed.py
class EmbeddingDeploy(LazyLLMDeployBase):
    """This class is a subclass of ``LazyLLMDeployBase``, designed for deploying text embedding services. It supports both dense and sparse embedding methods, compatible with HuggingFace models and FlagEmbedding models.

Args:
    launcher (Optional[lazyllm.launcher]): The launcher instance, defaults to ``None``.
    model_type (Optional[str]): Model type, defaults to ``'embed'``.
    log_path (Optional[str]): Path for log file, defaults to ``None``.
    embed_type (Optional[str]): Embedding type, either ``'dense'`` or ``'sparse'``, defaults to ``'dense'``.
    trust_remote_code (bool): Whether to trust remote code, defaults to ``True``.
    port (Optional[int]): Service port number, defaults to ``None``, in which case LazyLLM will generate a random port.

Call Arguments:
    finetuned_model (Optional[str]): Path or name of the fine-tuned model. 

    base_model (Optional[str]): Path or name of the base model, used when finetuned_model is invalid. 


Message Format:
    Input format is a dictionary containing text and images list.

    - text (str): Text content to be encoded

    - images (Union[str, List[str]]): List of images to be encoded (optional)



Examples:
    >>> from lazyllm import deploy
    >>> embed_service = deploy.EmbeddingDeploy(embed_type='dense')
    >>> embed_service('path/to/model')
    """
    message_format = {
        'text': 'text',  # str,
        'images': []  # Union[str, List[str]]
    }
    keys_name_handle = {
        'inputs': 'text',
        'image': 'images'
    }
    default_headers = {'Content-Type': 'application/json'}

    def __init__(self, launcher: LazyLLMLaunchersBase = None, model_type: str = 'embed', log_path: Optional[str] = None,
                 embed_type: Optional[str] = 'dense', trust_remote_code: bool = True, port: Optional[int] = None, **kw):
        super().__init__(launcher=launcher)
        self._launcher = launcher
        self._port = port
        self._model_type = model_type
        self._log_path = log_path
        self._sparse_embed = True if embed_type == 'sparse' else False
        self._trust_remote_code = trust_remote_code
        self._port = port

    def _get_model_path(self, finetuned_model=None, base_model=None):
        if not os.path.exists(finetuned_model) or \
            not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                    for filename in os.listdir(finetuned_model)):
            if not finetuned_model:
                LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                            f'base_model({base_model}) will be used')
            finetuned_model = base_model
        return finetuned_model

    def __call__(self, finetuned_model=None, base_model=None):
        finetuned_model = self._get_model_path(finetuned_model, base_model)
        if self._sparse_embed or lazyllm.config['default_embedding_engine'] == 'flagEmbedding':
            return lazyllm.deploy.RelayServer(port=self._port, func=LazyFlagEmbedding(
                finetuned_model, sparse=self._sparse_embed),
                launcher=self._launcher, log_path=self._log_path, cls='embedding')()
        else:
            return lazyllm.deploy.RelayServer(port=self._port, func=HuggingFaceEmbedding(finetuned_model),
                                              launcher=self._launcher, log_path=self._log_path, cls='embedding')()

lazyllm.components.deploy.embed.RerankDeploy

Bases: EmbeddingDeploy

This class is a subclass of EmbeddingDeploy, designed for deploying reranking services. It supports text reranking using HuggingFace models.

Parameters:

  • launcher (launcher, default: None ) –

    The launcher instance, defaults to None.

  • model_type (str, default: 'embed' ) –

    Model type, defaults to 'embed'.

  • log_path (str, default: None ) –

    Path for log file, defaults to None.

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code, defaults to True.

  • port (int, default: None ) –

    Service port number, defaults to None, in which case LazyLLM will generate a random port.

Call Arguments

finetuned_model: Path or name of the fine-tuned model.

base_model: Path or name of the base model, used when finetuned_model is invalid.

Message Format

Input format is a dictionary containing query (query text), documents (list of candidate documents), and top_n (number of documents to return).

  • query: Query text

  • documents: List of candidate documents

  • top_n: Number of documents to return, defaults to 1

Examples:

>>> from lazyllm import deploy
>>> rerank_service = deploy.embed.RerankDeploy()
>>> rerank_service('path/to/model')
>>> input_data = {
...     "query": "What is machine learning?",
...     "documents": [
...         "Machine learning is a branch of AI.",
...         "Machine learning uses data to improve.",
...         "Deep learning is a subset of machine learning."
...     ],
...     "top_n": 2
... }
>>> result = rerank_service(input_data)
Source code in lazyllm/components/deploy/embed.py
class RerankDeploy(EmbeddingDeploy):
    """This class is a subclass of ``EmbeddingDeploy``, designed for deploying reranking services. It supports text reranking using HuggingFace models.

Args:
    launcher (lazyllm.launcher): The launcher instance, defaults to ``None``.
    model_type (str): Model type, defaults to ``'embed'``.
    log_path (str): Path for log file, defaults to ``None``.
    trust_remote_code (bool): Whether to trust remote code, defaults to ``True``.
    port (int): Service port number, defaults to ``None``, in which case LazyLLM will generate a random port.

Call Arguments:
    finetuned_model: Path or name of the fine-tuned model. 

    base_model: Path or name of the base model, used when finetuned_model is invalid.


Message Format:
    Input format is a dictionary containing query (query text), documents (list of candidate documents), and top_n (number of documents to return).\n
    - query: Query text 

    - documents: List of candidate documents 

    - top_n: Number of documents to return, defaults to 1 



Examples:
    >>> from lazyllm import deploy
    >>> rerank_service = deploy.embed.RerankDeploy()
    >>> rerank_service('path/to/model')
    >>> input_data = {
    ...     "query": "What is machine learning?",
    ...     "documents": [
    ...         "Machine learning is a branch of AI.",
    ...         "Machine learning uses data to improve.",
    ...         "Deep learning is a subset of machine learning."
    ...     ],
    ...     "top_n": 2
    ... }
    >>> result = rerank_service(input_data)
    """
    message_format = {'query': 'query', 'documents': ['string'], 'top_n': 1}
    keys_name_handle = {'inputs': 'query', 'documents': 'documents', 'top_n': 'top_n'}
    default_headers = {'Content-Type': 'application/json'}

    def __call__(self, finetuned_model=None, base_model=None):
        finetuned_model = self._get_model_path(finetuned_model, base_model)
        return lazyllm.deploy.RelayServer(port=self._port, func=LazyHuggingFaceRerank(
            finetuned_model), launcher=self._launcher, log_path=self._log_path, cls='embedding')()

lazyllm.components.deploy.embed.LazyHuggingFaceRerank

Bases: object

Wrapper class for HuggingFace CrossEncoder-based reranking.
Ranks candidate documents by relevance score with respect to a given query.
Supports downloading and loading a specified rerank model at initialization, with optional lazy loading for faster startup.

Parameters:

  • base_rerank (str) –

    Name or local path of the rerank model. Supports HuggingFace Hub identifiers or local paths.

  • source (Optional[str], default: None ) –

    Source of the model, supports huggingface and modelscope. Defaults to global config model_source.

  • init (bool, default: False ) –

    Whether to load the model immediately upon instantiation. If False, the model will be loaded lazily on first call.

Source code in lazyllm/components/deploy/embed.py
class LazyHuggingFaceRerank(object):
    """Wrapper class for HuggingFace CrossEncoder-based reranking.  
Ranks candidate documents by relevance score with respect to a given query.  
Supports downloading and loading a specified rerank model at initialization, with optional lazy loading for faster startup.

Args:
    base_rerank (str): Name or local path of the rerank model. Supports HuggingFace Hub identifiers or local paths.
    source (Optional[str]): Source of the model, supports `huggingface` and `modelscope`. Defaults to global config `model_source`.
    init (bool): Whether to load the model immediately upon instantiation. If `False`, the model will be loaded lazily on first call.
"""
    def __init__(self, base_rerank, source=None, init=False):
        from ..utils.downloader import ModelManager
        source = lazyllm.config['model_source'] if not source else source
        self.base_rerank = ModelManager(source).download(base_rerank) or ''
        self.reranker = None
        self.init_flag = lazyllm.once_flag()
        if init:
            lazyllm.call_once(self.init_flag, self.load_reranker)

    def load_reranker(self):
        """Load the rerank model.  

This method initializes a `sentence_transformers.CrossEncoder` instance using `self.base_rerank`  
and assigns it to the class attribute `self.reranker` for subsequent reranking tasks.  
"""
        self.reranker = sentence_transformers.CrossEncoder(self.base_rerank)

    def __call__(self, inps):
        lazyllm.call_once(self.init_flag, self.load_reranker)
        query, documents, top_n = inps['query'], inps['documents'], inps['top_n']
        query_pairs = [(query, doc) for doc in documents]
        scores = self.reranker.predict(query_pairs)
        sorted_indices = [(index, scores[index]) for index in np.argsort(scores)[::-1]]
        if top_n > 0:
            sorted_indices = sorted_indices[:top_n]
        return sorted_indices

    @classmethod
    def rebuild(cls, base_rerank, init):
        """Class method to rebuild a `LazyHuggingFaceRerank` instance.  
Used primarily for deserialization during pickle/cloudpickle operations,  
reinstantiating the object with the provided parameters.

Args:
    base_rerank (str): Model name or path.
    init (bool): Whether to load the model immediately upon rebuilding.

**Returns:**

- LazyHuggingFaceRerank: The rebuilt class instance.
"""
        return cls(base_rerank, init)

    def __reduce__(self):
        init = bool(os.getenv('LAZYLLM_ON_CLOUDPICKLE', None) == 'ON' or self.init_flag)
        return LazyHuggingFaceRerank.rebuild, (self.base_rerank, init)

load_reranker()

Load the rerank model.

This method initializes a sentence_transformers.CrossEncoder instance using self.base_rerank
and assigns it to the class attribute self.reranker for subsequent reranking tasks.

Source code in lazyllm/components/deploy/embed.py
    def load_reranker(self):
        """Load the rerank model.  

This method initializes a `sentence_transformers.CrossEncoder` instance using `self.base_rerank`  
and assigns it to the class attribute `self.reranker` for subsequent reranking tasks.  
"""
        self.reranker = sentence_transformers.CrossEncoder(self.base_rerank)

rebuild(base_rerank, init) classmethod

Class method to rebuild a LazyHuggingFaceRerank instance.
Used primarily for deserialization during pickle/cloudpickle operations,
reinstantiating the object with the provided parameters.

Parameters:

  • base_rerank (str) –

    Model name or path.

  • init (bool) –

    Whether to load the model immediately upon rebuilding.

Returns:

  • LazyHuggingFaceRerank: The rebuilt class instance.
Source code in lazyllm/components/deploy/embed.py
    @classmethod
    def rebuild(cls, base_rerank, init):
        """Class method to rebuild a `LazyHuggingFaceRerank` instance.  
Used primarily for deserialization during pickle/cloudpickle operations,  
reinstantiating the object with the provided parameters.

Args:
    base_rerank (str): Model name or path.
    init (bool): Whether to load the model immediately upon rebuilding.

**Returns:**

- LazyHuggingFaceRerank: The rebuilt class instance.
"""
        return cls(base_rerank, init)

lazyllm.components.deploy.embed.HuggingFaceEmbedding

HuggingFace embedding model management class for managing and registering different embedding model implementations.

Attributes:

  • _model_id_mapping (dict) –

    Mapping dictionary from model IDs to implementation classes.

Parameters:

  • base_embed (str) –

    Path or name of the base embedding model.

  • source (Optional[str], default: None ) –

    Model source, defaults to None.

Source code in lazyllm/components/deploy/embed.py
class HuggingFaceEmbedding:
    """HuggingFace embedding model management class for managing and registering different embedding model implementations.

Attributes:
    _model_id_mapping (dict): Mapping dictionary from model IDs to implementation classes.

Args:
    base_embed (str): Path or name of the base embedding model.
    source (Optional[str]): Model source, defaults to None.
"""
    _model_id_mapping = {}

    @classmethod
    def get_emb_cls(cls, model_name: str):
        """Get the embedding implementation class for a model.

Args:
    model_name (str): Model name or path.

**Returns:**

- type: Returns corresponding embedding model implementation class, defaults to LazyHuggingFaceDefaultEmbedding if not found.
"""
        model_id = model_name.split('/')[-1].lower()
        return cls._model_id_mapping.get(model_id, LazyHuggingFaceDefaultEmbedding)

    @classmethod
    def register(cls, model_ids: List[str]):
        """Decorator for registering model IDs to specific implementation classes.

Args:
    model_ids (List[str]): List of model IDs to register.

**Returns:**

- Callable: Returns decorator function.
"""
        def decorator(target_class):
            for ele in model_ids:
                cls._model_id_mapping[ele.lower()] = target_class
            return target_class
        return decorator

    def __init__(self, base_embed, source=None):
        self._embed = self.__class__.get_emb_cls(base_embed)(base_embed, source)

    def load_embed(self):
        """Load the embedding model.

This method calls the load_embed method of the internal embedding implementation class to load the model.
"""
        self._embed.load_embed()

    def __call__(self, *args, **kwargs):
        try:
            args[0]['images'] = [_base64_to_file(image) if _is_base64_with_mime(image) else image
                                 for image in args[0]['images']]
        except Exception as e:
            LOG.error(f'Error converting base64 to image: {e}')
        return self._embed(*args, **kwargs)

get_emb_cls(model_name) classmethod

Get the embedding implementation class for a model.

Parameters:

  • model_name (str) –

    Model name or path.

Returns:

  • type: Returns corresponding embedding model implementation class, defaults to LazyHuggingFaceDefaultEmbedding if not found.
Source code in lazyllm/components/deploy/embed.py
    @classmethod
    def get_emb_cls(cls, model_name: str):
        """Get the embedding implementation class for a model.

Args:
    model_name (str): Model name or path.

**Returns:**

- type: Returns corresponding embedding model implementation class, defaults to LazyHuggingFaceDefaultEmbedding if not found.
"""
        model_id = model_name.split('/')[-1].lower()
        return cls._model_id_mapping.get(model_id, LazyHuggingFaceDefaultEmbedding)

register(model_ids) classmethod

Decorator for registering model IDs to specific implementation classes.

Parameters:

  • model_ids (List[str]) –

    List of model IDs to register.

Returns:

  • Callable: Returns decorator function.
Source code in lazyllm/components/deploy/embed.py
    @classmethod
    def register(cls, model_ids: List[str]):
        """Decorator for registering model IDs to specific implementation classes.

Args:
    model_ids (List[str]): List of model IDs to register.

**Returns:**

- Callable: Returns decorator function.
"""
        def decorator(target_class):
            for ele in model_ids:
                cls._model_id_mapping[ele.lower()] = target_class
            return target_class
        return decorator

load_embed()

Load the embedding model.

This method calls the load_embed method of the internal embedding implementation class to load the model.

Source code in lazyllm/components/deploy/embed.py
    def load_embed(self):
        """Load the embedding model.

This method calls the load_embed method of the internal embedding implementation class to load the model.
"""
        self._embed.load_embed()

lazyllm.components.deploy.embed.LazyFlagEmbedding

Bases: object

A lazily loaded wrapper for the FlagEmbedding module.

This class encapsulates loading and usage of FlagEmbedding, with support for both sparse and dense embeddings. It leverages the lazyllm.once_flag() mechanism to initialize only once on demand, and integrates with LazyLLM's model downloading utilities.

Parameters:

  • base_embed (str) –

    The model name or path to be used as the embedding backend.

  • sparse (bool, default: False ) –

    Whether to enable sparse embedding output. Defaults to False.

  • source (str, default: None ) –

    Source URL or identifier for model downloading. Defaults to global config.

  • init (bool, default: False ) –

    Whether to initialize the model immediately upon construction. Defaults to False.

Source code in lazyllm/components/deploy/embed.py
class LazyFlagEmbedding(object):
    """A lazily loaded wrapper for the FlagEmbedding module.

This class encapsulates loading and usage of FlagEmbedding, with support for both sparse and dense embeddings. It leverages the lazyllm.once_flag() mechanism to initialize only once on demand, and integrates with LazyLLM's model downloading utilities.

Args:
    base_embed (str): The model name or path to be used as the embedding backend.
    sparse (bool): Whether to enable sparse embedding output. Defaults to False.
    source (str, optional): Source URL or identifier for model downloading. Defaults to global config.
    init (bool): Whether to initialize the model immediately upon construction. Defaults to False.
"""
    def __init__(self, base_embed, sparse=False, source=None, init=False):
        from ..utils.downloader import ModelManager
        source = lazyllm.config['model_source'] if not source else source
        self.base_embed = ModelManager(source).download(base_embed) or ''
        self.embed = None
        self.device = 'cpu'
        self.sparse = sparse
        self.init_flag = lazyllm.once_flag()
        if init:
            lazyllm.call_once(self.init_flag, self.load_embed)

    def load_embed(self):
        """Load the embedding model onto the appropriate device.

This method selects the available device (GPU or CPU) and initializes the pretrained FlagEmbedding model from the provided path or model hub.
"""
        self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
        self.embed = fe.FlagAutoModel.from_finetuned(self.base_embed, use_fp16=False, devices=[self.device])

    def __call__(self, data: Dict[str, Union[str, List[str]]]):
        lazyllm.call_once(self.init_flag, self.load_embed)
        string, _ = data['text'], data['images']
        with torch.no_grad():
            model_output = self.embed.encode(string, return_sparse=self.sparse)
        if self.sparse:
            embeddings = model_output['lexical_weights']
            if isinstance(string, list):
                res = [dict(embedding) for embedding in embeddings]
            else:
                res = dict(embeddings)
        else:
            res = model_output['dense_vecs'].tolist()

        if type(string) is list and type(res) is dict:
            return json.dumps([res], default=lambda x: float(x))
        else:
            return json.dumps(res, default=lambda x: float(x))

    @classmethod
    def rebuild(cls, base_embed, sparse, init):
        """Rebuild a LazyFlagEmbedding instance.

This class method reconstructs an instance of LazyFlagEmbedding, typically used during deserialization or multiprocessing scenarios.

Args:
    base_embed (str): The path or name of the embedding model.
    sparse (bool): Whether to enable sparse embedding mode.
    init (bool): Whether to load the model immediately during instantiation.

**Returns:**

- LazyFlagEmbedding: A newly constructed LazyFlagEmbedding instance.
"""
        return cls(base_embed, sparse, init=init)

    def __reduce__(self):
        init = bool(os.getenv('LAZYLLM_ON_CLOUDPICKLE', None) == 'ON' or self.init_flag)
        return LazyFlagEmbedding.rebuild, (self.base_embed, self.sparse, init)

load_embed()

Load the embedding model onto the appropriate device.

This method selects the available device (GPU or CPU) and initializes the pretrained FlagEmbedding model from the provided path or model hub.

Source code in lazyllm/components/deploy/embed.py
    def load_embed(self):
        """Load the embedding model onto the appropriate device.

This method selects the available device (GPU or CPU) and initializes the pretrained FlagEmbedding model from the provided path or model hub.
"""
        self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
        self.embed = fe.FlagAutoModel.from_finetuned(self.base_embed, use_fp16=False, devices=[self.device])

rebuild(base_embed, sparse, init) classmethod

Rebuild a LazyFlagEmbedding instance.

This class method reconstructs an instance of LazyFlagEmbedding, typically used during deserialization or multiprocessing scenarios.

Parameters:

  • base_embed (str) –

    The path or name of the embedding model.

  • sparse (bool) –

    Whether to enable sparse embedding mode.

  • init (bool) –

    Whether to load the model immediately during instantiation.

Returns:

  • LazyFlagEmbedding: A newly constructed LazyFlagEmbedding instance.
Source code in lazyllm/components/deploy/embed.py
    @classmethod
    def rebuild(cls, base_embed, sparse, init):
        """Rebuild a LazyFlagEmbedding instance.

This class method reconstructs an instance of LazyFlagEmbedding, typically used during deserialization or multiprocessing scenarios.

Args:
    base_embed (str): The path or name of the embedding model.
    sparse (bool): Whether to enable sparse embedding mode.
    init (bool): Whether to load the model immediately during instantiation.

**Returns:**

- LazyFlagEmbedding: A newly constructed LazyFlagEmbedding instance.
"""
        return cls(base_embed, sparse, init=init)

lazyllm.components.deploy.Mindie

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase, designed for deploying and managing the MindIE large language model inference service. It encapsulates the full workflow including configuration generation, process launching, and API interaction for the MindIE service.

Parameters:

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code (e.g., from HuggingFace models). Default is True.

  • launcher

    Instance of the task launcher. Default is launchers.remote().

  • log_path (str, default: None ) –

    Path to save logs. If None, logs will not be saved.

  • **kw

    Other configuration parameters.

Other Parameters:

  • npuDeviceIds

    List of NPU device IDs (e.g., [[0,1]] indicates using 2 devices)

  • worldSize

    Model parallelism size

  • port

    Service port (set to 'auto' for auto-assignment between 30000–40000)

  • maxSeqLen

    Maximum sequence length

  • maxInputTokenLen

    Maximum number of tokens per input

  • maxPrefillTokens

    Maximum number of prefill tokens

  • config

    Custom configuration file

Notes

You must set the environment variable LAZYLLM_MINDIE_HOME to point to the MindIE installation directory. If finetuned_model is not specified or the path is invalid, it will automatically fall back to base_model.

Examples:

>>> import lazyllm
>>> from lazyllm.components.deploy import Mindie            
>>> deployer = Mindie(
...     port=30000,
...     launcher=lazyllm.launchers.remote(),
...     max_seq_len=32000,
...     log_path="/path/to/logs"
... )
>>> cmd = deployer.cmd(
...     finetuned_model="/path/to/finetuned_model",
...     base_model="/path/to/base_model")
>>> print("Service URL:", cmd.geturl())
Source code in lazyllm/components/deploy/mindie.py
class Mindie(LazyLLMDeployBase):
    """This class is a subclass of ``LazyLLMDeployBase``, designed for deploying and managing the MindIE large language model inference service. It encapsulates the full workflow including configuration generation, process launching, and API interaction for the MindIE service.

Args:
    trust_remote_code (bool): Whether to trust remote code (e.g., from HuggingFace models). Default is ``True``.
    launcher: Instance of the task launcher. Default is ``launchers.remote()``.
    log_path (str): Path to save logs. If ``None``, logs will not be saved.
    **kw: Other configuration parameters.

Keyword Args: 
            npuDeviceIds: List of NPU device IDs (e.g., ``[[0,1]]`` indicates using 2 devices)
            worldSize: Model parallelism size
            port: Service port (set to ``'auto'`` for auto-assignment between 30000–40000)
            maxSeqLen: Maximum sequence length
            maxInputTokenLen: Maximum number of tokens per input
            maxPrefillTokens: Maximum number of prefill tokens
            config: Custom configuration file

Notes:
    You must set the environment variable ``LAZYLLM_MINDIE_HOME`` to point to the MindIE installation directory. 
    If ``finetuned_model`` is not specified or the path is invalid, it will automatically fall back to ``base_model``.


Examples:
    >>> import lazyllm
    >>> from lazyllm.components.deploy import Mindie            
    >>> deployer = Mindie(
    ...     port=30000,
    ...     launcher=lazyllm.launchers.remote(),
    ...     max_seq_len=32000,
    ...     log_path="/path/to/logs"
    ... )
    >>> cmd = deployer.cmd(
    ...     finetuned_model="/path/to/finetuned_model",
    ...     base_model="/path/to/base_model")
    >>> print("Service URL:", cmd.geturl())

    """
    keys_name_handle = {
        'inputs': 'prompt',
    }
    default_headers = {'Content-Type': 'application/json'}
    message_format = {
        'prompt': 'Who are you ?',
        'stream': False,
        'max_tokens': 4096,
        'presence_penalty': 1.03,
        'frequency_penalty': 1.0,
        'temperature': 0.5,
        'top_p': 0.95
    }
    auto_map = {
        'port': int,
        'tp': ('worldSize', int),
        'max_input_token_len': ('maxInputTokenLen', int),
        'max_prefill_tokens': ('maxPrefillTokens', int),
        'max_seq_len': ('maxSeqLen', int)
    }

    def __init__(self, trust_remote_code=True, launcher=launchers.remote(), log_path=None, **kw):  # noqa B008
        super().__init__(launcher=launcher)
        assert lazyllm.config['mindie_home'], 'Ensure you have installed MindIE and \
                                  "export LAZYLLM_MINDIE_HOME=/path/to/mindie/latest"'
        self.mindie_home = lazyllm.config['mindie_home']
        self.mindie_config_path = os.path.join(self.mindie_home, 'mindie-service/conf/config.json')
        self.backup_path = self.mindie_config_path + '.backup'
        self.custom_config = kw.pop('config', None)
        self.kw = ArgsDict({
            'npuDeviceIds': [[0]],
            'worldSize': 1,
            'port': 'auto',
            'host': '0.0.0.0',
            'maxSeqLen': 64000,
            'maxInputTokenLen': 4096,
            'maxPrefillTokens': 8192,
        })
        self.trust_remote_code = trust_remote_code
        self.options_keys = kw.pop('options_keys', [])
        assert len(self.options_keys) == 0, 'options_keys is not supported'
        kw.update({'skip_check': False})
        self.kw.check_and_update(kw)
        self.kw['npuDeviceIds'] = [[i for i in range(self.kw.get('worldSize', 1))]]
        self.random_port = False if 'port' in kw and kw['port'] and kw['port'] != 'auto' else True
        self.temp_folder = make_log_dir(log_path, 'mindie') if log_path else None

        if self.custom_config:
            self.config_dict = (ArgsDict(self.load_config(self.custom_config))
                                if isinstance(self.custom_config, str) else ArgsDict(self.custom_config))
            self.kw['host'] = self.config_dict['ServerConfig']['ipAddress']
            self.kw['port'] = self.config_dict['ServerConfig']['port']
        else:
            default_config_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'mindie', 'config.json')
            self.config_dict = ArgsDict(self.load_config(default_config_path))

    def __del__(self):
        if hasattr(self, 'backup_path') and os.path.isfile(self.backup_path):
            shutil.copy2(self.backup_path, self.mindie_config_path)

    def load_config(self, config_path):
        """Loads and parses the MindIE configuration file.

Args:
    config_path (str): Path to the JSON configuration file

**Returns:**

- dict: Parsed configuration dictionary

Notes:
    - Handles both default and custom configuration files
    - Uses JSON format for configuration
    - Creates backup of original config before modification
"""
        with open(config_path, 'r') as file:
            config_dict = json.load(file)
        return config_dict

    def save_config(self):
        """Saves the current configuration to file.

Notes:
    - Automatically creates backup of existing config
    - Writes to the standard MindIE config location
    - Uses JSON format with proper indentation
    - Called automatically during deployment
"""
        if os.path.isfile(self.mindie_config_path):
            shutil.copy2(self.mindie_config_path, self.backup_path)

        with open(self.mindie_config_path, 'w') as file:
            json.dump(self.config_dict, file)

    def update_config(self):
        """Updates the configuration dictionary with current settings.

Notes:
    - Handles multiple configuration sections:
        - Model deployment parameters
        - Server settings
        - Scheduling parameters
"""
        backend_config = self.config_dict['BackendConfig']
        backend_config['npuDeviceIds'] = self.kw['npuDeviceIds']
        model_config = {
            'modelName': self.finetuned_model.split('/')[-1],
            'modelWeightPath': self.finetuned_model,
            'worldSize': self.kw['worldSize'],
            'trust_remote_code': self.trust_remote_code
        }
        backend_config['ModelDeployConfig']['ModelConfig'][0].update(model_config)
        backend_config['ModelDeployConfig']['maxSeqLen'] = self.kw['maxSeqLen']
        backend_config['ModelDeployConfig']['maxInputTokenLen'] = self.kw['maxInputTokenLen']
        backend_config['ScheduleConfig']['maxPrefillTokens'] = self.kw['maxPrefillTokens']
        self.config_dict['BackendConfig'] = backend_config
        if self.kw['host'] != '0.0.0.0':
            self.config_dict['ServerConfig']['ipAddress'] = self.kw['host']
        self.config_dict['ServerConfig']['port'] = self.kw['port']

    def cmd(self, finetuned_model=None, base_model=None, master_ip=None):
        """Generates the command to start the MindIE service.

Args:
    finetuned_model (str): Path to the fine-tuned model
    base_model (str): Path to the base model (fallback if finetuned_model is invalid)
    master_ip (str): Master node IP address (currently unused)

**Returns:**

- LazyLLMCMD: Command object for starting the service

Notes:
    - Automatically handles model path validation
    - Updates configuration before service start
    - Supports random port allocation when configured
"""
        if self.custom_config is None:
            self.finetuned_model = finetuned_model
            if finetuned_model or base_model:
                if not os.path.exists(finetuned_model) or \
                    not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                            for filename in os.listdir(finetuned_model)):
                    if not finetuned_model:
                        LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                                    f'base_model({base_model}) will be used')
                    self.finetuned_model = base_model

            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)

            self.update_config()

        self.save_config()

        def impl():
            cmd = f'{os.path.join(self.mindie_home, "mindie-service/bin/mindieservice_daemon")}'
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

    def geturl(self, job=None):
        """Gets the service URL after deployment.

Args:
    job: Job object (optional, defaults to self.job)

**Returns:**

- str: The generate endpoint URL

Notes:
    - Returns different formats based on display mode
    - Includes port number from configuration
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return f'http://{job.get_jobip()}:{self.kw["port"]}/generate'
        else:
            LOG.info(f'MindIE Server running on http://{job.get_jobip()}:{self.kw["port"]}')
            return f'http://{job.get_jobip()}:{self.kw["port"]}/generate'

    @staticmethod
    def extract_result(x, inputs):
        """Extracts the generated text from the API response.

Args:
    x: Raw API response
    inputs: Original inputs (unused)

**Returns:**

- str: The generated text

Notes:
    - Parses JSON response
    - Returns first text entry from response
"""
        return json.loads(x)['text'][0]

cmd(finetuned_model=None, base_model=None, master_ip=None)

Generates the command to start the MindIE service.

Parameters:

  • finetuned_model (str, default: None ) –

    Path to the fine-tuned model

  • base_model (str, default: None ) –

    Path to the base model (fallback if finetuned_model is invalid)

  • master_ip (str, default: None ) –

    Master node IP address (currently unused)

Returns:

  • LazyLLMCMD: Command object for starting the service
Notes
  • Automatically handles model path validation
  • Updates configuration before service start
  • Supports random port allocation when configured
Source code in lazyllm/components/deploy/mindie.py
    def cmd(self, finetuned_model=None, base_model=None, master_ip=None):
        """Generates the command to start the MindIE service.

Args:
    finetuned_model (str): Path to the fine-tuned model
    base_model (str): Path to the base model (fallback if finetuned_model is invalid)
    master_ip (str): Master node IP address (currently unused)

**Returns:**

- LazyLLMCMD: Command object for starting the service

Notes:
    - Automatically handles model path validation
    - Updates configuration before service start
    - Supports random port allocation when configured
"""
        if self.custom_config is None:
            self.finetuned_model = finetuned_model
            if finetuned_model or base_model:
                if not os.path.exists(finetuned_model) or \
                    not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                            for filename in os.listdir(finetuned_model)):
                    if not finetuned_model:
                        LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                                    f'base_model({base_model}) will be used')
                    self.finetuned_model = base_model

            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)

            self.update_config()

        self.save_config()

        def impl():
            cmd = f'{os.path.join(self.mindie_home, "mindie-service/bin/mindieservice_daemon")}'
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

extract_result(x, inputs) staticmethod

Extracts the generated text from the API response.

Parameters:

  • x

    Raw API response

  • inputs

    Original inputs (unused)

Returns:

  • str: The generated text
Notes
  • Parses JSON response
  • Returns first text entry from response
Source code in lazyllm/components/deploy/mindie.py
    @staticmethod
    def extract_result(x, inputs):
        """Extracts the generated text from the API response.

Args:
    x: Raw API response
    inputs: Original inputs (unused)

**Returns:**

- str: The generated text

Notes:
    - Parses JSON response
    - Returns first text entry from response
"""
        return json.loads(x)['text'][0]

geturl(job=None)

Gets the service URL after deployment.

Parameters:

  • job

    Job object (optional, defaults to self.job)

Returns:

  • str: The generate endpoint URL
Notes
  • Returns different formats based on display mode
  • Includes port number from configuration
Source code in lazyllm/components/deploy/mindie.py
    def geturl(self, job=None):
        """Gets the service URL after deployment.

Args:
    job: Job object (optional, defaults to self.job)

**Returns:**

- str: The generate endpoint URL

Notes:
    - Returns different formats based on display mode
    - Includes port number from configuration
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return f'http://{job.get_jobip()}:{self.kw["port"]}/generate'
        else:
            LOG.info(f'MindIE Server running on http://{job.get_jobip()}:{self.kw["port"]}')
            return f'http://{job.get_jobip()}:{self.kw["port"]}/generate'

load_config(config_path)

Loads and parses the MindIE configuration file.

Parameters:

  • config_path (str) –

    Path to the JSON configuration file

Returns:

  • dict: Parsed configuration dictionary
Notes
  • Handles both default and custom configuration files
  • Uses JSON format for configuration
  • Creates backup of original config before modification
Source code in lazyllm/components/deploy/mindie.py
    def load_config(self, config_path):
        """Loads and parses the MindIE configuration file.

Args:
    config_path (str): Path to the JSON configuration file

**Returns:**

- dict: Parsed configuration dictionary

Notes:
    - Handles both default and custom configuration files
    - Uses JSON format for configuration
    - Creates backup of original config before modification
"""
        with open(config_path, 'r') as file:
            config_dict = json.load(file)
        return config_dict

save_config()

Saves the current configuration to file.

Notes
  • Automatically creates backup of existing config
  • Writes to the standard MindIE config location
  • Uses JSON format with proper indentation
  • Called automatically during deployment
Source code in lazyllm/components/deploy/mindie.py
    def save_config(self):
        """Saves the current configuration to file.

Notes:
    - Automatically creates backup of existing config
    - Writes to the standard MindIE config location
    - Uses JSON format with proper indentation
    - Called automatically during deployment
"""
        if os.path.isfile(self.mindie_config_path):
            shutil.copy2(self.mindie_config_path, self.backup_path)

        with open(self.mindie_config_path, 'w') as file:
            json.dump(self.config_dict, file)

update_config()

Updates the configuration dictionary with current settings.

Notes
  • Handles multiple configuration sections:
    • Model deployment parameters
    • Server settings
    • Scheduling parameters
Source code in lazyllm/components/deploy/mindie.py
    def update_config(self):
        """Updates the configuration dictionary with current settings.

Notes:
    - Handles multiple configuration sections:
        - Model deployment parameters
        - Server settings
        - Scheduling parameters
"""
        backend_config = self.config_dict['BackendConfig']
        backend_config['npuDeviceIds'] = self.kw['npuDeviceIds']
        model_config = {
            'modelName': self.finetuned_model.split('/')[-1],
            'modelWeightPath': self.finetuned_model,
            'worldSize': self.kw['worldSize'],
            'trust_remote_code': self.trust_remote_code
        }
        backend_config['ModelDeployConfig']['ModelConfig'][0].update(model_config)
        backend_config['ModelDeployConfig']['maxSeqLen'] = self.kw['maxSeqLen']
        backend_config['ModelDeployConfig']['maxInputTokenLen'] = self.kw['maxInputTokenLen']
        backend_config['ScheduleConfig']['maxPrefillTokens'] = self.kw['maxPrefillTokens']
        self.config_dict['BackendConfig'] = backend_config
        if self.kw['host'] != '0.0.0.0':
            self.config_dict['ServerConfig']['ipAddress'] = self.kw['host']
        self.config_dict['ServerConfig']['port'] = self.kw['port']

lazyllm.components.deploy.Infinity

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase, providing high-performance text-embeddings, reranking, and CLIP capabilities based on the Infinity framework.

Parameters:

  • launcher (launcher, default: remote(ngpus=1) ) –

    The launcher for Infinity, defaulting to launchers.remote(ngpus=1).

  • kw

    Keyword arguments for updating default training parameters. Note that no additional keyword arguments can be passed here except those listed below.

The keyword arguments and their default values for this class are as follows:

Other Parameters:

  • keys_name_handle (Dict) –

    Key name mapping dictionary.

  • message_format (Dict) –

    Default message format template.

  • default_headers (Dict) –

    Default HTTP request headers.

  • target_name (str) –

    API target endpoint name.

Examples:

>>> import lazyllm
>>> from lazyllm import deploy
>>> deploy.Infinity()
<lazyllm.llm.deploy type=Infinity>
Source code in lazyllm/components/deploy/infinity.py
class Infinity(LazyLLMDeployBase):
    """This class is a subclass of ``LazyLLMDeployBase``, providing high-performance text-embeddings, reranking, and CLIP capabilities based on the [Infinity](https://github.com/michaelfeil/infinity) framework.

Args:
    launcher (lazyllm.launcher): The launcher for Infinity, defaulting to ``launchers.remote(ngpus=1)``.
    kw: Keyword arguments for updating default training parameters. Note that no additional keyword arguments can be passed here except those listed below.

The keyword arguments and their default values for this class are as follows:

Keyword Args: 
    keys_name_handle (Dict): Key name mapping dictionary.
    message_format (Dict): Default message format template.
    default_headers (Dict): Default HTTP request headers.
    target_name (str): API target endpoint name.


Examples:
    >>> import lazyllm
    >>> from lazyllm import deploy
    >>> deploy.Infinity()
    <lazyllm.llm.deploy type=Infinity>
    """
    keys_name_handle = {
        'inputs': 'input',
    }
    message_format = {
        'input': 'who are you ?',
    }
    default_headers = {'Content-Type': 'application/json'}
    target_name = 'embeddings'

    def __init__(self, launcher=launchers.remote(ngpus=1), model_type='embed', log_path=None, **kw):  # noqa B008
        super().__init__(launcher=launcher)
        self.kw = ArgsDict({
            'host': '0.0.0.0',
            'port': None,
            'batch-size': 256,
        })
        self._model_type = model_type
        kw.pop('stream', '')
        # Infinity (embedding model) doesn't support tensor parallel, ignore 'tp' parameter
        kw.pop('tp', None)
        self.options_keys = kw.pop('options_keys', [])
        self.kw.check_and_update(kw)
        self.random_port = False if 'port' in kw and kw['port'] else True
        self.temp_folder = make_log_dir(log_path, 'lmdeploy') if log_path else None

    def cmd(self, finetuned_model=None, base_model=None):
        if not os.path.exists(finetuned_model) or \
            not any(filename.endswith('.bin') or filename.endswith('.safetensors')
                    for filename in os.listdir(finetuned_model)):
            if not finetuned_model:
                LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                            f'base_model({base_model}) will be used')
            finetuned_model = base_model

        def impl():
            if self.random_port:
                self.kw['port'] = random.randint(30000, 40000)
            cmd = f'infinity_emb v2 --model-id {finetuned_model} --no-bettertransformer '
            if isinstance(self._launcher, launchers.EmptyLauncher) and self._launcher.ngpus:
                available_gpus = self._launcher._get_idle_gpus()
                required_count = self._launcher.ngpus
                if required_count <= len(available_gpus):
                    try:
                        use_cuda_visible = lazyllm.config['cuda_visible']
                    except (KeyError, AttributeError):
                        use_cuda_visible = False
                    if use_cuda_visible:
                        # Use logical GPU IDs (0, 1, 2...) when CUDA_VISIBLE_DEVICES is set
                        gpu_ids = ','.join(map(str, range(required_count)))
                    else:
                        # Use physical GPU IDs when CUDA_VISIBLE_DEVICES is not set
                        gpu_ids = ','.join(map(str, available_gpus[:required_count]))
                    cmd += f'--device-id={gpu_ids} '
                else:
                    raise RuntimeError(
                        f'Insufficient GPUs available (required: {required_count}, '
                        f'available: {len(available_gpus)})'
                    )
            cmd += self.kw.parse_kwargs()
            cmd += ' ' + parse_options_keys(self.options_keys)
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl, checkf=verify_fastapi_func)

    def geturl(self, job=None):
        """Get the URL address of the Infinity service. Returns the corresponding API access URL address based on deployment mode and job status.

Args:
    job (Optional[Any]): Job object, if None uses the current instance's job property.
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return f'http://<ip>:<port>/{self.target_name}'
        else:
            return f'http://{job.get_jobip()}:{self.kw["port"]}/{self.target_name}'

    @staticmethod
    def extract_result(x, inputs):
        """Extract result data from Infinity API response.
Parses JSON response from Infinity service and extracts embedding vectors or reranking results based on the returned object type.

Args:
    x (str): JSON string response returned by API.
    inputs (Dict): Original input data used to determine the return format.
"""
        try:
            res_object = json.loads(x)
        except Exception as e:
            LOG.warning(f'JSONDecodeError on load {x}')
            raise e
        assert 'object' in res_object
        object_type = res_object['object']
        if object_type == 'list':  # for infinity >= 0.0.64
            object_type = res_object['data'][0]['object']
        if object_type == 'embedding':
            res_list = [item['embedding'] for item in res_object['data']]
            if len(res_list) == 1 and type(inputs['input']) is str:
                res_list = res_list[0]
            return json.dumps(res_list)
        elif object_type == 'rerank':
            return [(x['index'], x['relevance_score']) for x in res_object['results']]

extract_result(x, inputs) staticmethod

Extract result data from Infinity API response. Parses JSON response from Infinity service and extracts embedding vectors or reranking results based on the returned object type.

Parameters:

  • x (str) –

    JSON string response returned by API.

  • inputs (Dict) –

    Original input data used to determine the return format.

Source code in lazyllm/components/deploy/infinity.py
    @staticmethod
    def extract_result(x, inputs):
        """Extract result data from Infinity API response.
Parses JSON response from Infinity service and extracts embedding vectors or reranking results based on the returned object type.

Args:
    x (str): JSON string response returned by API.
    inputs (Dict): Original input data used to determine the return format.
"""
        try:
            res_object = json.loads(x)
        except Exception as e:
            LOG.warning(f'JSONDecodeError on load {x}')
            raise e
        assert 'object' in res_object
        object_type = res_object['object']
        if object_type == 'list':  # for infinity >= 0.0.64
            object_type = res_object['data'][0]['object']
        if object_type == 'embedding':
            res_list = [item['embedding'] for item in res_object['data']]
            if len(res_list) == 1 and type(inputs['input']) is str:
                res_list = res_list[0]
            return json.dumps(res_list)
        elif object_type == 'rerank':
            return [(x['index'], x['relevance_score']) for x in res_object['results']]

geturl(job=None)

Get the URL address of the Infinity service. Returns the corresponding API access URL address based on deployment mode and job status.

Parameters:

  • job (Optional[Any], default: None ) –

    Job object, if None uses the current instance's job property.

Source code in lazyllm/components/deploy/infinity.py
    def geturl(self, job=None):
        """Get the URL address of the Infinity service. Returns the corresponding API access URL address based on deployment mode and job status.

Args:
    job (Optional[Any]): Job object, if None uses the current instance's job property.
"""
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return f'http://<ip>:<port>/{self.target_name}'
        else:
            return f'http://{job.get_jobip()}:{self.kw["port"]}/{self.target_name}'

lazyllm.components.deploy.OCRDeploy

Bases: LazyLLMDeployBase

OCRDeploy is a subclass of LazyLLMDeployBase that provides deployment for OCR (Optical Character Recognition) models. This class is designed to deploy OCR models with additional configurations such as logging, trust for remote code, and port customization.

Attributes:

keys_name_handle: A dictionary mapping input keys to their corresponding handler keys. For example:
    - "inputs": Handles general inputs.
    - "ocr_files": Also mapped to "inputs".
message_format: A dictionary specifying the expected message format. For example:
    - {"inputs": "/path/to/pdf"} indicates that the model expects a PDF file path as input.
default_headers: A dictionary specifying default headers for API requests. Defaults to:
    - {"Content-Type": "application/json"}

Parameters:

  • launcher

    A launcher instance for deploying the model. Defaults to None.

  • log_path

    A string specifying the path where logs should be saved. Defaults to None.

  • trust_remote_code

    A boolean indicating whether to trust remote code execution. Defaults to True.

  • port

    An integer specifying the port for the deployment server. Defaults to None.

Returns:

  • OCRDeploy instance, can be started by calling

Examples:

>>> from lazyllm.components import OCRDeploy
>>> from lazyllm import launchers
>>> # 创建一个 OCRDeploy 实例
>>> deployer = OCRDeploy(launcher=launchers.local(), log_path='./logs', port=8080)
>>> # 使用微调的 OCR 模型部署服务器
>>> server = deployer(finetuned_model='ocr-model')
>>> # 打印部署服务器信息
>>> print(server)
... <RelayServer instance ready to handle OCR requests>
Source code in lazyllm/components/deploy/ocr/pp_ocr.py
class OCRDeploy(LazyLLMDeployBase):
    """OCRDeploy is a subclass of [LazyLLMDeployBase][lazyllm.components.LazyLLMDeployBase] that provides deployment for OCR (Optical Character Recognition) models.
This class is designed to deploy OCR models with additional configurations such as logging, trust for remote code, and port customization.

Attributes:

    keys_name_handle: A dictionary mapping input keys to their corresponding handler keys. For example:
        - "inputs": Handles general inputs.
        - "ocr_files": Also mapped to "inputs".
    message_format: A dictionary specifying the expected message format. For example:
        - {"inputs": "/path/to/pdf"} indicates that the model expects a PDF file path as input.
    default_headers: A dictionary specifying default headers for API requests. Defaults to:
        - {"Content-Type": "application/json"}

Args:
    launcher: A launcher instance for deploying the model. Defaults to `None`.
    log_path: A string specifying the path where logs should be saved. Defaults to `None`.
    trust_remote_code: A boolean indicating whether to trust remote code execution. Defaults to `True`.
    port: An integer specifying the port for the deployment server. Defaults to `None`.

Returns:
    OCRDeploy instance, can be started by calling


Examples:
    >>> from lazyllm.components import OCRDeploy
    >>> from lazyllm import launchers
    >>> # 创建一个 OCRDeploy 实例
    >>> deployer = OCRDeploy(launcher=launchers.local(), log_path='./logs', port=8080)
    >>> # 使用微调的 OCR 模型部署服务器
    >>> server = deployer(finetuned_model='ocr-model')
    >>> # 打印部署服务器信息
    >>> print(server)
    ... <RelayServer instance ready to handle OCR requests>
    """
    keys_name_handle = {
        'inputs': 'inputs',
        'ocr_files': 'inputs',
    }
    message_format = {'inputs': '/path/to/pdf'}
    default_headers = {'Content-Type': 'application/json'}

    def __init__(self, launcher=None, log_path=None, trust_remote_code=True, port=None, **kw):
        super().__init__(launcher=launcher)
        self._log_path = log_path
        self._trust_remote_code = trust_remote_code
        self._port = port

    def __call__(self, finetuned_model=None, base_model=None):
        if not finetuned_model:
            finetuned_model = base_model
        return lazyllm.deploy.RelayServer(
            port=self._port, func=_OCR(finetuned_model), launcher=self._launcher, log_path=self._log_path, cls='ocr')()

lazyllm.components.deploy.BertDeploy

Bases: LazyLLMDeployBase

This class is a subclass of LazyLLMDeployBase, used to deploy HuggingFace-style sequence classification models. The service exposes an HTTP /generate endpoint via RelayServer (no standalone CLI). Usage mirrors StableDiffusionDeploy: calling the instance __call__(finetuned_model, base_model) returns a launchable Relay service.

Parameters:

  • launcher (Optional[LazyLLMLaunchersBase], default: None ) –

    Launcher instance, defaults to None.

  • log_path (Optional[str], default: None ) –

    Log file path, defaults to None.

  • trust_remote_code (bool, default: True ) –

    Whether to pass trust_remote_code when loading tokenizer/model, defaults to True.

  • port (Optional[int], default: None ) –

    Listen port; None lets the framework assign one.

  • max_length (int, default: 512 ) –

    Maximum sequence length for tokenization, default 512.

  • device (Optional[str], default: None ) –

    Device for inference, e.g. "cuda" or "cpu"; None selects automatically.

  • **kw

    Unknown keyword arguments are logged as a warning and ignored.

Message Format

With TrainableModule, the first positional argument maps to text_a and text_b is passed as a keyword. JSON body: - text_a (str): First segment (e.g. span text). - text_b (str): Second segment (e.g. context); may be empty for single-sequence encoding.

Returns: - JSON string from the worker with fields such as logits, probs, and predicted_label.

Examples:

>>> import lazyllm
>>> m = lazyllm.TrainableModule('xlm-roberta-base', use_model_map=False).deploy_method(
...     lazyllm.deploy.BertDeploy, port=36001, max_length=128,
... ).start()
>>> # Two-sequence input: candidate span + surrounding context.
>>> pair_out = m('candidate_span', text_b='This is the surrounding context.')
>>> print(pair_out)
>>> # Single-sequence input: either omit ``text_b`` or pass an empty string.
>>> single_out = m('This is a single sequence.')
>>> print(single_out)
Source code in lazyllm/components/deploy/bert.py
class BertDeploy(LazyLLMDeployBase):
    """This class is a subclass of ``LazyLLMDeployBase``, used to deploy HuggingFace-style **sequence classification** models. The service exposes an HTTP ``/generate`` endpoint via ``RelayServer`` (no standalone CLI). Usage mirrors ``StableDiffusionDeploy``: calling the instance ``__call__(finetuned_model, base_model)`` returns a launchable Relay service.

Args:
    launcher (Optional[LazyLLMLaunchersBase]): Launcher instance, defaults to ``None``.
    log_path (Optional[str]): Log file path, defaults to ``None``.
    trust_remote_code (bool): Whether to pass ``trust_remote_code`` when loading tokenizer/model, defaults to ``True``.
    port (Optional[int]): Listen port; ``None`` lets the framework assign one.
    max_length (int): Maximum sequence length for tokenization, default ``512``.
    device (Optional[str]): Device for inference, e.g. ``"cuda"`` or ``"cpu"``; ``None`` selects automatically.
    **kw: Unknown keyword arguments are logged as a warning and ignored.

Message Format:
    With ``TrainableModule``, the first positional argument maps to ``text_a`` and ``text_b`` is passed as a keyword. JSON body:
    - text_a (str): First segment (e.g. span text).
    - text_b (str): Second segment (e.g. context); may be empty for single-sequence encoding.

**Returns:**
- JSON string from the worker with fields such as ``logits``, ``probs``, and ``predicted_label``.


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule('xlm-roberta-base', use_model_map=False).deploy_method(
    ...     lazyllm.deploy.BertDeploy, port=36001, max_length=128,
    ... ).start()
    >>> # Two-sequence input: candidate span + surrounding context.
    >>> pair_out = m('candidate_span', text_b='This is the surrounding context.')
    >>> print(pair_out)
    >>> # Single-sequence input: either omit ``text_b`` or pass an empty string.
    >>> single_out = m('This is a single sequence.')
    >>> print(single_out)
    """

    message_format = {
        'text_a': '',
        'text_b': '',
    }
    keys_name_handle = {'inputs': 'text_a'}
    default_headers = {'Content-Type': 'application/json'}
    stream_url_suffix = ''
    stream_parse_parameters: dict = {}

    def __init__(
        self,
        launcher: Optional[LazyLLMLaunchersBase] = None,
        log_path: Optional[str] = None,
        trust_remote_code: bool = True,
        port: Optional[int] = None,
        max_length: int = 512,
        device: Optional[str] = None,
        **kw,
    ):
        super().__init__(launcher=launcher)
        self._log_path = log_path
        self._trust_remote_code = trust_remote_code
        self._port = port
        self._max_length = max_length
        self._device = device
        if kw:
            LOG.warning(f'Bert deploy: ignoring unknown kwargs: {sorted(kw.keys())}')

    def __call__(self, finetuned_model=None, base_model=None):
        finetuned_model = finetuned_model or ''
        base_model = base_model or ''
        if not finetuned_model:
            model_path = base_model
        else:
            valid_local = (
                os.path.isdir(finetuned_model)
                and any(
                    f.endswith(('.bin', '.safetensors', '.pt'))
                    for f in os.listdir(finetuned_model)
                )
            )
            if valid_local:
                model_path = finetuned_model
            elif base_model:
                LOG.warning(
                    f'Note! finetuned_model({finetuned_model}) is not a local checkpoint with weights; '
                    f'using base_model({base_model}).'
                )
                model_path = base_model
            else:
                LOG.info(
                    f'Bert deploy: finetuned_model({finetuned_model}) is not a valid local '
                    f'checkpoint and no base_model provided; treating it as a remote model id.'
                )
                model_path = finetuned_model

        if not model_path:
            raise ValueError('Bert deploy: finetuned_model and base_model are both empty or invalid.')

        func = _BertSequenceClassificationService(
            model_path,
            trust_remote_code=self._trust_remote_code,
            max_length=self._max_length,
            device=self._device,
        )
        return RelayServer(
            port=self._port,
            func=func,
            launcher=self._launcher,
            log_path=self._log_path,
            cls='bert',
        )()

    @staticmethod
    def extract_result(output: str, inputs) -> str:
        return output

lazyllm.components.deploy.relay.base.RelayServer

Bases: LazyLLMDeployBase

Source code in lazyllm/components/deploy/relay/base.py
class RelayServer(LazyLLMDeployBase):
    keys_name_handle = None
    default_headers = {'Content-Type': 'application/json'}
    message_format = None

    def __init__(self, port=None, *, func=None, pre_func=None, post_func=None, pythonpath=None,
                 log_path=None, cls=None, launcher=launchers.remote(sync=False), num_replicas: int = 1,  # noqa B008
                 security_key: Optional[str] = None, defined_pos: Optional[str] = None):
        # func must dump in __call__ to wait for dependancies.
        self._func = func
        self._pre = dump_obj(pre_func)
        self._post = dump_obj(post_func)
        self._port, self._real_port = port, None
        self._pythonpath = pythonpath
        self._num_replicas = num_replicas
        self._security_key = security_key
        self._defined_pos = defined_pos
        super().__init__(launcher=launcher)
        self.temp_folder = make_log_dir(log_path, cls or 'relay') if log_path else None

    def cmd(self, func=None):
        FastapiApp.update()
        self._func = dump_obj(func or self._func)
        folder_path = os.path.dirname(os.path.abspath(__file__))
        run_file_path = os.path.join(folder_path, 'server.py')

        def impl():
            self._real_port = self._port if self._port else random.randint(30000, 40000)
            cmd = f'{sys.executable} {run_file_path} --open_port={self._real_port} --function="{self._func}" '
            if self._pre:
                cmd += f'--before_function="{self._pre}" '
            if self._post:
                cmd += f'--after_function="{self._post}" '
            if self._pythonpath:
                cmd += f'--pythonpath="{self._pythonpath}" '
            if self._num_replicas > 1 and config['use_ray']:
                cmd += f'--num_replicas={self._num_replicas}'
            if self._security_key:
                cmd += f'--security_key="{self._security_key}" '
            if self._defined_pos:
                cmd += '--defined_pos="{}" '.format(dump_obj(self._defined_pos.replace('"', r'\"')))
            if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
            return cmd

        return LazyLLMCMD(cmd=impl, return_value=self.geturl,
                          checkf=verify_ray_func if config['use_ray'] else verify_fastapi_func,
                          no_displays=['function', 'before_function', 'after_function', 'security_key', 'defined_pos'])

    def geturl(self, job=None):
        if job is None:
            job = self.job
        return f'http://{job.get_jobip()}:{self._real_port}/generate'

cmd(func=None)

Source code in lazyllm/components/deploy/relay/base.py
def cmd(self, func=None):
    FastapiApp.update()
    self._func = dump_obj(func or self._func)
    folder_path = os.path.dirname(os.path.abspath(__file__))
    run_file_path = os.path.join(folder_path, 'server.py')

    def impl():
        self._real_port = self._port if self._port else random.randint(30000, 40000)
        cmd = f'{sys.executable} {run_file_path} --open_port={self._real_port} --function="{self._func}" '
        if self._pre:
            cmd += f'--before_function="{self._pre}" '
        if self._post:
            cmd += f'--after_function="{self._post}" '
        if self._pythonpath:
            cmd += f'--pythonpath="{self._pythonpath}" '
        if self._num_replicas > 1 and config['use_ray']:
            cmd += f'--num_replicas={self._num_replicas}'
        if self._security_key:
            cmd += f'--security_key="{self._security_key}" '
        if self._defined_pos:
            cmd += '--defined_pos="{}" '.format(dump_obj(self._defined_pos.replace('"', r'\"')))
        if self.temp_folder: cmd += f' 2>&1 | tee {get_log_path(self.temp_folder)}'
        return cmd

    return LazyLLMCMD(cmd=impl, return_value=self.geturl,
                      checkf=verify_ray_func if config['use_ray'] else verify_fastapi_func,
                      no_displays=['function', 'before_function', 'after_function', 'security_key', 'defined_pos'])

geturl(job=None)

Source code in lazyllm/components/deploy/relay/base.py
def geturl(self, job=None):
    if job is None:
        job = self.job
    return f'http://{job.get_jobip()}:{self._real_port}/generate'

lazyllm.components.deploy.text_to_speech.utils.TTSBase

Bases: LazyLLMDeployBase

Base class for TTS (Text-to-Speech) services.

Provides the deployment framework for text-to-speech services, supporting model loading and RelayServer deployment.

Parameters:

  • launcher (LazyLLMLaunchersBase, default: None ) –

    Task launcher

  • log_path (str, default: None ) –

    Log file path

  • port (int, default: None ) –

    Service port number

Source code in lazyllm/components/deploy/text_to_speech/utils.py
class TTSBase(LazyLLMDeployBase):
    """Base class for TTS (Text-to-Speech) services.

Provides the deployment framework for text-to-speech services, supporting model loading and RelayServer deployment.

Args:
    launcher (LazyLLMLaunchersBase, optional): Task launcher
    log_path (str, optional): Log file path
    port (int, optional): Service port number
"""
    func = None

    def __init__(self, launcher: LazyLLMLaunchersBase = None,
                 log_path: Optional[str] = None, port: Optional[int] = None, **kw):
        super().__init__(launcher=launcher)
        self._log_path = log_path
        self._port = port

    def __call__(self, finetuned_model=None, base_model=None):
        if not finetuned_model:
            finetuned_model = base_model
        elif not os.path.exists(finetuned_model) or \
            not any(file.endswith(('.bin', '.safetensors'))
                    for _, _, filenames in os.walk(finetuned_model) for file in filenames):
            LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                        f'base_model({base_model}) will be used')
            finetuned_model = base_model
        return lazyllm.deploy.RelayServer(port=self._port, func=self.__class__.func(finetuned_model),
                                          launcher=self._launcher, log_path=self._log_path, cls='tts')()

Prompter

lazyllm.components.prompter.LazyLLMPrompterBase

LazyLLM prompter base class for managing and generating model prompts.

Parameters:

  • show (bool, default: False ) –

    Whether to display generated prompts, defaults to False.

  • tools (Optional[List], default: None ) –

    List of available tools, defaults to None.

  • history (Optional[List], default: None ) –

    Conversation history, defaults to None.

Attributes:

  • ISA (str) –

    Instruction separator start token "<!lazyllm-spliter!>".

  • ISE (str) –

    Instruction separator end token "</!lazyllm-spliter!>".

Configuration Items

system: System role setting

sos/eos: Session start/end markers

soh/eoh: Human input start/end markers

soa/eoa: AI response start/end markers

soe/eoe: Tool execution result start/end markers

tool_start_token/tool_end_token: Tool call start/end markers

tool_args_token: Tool arguments marker

Source code in lazyllm/components/prompter/builtinPrompt.py
class LazyLLMPrompterBase(metaclass=LazyLLMRegisterMetaClass):
    """LazyLLM prompter base class for managing and generating model prompts.

Args:
    show (bool): Whether to display generated prompts, defaults to False.
    tools (Optional[List]): List of available tools, defaults to None.
    history (Optional[List]): Conversation history, defaults to None.

Attributes:
    ISA (str): Instruction separator start token "<!lazyllm-spliter!>".

    ISE (str): Instruction separator end token "</!lazyllm-spliter!>".


Configuration Items:
    system: System role setting 

    sos/eos: Session start/end markers 

    soh/eoh: Human input start/end markers 

    soa/eoa: AI response start/end markers 

    soe/eoe: Tool execution result start/end markers 

    tool_start_token/tool_end_token: Tool call start/end markers 

    tool_args_token: Tool arguments marker 

"""
    ISA = '<!lazyllm-spliter!>'
    ISE = '</!lazyllm-spliter!>'

    def __init__(self, show=False, tools=None, skills=None, history=None, *, enable_system: bool = True):
        self._set_model_configs(system=_DEFULT_CONFIG if not config['disable_system_prompt'] else '',
                                sos='', soh='', soa='', eos='', eoh='', eoa='')
        self._show = show
        self._tools = _DynamicValue(tools) if callable(tools) else tools
        self._skills = _DynamicValue(skills) if callable(skills) else (skills or '')
        self._pre_hook = None
        self._history = history or []
        self._enable_system = enable_system

    def _init_prompt(self, template: str, instruction_template: str, split: Union[None, str] = None):
        self._template = template
        self._instruction_template = instruction_template
        if split:
            assert not hasattr(self, '_split')
            self._split = split

    @staticmethod
    def _get_extro_key_template(extra_keys, prefix='Here are some extra messages you can referred to:\n\n'):
        if extra_keys:
            if isinstance(extra_keys, str): extra_keys = [extra_keys]
            assert isinstance(extra_keys, (tuple, list)), 'Only str, tuple[str], list[str] are supported'
            return prefix + ''.join([f'### {k}:\n{{{k}}}\n\n' for k in extra_keys])
        return ''

    def _handle_tool_call_instruction(self, instruction, tools):
        tool_dict = {}
        for key in ['tool_start_token', 'tool_args_token', 'tool_end_token']:
            if getattr(self, f'_{key}', None) and key in instruction:
                tool_dict[key] = getattr(self, f'_{key}')
        if 'tool_names' in instruction: tool_dict['tool_names'] = self._get_tools_name(tools)
        return reduce(lambda s, kv: s.replace(f'{{{kv[0]}}}', kv[1]), tool_dict.items(), instruction)

    def _set_model_configs(self, system: str = None, sos: Union[None, str] = None, soh: Union[None, str] = None,
                           soa: Union[None, str] = None, eos: Union[None, str] = None,
                           eoh: Union[None, str] = None, eoa: Union[None, str] = None,
                           soe: Union[None, str] = None, eoe: Union[None, str] = None,
                           separator: Union[None, str] = None, plugin: Union[None, str] = None,
                           interpreter: Union[None, str] = None, stop_words: Union[None, List[str]] = None,
                           tool_start_token: Union[None, str] = None, tool_end_token: Union[None, str] = None,
                           tool_args_token: Union[None, str] = None):

        local = locals()
        for name in ['system', 'sos', 'soh', 'soa', 'eos', 'eoh', 'eoa', 'soe', 'eoe', 'tool_start_token',
                     'tool_end_token', 'tool_args_token']:
            if local[name] is not None: setattr(self, f'_{name}', local[name])

    def _resolve_dynamic(self, value, input, history):
        return value.resolve(input, history) if isinstance(value, _DynamicValue) else value

    def _get_tools(self, tools, *, for_chat_api: bool):
        return tools if for_chat_api else '### Function-call Tools. \n\n' +\
            f'{json.dumps(tools, ensure_ascii=False)}\n\n' if tools else ''

    def _get_tools_name(self, tools):
        return json.dumps([t['function']['name'] for t in tools], ensure_ascii=False) if tools else ''

    def _get_histories(self, history, *, for_chat_api: bool):  # noqa: C901
        if not self._history and not history: return ''
        if for_chat_api:
            content = []
            for item in self._history + (history or []):
                if isinstance(item, list):
                    assert len(item) <= 2, 'history item length cannot be greater than 2'
                    if len(item) > 0: content.append({'role': 'user', 'content': item[0]})
                    if len(item) > 1: content.append({'role': 'assistant', 'content': item[1]})
                elif isinstance(item, dict):
                    content.append(item)
                else:
                    LOG.error(f'history: {history}')
                    raise ValueError('history must be a list of list or dict')
            return content
        else:
            ret = ''.join([f'{self._soh}{h}{self._eoh}{self._soa}{a}{self._eoa}' for h, a in self._history])
            if not history: return ret
            if isinstance(history[0], list):
                return ret + ''.join([f'{self._soh}{h}{self._eoh}{self._soa}{a}{self._eoa}' for h, a in history])
            elif isinstance(history[0], dict):
                for item in history:
                    if item['role'] == 'user':
                        ret += f'{self._soh}{item["content"]}{self._eoh}'
                    elif item['role'] == 'assistant':
                        ret += f'{self._soa}'
                        ret += f'{item.get("content", "")}'
                        for idx in range(len(item.get('tool_calls', []))):
                            tool = item['tool_calls'][idx]['function']
                            if getattr(self, '_tool_args_token', None):
                                tool = tool['name'] + self._tool_args_token + \
                                    json.dumps(tool['arguments'], ensure_ascii=False)
                            ret += (f'{getattr(self, "_tool_start_token", "")}' + '\n'
                                    f'{tool}'
                                    f'{getattr(self, "_tool_end_token", "")}' + '\n')
                        ret += f'{self._eoa}'
                    elif item['role'] == 'tool':
                        try:
                            content = json.loads(item['content'].strip())
                        except Exception:
                            content = item['content']
                        ret += f'{getattr(self, "_soe", "")}{content}{getattr(self, "_eoe", "")}'

                return ret
            else:
                raise NotImplementedError('Cannot transform json history to {type(history[0])} now')

    def _get_instruction_and_input(self, input, *, for_chat_api: bool = False, tools=None):
        instruction = self._instruction_template
        fc_prompt = '' if for_chat_api or not tools else FC_PROMPT
        if fc_prompt and FC_PROMPT_PLACEHOLDER not in instruction:
            instruction = f'{instruction}\n\n{fc_prompt}'
        instruction = instruction.replace(FC_PROMPT_PLACEHOLDER, fc_prompt)
        instruction = self._handle_tool_call_instruction(instruction, tools)
        prompt_keys = list(set(re.findall(r'\{(\w+)\}', instruction)))
        if isinstance(input, (str, int)):
            if len(prompt_keys) == 1:
                return instruction.format(**{prompt_keys[0]: input}), ''
            else:
                assert len(prompt_keys) == 0
                return instruction, input
        assert isinstance(input, dict), f'expected types are str, int and dict, bug get {type(input)}(`{input})`'
        kwargs = {k: input.pop(k) for k in prompt_keys}
        assert len(input) <= 1, f'Unexpected keys found in input: {list(input.keys())}'
        return (reduce(lambda s, kv: s.replace(f'{{{kv[0]}}}', kv[1]),
                       kwargs.items(),
                       instruction)
                if len(kwargs) > 0 else instruction,
                list(input.values())[0] if input else '')

    def _check_values(self, instruction, input, history, tools): pass

    # Used for TrainableModule(local deployed)
    def _generate_prompt_impl(self, instruction, input, user, history, tools, label, skills):
        is_tool = False
        if isinstance(input, dict):
            input = input.get('content', '')
            is_tool = input.get('role') == 'tool'
        elif isinstance(input, list):
            is_tool = any(item.get('role') == 'tool' for item in input)
            input = '\n'.join([item.get('content', '') for item in input])
        params = dict(system=self._system, instruction=instruction, input=input, user=user, history=history, tools=tools,
                      skills=skills, sos=self._sos, eos=self._eos, soh=self._soh, eoh=self._eoh, soa=self._soa,
                      eoa=self._eoa)
        if is_tool:
            params['soh'] = getattr(self, '_soe', self._soh)
            params['eoh'] = getattr(self, '_eoe', self._eoh)
        return (self._template.format(**params) + (label if label else '')).lstrip('\n')

    # Used for OnlineChatModule
    def _generate_prompt_dict_impl(self, instruction, input, user, history, tools, label, skills):
        if not history: history = []
        if isinstance(input, str):
            history.append({'role': 'user', 'content': input})
        elif isinstance(input, dict):
            history.append(input)
        elif isinstance(input, list) and all(isinstance(ele, dict) for ele in input):
            history.extend(input)
        elif isinstance(input, tuple) and len(input) == 1:
            # Note tuple size 1 with one single string is not expected
            history.append({'role': 'user', 'content': input[0]})
        else:
            raise TypeError('input must be a string or a dict')

        if user:
            history[-1]['content'] = user + history[-1]['content']

        if self._enable_system:
            system_content = '\n'.join(p for p in (self._system, instruction, skills) if p)
            history.insert(0, {'role': 'system', 'content': system_content})
        return dict(messages=history, tools=tools) if tools else dict(messages=history)

    # Used for OnlineChatModule with Anthropic-format API
    def _generate_prompt_anthropic_impl(self, instruction, input, user, history, tools, label, skills):
        result = self._generate_prompt_dict_impl(instruction, input, user, history, tools, label, skills)
        messages = result.get('messages', [])
        system_text = None
        non_system = []
        for msg in messages:
            if msg.get('role') == 'system':
                system_text = msg['content']
            else:
                non_system.append(msg)
        out = dict(messages=non_system)
        if system_text is not None:
            out['system'] = system_text
        if tools:
            out['tools'] = result['tools']
        return out

    def pre_hook(self, func: Optional[Callable] = None):
        """Sets a pre-processing hook function, allowing external custom processing of input data before prompt generation.

Args:
    func (Optional[Callable]): A callable object to be used as the pre-processing hook function, which receives and processes input data.

**Returns:**

- LazyLLMPrompterBase: Returns the instance itself to support method chaining.
"""
        self._pre_hook = func
        return self

    def _split_instruction(self, instruction: str):
        system_instruction = instruction
        user_instruction = ''
        if LazyLLMPrompterBase.ISA in instruction and LazyLLMPrompterBase.ISE in instruction:
            # The instruction includes system prompts and/or user prompts
            pattern = re.compile(r'%s(.*)%s' % (LazyLLMPrompterBase.ISA, LazyLLMPrompterBase.ISE), re.DOTALL)
            ret = re.split(pattern, instruction)
            system_instruction = ret[0]
            user_instruction = ret[1]

        return system_instruction, user_instruction

    def generate_prompt(self, input: Union[str, List, Dict[str, str], None] = None,
                        history: List[Union[List[str], Dict[str, Any]]] = None,
                        tools: Union[List[Dict[str, Any]], None] = None,
                        label: Union[str, None] = None,
                        *, show: bool = False, return_dict: bool = False,
                        format: Optional[str] = None) -> Union[str, Dict]:
        """
Generate a corresponding Prompt based on user input.

Args:
    input (Option[str | Dict]): The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.
    history (Option[List[List | Dict]]): Historical conversation, can be ``[[u, s], [u, s]]`` or in openai's history format, defaults to None.
    tools (Option[List[Dict]]): A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.
    label (Option[str]): Label, used during fine-tuning or training, defaults to None.
    show (bool): Flag indicating whether to print the generated Prompt, defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` and emits a one-time deprecation warning. Defaults to False.
    format (Option[str]): Output structure. ``None`` returns a concatenated string for local/finetuning use; ``"openai"`` returns OpenAI-style ``messages`` (and optional ``tools``); ``"anthropic"`` returns Anthropic-style ``system``/``messages``. ``OnlineChatModule`` passes the appropriate value. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence. Defaults to ``None``.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        input = copy.deepcopy(input)
        if self._pre_hook:
            input, history, tools, label = self._pre_hook(input, history, tools, label)
        tools = self._resolve_dynamic(tools or self._tools, input, history)
        skills = self._resolve_dynamic(self._skills, input, history)
        for_chat_api = bool(format)
        instruction, input = self._get_instruction_and_input(input, for_chat_api=for_chat_api, tools=tools)
        history = self._get_histories(history, for_chat_api=for_chat_api)
        tools = self._get_tools(tools, for_chat_api=for_chat_api)
        self._check_values(instruction, input, history, tools)
        instruction, user_instruction = self._split_instruction(instruction)
        if format == 'anthropic':
            func = self._generate_prompt_anthropic_impl
        elif format == 'openai':
            func = self._generate_prompt_dict_impl
        else:
            func = self._generate_prompt_impl
        result = func(instruction, input, user_instruction, history, tools, label, skills)
        if self._show or show: LOG.info(result)
        return result

    def get_response(self, output: str, input: Union[str, None] = None) -> str:
        """Used to truncate the Prompt, keeping only valuable output.

Args:
        output (str): The output of the large model.
        input (Option[str]): The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.
"""
        if input and output.startswith(input):
            return output[len(input):]
        return output if getattr(self, '_split', None) is None else output.split(self._split)[-1]

generate_prompt(input=None, history=None, tools=None, label=None, *, show=False, return_dict=False, format=None)

Generate a corresponding Prompt based on user input.

Parameters:

  • input (Option[str | Dict], default: None ) –

    The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.

  • history (Option[List[List | Dict]], default: None ) –

    Historical conversation, can be [[u, s], [u, s]] or in openai's history format, defaults to None.

  • tools (Option[List[Dict]], default: None ) –

    A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.

  • label (Option[str], default: None ) –

    Label, used during fine-tuning or training, defaults to None.

  • show (bool, default: False ) –

    Flag indicating whether to print the generated Prompt, defaults to False.

  • return_dict (bool, default: False ) –

    Deprecated; prefer format="openai". When format is None and this is True, behaves like format="openai" and emits a one-time deprecation warning. Defaults to False.

  • format (Option[str], default: None ) –

    Output structure. None returns a concatenated string for local/finetuning use; "openai" returns OpenAI-style messages (and optional tools); "anthropic" returns Anthropic-style system/messages. OnlineChatModule passes the appropriate value. If both return_dict and format are provided, format takes precedence. Defaults to None.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def generate_prompt(self, input: Union[str, List, Dict[str, str], None] = None,
                        history: List[Union[List[str], Dict[str, Any]]] = None,
                        tools: Union[List[Dict[str, Any]], None] = None,
                        label: Union[str, None] = None,
                        *, show: bool = False, return_dict: bool = False,
                        format: Optional[str] = None) -> Union[str, Dict]:
        """
Generate a corresponding Prompt based on user input.

Args:
    input (Option[str | Dict]): The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.
    history (Option[List[List | Dict]]): Historical conversation, can be ``[[u, s], [u, s]]`` or in openai's history format, defaults to None.
    tools (Option[List[Dict]]): A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.
    label (Option[str]): Label, used during fine-tuning or training, defaults to None.
    show (bool): Flag indicating whether to print the generated Prompt, defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` and emits a one-time deprecation warning. Defaults to False.
    format (Option[str]): Output structure. ``None`` returns a concatenated string for local/finetuning use; ``"openai"`` returns OpenAI-style ``messages`` (and optional ``tools``); ``"anthropic"`` returns Anthropic-style ``system``/``messages``. ``OnlineChatModule`` passes the appropriate value. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence. Defaults to ``None``.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        input = copy.deepcopy(input)
        if self._pre_hook:
            input, history, tools, label = self._pre_hook(input, history, tools, label)
        tools = self._resolve_dynamic(tools or self._tools, input, history)
        skills = self._resolve_dynamic(self._skills, input, history)
        for_chat_api = bool(format)
        instruction, input = self._get_instruction_and_input(input, for_chat_api=for_chat_api, tools=tools)
        history = self._get_histories(history, for_chat_api=for_chat_api)
        tools = self._get_tools(tools, for_chat_api=for_chat_api)
        self._check_values(instruction, input, history, tools)
        instruction, user_instruction = self._split_instruction(instruction)
        if format == 'anthropic':
            func = self._generate_prompt_anthropic_impl
        elif format == 'openai':
            func = self._generate_prompt_dict_impl
        else:
            func = self._generate_prompt_impl
        result = func(instruction, input, user_instruction, history, tools, label, skills)
        if self._show or show: LOG.info(result)
        return result

get_response(output, input=None)

Used to truncate the Prompt, keeping only valuable output.

Parameters:

  • output (str) –

    The output of the large model.

  • input (Option[str], default: None ) –

    The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def get_response(self, output: str, input: Union[str, None] = None) -> str:
        """Used to truncate the Prompt, keeping only valuable output.

Args:
        output (str): The output of the large model.
        input (Option[str]): The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.
"""
        if input and output.startswith(input):
            return output[len(input):]
        return output if getattr(self, '_split', None) is None else output.split(self._split)[-1]

pre_hook(func=None)

Sets a pre-processing hook function, allowing external custom processing of input data before prompt generation.

Parameters:

  • func (Optional[Callable], default: None ) –

    A callable object to be used as the pre-processing hook function, which receives and processes input data.

Returns:

  • LazyLLMPrompterBase: Returns the instance itself to support method chaining.
Source code in lazyllm/components/prompter/builtinPrompt.py
    def pre_hook(self, func: Optional[Callable] = None):
        """Sets a pre-processing hook function, allowing external custom processing of input data before prompt generation.

Args:
    func (Optional[Callable]): A callable object to be used as the pre-processing hook function, which receives and processes input data.

**Returns:**

- LazyLLMPrompterBase: Returns the instance itself to support method chaining.
"""
        self._pre_hook = func
        return self

lazyllm.components.prompter.EmptyPrompter

Bases: LazyLLMPrompterBase

An empty prompt generator that inherits from LazyLLMPrompterBase, and directly returns the original input.

This class performs no formatting and is useful for debugging, testing, or as a placeholder.

Examples:

>>> from lazyllm.components.prompter import EmptyPrompter
>>> prompter = EmptyPrompter()
>>> prompter.generate_prompt("Hello LazyLLM")
'Hello LazyLLM'
>>> prompter.generate_prompt({"query": "Tell me a joke"})
{'query': 'Tell me a joke'}
>>> # Even with additional parameters, the input is returned unchanged
>>> prompter.generate_prompt("No-op", history=[["Hi", "Hello"]], tools=[{"name": "search"}], label="debug")
'No-op'
Source code in lazyllm/components/prompter/builtinPrompt.py
class EmptyPrompter(LazyLLMPrompterBase):
    """An empty prompt generator that inherits from `LazyLLMPrompterBase`, and directly returns the original input.

This class performs no formatting and is useful for debugging, testing, or as a placeholder.


Examples:
    >>> from lazyllm.components.prompter import EmptyPrompter
    >>> prompter = EmptyPrompter()
    >>> prompter.generate_prompt("Hello LazyLLM")
    'Hello LazyLLM'
    >>> prompter.generate_prompt({"query": "Tell me a joke"})
    {'query': 'Tell me a joke'}
    >>> # Even with additional parameters, the input is returned unchanged
    >>> prompter.generate_prompt("No-op", history=[["Hi", "Hello"]], tools=[{"name": "search"}], label="debug")
    'No-op'
    """

    def generate_prompt(self, input, history=None, tools=None, label=None, *, show=False,
                        return_dict: bool = False, format=None):
        """A prompt passthrough implementation that inherits from `LazyLLMPrompterBase`.

This method directly returns the input without any formatting. Useful for debugging, testing, or placeholder use.

Args:
    input (Any): The input to be returned directly as the prompt.
    history (Option[List[List | Dict]]): Dialogue history, ignored. Defaults to None.
    tools (Option[List[Dict]]): Tool definitions, ignored. Defaults to None.
    label (Option[str]): Label, ignored. Defaults to None.
    show (bool): Whether to print the returned prompt. Defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` with a one-time deprecation warning.
    format (Option[str]): If set (e.g. ``"openai"``), returns ``{"messages": [{"role": "user", "content": input}]}``; otherwise returns ``input`` unchanged. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        if format:
            return {'messages': [{'role': 'user', 'content': input}]}
        if self._show or show: LOG.info(input)
        return input

generate_prompt(input, history=None, tools=None, label=None, *, show=False, return_dict=False, format=None)

A prompt passthrough implementation that inherits from LazyLLMPrompterBase.

This method directly returns the input without any formatting. Useful for debugging, testing, or placeholder use.

Parameters:

  • input (Any) –

    The input to be returned directly as the prompt.

  • history (Option[List[List | Dict]], default: None ) –

    Dialogue history, ignored. Defaults to None.

  • tools (Option[List[Dict]], default: None ) –

    Tool definitions, ignored. Defaults to None.

  • label (Option[str], default: None ) –

    Label, ignored. Defaults to None.

  • show (bool, default: False ) –

    Whether to print the returned prompt. Defaults to False.

  • return_dict (bool, default: False ) –

    Deprecated; prefer format="openai". When format is None and this is True, behaves like format="openai" with a one-time deprecation warning.

  • format (Option[str], default: None ) –

    If set (e.g. "openai"), returns {"messages": [{"role": "user", "content": input}]}; otherwise returns input unchanged. If both return_dict and format are provided, format takes precedence.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def generate_prompt(self, input, history=None, tools=None, label=None, *, show=False,
                        return_dict: bool = False, format=None):
        """A prompt passthrough implementation that inherits from `LazyLLMPrompterBase`.

This method directly returns the input without any formatting. Useful for debugging, testing, or placeholder use.

Args:
    input (Any): The input to be returned directly as the prompt.
    history (Option[List[List | Dict]]): Dialogue history, ignored. Defaults to None.
    tools (Option[List[Dict]]): Tool definitions, ignored. Defaults to None.
    label (Option[str]): Label, ignored. Defaults to None.
    show (bool): Whether to print the returned prompt. Defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` with a one-time deprecation warning.
    format (Option[str]): If set (e.g. ``"openai"``), returns ``{"messages": [{"role": "user", "content": input}]}``; otherwise returns ``input`` unchanged. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        if format:
            return {'messages': [{'role': 'user', 'content': input}]}
        if self._show or show: LOG.info(input)
        return input

lazyllm.components.Prompter

Bases: object

Prompt generator class for LLM input formatting. Supports template-based prompting, history injection, and response extraction.

This class allows prompts to be defined via string templates, loaded from dicts, files, or predefined names. It supports history-aware formatting for multi-turn conversations and adapts to both mapping and string input types.

Parameters:

  • prompt (Optional[str], default: None ) –

    Prompt template string with format placeholders.

  • response_split (Optional[str], default: None ) –

    Optional delimiter to split model response and extract useful output.

  • chat_prompt (Optional[str], default: None ) –

    Chat template string, must contain a history placeholder.

  • history_symbol (str, default: 'llm_chat_history' ) –

    Name of the placeholder for historical messages, default is 'llm_chat_history'.

  • eoa (Optional[str], default: None ) –

    Delimiter between assistant/user in history items.

  • eoh (Optional[str], default: None ) –

    Delimiter between user-assistant pairs.

  • show (bool, default: False ) –

    Whether to print the final prompt when generating. Default is False.

Examples:

>>> from lazyllm import Prompter
>>> p = Prompter(prompt="Answer the following: {question}")
>>> p.generate_prompt("What is AI?")
'Answer the following: What is AI?'
>>> p.generate_prompt({"question": "Define machine learning"})
'Answer the following: Define machine learning'
>>> p = Prompter(
...     prompt="Instruction: {instruction}",
...     chat_prompt="Instruction: {instruction}\nHistory:\n{llm_chat_history}",
...     history_symbol="llm_chat_history",
...     eoa="</s>",
...     eoh="|"
... )
>>> p.generate_prompt(
...     input={"instruction": "Translate this."},
...     history=[["hello", "你好"], ["how are you", "你好吗"]]
... )
'Instruction: Translate this.\nHistory:\nhello|你好</s>how are you|你好吗'
>>> prompt_conf = {
...     "prompt": "Task: {task}",
...     "response_split": "---"
... }
>>> p = Prompter.from_dict(prompt_conf)
>>> p.generate_prompt("Summarize this article.")
'Task: Summarize this article.'
>>> full_output = "Task: Summarize this article.---This is the summary."
>>> p.get_response(full_output)
'This is the summary.'
Source code in lazyllm/components/prompter/prompter.py
class Prompter(object):
    """Prompt generator class for LLM input formatting. Supports template-based prompting, history injection, and response extraction.

This class allows prompts to be defined via string templates, loaded from dicts, files, or predefined names.
It supports history-aware formatting for multi-turn conversations and adapts to both mapping and string input types.

Args:
    prompt (Optional[str]): Prompt template string with format placeholders.
    response_split (Optional[str]): Optional delimiter to split model response and extract useful output.
    chat_prompt (Optional[str]): Chat template string, must contain a history placeholder.
    history_symbol (str): Name of the placeholder for historical messages, default is 'llm_chat_history'.
    eoa (Optional[str]): Delimiter between assistant/user in history items.
    eoh (Optional[str]): Delimiter between user-assistant pairs.
    show (bool): Whether to print the final prompt when generating. Default is False.


Examples:
    >>> from lazyllm import Prompter

    >>> p = Prompter(prompt="Answer the following: {question}")
    >>> p.generate_prompt("What is AI?")
    'Answer the following: What is AI?'

    >>> p.generate_prompt({"question": "Define machine learning"})
    'Answer the following: Define machine learning'

    >>> p = Prompter(
    ...     prompt="Instruction: {instruction}",
    ...     chat_prompt="Instruction: {instruction}\\nHistory:\\n{llm_chat_history}",
    ...     history_symbol="llm_chat_history",
    ...     eoa="</s>",
    ...     eoh="|"
    ... )
    >>> p.generate_prompt(
    ...     input={"instruction": "Translate this."},
    ...     history=[["hello", "你好"], ["how are you", "你好吗"]]
    ... )
    'Instruction: Translate this.\\nHistory:\\nhello|你好</s>how are you|你好吗'

    >>> prompt_conf = {
    ...     "prompt": "Task: {task}",
    ...     "response_split": "---"
    ... }
    >>> p = Prompter.from_dict(prompt_conf)
    >>> p.generate_prompt("Summarize this article.")
    'Task: Summarize this article.'

    >>> full_output = "Task: Summarize this article.---This is the summary."
    >>> p.get_response(full_output)
    'This is the summary.'
    """
    def __init__(self, prompt=None, response_split=None, *, chat_prompt=None,
                 history_symbol='llm_chat_history', eoa=None, eoh=None, show=False):
        self._prompt, self._response_split = prompt, response_split
        self._chat_prompt = chat_prompt
        self._history_symbol, self._eoa, self._eoh = history_symbol, eoa, eoh
        self._show = show
        self._prompt_keys = list(set(re.findall(r'\{(\w+)\}', self._prompt))) if prompt else []
        if chat_prompt is not None:
            chat_keys = set(re.findall(r'\{(\w+)\}', self._chat_prompt))
            assert set(self._prompt_keys).issubset(chat_keys)
            assert chat_keys - set(self._prompt_keys) == set([self._history_symbol])
            self.use_history = True
        else:
            self.use_history = history_symbol in self._prompt_keys
            if self.use_history:
                self._prompt_keys.pop(self._prompt_keys.index(history_symbol))
                self._chat_prompt = self._prompt

    @classmethod
    def from_dict(cls, prompt, *, show=False):
        """Initializes a Prompter instance from a prompt configuration dictionary.

Args:
    prompt (Dict): A dictionary containing prompt-related configuration. Must include 'prompt' key.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        assert isinstance(prompt, dict)
        return cls(**prompt, show=show)

    @classmethod
    def from_template(cls, template_name, *, show=False):
        """Loads prompt configuration from a template name and initializes a Prompter instance.

Args:
    template_name (str): Name of the template. Must exist in the `templates` dictionary.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        return cls.from_dict(templates[template_name], show=show)

    @classmethod
    def from_file(cls, fname, *, show=False):
        """Loads prompt configuration from a JSON file and initializes a Prompter instance.

Args:
    fname (str): Path to the JSON configuration file.
    show (bool): Whether to display the generated prompt. Defaults to False.

Returns:
    Prompter: An initialized Prompter instance.
"""
        with open(fname) as fp:
            return cls.from_dict(json.load(fp), show=show)

    @classmethod
    def empty(cls):
        """Creates an empty Prompter instance.

Returns:
    Prompter: A Prompter instance without any prompt configuration.
"""
        return cls()

    def _is_empty(self):
        return self._prompt is None

    def generate_prompt(self, input, history=None, tools=None, label=None, show=False):
        """Generates a formatted prompt string based on input and optional conversation history.

Args:
    input (Union[str, Dict]): User input. Can be a single string or a dictionary with multiple fields.
    history (Optional[List[List[str]]]): Multi-turn dialogue history, e.g., [['u1', 'a1'], ['u2', 'a2']].
    tools (Optional[Any]): Not supported. Must be None.
    label (Optional[str]): Optional label to append to the prompt, commonly used for training.
    show (bool): Whether to print the generated prompt. Defaults to False.

Returns:
    str: The final formatted prompt string.
"""
        if not self._is_empty():
            assert tools is None
            # datasets.formatting.formatting.LazyDict is used in transformers
            if not isinstance(input, collections.abc.Mapping):
                assert len(self._prompt_keys) == 1, (
                    f'invalid prompt `{self._prompt}` for <{type(input)}> input `{input}`')
                input = {self._prompt_keys[0]: input}
            try:
                if self.use_history and isinstance(history, list) and len(history) > 0:
                    assert isinstance(history[0], list), 'history must be list of list'
                    input[self._history_symbol] = self._eoa.join([self._eoh.join(h) for h in history])
                    input = self._chat_prompt.format(**input)
                else:
                    if self.use_history: input[self._history_symbol] = ''
                    input = self._prompt.format(**input)
            except Exception:
                raise RuntimeError(f'Generate prompt failed, and prompt is {self._prompt}; chat-prompt'
                                   f' is {self._chat_prompt}; input is {input}; history is {history}')
            if label: input += label
        if self._show or show: LOG.info(input)
        return input

    def get_response(self, response, input=None):
        """Extracts the actual model answer from the full response returned by an LLM.

Args:
    response (str): The full raw output from the model.
    input (Optional[str]): If the response starts with the input, that part will be removed.

Returns:
    str: The cleaned model response.
"""
        if input and response.startswith(input):
            return response[len(input):]
        return response if self._response_split is None else response.split(self._response_split)[-1]

from_dict(prompt, *, show=False) classmethod

Initializes a Prompter instance from a prompt configuration dictionary.

Parameters:

  • prompt (Dict) –

    A dictionary containing prompt-related configuration. Must include 'prompt' key.

  • show (bool, default: False ) –

    Whether to display the generated prompt. Defaults to False.

Returns:

  • Prompter: An initialized Prompter instance.
Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def from_dict(cls, prompt, *, show=False):
        """Initializes a Prompter instance from a prompt configuration dictionary.

Args:
    prompt (Dict): A dictionary containing prompt-related configuration. Must include 'prompt' key.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        assert isinstance(prompt, dict)
        return cls(**prompt, show=show)

from_template(template_name, *, show=False) classmethod

Loads prompt configuration from a template name and initializes a Prompter instance.

Parameters:

  • template_name (str) –

    Name of the template. Must exist in the templates dictionary.

  • show (bool, default: False ) –

    Whether to display the generated prompt. Defaults to False.

Returns:

  • Prompter: An initialized Prompter instance.
Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def from_template(cls, template_name, *, show=False):
        """Loads prompt configuration from a template name and initializes a Prompter instance.

Args:
    template_name (str): Name of the template. Must exist in the `templates` dictionary.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        return cls.from_dict(templates[template_name], show=show)

from_file(fname, *, show=False) classmethod

Loads prompt configuration from a JSON file and initializes a Prompter instance.

Parameters:

  • fname (str) –

    Path to the JSON configuration file.

  • show (bool, default: False ) –

    Whether to display the generated prompt. Defaults to False.

Returns:

  • Prompter

    An initialized Prompter instance.

Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def from_file(cls, fname, *, show=False):
        """Loads prompt configuration from a JSON file and initializes a Prompter instance.

Args:
    fname (str): Path to the JSON configuration file.
    show (bool): Whether to display the generated prompt. Defaults to False.

Returns:
    Prompter: An initialized Prompter instance.
"""
        with open(fname) as fp:
            return cls.from_dict(json.load(fp), show=show)

empty() classmethod

Creates an empty Prompter instance.

Returns:

  • Prompter

    A Prompter instance without any prompt configuration.

Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def empty(cls):
        """Creates an empty Prompter instance.

Returns:
    Prompter: A Prompter instance without any prompt configuration.
"""
        return cls()

generate_prompt(input, history=None, tools=None, label=None, show=False)

Generates a formatted prompt string based on input and optional conversation history.

Parameters:

  • input (Union[str, Dict]) –

    User input. Can be a single string or a dictionary with multiple fields.

  • history (Optional[List[List[str]]], default: None ) –

    Multi-turn dialogue history, e.g., [['u1', 'a1'], ['u2', 'a2']].

  • tools (Optional[Any], default: None ) –

    Not supported. Must be None.

  • label (Optional[str], default: None ) –

    Optional label to append to the prompt, commonly used for training.

  • show (bool, default: False ) –

    Whether to print the generated prompt. Defaults to False.

Returns:

  • str

    The final formatted prompt string.

Source code in lazyllm/components/prompter/prompter.py
    def generate_prompt(self, input, history=None, tools=None, label=None, show=False):
        """Generates a formatted prompt string based on input and optional conversation history.

Args:
    input (Union[str, Dict]): User input. Can be a single string or a dictionary with multiple fields.
    history (Optional[List[List[str]]]): Multi-turn dialogue history, e.g., [['u1', 'a1'], ['u2', 'a2']].
    tools (Optional[Any]): Not supported. Must be None.
    label (Optional[str]): Optional label to append to the prompt, commonly used for training.
    show (bool): Whether to print the generated prompt. Defaults to False.

Returns:
    str: The final formatted prompt string.
"""
        if not self._is_empty():
            assert tools is None
            # datasets.formatting.formatting.LazyDict is used in transformers
            if not isinstance(input, collections.abc.Mapping):
                assert len(self._prompt_keys) == 1, (
                    f'invalid prompt `{self._prompt}` for <{type(input)}> input `{input}`')
                input = {self._prompt_keys[0]: input}
            try:
                if self.use_history and isinstance(history, list) and len(history) > 0:
                    assert isinstance(history[0], list), 'history must be list of list'
                    input[self._history_symbol] = self._eoa.join([self._eoh.join(h) for h in history])
                    input = self._chat_prompt.format(**input)
                else:
                    if self.use_history: input[self._history_symbol] = ''
                    input = self._prompt.format(**input)
            except Exception:
                raise RuntimeError(f'Generate prompt failed, and prompt is {self._prompt}; chat-prompt'
                                   f' is {self._chat_prompt}; input is {input}; history is {history}')
            if label: input += label
        if self._show or show: LOG.info(input)
        return input

get_response(response, input=None)

Extracts the actual model answer from the full response returned by an LLM.

Parameters:

  • response (str) –

    The full raw output from the model.

  • input (Optional[str], default: None ) –

    If the response starts with the input, that part will be removed.

Returns:

  • str

    The cleaned model response.

Source code in lazyllm/components/prompter/prompter.py
    def get_response(self, response, input=None):
        """Extracts the actual model answer from the full response returned by an LLM.

Args:
    response (str): The full raw output from the model.
    input (Optional[str]): If the response starts with the input, that part will be removed.

Returns:
    str: The cleaned model response.
"""
        if input and response.startswith(input):
            return response[len(input):]
        return response if self._response_split is None else response.split(self._response_split)[-1]

options: heading_level: 3 inherited_members: - generate_prompt - get_response members: false

lazyllm.components.prompter.EmptyPrompter

Bases: LazyLLMPrompterBase

An empty prompt generator that inherits from LazyLLMPrompterBase, and directly returns the original input.

This class performs no formatting and is useful for debugging, testing, or as a placeholder.

Examples:

>>> from lazyllm.components.prompter import EmptyPrompter
>>> prompter = EmptyPrompter()
>>> prompter.generate_prompt("Hello LazyLLM")
'Hello LazyLLM'
>>> prompter.generate_prompt({"query": "Tell me a joke"})
{'query': 'Tell me a joke'}
>>> # Even with additional parameters, the input is returned unchanged
>>> prompter.generate_prompt("No-op", history=[["Hi", "Hello"]], tools=[{"name": "search"}], label="debug")
'No-op'
Source code in lazyllm/components/prompter/builtinPrompt.py
class EmptyPrompter(LazyLLMPrompterBase):
    """An empty prompt generator that inherits from `LazyLLMPrompterBase`, and directly returns the original input.

This class performs no formatting and is useful for debugging, testing, or as a placeholder.


Examples:
    >>> from lazyllm.components.prompter import EmptyPrompter
    >>> prompter = EmptyPrompter()
    >>> prompter.generate_prompt("Hello LazyLLM")
    'Hello LazyLLM'
    >>> prompter.generate_prompt({"query": "Tell me a joke"})
    {'query': 'Tell me a joke'}
    >>> # Even with additional parameters, the input is returned unchanged
    >>> prompter.generate_prompt("No-op", history=[["Hi", "Hello"]], tools=[{"name": "search"}], label="debug")
    'No-op'
    """

    def generate_prompt(self, input, history=None, tools=None, label=None, *, show=False,
                        return_dict: bool = False, format=None):
        """A prompt passthrough implementation that inherits from `LazyLLMPrompterBase`.

This method directly returns the input without any formatting. Useful for debugging, testing, or placeholder use.

Args:
    input (Any): The input to be returned directly as the prompt.
    history (Option[List[List | Dict]]): Dialogue history, ignored. Defaults to None.
    tools (Option[List[Dict]]): Tool definitions, ignored. Defaults to None.
    label (Option[str]): Label, ignored. Defaults to None.
    show (bool): Whether to print the returned prompt. Defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` with a one-time deprecation warning.
    format (Option[str]): If set (e.g. ``"openai"``), returns ``{"messages": [{"role": "user", "content": input}]}``; otherwise returns ``input`` unchanged. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        if format:
            return {'messages': [{'role': 'user', 'content': input}]}
        if self._show or show: LOG.info(input)
        return input

generate_prompt(input, history=None, tools=None, label=None, *, show=False, return_dict=False, format=None)

A prompt passthrough implementation that inherits from LazyLLMPrompterBase.

This method directly returns the input without any formatting. Useful for debugging, testing, or placeholder use.

Parameters:

  • input (Any) –

    The input to be returned directly as the prompt.

  • history (Option[List[List | Dict]], default: None ) –

    Dialogue history, ignored. Defaults to None.

  • tools (Option[List[Dict]], default: None ) –

    Tool definitions, ignored. Defaults to None.

  • label (Option[str], default: None ) –

    Label, ignored. Defaults to None.

  • show (bool, default: False ) –

    Whether to print the returned prompt. Defaults to False.

  • return_dict (bool, default: False ) –

    Deprecated; prefer format="openai". When format is None and this is True, behaves like format="openai" with a one-time deprecation warning.

  • format (Option[str], default: None ) –

    If set (e.g. "openai"), returns {"messages": [{"role": "user", "content": input}]}; otherwise returns input unchanged. If both return_dict and format are provided, format takes precedence.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def generate_prompt(self, input, history=None, tools=None, label=None, *, show=False,
                        return_dict: bool = False, format=None):
        """A prompt passthrough implementation that inherits from `LazyLLMPrompterBase`.

This method directly returns the input without any formatting. Useful for debugging, testing, or placeholder use.

Args:
    input (Any): The input to be returned directly as the prompt.
    history (Option[List[List | Dict]]): Dialogue history, ignored. Defaults to None.
    tools (Option[List[Dict]]): Tool definitions, ignored. Defaults to None.
    label (Option[str]): Label, ignored. Defaults to None.
    show (bool): Whether to print the returned prompt. Defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` with a one-time deprecation warning.
    format (Option[str]): If set (e.g. ``"openai"``), returns ``{"messages": [{"role": "user", "content": input}]}``; otherwise returns ``input`` unchanged. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        if format:
            return {'messages': [{'role': 'user', 'content': input}]}
        if self._show or show: LOG.info(input)
        return input

lazyllm.components.Prompter

Bases: object

Prompt generator class for LLM input formatting. Supports template-based prompting, history injection, and response extraction.

This class allows prompts to be defined via string templates, loaded from dicts, files, or predefined names. It supports history-aware formatting for multi-turn conversations and adapts to both mapping and string input types.

Parameters:

  • prompt (Optional[str], default: None ) –

    Prompt template string with format placeholders.

  • response_split (Optional[str], default: None ) –

    Optional delimiter to split model response and extract useful output.

  • chat_prompt (Optional[str], default: None ) –

    Chat template string, must contain a history placeholder.

  • history_symbol (str, default: 'llm_chat_history' ) –

    Name of the placeholder for historical messages, default is 'llm_chat_history'.

  • eoa (Optional[str], default: None ) –

    Delimiter between assistant/user in history items.

  • eoh (Optional[str], default: None ) –

    Delimiter between user-assistant pairs.

  • show (bool, default: False ) –

    Whether to print the final prompt when generating. Default is False.

Examples:

>>> from lazyllm import Prompter
>>> p = Prompter(prompt="Answer the following: {question}")
>>> p.generate_prompt("What is AI?")
'Answer the following: What is AI?'
>>> p.generate_prompt({"question": "Define machine learning"})
'Answer the following: Define machine learning'
>>> p = Prompter(
...     prompt="Instruction: {instruction}",
...     chat_prompt="Instruction: {instruction}\nHistory:\n{llm_chat_history}",
...     history_symbol="llm_chat_history",
...     eoa="</s>",
...     eoh="|"
... )
>>> p.generate_prompt(
...     input={"instruction": "Translate this."},
...     history=[["hello", "你好"], ["how are you", "你好吗"]]
... )
'Instruction: Translate this.\nHistory:\nhello|你好</s>how are you|你好吗'
>>> prompt_conf = {
...     "prompt": "Task: {task}",
...     "response_split": "---"
... }
>>> p = Prompter.from_dict(prompt_conf)
>>> p.generate_prompt("Summarize this article.")
'Task: Summarize this article.'
>>> full_output = "Task: Summarize this article.---This is the summary."
>>> p.get_response(full_output)
'This is the summary.'
Source code in lazyllm/components/prompter/prompter.py
class Prompter(object):
    """Prompt generator class for LLM input formatting. Supports template-based prompting, history injection, and response extraction.

This class allows prompts to be defined via string templates, loaded from dicts, files, or predefined names.
It supports history-aware formatting for multi-turn conversations and adapts to both mapping and string input types.

Args:
    prompt (Optional[str]): Prompt template string with format placeholders.
    response_split (Optional[str]): Optional delimiter to split model response and extract useful output.
    chat_prompt (Optional[str]): Chat template string, must contain a history placeholder.
    history_symbol (str): Name of the placeholder for historical messages, default is 'llm_chat_history'.
    eoa (Optional[str]): Delimiter between assistant/user in history items.
    eoh (Optional[str]): Delimiter between user-assistant pairs.
    show (bool): Whether to print the final prompt when generating. Default is False.


Examples:
    >>> from lazyllm import Prompter

    >>> p = Prompter(prompt="Answer the following: {question}")
    >>> p.generate_prompt("What is AI?")
    'Answer the following: What is AI?'

    >>> p.generate_prompt({"question": "Define machine learning"})
    'Answer the following: Define machine learning'

    >>> p = Prompter(
    ...     prompt="Instruction: {instruction}",
    ...     chat_prompt="Instruction: {instruction}\\nHistory:\\n{llm_chat_history}",
    ...     history_symbol="llm_chat_history",
    ...     eoa="</s>",
    ...     eoh="|"
    ... )
    >>> p.generate_prompt(
    ...     input={"instruction": "Translate this."},
    ...     history=[["hello", "你好"], ["how are you", "你好吗"]]
    ... )
    'Instruction: Translate this.\\nHistory:\\nhello|你好</s>how are you|你好吗'

    >>> prompt_conf = {
    ...     "prompt": "Task: {task}",
    ...     "response_split": "---"
    ... }
    >>> p = Prompter.from_dict(prompt_conf)
    >>> p.generate_prompt("Summarize this article.")
    'Task: Summarize this article.'

    >>> full_output = "Task: Summarize this article.---This is the summary."
    >>> p.get_response(full_output)
    'This is the summary.'
    """
    def __init__(self, prompt=None, response_split=None, *, chat_prompt=None,
                 history_symbol='llm_chat_history', eoa=None, eoh=None, show=False):
        self._prompt, self._response_split = prompt, response_split
        self._chat_prompt = chat_prompt
        self._history_symbol, self._eoa, self._eoh = history_symbol, eoa, eoh
        self._show = show
        self._prompt_keys = list(set(re.findall(r'\{(\w+)\}', self._prompt))) if prompt else []
        if chat_prompt is not None:
            chat_keys = set(re.findall(r'\{(\w+)\}', self._chat_prompt))
            assert set(self._prompt_keys).issubset(chat_keys)
            assert chat_keys - set(self._prompt_keys) == set([self._history_symbol])
            self.use_history = True
        else:
            self.use_history = history_symbol in self._prompt_keys
            if self.use_history:
                self._prompt_keys.pop(self._prompt_keys.index(history_symbol))
                self._chat_prompt = self._prompt

    @classmethod
    def from_dict(cls, prompt, *, show=False):
        """Initializes a Prompter instance from a prompt configuration dictionary.

Args:
    prompt (Dict): A dictionary containing prompt-related configuration. Must include 'prompt' key.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        assert isinstance(prompt, dict)
        return cls(**prompt, show=show)

    @classmethod
    def from_template(cls, template_name, *, show=False):
        """Loads prompt configuration from a template name and initializes a Prompter instance.

Args:
    template_name (str): Name of the template. Must exist in the `templates` dictionary.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        return cls.from_dict(templates[template_name], show=show)

    @classmethod
    def from_file(cls, fname, *, show=False):
        """Loads prompt configuration from a JSON file and initializes a Prompter instance.

Args:
    fname (str): Path to the JSON configuration file.
    show (bool): Whether to display the generated prompt. Defaults to False.

Returns:
    Prompter: An initialized Prompter instance.
"""
        with open(fname) as fp:
            return cls.from_dict(json.load(fp), show=show)

    @classmethod
    def empty(cls):
        """Creates an empty Prompter instance.

Returns:
    Prompter: A Prompter instance without any prompt configuration.
"""
        return cls()

    def _is_empty(self):
        return self._prompt is None

    def generate_prompt(self, input, history=None, tools=None, label=None, show=False):
        """Generates a formatted prompt string based on input and optional conversation history.

Args:
    input (Union[str, Dict]): User input. Can be a single string or a dictionary with multiple fields.
    history (Optional[List[List[str]]]): Multi-turn dialogue history, e.g., [['u1', 'a1'], ['u2', 'a2']].
    tools (Optional[Any]): Not supported. Must be None.
    label (Optional[str]): Optional label to append to the prompt, commonly used for training.
    show (bool): Whether to print the generated prompt. Defaults to False.

Returns:
    str: The final formatted prompt string.
"""
        if not self._is_empty():
            assert tools is None
            # datasets.formatting.formatting.LazyDict is used in transformers
            if not isinstance(input, collections.abc.Mapping):
                assert len(self._prompt_keys) == 1, (
                    f'invalid prompt `{self._prompt}` for <{type(input)}> input `{input}`')
                input = {self._prompt_keys[0]: input}
            try:
                if self.use_history and isinstance(history, list) and len(history) > 0:
                    assert isinstance(history[0], list), 'history must be list of list'
                    input[self._history_symbol] = self._eoa.join([self._eoh.join(h) for h in history])
                    input = self._chat_prompt.format(**input)
                else:
                    if self.use_history: input[self._history_symbol] = ''
                    input = self._prompt.format(**input)
            except Exception:
                raise RuntimeError(f'Generate prompt failed, and prompt is {self._prompt}; chat-prompt'
                                   f' is {self._chat_prompt}; input is {input}; history is {history}')
            if label: input += label
        if self._show or show: LOG.info(input)
        return input

    def get_response(self, response, input=None):
        """Extracts the actual model answer from the full response returned by an LLM.

Args:
    response (str): The full raw output from the model.
    input (Optional[str]): If the response starts with the input, that part will be removed.

Returns:
    str: The cleaned model response.
"""
        if input and response.startswith(input):
            return response[len(input):]
        return response if self._response_split is None else response.split(self._response_split)[-1]

from_dict(prompt, *, show=False) classmethod

Initializes a Prompter instance from a prompt configuration dictionary.

Parameters:

  • prompt (Dict) –

    A dictionary containing prompt-related configuration. Must include 'prompt' key.

  • show (bool, default: False ) –

    Whether to display the generated prompt. Defaults to False.

Returns:

  • Prompter: An initialized Prompter instance.
Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def from_dict(cls, prompt, *, show=False):
        """Initializes a Prompter instance from a prompt configuration dictionary.

Args:
    prompt (Dict): A dictionary containing prompt-related configuration. Must include 'prompt' key.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        assert isinstance(prompt, dict)
        return cls(**prompt, show=show)

from_template(template_name, *, show=False) classmethod

Loads prompt configuration from a template name and initializes a Prompter instance.

Parameters:

  • template_name (str) –

    Name of the template. Must exist in the templates dictionary.

  • show (bool, default: False ) –

    Whether to display the generated prompt. Defaults to False.

Returns:

  • Prompter: An initialized Prompter instance.
Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def from_template(cls, template_name, *, show=False):
        """Loads prompt configuration from a template name and initializes a Prompter instance.

Args:
    template_name (str): Name of the template. Must exist in the `templates` dictionary.
    show (bool): Whether to display the generated prompt. Defaults to False.

**Returns:**

- Prompter: An initialized Prompter instance.
"""
        return cls.from_dict(templates[template_name], show=show)

from_file(fname, *, show=False) classmethod

Loads prompt configuration from a JSON file and initializes a Prompter instance.

Parameters:

  • fname (str) –

    Path to the JSON configuration file.

  • show (bool, default: False ) –

    Whether to display the generated prompt. Defaults to False.

Returns:

  • Prompter

    An initialized Prompter instance.

Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def from_file(cls, fname, *, show=False):
        """Loads prompt configuration from a JSON file and initializes a Prompter instance.

Args:
    fname (str): Path to the JSON configuration file.
    show (bool): Whether to display the generated prompt. Defaults to False.

Returns:
    Prompter: An initialized Prompter instance.
"""
        with open(fname) as fp:
            return cls.from_dict(json.load(fp), show=show)

empty() classmethod

Creates an empty Prompter instance.

Returns:

  • Prompter

    A Prompter instance without any prompt configuration.

Source code in lazyllm/components/prompter/prompter.py
    @classmethod
    def empty(cls):
        """Creates an empty Prompter instance.

Returns:
    Prompter: A Prompter instance without any prompt configuration.
"""
        return cls()

generate_prompt(input, history=None, tools=None, label=None, show=False)

Generates a formatted prompt string based on input and optional conversation history.

Parameters:

  • input (Union[str, Dict]) –

    User input. Can be a single string or a dictionary with multiple fields.

  • history (Optional[List[List[str]]], default: None ) –

    Multi-turn dialogue history, e.g., [['u1', 'a1'], ['u2', 'a2']].

  • tools (Optional[Any], default: None ) –

    Not supported. Must be None.

  • label (Optional[str], default: None ) –

    Optional label to append to the prompt, commonly used for training.

  • show (bool, default: False ) –

    Whether to print the generated prompt. Defaults to False.

Returns:

  • str

    The final formatted prompt string.

Source code in lazyllm/components/prompter/prompter.py
    def generate_prompt(self, input, history=None, tools=None, label=None, show=False):
        """Generates a formatted prompt string based on input and optional conversation history.

Args:
    input (Union[str, Dict]): User input. Can be a single string or a dictionary with multiple fields.
    history (Optional[List[List[str]]]): Multi-turn dialogue history, e.g., [['u1', 'a1'], ['u2', 'a2']].
    tools (Optional[Any]): Not supported. Must be None.
    label (Optional[str]): Optional label to append to the prompt, commonly used for training.
    show (bool): Whether to print the generated prompt. Defaults to False.

Returns:
    str: The final formatted prompt string.
"""
        if not self._is_empty():
            assert tools is None
            # datasets.formatting.formatting.LazyDict is used in transformers
            if not isinstance(input, collections.abc.Mapping):
                assert len(self._prompt_keys) == 1, (
                    f'invalid prompt `{self._prompt}` for <{type(input)}> input `{input}`')
                input = {self._prompt_keys[0]: input}
            try:
                if self.use_history and isinstance(history, list) and len(history) > 0:
                    assert isinstance(history[0], list), 'history must be list of list'
                    input[self._history_symbol] = self._eoa.join([self._eoh.join(h) for h in history])
                    input = self._chat_prompt.format(**input)
                else:
                    if self.use_history: input[self._history_symbol] = ''
                    input = self._prompt.format(**input)
            except Exception:
                raise RuntimeError(f'Generate prompt failed, and prompt is {self._prompt}; chat-prompt'
                                   f' is {self._chat_prompt}; input is {input}; history is {history}')
            if label: input += label
        if self._show or show: LOG.info(input)
        return input

get_response(response, input=None)

Extracts the actual model answer from the full response returned by an LLM.

Parameters:

  • response (str) –

    The full raw output from the model.

  • input (Optional[str], default: None ) –

    If the response starts with the input, that part will be removed.

Returns:

  • str

    The cleaned model response.

Source code in lazyllm/components/prompter/prompter.py
    def get_response(self, response, input=None):
        """Extracts the actual model answer from the full response returned by an LLM.

Args:
    response (str): The full raw output from the model.
    input (Optional[str]): If the response starts with the input, that part will be removed.

Returns:
    str: The cleaned model response.
"""
        if input and response.startswith(input):
            return response[len(input):]
        return response if self._response_split is None else response.split(self._response_split)[-1]

lazyllm.components.AlpacaPrompter

Bases: LazyLLMPrompterBase

Alpaca-style Prompter, supports tool calls, does not support historical dialogue.

Parameters:

  • instruction (Option[str], default: None ) –

    Task instructions for the large model, with at least one fillable slot (e.g. {instruction}). Or use a dictionary to specify the system and user instructions.

  • extra_keys (Option[List], default: None ) –

    Additional fields that will be filled with user input.

  • show (bool, default: False ) –

    Flag indicating whether to print the generated Prompt, default is False.

  • tools (Option[list], default: None ) –

    Tool-set which is provived for LLMs, default is None.

Examples:

>>> from lazyllm import AlpacaPrompter
>>> p = AlpacaPrompter('hello world {instruction}')
>>> p.generate_prompt('this is my input')
'You are an AI-Agent developed by LazyLLM.\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\n\n ### Instruction:\nhello world this is my input\n\n\n### Response:\n'
>>> p.generate_prompt('this is my input', format='openai')
{'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\n\n ### Instruction:\nhello world this is my input\n\n'}, {'role': 'user', 'content': ''}]}
>>>
>>> p = AlpacaPrompter('hello world {instruction}, {input}', extra_keys=['knowledge'])
>>> p.generate_prompt(dict(instruction='hello world', input='my input', knowledge='lazyllm'))
'You are an AI-Agent developed by LazyLLM.\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\n\n ### Instruction:\nhello world hello world, my input\n\nHere are some extra messages you can referred to:\n\n### knowledge:\nlazyllm\n\n\n### Response:\n'
>>> p.generate_prompt(dict(instruction='hello world', input='my input', knowledge='lazyllm'), format='openai')
{'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\n\n ### Instruction:\nhello world hello world, my input\n\nHere are some extra messages you can referred to:\n\n### knowledge:\nlazyllm\n\n'}, {'role': 'user', 'content': ''}]}
>>>
>>> p = AlpacaPrompter(dict(system="hello world", user="this is user instruction {input}"))
>>> p.generate_prompt(dict(input="my input"))
'You are an AI-Agent developed by LazyLLM.\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\n\n ### Instruction:\nhello word\n\n\n\nthis is user instruction my input### Response:\n'
>>> p.generate_prompt(dict(input="my input"), format='openai')
{'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\n\n ### Instruction:\nhello world'}, {'role': 'user', 'content': 'this is user instruction my input'}]}
Source code in lazyllm/components/prompter/alpacaPrompter.py
class AlpacaPrompter(LazyLLMPrompterBase):
    """Alpaca-style Prompter, supports tool calls, does not support historical dialogue.


Args:
    instruction (Option[str]): Task instructions for the large model, with at least one fillable slot (e.g. ``{instruction}``). Or use a dictionary to specify the ``system`` and ``user`` instructions.
    extra_keys (Option[List]): Additional fields that will be filled with user input.
    show (bool): Flag indicating whether to print the generated Prompt, default is False.
    tools (Option[list]): Tool-set which is provived for LLMs, default is None.


Examples:
    >>> from lazyllm import AlpacaPrompter
    >>> p = AlpacaPrompter('hello world {instruction}')
    >>> p.generate_prompt('this is my input')
    'You are an AI-Agent developed by LazyLLM.\\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\\n\\n ### Instruction:\\nhello world this is my input\\n\\n\\n### Response:\\n'
    >>> p.generate_prompt('this is my input', format='openai')
    {'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\\n\\n ### Instruction:\\nhello world this is my input\\n\\n'}, {'role': 'user', 'content': ''}]}
    >>>
    >>> p = AlpacaPrompter('hello world {instruction}, {input}', extra_keys=['knowledge'])
    >>> p.generate_prompt(dict(instruction='hello world', input='my input', knowledge='lazyllm'))
    'You are an AI-Agent developed by LazyLLM.\\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\\n\\n ### Instruction:\\nhello world hello world, my input\\n\\nHere are some extra messages you can referred to:\\n\\n### knowledge:\\nlazyllm\\n\\n\\n### Response:\\n'
    >>> p.generate_prompt(dict(instruction='hello world', input='my input', knowledge='lazyllm'), format='openai')
    {'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\\n\\n ### Instruction:\\nhello world hello world, my input\\n\\nHere are some extra messages you can referred to:\\n\\n### knowledge:\\nlazyllm\\n\\n'}, {'role': 'user', 'content': ''}]}
    >>>
    >>> p = AlpacaPrompter(dict(system="hello world", user="this is user instruction {input}"))
    >>> p.generate_prompt(dict(input="my input"))
    'You are an AI-Agent developed by LazyLLM.\\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\\n\\n ### Instruction:\\nhello word\\n\\n\\n\\nthis is user instruction my input### Response:\\n'
    >>> p.generate_prompt(dict(input="my input"), format='openai')
    {'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\\nBelow is an instruction that describes a task, paired with extra messages such as input that provides further context if possible. Write a response that appropriately completes the request.\\n\\n ### Instruction:\\nhello world'}, {'role': 'user', 'content': 'this is user instruction my input'}]}
    """
    def __init__(self, instruction: Union[None, str, Dict[str, str]] = None, extra_keys: Union[None, List[str]] = None,
                 show: bool = False, tools: Optional[List] = None):
        super(__class__, self).__init__(show, tools=tools)
        extra_keys_template = LazyLLMPrompterBase._get_extro_key_template(extra_keys)
        _prefix = ('Below is an instruction that describes a task, paired with extra messages such as '
                   'input that provides further context if possible. Write a response that appropriately '
                   'completes the request.\n\n### Instruction:\n')
        if isinstance(instruction, dict):
            user_suffix = ('\n\n' + extra_keys_template) if extra_keys_template else ''
            splice_struction = instruction.get('system', '') + \
                AlpacaPrompter.ISA + instruction.get('user', '') + user_suffix + AlpacaPrompter.ISE
            instruction = splice_struction
            instruction_template = _prefix + instruction
        else:
            instruction_template = (_prefix + (instruction if instruction else '')
                                    + '\n\n' + extra_keys_template)
        self._init_prompt('{system}\n{instruction}\n{tools}\n{user}### Response:\n',
                          instruction_template,
                          '### Response:\n')

    def _check_values(self, instruction, input, history, tools):
        assert not history, f'Chat history is not supported in {__class__}.'
        assert not input, 'All keys should in instruction or extra-keys'

generate_prompt(input=None, history=None, tools=None, label=None, *, show=False, return_dict=False, format=None)

Generate a corresponding Prompt based on user input.

Parameters:

  • input (Option[str | Dict], default: None ) –

    The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.

  • history (Option[List[List | Dict]], default: None ) –

    Historical conversation, can be [[u, s], [u, s]] or in openai's history format, defaults to None.

  • tools (Option[List[Dict]], default: None ) –

    A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.

  • label (Option[str], default: None ) –

    Label, used during fine-tuning or training, defaults to None.

  • show (bool, default: False ) –

    Flag indicating whether to print the generated Prompt, defaults to False.

  • return_dict (bool, default: False ) –

    Deprecated; prefer format="openai". When format is None and this is True, behaves like format="openai" and emits a one-time deprecation warning. Defaults to False.

  • format (Option[str], default: None ) –

    Output structure. None returns a concatenated string for local/finetuning use; "openai" returns OpenAI-style messages (and optional tools); "anthropic" returns Anthropic-style system/messages. OnlineChatModule passes the appropriate value. If both return_dict and format are provided, format takes precedence. Defaults to None.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def generate_prompt(self, input: Union[str, List, Dict[str, str], None] = None,
                        history: List[Union[List[str], Dict[str, Any]]] = None,
                        tools: Union[List[Dict[str, Any]], None] = None,
                        label: Union[str, None] = None,
                        *, show: bool = False, return_dict: bool = False,
                        format: Optional[str] = None) -> Union[str, Dict]:
        """
Generate a corresponding Prompt based on user input.

Args:
    input (Option[str | Dict]): The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.
    history (Option[List[List | Dict]]): Historical conversation, can be ``[[u, s], [u, s]]`` or in openai's history format, defaults to None.
    tools (Option[List[Dict]]): A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.
    label (Option[str]): Label, used during fine-tuning or training, defaults to None.
    show (bool): Flag indicating whether to print the generated Prompt, defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` and emits a one-time deprecation warning. Defaults to False.
    format (Option[str]): Output structure. ``None`` returns a concatenated string for local/finetuning use; ``"openai"`` returns OpenAI-style ``messages`` (and optional ``tools``); ``"anthropic"`` returns Anthropic-style ``system``/``messages``. ``OnlineChatModule`` passes the appropriate value. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence. Defaults to ``None``.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        input = copy.deepcopy(input)
        if self._pre_hook:
            input, history, tools, label = self._pre_hook(input, history, tools, label)
        tools = self._resolve_dynamic(tools or self._tools, input, history)
        skills = self._resolve_dynamic(self._skills, input, history)
        for_chat_api = bool(format)
        instruction, input = self._get_instruction_and_input(input, for_chat_api=for_chat_api, tools=tools)
        history = self._get_histories(history, for_chat_api=for_chat_api)
        tools = self._get_tools(tools, for_chat_api=for_chat_api)
        self._check_values(instruction, input, history, tools)
        instruction, user_instruction = self._split_instruction(instruction)
        if format == 'anthropic':
            func = self._generate_prompt_anthropic_impl
        elif format == 'openai':
            func = self._generate_prompt_dict_impl
        else:
            func = self._generate_prompt_impl
        result = func(instruction, input, user_instruction, history, tools, label, skills)
        if self._show or show: LOG.info(result)
        return result

get_response(output, input=None)

Used to truncate the Prompt, keeping only valuable output.

Parameters:

  • output (str) –

    The output of the large model.

  • input (Option[str], default: None ) –

    The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def get_response(self, output: str, input: Union[str, None] = None) -> str:
        """Used to truncate the Prompt, keeping only valuable output.

Args:
        output (str): The output of the large model.
        input (Option[str]): The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.
"""
        if input and output.startswith(input):
            return output[len(input):]
        return output if getattr(self, '_split', None) is None else output.split(self._split)[-1]

lazyllm.components.ChatPrompter

Bases: LazyLLMPrompterBase

Prompt constructor for multi-turn dialogue, inherits from LazyLLMPrompterBase.

Supports tool calling, conversation history, and customizable instruction templates. Accepts instructions as either plain string or dict with separate system and user components, automatically merging them into a unified prompt template. Also supports injecting extra user-defined fields.

Parameters:

  • instruction (Option[str | Dict[str, str]], default: None ) –

    The prompt instruction template. Can be a string or a dict with system and user keys. If a dict is given, the components will be merged using special delimiters.

  • extra_keys (Option[List[str]], default: None ) –

    A list of additional keys that will be filled by user input to enrich the prompt context.

  • show (bool, default: False ) –

    Whether to print the generated prompt. Default is False.

  • tools (Option[List], default: None ) –

    A list of tools available to the model for function-calling tasks. Default is None.

  • history (Option[List[List[str]]], default: None ) –

    Dialogue history in the format [[user, assistant], ...]. Used to provide conversational memory. Default is None.

Examples:

>>> from lazyllm import ChatPrompter
  • Simple instruction string
>>> p = ChatPrompter('hello world')
>>> p.generate_prompt('this is my input')
'You are an AI-Agent developed by LazyLLM.hello world\nthis is my input\n'
>>> p.generate_prompt('this is my input', format='openai')
{'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\nhello world'}, {'role': 'user', 'content': 'this is my input'}]}
  • Using extra_keys
>>> p = ChatPrompter('hello world {instruction}', extra_keys=['knowledge'])
>>> p.generate_prompt({
...     'instruction': 'this is my ins',
...     'input': 'this is my inp',
...     'knowledge': 'LazyLLM-Knowledge'
... })
'You are an AI-Agent developed by LazyLLM.hello world this is my ins\nHere are some extra messages you can referred to:\n\n### knowledge:\nLazyLLM-Knowledge\nthis is my inp\n'
  • With conversation history
>>> p.generate_prompt({
...     'instruction': 'this is my ins',
...     'input': 'this is my inp',
...     'knowledge': 'LazyLLM-Knowledge'
... }, history=[['s1', 'e1'], ['s2', 'e2']])
'You are an AI-Agent developed by LazyLLM.hello world this is my ins\nHere are some extra messages you can referred to:\n\n### knowledge:\nLazyLLM-Knowledge\ns1|e1\ns2|e2\nthis is my inp\n'
  • Using dict format for system/user instructions
>>> p = ChatPrompter(dict(system="hello world", user="this is user instruction {input}"))
>>> p.generate_prompt({'input': "my input", 'query': "this is user query"})
'You are an AI-Agent developed by LazyLLM.hello world\nthis is user instruction my input this is user query\n'
>>> p.generate_prompt({'input': "my input", 'query': "this is user query"}, format='openai')
{'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\nhello world'}, {'role': 'user', 'content': 'this is user instruction my input this is user query'}]}
Source code in lazyllm/components/prompter/chatPrompter.py
class ChatPrompter(LazyLLMPrompterBase):
    """Prompt constructor for multi-turn dialogue, inherits from `LazyLLMPrompterBase`.

Supports tool calling, conversation history, and customizable instruction templates. Accepts instructions as either plain string or dict with separate `system` and `user` components, automatically merging them into a unified prompt template. Also supports injecting extra user-defined fields.

Args:
    instruction (Option[str | Dict[str, str]]): The prompt instruction template. Can be a string or a dict with `system` and `user` keys. If a dict is given, the components will be merged using special delimiters.
    extra_keys (Option[List[str]]): A list of additional keys that will be filled by user input to enrich the prompt context.
    show (bool): Whether to print the generated prompt. Default is False.
    tools (Option[List]): A list of tools available to the model for function-calling tasks. Default is None.
    history (Option[List[List[str]]]): Dialogue history in the format [[user, assistant], ...]. Used to provide conversational memory. Default is None.


Examples:
    >>> from lazyllm import ChatPrompter

    - Simple instruction string
    >>> p = ChatPrompter('hello world')
    >>> p.generate_prompt('this is my input')
    'You are an AI-Agent developed by LazyLLM.hello world\\nthis is my input\\n'

    >>> p.generate_prompt('this is my input', format='openai')
    {'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\\nhello world'}, {'role': 'user', 'content': 'this is my input'}]}

    - Using extra_keys
    >>> p = ChatPrompter('hello world {instruction}', extra_keys=['knowledge'])
    >>> p.generate_prompt({
    ...     'instruction': 'this is my ins',
    ...     'input': 'this is my inp',
    ...     'knowledge': 'LazyLLM-Knowledge'
    ... })
    'You are an AI-Agent developed by LazyLLM.hello world this is my ins\\nHere are some extra messages you can referred to:\\n\\n### knowledge:\\nLazyLLM-Knowledge\\nthis is my inp\\n'

    - With conversation history
    >>> p.generate_prompt({
    ...     'instruction': 'this is my ins',
    ...     'input': 'this is my inp',
    ...     'knowledge': 'LazyLLM-Knowledge'
    ... }, history=[['s1', 'e1'], ['s2', 'e2']])
    'You are an AI-Agent developed by LazyLLM.hello world this is my ins\\nHere are some extra messages you can referred to:\\n\\n### knowledge:\\nLazyLLM-Knowledge\\ns1|e1\\ns2|e2\\nthis is my inp\\n'

    - Using dict format for system/user instructions
    >>> p = ChatPrompter(dict(system="hello world", user="this is user instruction {input}"))
    >>> p.generate_prompt({'input': "my input", 'query': "this is user query"})
    'You are an AI-Agent developed by LazyLLM.hello world\\nthis is user instruction my input this is user query\\n'

    >>> p.generate_prompt({'input': "my input", 'query': "this is user query"}, format='openai')
    {'messages': [{'role': 'system', 'content': 'You are an AI-Agent developed by LazyLLM.\\nhello world'}, {'role': 'user', 'content': 'this is user instruction my input this is user query'}]}
    """
    def __init__(self, instruction: Union[None, str, Dict[str, str]] = None, extra_keys: Union[None, List[str]] = None,
                 show: bool = False, tools: Optional[List] = None, skills: Optional[List] = None,
                 history: Optional[List[List[str]]] = None, *, enable_system: bool = True):
        super(__class__, self).__init__(show, tools=tools, skills=skills, history=history, enable_system=enable_system)
        extra_keys_template = LazyLLMPrompterBase._get_extro_key_template(extra_keys)
        if isinstance(instruction, dict):
            splice_instruction = instruction.get('system', '') + \
                ChatPrompter.ISA + instruction.get('user', '') + extra_keys_template + ChatPrompter.ISE
            instruction = splice_instruction
            instruction_template = f'{instruction}\n' if instruction else ''
        else:
            instruction_template = f'{instruction}\n{extra_keys_template}\n' if instruction else ''
        self._init_prompt(
            '{sos}{system}{instruction}{skills}{tools}{eos}\n\n{history}\n{soh}\n{user}{input}\n{eoh}{soa}\n',
            instruction_template)

    @property
    def _split(self): return f'{self._soa}\n' if self._soa else None

generate_prompt(input=None, history=None, tools=None, label=None, *, show=False, return_dict=False, format=None)

Generate a corresponding Prompt based on user input.

Parameters:

  • input (Option[str | Dict], default: None ) –

    The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.

  • history (Option[List[List | Dict]], default: None ) –

    Historical conversation, can be [[u, s], [u, s]] or in openai's history format, defaults to None.

  • tools (Option[List[Dict]], default: None ) –

    A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.

  • label (Option[str], default: None ) –

    Label, used during fine-tuning or training, defaults to None.

  • show (bool, default: False ) –

    Flag indicating whether to print the generated Prompt, defaults to False.

  • return_dict (bool, default: False ) –

    Deprecated; prefer format="openai". When format is None and this is True, behaves like format="openai" and emits a one-time deprecation warning. Defaults to False.

  • format (Option[str], default: None ) –

    Output structure. None returns a concatenated string for local/finetuning use; "openai" returns OpenAI-style messages (and optional tools); "anthropic" returns Anthropic-style system/messages. OnlineChatModule passes the appropriate value. If both return_dict and format are provided, format takes precedence. Defaults to None.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def generate_prompt(self, input: Union[str, List, Dict[str, str], None] = None,
                        history: List[Union[List[str], Dict[str, Any]]] = None,
                        tools: Union[List[Dict[str, Any]], None] = None,
                        label: Union[str, None] = None,
                        *, show: bool = False, return_dict: bool = False,
                        format: Optional[str] = None) -> Union[str, Dict]:
        """
Generate a corresponding Prompt based on user input.

Args:
    input (Option[str | Dict]): The input from the prompter, if it's a dict, it will be filled into the slots of the instruction; if it's a str, it will be used as input.
    history (Option[List[List | Dict]]): Historical conversation, can be ``[[u, s], [u, s]]`` or in openai's history format, defaults to None.
    tools (Option[List[Dict]]): A collection of tools that can be used, used when the large model performs FunctionCall, defaults to None.
    label (Option[str]): Label, used during fine-tuning or training, defaults to None.
    show (bool): Flag indicating whether to print the generated Prompt, defaults to False.
    return_dict (bool): Deprecated; prefer ``format="openai"``. When ``format`` is ``None`` and this is True, behaves like ``format="openai"`` and emits a one-time deprecation warning. Defaults to False.
    format (Option[str]): Output structure. ``None`` returns a concatenated string for local/finetuning use; ``"openai"`` returns OpenAI-style ``messages`` (and optional ``tools``); ``"anthropic"`` returns Anthropic-style ``system``/``messages``. ``OnlineChatModule`` passes the appropriate value. If both ``return_dict`` and ``format`` are provided, ``format`` takes precedence. Defaults to ``None``.
"""
        if return_dict and format is None:
            LOG.log_once('return_dict is deprecated, use format="openai" instead.', level='warning')
            format = 'openai'
        input = copy.deepcopy(input)
        if self._pre_hook:
            input, history, tools, label = self._pre_hook(input, history, tools, label)
        tools = self._resolve_dynamic(tools or self._tools, input, history)
        skills = self._resolve_dynamic(self._skills, input, history)
        for_chat_api = bool(format)
        instruction, input = self._get_instruction_and_input(input, for_chat_api=for_chat_api, tools=tools)
        history = self._get_histories(history, for_chat_api=for_chat_api)
        tools = self._get_tools(tools, for_chat_api=for_chat_api)
        self._check_values(instruction, input, history, tools)
        instruction, user_instruction = self._split_instruction(instruction)
        if format == 'anthropic':
            func = self._generate_prompt_anthropic_impl
        elif format == 'openai':
            func = self._generate_prompt_dict_impl
        else:
            func = self._generate_prompt_impl
        result = func(instruction, input, user_instruction, history, tools, label, skills)
        if self._show or show: LOG.info(result)
        return result

get_response(output, input=None)

Used to truncate the Prompt, keeping only valuable output.

Parameters:

  • output (str) –

    The output of the large model.

  • input (Option[str], default: None ) –

    The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.

Source code in lazyllm/components/prompter/builtinPrompt.py
    def get_response(self, output: str, input: Union[str, None] = None) -> str:
        """Used to truncate the Prompt, keeping only valuable output.

Args:
        output (str): The output of the large model.
        input (Option[str]): The input of the large model. If this parameter is specified, any part of the output that includes the input will be completely truncated. Defaults to None.
"""
        if input and output.startswith(input):
            return output[len(input):]
        return output if getattr(self, '_split', None) is None else output.split(self._split)[-1]

MultiModal

Text to Image

lazyllm.components.StableDiffusionDeploy

Bases: LazyLLMDeployBase

Stable Diffusion Model Deployment Class. This class is used to deploy the stable diffusion model to a specified server for network invocation.

Parameters:

  • launcher (Optional[LazyLLMLaunchersBase], default: None ) –

    Launcher instance. Defaults to None

  • log_path (Optional[str], default: None ) –

    Log file path. Defaults to None

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code. Defaults to True

  • port (Optional[int], default: None ) –

    Service port number. Defaults to None

Examples:

>>> from lazyllm import launchers, UrlModule
>>> from lazyllm.components import StableDiffusionDeploy
>>> deployer = StableDiffusionDeploy(launchers.remote())
>>> url = deployer(base_model='stable-diffusion-3-medium')
>>> model = UrlModule(url=url)
>>> res = model('a tiny cat.')
>>> print(res)
... <lazyllm-query>{"query": "", "files": ["path/to/sd3/image_xxx.png"]}
Source code in lazyllm/components/deploy/stable_diffusion/stable_diffusion3.py
class StableDiffusionDeploy(LazyLLMDeployBase):
    """Stable Diffusion Model Deployment Class. This class is used to deploy the stable diffusion model to a specified server for network invocation.

Args:
    launcher (Optional[LazyLLMLaunchersBase], optional): Launcher instance. Defaults to ``None``
    log_path (Optional[str], optional): Log file path. Defaults to ``None``
    trust_remote_code (bool, optional): Whether to trust remote code. Defaults to ``True``
    port (Optional[int], optional): Service port number. Defaults to ``None``


Examples:
    >>> from lazyllm import launchers, UrlModule
    >>> from lazyllm.components import StableDiffusionDeploy
    >>> deployer = StableDiffusionDeploy(launchers.remote())
    >>> url = deployer(base_model='stable-diffusion-3-medium')
    >>> model = UrlModule(url=url)
    >>> res = model('a tiny cat.')
    >>> print(res)
    ... <lazyllm-query>{"query": "", "files": ["path/to/sd3/image_xxx.png"]}
    """
    message_format = None
    keys_name_handle = None
    default_headers = {'Content-Type': 'application/json'}

    def __init__(self, launcher: Optional[LazyLLMLaunchersBase] = None,
                 log_path: Optional[str] = None, trust_remote_code: bool = True, port: Optional[int] = None, **kw):
        super().__init__(launcher=launcher)
        self._log_path = log_path
        self._trust_remote_code = trust_remote_code
        self._port = port

    def __call__(self, finetuned_model=None, base_model=None):
        if not finetuned_model:
            finetuned_model = base_model
        elif not os.path.exists(finetuned_model) or \
            not any(file.endswith(('.bin', '.safetensors'))
                    for _, _, filenames in os.walk(finetuned_model) for file in filenames):
            LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                        f'base_model({base_model}) will be used')
            finetuned_model = base_model
        return lazyllm.deploy.RelayServer(port=self._port, func=_StableDiffusion3(finetuned_model),
                                          launcher=self._launcher, log_path=self._log_path, cls='stable_diffusion')()

Visual Question Answering

Reference LMDeploy, which supports the Visual Question Answering model.

Text to Sound

lazyllm.components.TTSDeploy

Source code in lazyllm/components/deploy/text_to_speech/__init__.py
class TTSDeploy:

    def __new__(cls, name, **kwarg):
        return cls.get_deploy_cls(name)(**kwarg)

    @classmethod
    def get_deploy_cls(cls, name):
        name = name.lower()
        if name == 'bark':
            return BarkDeploy
        elif name in ('chattts', 'chattts-new'):
            raise RuntimeError('ChatTTS is deprecated and no longer supported.')
            return ChatTTSDeploy
        elif name.startswith('musicgen'):
            return MusicGenDeploy
        else:
            raise RuntimeError(f'Not support model: {name}')

lazyllm.components.ChatTTSDeploy

Bases: TTSBase

ChatTTS Model Deployment Class.

Other Parameters:

  • keys_name_handle (dict) –

    A key mapping dictionary used to handle parameter name conversion between internal and external API interfaces. Defaults to {'inputs': 'inputs'}.

  • message_format (dict) –

    The request payload structure containing three main sections:

    • inputs (str): The raw text content to be synthesized into speech.

    • refinetext (dict): Text refinement and stylization parameters controlling speech expression:

      • prompt (str): Voice style control tags, e.g., "[oral_2][laugh_0][break_6]"

      • top_P (float): Nucleus sampling parameter for decoding strategy (default: 0.7)

      • top_K (int): Top-K sampling parameter (default: 20)

      • temperature (float): Sampling temperature controlling randomness (default: 0.7)

      • repetition_penalty (float): Repetition penalty to avoid redundant generation (default: 1.0)

      • max_new_token (int): Maximum number of tokens to generate (default: 384)

      • min_new_token (int): Minimum number of tokens to generate (default: 0)

      • show_tqdm (bool): Whether to display progress bar during generation (default: True)

      • ensure_non_empty (bool): Ensure non-empty generation result (default: True)

    • infercode (dict): Inference and encoding parameters affecting audio quality:

      • prompt (str): Voice speed control tags, e.g., "[speed_5]"

      • spk_emb (Optional): Speaker embedding vector for specifying voice characteristics (default: None)

      • temperature (float): Sampling temperature for audio generation (default: 0.3)

      • repetition_penalty (float): Repetition penalty coefficient (default: 1.05)

      • max_new_token (int): Maximum number of tokens for audio generation (default: 2048)

Examples:

>>> from lazyllm import launchers, UrlModule
>>> from lazyllm.components import ChatTTSDeploy
>>> deployer = ChatTTSDeploy(launchers.remote())
>>> url = deployer(base_model='ChatTTS')
>>> model = UrlModule(url=url)
>>> res = model('Hello World!')
>>> print(res)
... <lazyllm-query>{"query": "", "files": ["path/to/chattts/sound_xxx.wav"]}
Source code in lazyllm/components/deploy/text_to_speech/chattts.py
class ChatTTSDeploy(TTSBase):
    """ChatTTS Model Deployment Class.

Keyword Args: 
    keys_name_handle (dict): A key mapping dictionary used to handle parameter name conversion between 
                            internal and external API interfaces. Defaults to `{'inputs': 'inputs'}`.

    message_format (dict): The request payload structure containing three main sections: 

        - `inputs` (str): The raw text content to be synthesized into speech. 

        - `refinetext` (dict): Text refinement and stylization parameters controlling speech expression: 

            * `prompt` (str): Voice style control tags, e.g., "[oral_2][laugh_0][break_6]" 

            * `top_P` (float): Nucleus sampling parameter for decoding strategy (default: 0.7) 

            * `top_K` (int): Top-K sampling parameter (default: 20) 

            * `temperature` (float): Sampling temperature controlling randomness (default: 0.7) 

            * `repetition_penalty` (float): Repetition penalty to avoid redundant generation (default: 1.0) 

            * `max_new_token` (int): Maximum number of tokens to generate (default: 384) 

            * `min_new_token` (int): Minimum number of tokens to generate (default: 0) 

            * `show_tqdm` (bool): Whether to display progress bar during generation (default: True) 

            * `ensure_non_empty` (bool): Ensure non-empty generation result (default: True) 

        - `infercode` (dict): Inference and encoding parameters affecting audio quality: 

            * `prompt` (str): Voice speed control tags, e.g., "[speed_5]" 

            * `spk_emb` (Optional): Speaker embedding vector for specifying voice characteristics (default: None) 

            * `temperature` (float): Sampling temperature for audio generation (default: 0.3) 

            * `repetition_penalty` (float): Repetition penalty coefficient (default: 1.05) 

            * `max_new_token` (int): Maximum number of tokens for audio generation (default: 2048) 



Examples:
    >>> from lazyllm import launchers, UrlModule
    >>> from lazyllm.components import ChatTTSDeploy
    >>> deployer = ChatTTSDeploy(launchers.remote())
    >>> url = deployer(base_model='ChatTTS')
    >>> model = UrlModule(url=url)
    >>> res = model('Hello World!')
    >>> print(res)
    ... <lazyllm-query>{"query": "", "files": ["path/to/chattts/sound_xxx.wav"]}
    """
    keys_name_handle = {
        'inputs': 'inputs',
    }
    message_format = {
        'inputs': 'Who are you ?',
        'refinetext': {
            'prompt': '[oral_2][laugh_0][break_6]',
            'top_P': 0.7,
            'top_K': 20,
            'temperature': 0.7,
            'repetition_penalty': 1.0,
            'max_new_token': 384,
            'min_new_token': 0,
            'show_tqdm': True,
            'ensure_non_empty': True,
        },
        'infercode': {
            'prompt': '[speed_5]',
            'spk_emb': None,
            'temperature': 0.3,
            'repetition_penalty': 1.05,
            'max_new_token': 2048,
        }

    }
    default_headers = {'Content-Type': 'application/json'}
    func = _ChatTTSModule

lazyllm.components.BarkDeploy

Bases: TTSBase

Bark Model Deployment Class. This class is used to deploy the Bark model to a specified server for network invocation.

__init__(self, launcher=None) Constructor, initializes the deployment class.

Parameters:

  • launcher (launcher, default: None ) –

    An instance of the launcher used to start the remote service.

__call__(self, finetuned_model=None, base_model=None) Deploys the model and returns the remote service address.

Parameters:

  • finetuned_model (str) –

    If provided, this model will be used for deployment; if not provided or the path is invalid, base_model will be used.

  • base_model (str) –

    The default model, which will be used for deployment if finetuned_model is invalid.

  • Return (str) –

    The URL address of the remote service.

Notes
  • Input for infer: str. The text corresponding to the audio to be generated.
  • Return of infer: The string encoded from the generated file paths, starting with the encoding flag "", followed by the serialized dictionary. The key files in the dictionary stores a list, with elements being the paths of the generated audio files.
  • Supported models: bark

Examples:

>>> from lazyllm import launchers, UrlModule
>>> from lazyllm.components import BarkDeploy
>>> deployer = BarkDeploy(launchers.remote())
>>> url = deployer(base_model='bark')
>>> model = UrlModule(url=url)
>>> res = model('Hello World!')
>>> print(res)
... <lazyllm-query>{"query": "", "files": ["path/to/bark/sound_xxx.wav"]}
Source code in lazyllm/components/deploy/text_to_speech/bark.py
class BarkDeploy(TTSBase):
    """Bark Model Deployment Class. This class is used to deploy the Bark model to a specified server for network invocation.

`__init__(self, launcher=None)`
Constructor, initializes the deployment class.

Args:
    launcher (lazyllm.launcher): An instance of the launcher used to start the remote service.

`__call__(self, finetuned_model=None, base_model=None)`
Deploys the model and returns the remote service address.

Args:
    finetuned_model (str): If provided, this model will be used for deployment; if not provided or the path is invalid, `base_model` will be used.
    base_model (str): The default model, which will be used for deployment if `finetuned_model` is invalid.
    Return (str): The URL address of the remote service.

Notes:
    - Input for infer: `str`.  The text corresponding to the audio to be generated.
    - Return of infer: The string encoded from the generated file paths, starting with the encoding flag "<lazyllm-query>", followed by the serialized dictionary. The key `files` in the dictionary stores a list, with elements being the paths of the generated audio files.
    - Supported models: [bark](https://huggingface.co/suno/bark)


Examples:
    >>> from lazyllm import launchers, UrlModule
    >>> from lazyllm.components import BarkDeploy
    >>> deployer = BarkDeploy(launchers.remote())
    >>> url = deployer(base_model='bark')
    >>> model = UrlModule(url=url)
    >>> res = model('Hello World!')
    >>> print(res)
    ... <lazyllm-query>{"query": "", "files": ["path/to/bark/sound_xxx.wav"]}
    """
    keys_name_handle = {
        'inputs': 'inputs',
    }
    message_format = {
        'inputs': 'Who are you ?',
        'voice_preset': None,
    }
    default_headers = {'Content-Type': 'application/json'}

    func = _Bark

lazyllm.components.MusicGenDeploy

Bases: TTSBase

MusicGen Model Deployment Class. This class is used to deploy the MusicGen model to a specified server for network invocation.

__init__(self, launcher=None) Constructor, initializes the deployment class.

Parameters:

  • launcher (launcher, default: None ) –

    An instance of the launcher used to start the remote service.

__call__(self, finetuned_model=None, base_model=None) Deploys the model and returns the remote service address.

Parameters:

  • finetuned_model (str) –

    If provided, this model will be used for deployment; if not provided or the path is invalid, base_model will be used.

  • base_model (str) –

    The default model, which will be used for deployment if finetuned_model is invalid.

  • Return (str) –

    The URL address of the remote service.

Notes
  • Input for infer: str. The text corresponding to the audio to be generated.
  • Return of infer: The string encoded from the generated file paths, starting with the encoding flag "", followed by the serialized dictionary. The key files in the dictionary stores a list, with elements being the paths of the generated audio files.
  • Supported models: musicgen-small

Examples:

>>> from lazyllm import launchers, UrlModule
>>> from lazyllm.components import MusicGenDeploy
>>> deployer = MusicGenDeploy(launchers.remote())
>>> url = deployer(base_model='musicgen-small')
>>> model = UrlModule(url=url)
>>> model('Symphony with flute as the main melody')
... <lazyllm-query>{"query": "", "files": ["path/to/musicgen/sound_xxx.wav"]}
Source code in lazyllm/components/deploy/text_to_speech/musicgen.py
class MusicGenDeploy(TTSBase):
    """MusicGen Model Deployment Class. This class is used to deploy the MusicGen model to a specified server for network invocation.

`__init__(self, launcher=None)`
Constructor, initializes the deployment class.

Args:
    launcher (lazyllm.launcher): An instance of the launcher used to start the remote service.

`__call__(self, finetuned_model=None, base_model=None)`
Deploys the model and returns the remote service address.

Args:
    finetuned_model (str): If provided, this model will be used for deployment; if not provided or the path is invalid, `base_model` will be used.
    base_model (str): The default model, which will be used for deployment if `finetuned_model` is invalid.
    Return (str): The URL address of the remote service.

Notes:
    - Input for infer: `str`.  The text corresponding to the audio to be generated.
    - Return of infer: The string encoded from the generated file paths, starting with the encoding flag "<lazyllm-query>", followed by the serialized dictionary. The key `files` in the dictionary stores a list, with elements being the paths of the generated audio files.
    - Supported models: [musicgen-small](https://huggingface.co/facebook/musicgen-small)


Examples:
    >>> from lazyllm import launchers, UrlModule
    >>> from lazyllm.components import MusicGenDeploy
    >>> deployer = MusicGenDeploy(launchers.remote())
    >>> url = deployer(base_model='musicgen-small')
    >>> model = UrlModule(url=url)
    >>> model('Symphony with flute as the main melody')
    ... <lazyllm-query>{"query": "", "files": ["path/to/musicgen/sound_xxx.wav"]}
    """
    message_format = None
    keys_name_handle = None
    default_headers = {'Content-Type': 'application/json'}
    func = _MusicGen

Speech to Text

lazyllm.components.SenseVoiceDeploy

Bases: LazyLLMDeployBase

SenseVoice Model Deployment Class. This class is used to deploy the SenseVoice model to a specified server for network invocation.

__init__(self, launcher=None) Constructor, initializes the deployment class.

Parameters:

  • launcher (Optional[LazyLLMLaunchersBase], default: None ) –

    Launcher instance, defaults to None.

  • log_path (Optional[str], default: None ) –

    Log file path, defaults to None.

  • trust_remote_code (bool, default: True ) –

    Whether to trust remote code, defaults to True.

  • port (Optional[int], default: None ) –

    Service port number, defaults to None.

Notes
  • Input for infer: str. The audio path or link.
  • Return of infer: str. The recognized content.
  • Supported models: SenseVoiceSmall

Examples:

>>> import os
>>> import lazyllm
>>> from lazyllm import launchers, UrlModule
>>> from lazyllm.components import SenseVoiceDeploy
>>> deployer = SenseVoiceDeploy(launchers.remote())
>>> url = deployer(base_model='SenseVoiceSmall')
>>> model = UrlModule(url=url)
>>> model('path/to/audio') # support format: .mp3, .wav
... xxxxxxxxxxxxxxxx
Source code in lazyllm/components/deploy/speech_to_text/sense_voice.py
class SenseVoiceDeploy(LazyLLMDeployBase):
    """SenseVoice Model Deployment Class. This class is used to deploy the SenseVoice model to a specified server for network invocation.

`__init__(self, launcher=None)`
Constructor, initializes the deployment class.

Args:
    launcher (Optional[LazyLLMLaunchersBase]): Launcher instance, defaults to None.
    log_path (Optional[str]): Log file path, defaults to None.
    trust_remote_code (bool): Whether to trust remote code, defaults to True.
    port (Optional[int]): Service port number, defaults to None.

Notes:
    - Input for infer: `str`. The audio path or link.
    - Return of infer: `str`. The recognized content.
    - Supported models: [SenseVoiceSmall](https://huggingface.co/FunAudioLLM/SenseVoiceSmall)


Examples:
    >>> import os
    >>> import lazyllm
    >>> from lazyllm import launchers, UrlModule
    >>> from lazyllm.components import SenseVoiceDeploy
    >>> deployer = SenseVoiceDeploy(launchers.remote())
    >>> url = deployer(base_model='SenseVoiceSmall')
    >>> model = UrlModule(url=url)
    >>> model('path/to/audio') # support format: .mp3, .wav
    ... xxxxxxxxxxxxxxxx
    """
    keys_name_handle = {
        'inputs': 'inputs',
        'audio': 'audio',
    }
    message_format = {
        'inputs': 'Who are you ?',
        'audio': None,
    }
    default_headers = {'Content-Type': 'application/json'}

    def __init__(self, launcher: Optional[LazyLLMLaunchersBase] = None,
                 log_path: Optional[str] = None, trust_remote_code: bool = True, port: Optional[int] = None, **kw):
        super().__init__(launcher=launcher)
        self._log_path = log_path
        self._trust_remote_code = trust_remote_code
        self._port = port

    def __call__(self, finetuned_model=None, base_model=None):
        if not finetuned_model:
            finetuned_model = base_model
        elif not os.path.exists(finetuned_model) or \
            not any(file.endswith(('.pt', '.bin', '.safetensors'))
                    for _, _, filenames in os.walk(finetuned_model) for file in filenames):
            LOG.warning(f'Note! That finetuned_model({finetuned_model}) is an invalid path, '
                        f'base_model({base_model}) will be used')
            finetuned_model = base_model
        return lazyllm.deploy.RelayServer(port=self._port, func=SenseVoice(finetuned_model), launcher=self._launcher,
                                          log_path=self._log_path, cls='sensevoice')()

lazyllm.components.deploy.speech_to_text.sense_voice.SenseVoice

Bases: object

The SenseVoice class encapsulates FunASR-based speech-to-text model loading and invocation.
It supports lazy initialization, automatic model downloading, and accepts string paths, URLs, or dicts containing audio.

Parameters:

  • base_path (str) –

    Model path or identifier, downloaded locally via ModelManager.

  • source (Optional[str], default: None ) –

    Model source, defaults to lazyllm.config['model_source'] if not specified.

  • init (bool, default: False ) –

    Whether to load the model immediately during initialization. Defaults to False.

Attributes:

  • base_path (str) –

    Resolved local path of the downloaded model.

  • model (Optional[AutoModel]) –

    Instance of the FunASR speech recognition model, available after initialization.

  • init_flag

    A flag used for lazy loading, ensuring the model is loaded only once.

Source code in lazyllm/components/deploy/speech_to_text/sense_voice.py
class SenseVoice(object):
    """The SenseVoice class encapsulates FunASR-based speech-to-text model loading and invocation.  
It supports lazy initialization, automatic model downloading, and accepts string paths, URLs, or dicts containing audio.  

Args:
    base_path (str): Model path or identifier, downloaded locally via ModelManager.  
    source (Optional[str]): Model source, defaults to ``lazyllm.config['model_source']`` if not specified.  
    init (bool): Whether to load the model immediately during initialization. Defaults to ``False``.  

Attributes:
    base_path (str): Resolved local path of the downloaded model.  
    model (Optional[funasr.AutoModel]): Instance of the FunASR speech recognition model, available after initialization.  
    init_flag: A flag used for lazy loading, ensuring the model is loaded only once.  
"""
    def __init__(self, base_path, source=None, init=False):
        source = lazyllm.config['model_source'] if not source else source
        self.base_path = ModelManager(source).download(base_path) or ''
        self.model = None
        self.init_flag = lazyllm.once_flag()
        if init:
            lazyllm.call_once(self.init_flag, self.load_stt)

    def load_stt(self):
        """Initializes and loads the FunASR speech-to-text model. Supports Huawei NPU acceleration if `torch_npu` is available.

Uses `fsmn-vad` for voice activity detection (VAD), supporting long utterances.
Maximum single segment duration is set to 30 seconds.
Default inference device is `cuda:0` (GPU).

The loaded model is assigned to `self.model` for subsequent audio transcription.

Note:
- If the environment has `torch_npu` installed, the method will import it to enable Ascend NPU acceleration.
"""
        if importlib.util.find_spec('torch_npu') is not None:
            import torch_npu  # noqa F401
            from torch_npu.contrib import transfer_to_npu  # noqa F401

        self.model = funasr.AutoModel(
            model=self.base_path,
            trust_remote_code=False,
            vad_model='fsmn-vad',
            vad_kwargs={'max_single_segment_time': 30000},
            device='cuda:0',
        )

    def __call__(self, string):
        lazyllm.call_once(self.init_flag, self.load_stt)
        if isinstance(string, dict):
            if string['audio']:
                string = string['audio'][-1] if isinstance(string['audio'], list) else string['audio']
            else:
                string = string['inputs']
        assert isinstance(string, str)
        string = string.strip()
        try:
            string = _base64_to_file(string) if _is_base64_with_mime(string) else string
        except Exception as e:
            LOG.error(f'Error processing base64 encoding: {e}')
            return f'Error processing base64 encoding {e}'
        if not string.endswith(supported_formats):
            return f'Only {", ".join(supported_formats)} formats in the form of file paths or URLs are supported.'
        if not is_valid_path(string) and not is_valid_url(string):
            return f'This {string} is not a valid URL or file path. Please check.'
        res = self.model.generate(
            input=string,
            cache={},
            language='auto',  # 'zn', 'en', 'yue', 'ja', 'ko', 'nospeech'
            use_itn=True,
            batch_size_s=60,
            merge_vad=True,
            merge_length_s=15,
        )
        text = funasr.utils.postprocess_utils.rich_transcription_postprocess(res[0]['text'])
        return text

    @classmethod
    def rebuild(cls, base_path, init):
        """Class method to reconstruct a `SenseVoice` instance during deserialization (e.g., with `cloudpickle`).  

Args:
    base_path (str): Path to the speech-to-text model.  
    init (bool): Whether to initialize and load the model upon instantiation.

**Returns:**

- SenseVoice: A new `SenseVoice` instance, used for serialization/multiprocessing compatibility.
"""
        return cls(base_path, init=init)

    def __reduce__(self):
        init = bool(os.getenv('LAZYLLM_ON_CLOUDPICKLE', None) == 'ON' or self.init_flag)
        return SenseVoice.rebuild, (self.base_path, init)
load_stt()

Initializes and loads the FunASR speech-to-text model. Supports Huawei NPU acceleration if torch_npu is available.

Uses fsmn-vad for voice activity detection (VAD), supporting long utterances. Maximum single segment duration is set to 30 seconds. Default inference device is cuda:0 (GPU).

The loaded model is assigned to self.model for subsequent audio transcription.

Note: - If the environment has torch_npu installed, the method will import it to enable Ascend NPU acceleration.

Source code in lazyllm/components/deploy/speech_to_text/sense_voice.py
    def load_stt(self):
        """Initializes and loads the FunASR speech-to-text model. Supports Huawei NPU acceleration if `torch_npu` is available.

Uses `fsmn-vad` for voice activity detection (VAD), supporting long utterances.
Maximum single segment duration is set to 30 seconds.
Default inference device is `cuda:0` (GPU).

The loaded model is assigned to `self.model` for subsequent audio transcription.

Note:
- If the environment has `torch_npu` installed, the method will import it to enable Ascend NPU acceleration.
"""
        if importlib.util.find_spec('torch_npu') is not None:
            import torch_npu  # noqa F401
            from torch_npu.contrib import transfer_to_npu  # noqa F401

        self.model = funasr.AutoModel(
            model=self.base_path,
            trust_remote_code=False,
            vad_model='fsmn-vad',
            vad_kwargs={'max_single_segment_time': 30000},
            device='cuda:0',
        )
rebuild(base_path, init) classmethod

Class method to reconstruct a SenseVoice instance during deserialization (e.g., with cloudpickle).

Parameters:

  • base_path (str) –

    Path to the speech-to-text model.

  • init (bool) –

    Whether to initialize and load the model upon instantiation.

Returns:

  • SenseVoice: A new SenseVoice instance, used for serialization/multiprocessing compatibility.
Source code in lazyllm/components/deploy/speech_to_text/sense_voice.py
    @classmethod
    def rebuild(cls, base_path, init):
        """Class method to reconstruct a `SenseVoice` instance during deserialization (e.g., with `cloudpickle`).  

Args:
    base_path (str): Path to the speech-to-text model.  
    init (bool): Whether to initialize and load the model upon instantiation.

**Returns:**

- SenseVoice: A new `SenseVoice` instance, used for serialization/multiprocessing compatibility.
"""
        return cls(base_path, init=init)

ModelManager

lazyllm.components.ModelManager

ModelManager is a utility class in LazyLLM for managing and downloading models, supporting local search and Huggingface/Modelscope downloads.

Parameters:

  • model_source (Optional[str]) –

    Model download source, only huggingface or modelscope supported. Defaults to LAZYLLM_MODEL_SOURCE, and modelscope if unset.

  • token (Optional[str], default: config['model_source_token'] ) –

    Access token for private models. Defaults to LAZYLLM_MODEL_SOURCE_TOKEN.

  • model_path (Optional[str], default: config['model_path'] ) –

    Colon-separated list of local absolute paths to search before download. Defaults to LAZYLLM_MODEL_PATH.

  • cache_dir (Optional[str], default: config['model_cache_dir'] ) –

    Directory for downloaded models. Defaults to LAZYLLM_MODEL_CACHE_DIR, or ~/.lazyllm/model.

Static Methods

get_model_type(model: str) -> str Returns model type, e.g., llm or chat; returns llm if unrecognized. get_model_prompt_keys(model: str) -> dict Returns the prompt key mapping dictionary for the model. validate_model_path(model_path: str) -> bool Checks if directory contains valid model files (extensions: .pt, .bin, .safetensors).

Instance Methods

download(model: Optional[str] = '', call_back: Optional[Callable] = None) -> str | bool Downloads the specified model. Process: 1. Search in local directories listed in model_path; 2. If not found, search in cache_dir; 3. If still not found, download from model_source to cache_dir.

Args:
    model (Optional[str]): Target model name, can be abbreviated or full name from source.
    call_back (Optional[Callable]): Optional callback function for download progress.

Examples:

>>> from lazyllm.components import ModelManager
>>> downloader = ModelManager(model_source='modelscope')
>>> downloader.download('chatglm3-6b')
Source code in lazyllm/components/utils/downloader/model_downloader.py
class ModelManager():
    """ModelManager is a utility class in LazyLLM for managing and downloading models, supporting local search and Huggingface/Modelscope downloads.  

Args:
    model_source (Optional[str]): Model download source, only ``huggingface`` or ``modelscope`` supported.
        Defaults to LAZYLLM_MODEL_SOURCE, and ``modelscope`` if unset.
    token (Optional[str]): Access token for private models. Defaults to LAZYLLM_MODEL_SOURCE_TOKEN.
    model_path (Optional[str]): Colon-separated list of local absolute paths to search before download. Defaults to LAZYLLM_MODEL_PATH.
    cache_dir (Optional[str]): Directory for downloaded models. Defaults to LAZYLLM_MODEL_CACHE_DIR, or ``~/.lazyllm/model``.

Static Methods:
    get_model_type(model: str) -> str
        Returns model type, e.g., ``llm`` or ``chat``; returns ``llm`` if unrecognized.
    get_model_prompt_keys(model: str) -> dict
        Returns the prompt key mapping dictionary for the model.
    validate_model_path(model_path: str) -> bool
        Checks if directory contains valid model files (extensions: ``.pt``, ``.bin``, ``.safetensors``).

Instance Methods:
    download(model: Optional[str] = '', call_back: Optional[Callable] = None) -> str | bool
        Downloads the specified model. Process:
        1. Search in local directories listed in model_path;
        2. If not found, search in cache_dir;
        3. If still not found, download from model_source to cache_dir.

        Args:
            model (Optional[str]): Target model name, can be abbreviated or full name from source.
            call_back (Optional[Callable]): Optional callback function for download progress.


Examples:
    >>> from lazyllm.components import ModelManager
    >>> downloader = ModelManager(model_source='modelscope')
    >>> downloader.download('chatglm3-6b')
    """
    def __init__(self, model_source, token=lazyllm.config['model_source_token'],
                 cache_dir=lazyllm.config['model_cache_dir'], model_path=lazyllm.config['model_path']):
        self.model_source = model_source or lazyllm.config['model_source']
        self.token = token or None
        self.cache_dir = cache_dir
        self.model_paths = model_path.split(':') if len(model_path) > 0 else []
        if self.model_source == 'huggingface':
            self.hub_downloader = _HuggingfaceDownloader(token=self.token)
        else:
            self.hub_downloader = _ModelscopeDownloader(token=self.token)
            if self.model_source != 'modelscope':
                lazyllm.LOG.error('Only support Huggingface and Modelscope currently. '
                                  f'Unsupported model source: {self.model_source}. Forcing use of Modelscope.')

    @staticmethod
    @functools.lru_cache
    def get_model_type(model) -> str:
        """Retrieve the type of a model (e.g., LLM, VLM) based on its name.

Args:
    model (str): Model name or path, must be a non-empty string.

**Returns:**

- str: Model type, returns ``llm`` if no match is found.
"""
        assert isinstance(model, str) and len(model) > 0, f'model name should be a non-empty string, get {model}'
        __class__._try_add_mapping(model)
        for name, info in model_name_mapping.items():
            if 'type' not in info: continue

            model_name_set = {name.casefold()}
            for source in info['source']:
                model_name_set.add(info['source'][source].split('/')[-1].casefold())

            if model.split(os.sep)[-1].casefold() in model_name_set:
                return info['type']
        return infer_model_type(model)

    @staticmethod
    @functools.lru_cache
    def _get_model_name(model) -> str:
        search_string = os.path.basename(model)
        __class__._try_add_mapping(search_string)
        for model_name, sources in model_name_mapping.items():
            if model_name.lower() == search_string.lower() or any(
                    os.path.basename(source_file).lower() == search_string.lower()
                    for source_file in sources['source'].values()):
                return model_name
        return ''

    @staticmethod
    @functools.lru_cache
    def get_model_prompt_keys(model) -> dict:
        """Get the prompt key mapping dictionary for the specified model, used for constructing inputs during inference.  

Args:
    model (str): Model name or path.

**Returns:**

- dict: The prompt key mapping for the model, or an empty dictionary if none exists
"""
        model_name = __class__._get_model_name(model)
        __class__._try_add_mapping(model_name)
        if model_name and 'prompt_keys' in model_name_mapping[model_name.lower()]:
            return model_name_mapping[model_name.lower()]['prompt_keys']
        else:
            return dict()

    @staticmethod
    def validate_model_path(model_path):
        """Check whether the specified path contains valid model files (.pt, .bin, .safetensors).  

Args:
    model_path (str): Path to the model directory.

**Returns:**

- bool: True if model files exist in the directory, False otherwise
"""
        extensions = {'.pt', '.bin', '.safetensors'}
        for _, _, files in os.walk(model_path):
            for file in files:
                if any(file.endswith(ext) for ext in extensions):
                    return True
        return False

    @staticmethod
    def _try_add_mapping(model):
        model_base = os.path.basename(model)
        model = model_base.lower()
        if model in model_name_mapping.keys():
            return
        matched_model_prefix = next((key for key in model_provider if model.startswith(key)), None)
        if matched_model_prefix:
            matching_keys = [key for key in model_groups.keys() if key in model]
            if matching_keys:
                matched_groups = max(matching_keys, key=len)
                model_name_mapping[model] = {
                    'prompt_keys': model_groups[matched_groups]['prompt_keys'],
                    'source': {k: v + '/' + model_base for k, v in model_provider[matched_model_prefix].items()}
                }

    def download(self, model='', call_back=None):
        """Download the specified model by name. If it already exists locally, return the local path.  
Supports automatic download from Huggingface and Modelscope, creating a symbolic link in the cache directory for unified management.  

Args:
    model (str, optional): Model name or path, defaults to empty string which means no download.
    call_back (Optional[Callable], optional): Callback function for download progress, receives current download status.  

**Returns:**

- str | bool: Full local path to the model, or False if the download fails
"""
        assert isinstance(model, str), 'model name should be a string.'
        if len(model) == 0 or model[0] in (os.sep, '.', '~') or os.path.isabs(model): return model
        if (model_at_path := self._model_exists_at_path(model)): return model_at_path
        if self.model_source == '' or self.model_source not in ('huggingface', 'modelscope'):
            lazyllm.LOG.error('model automatic downloads only support Huggingface and Modelscope currently.')
            return model

        self._try_add_mapping(model)
        if model_name_mapping.get(model.lower(), {}).get('download_by_other'): return model

        if model.lower() in model_name_mapping.keys() and \
                self.model_source in model_name_mapping[model.lower()]['source'].keys():
            full_model_dir = os.path.join(self.cache_dir, model)

            mapped_model_name = model_name_mapping[model.lower()]['source'][self.model_source]
            model_save_dir = self._do_download(mapped_model_name, call_back)
            if model_save_dir:
                # The code safely creates a symbolic link by removing any existing target.
                if os.path.exists(full_model_dir):
                    os.remove(full_model_dir)
                if os.path.islink(full_model_dir):
                    os.unlink(full_model_dir)
                os.symlink(model_save_dir, full_model_dir, target_is_directory=True)
                return full_model_dir
            return model_save_dir  # return False
        else:
            model_name_for_download = model

            if '/' not in model_name_for_download:
                # Try to figure out a possible model provider
                matched_model_prefix = next((key for key in model_provider if model.lower().startswith(key)), None)
                if matched_model_prefix and self.model_source in model_provider[matched_model_prefix]:
                    model_name_for_download = model_provider[matched_model_prefix][self.model_source] + '/' + model

            model_save_dir = self._do_download(model_name_for_download, call_back)
            return model_save_dir

    def _validate_token(self):
        return self.hub_downloader.verify_hub_token()

    def _validate_model_id(self, model_id):
        return self.hub_downloader._verify_model_id(model_id)

    def _model_exists_at_path(self, model_name):
        if len(self.model_paths) == 0:
            return None
        model_dirs = []

        # For short model name, get all possible names from the mapping.
        if model_name.lower() in model_name_mapping.keys():
            for source in ('huggingface', 'modelscope'):
                if source in model_name_mapping[model_name.lower()]['source'].keys():
                    model_dirs.append(model_name_mapping[model_name.lower()]['source'][source].replace('/', os.sep))
        model_dirs.append(model_name.replace('/', os.sep))

        for model_path in self.model_paths:
            if len(model_path) == 0: continue
            if model_path[0] != os.sep:
                lazyllm.LOG.warning(f'skipping path {model_path} as only absolute paths is accepted.')
                continue
            for model_dir in model_dirs:
                full_model_dir = os.path.join(model_path, model_dir)
                if self._is_model_valid(full_model_dir):
                    return full_model_dir
        return None

    def _is_model_valid(self, model_dir):
        if not os.path.isdir(model_dir):
            return False
        return any((True for _ in os.scandir(model_dir)))

    def _do_download(self, model='', call_back=None):
        model_dir = model.replace('/', os.sep)
        full_model_dir = os.path.join(self.cache_dir, self.model_source, model_dir)

        try:
            return self.hub_downloader.download(model, full_model_dir, call_back)
        # Use `BaseException` to capture `KeyboardInterrupt` and normal `Exceptioin`.
        except BaseException as e:  # noqa B036
            lazyllm.LOG.warning(f'Download encountered an error: {e}')
            if '401' in str(e) or 'Client Error' in str(e):
                raise RuntimeError('Authentication failed (401 Error). Please check your access token and '
                                   'permissions.  And set the token with the environment variable '
                                   'LAZYLLM_MODEL_SOURCE_TOKEN.')
            if not self.token and 'Permission denied' not in str(e):
                lazyllm.LOG.warning('Token is empty, which may prevent private models from being downloaded, '
                                    'as indicated by "the model does not exist." Please set the token with the '
                                    'environment variable LAZYLLM_MODEL_SOURCE_TOKEN to download private models.')
            raise RuntimeError(f'Model download failed for model: {model}, with error: {e}')
        return False

get_model_type(model) cached staticmethod

Retrieve the type of a model (e.g., LLM, VLM) based on its name.

Parameters:

  • model (str) –

    Model name or path, must be a non-empty string.

Returns:

  • str: Model type, returns llm if no match is found.
Source code in lazyllm/components/utils/downloader/model_downloader.py
    @staticmethod
    @functools.lru_cache
    def get_model_type(model) -> str:
        """Retrieve the type of a model (e.g., LLM, VLM) based on its name.

Args:
    model (str): Model name or path, must be a non-empty string.

**Returns:**

- str: Model type, returns ``llm`` if no match is found.
"""
        assert isinstance(model, str) and len(model) > 0, f'model name should be a non-empty string, get {model}'
        __class__._try_add_mapping(model)
        for name, info in model_name_mapping.items():
            if 'type' not in info: continue

            model_name_set = {name.casefold()}
            for source in info['source']:
                model_name_set.add(info['source'][source].split('/')[-1].casefold())

            if model.split(os.sep)[-1].casefold() in model_name_set:
                return info['type']
        return infer_model_type(model)

get_model_prompt_keys(model) cached staticmethod

Get the prompt key mapping dictionary for the specified model, used for constructing inputs during inference.

Parameters:

  • model (str) –

    Model name or path.

Returns:

  • dict: The prompt key mapping for the model, or an empty dictionary if none exists
Source code in lazyllm/components/utils/downloader/model_downloader.py
    @staticmethod
    @functools.lru_cache
    def get_model_prompt_keys(model) -> dict:
        """Get the prompt key mapping dictionary for the specified model, used for constructing inputs during inference.  

Args:
    model (str): Model name or path.

**Returns:**

- dict: The prompt key mapping for the model, or an empty dictionary if none exists
"""
        model_name = __class__._get_model_name(model)
        __class__._try_add_mapping(model_name)
        if model_name and 'prompt_keys' in model_name_mapping[model_name.lower()]:
            return model_name_mapping[model_name.lower()]['prompt_keys']
        else:
            return dict()

validate_model_path(model_path) staticmethod

Check whether the specified path contains valid model files (.pt, .bin, .safetensors).

Parameters:

  • model_path (str) –

    Path to the model directory.

Returns:

  • bool: True if model files exist in the directory, False otherwise
Source code in lazyllm/components/utils/downloader/model_downloader.py
    @staticmethod
    def validate_model_path(model_path):
        """Check whether the specified path contains valid model files (.pt, .bin, .safetensors).  

Args:
    model_path (str): Path to the model directory.

**Returns:**

- bool: True if model files exist in the directory, False otherwise
"""
        extensions = {'.pt', '.bin', '.safetensors'}
        for _, _, files in os.walk(model_path):
            for file in files:
                if any(file.endswith(ext) for ext in extensions):
                    return True
        return False

download(model='', call_back=None)

Download the specified model by name. If it already exists locally, return the local path.
Supports automatic download from Huggingface and Modelscope, creating a symbolic link in the cache directory for unified management.

Parameters:

  • model (str, default: '' ) –

    Model name or path, defaults to empty string which means no download.

  • call_back (Optional[Callable], default: None ) –

    Callback function for download progress, receives current download status.

Returns:

  • str | bool: Full local path to the model, or False if the download fails
Source code in lazyllm/components/utils/downloader/model_downloader.py
    def download(self, model='', call_back=None):
        """Download the specified model by name. If it already exists locally, return the local path.  
Supports automatic download from Huggingface and Modelscope, creating a symbolic link in the cache directory for unified management.  

Args:
    model (str, optional): Model name or path, defaults to empty string which means no download.
    call_back (Optional[Callable], optional): Callback function for download progress, receives current download status.  

**Returns:**

- str | bool: Full local path to the model, or False if the download fails
"""
        assert isinstance(model, str), 'model name should be a string.'
        if len(model) == 0 or model[0] in (os.sep, '.', '~') or os.path.isabs(model): return model
        if (model_at_path := self._model_exists_at_path(model)): return model_at_path
        if self.model_source == '' or self.model_source not in ('huggingface', 'modelscope'):
            lazyllm.LOG.error('model automatic downloads only support Huggingface and Modelscope currently.')
            return model

        self._try_add_mapping(model)
        if model_name_mapping.get(model.lower(), {}).get('download_by_other'): return model

        if model.lower() in model_name_mapping.keys() and \
                self.model_source in model_name_mapping[model.lower()]['source'].keys():
            full_model_dir = os.path.join(self.cache_dir, model)

            mapped_model_name = model_name_mapping[model.lower()]['source'][self.model_source]
            model_save_dir = self._do_download(mapped_model_name, call_back)
            if model_save_dir:
                # The code safely creates a symbolic link by removing any existing target.
                if os.path.exists(full_model_dir):
                    os.remove(full_model_dir)
                if os.path.islink(full_model_dir):
                    os.unlink(full_model_dir)
                os.symlink(model_save_dir, full_model_dir, target_is_directory=True)
                return full_model_dir
            return model_save_dir  # return False
        else:
            model_name_for_download = model

            if '/' not in model_name_for_download:
                # Try to figure out a possible model provider
                matched_model_prefix = next((key for key in model_provider if model.lower().startswith(key)), None)
                if matched_model_prefix and self.model_source in model_provider[matched_model_prefix]:
                    model_name_for_download = model_provider[matched_model_prefix][self.model_source] + '/' + model

            model_save_dir = self._do_download(model_name_for_download, call_back)
            return model_save_dir

Formatter

lazyllm.components.formatter.LazyLLMFormatterBase

This class is the base class of the formatter. The formatter is the formatter of the model output result. Users can customize the formatter or use the formatter provided by LazyLLM.

Examples:

>>> from lazyllm.components.formatter import LazyLLMFormatterBase
>>> class MyFormatter(LazyLLMFormatterBase):
...     def __init__(self, formatter: str = None):
...         self._formatter = formatter
...         if self._formatter:
...             self._parse_formatter()
...         else:
...             self._slices = None
...     def _parse_formatter(self):
...         slice_str = self._formatter.strip()[1:-1]
...         slices = []
...         parts = slice_str.split(":")
...         start = int(parts[0]) if parts[0] else None
...         end = int(parts[1]) if len(parts) > 1 and parts[1] else None
...         step = int(parts[2]) if len(parts) > 2 and parts[2] else None
...         slices.append(slice(start, end, step))
...         self._slices = slices
...     def _load(self, data):
...         return [int(x) for x in data.strip('[]').split(',')]
...     def _parse_py_data_by_formatter(self, data):
...         if self._slices is not None:
...             result = []
...             for s in self._slices:
...                 if isinstance(s, slice):
...                     result.extend(data[s])
...                 else:
...                     result.append(data[int(s)])
...             return result
...         else:
...             return data
...
>>> fmt = MyFormatter("[1:3]")
>>> res = fmt.format("[1,2,3,4,5]")
>>> print(res)
[2, 3]
Source code in lazyllm/components/formatter/formatterbase.py
class LazyLLMFormatterBase(metaclass=LazyLLMRegisterMetaClass):
    """This class is the base class of the formatter. The formatter is the formatter of the model output result. Users can customize the formatter or use the formatter provided by LazyLLM.


Examples:
    >>> from lazyllm.components.formatter import LazyLLMFormatterBase
    >>> class MyFormatter(LazyLLMFormatterBase):
    ...     def __init__(self, formatter: str = None):
    ...         self._formatter = formatter
    ...         if self._formatter:
    ...             self._parse_formatter()
    ...         else:
    ...             self._slices = None
    ...     def _parse_formatter(self):
    ...         slice_str = self._formatter.strip()[1:-1]
    ...         slices = []
    ...         parts = slice_str.split(":")
    ...         start = int(parts[0]) if parts[0] else None
    ...         end = int(parts[1]) if len(parts) > 1 and parts[1] else None
    ...         step = int(parts[2]) if len(parts) > 2 and parts[2] else None
    ...         slices.append(slice(start, end, step))
    ...         self._slices = slices
    ...     def _load(self, data):
    ...         return [int(x) for x in data.strip('[]').split(',')]
    ...     def _parse_py_data_by_formatter(self, data):
    ...         if self._slices is not None:
    ...             result = []
    ...             for s in self._slices:
    ...                 if isinstance(s, slice):
    ...                     result.extend(data[s])
    ...                 else:
    ...                     result.append(data[int(s)])
    ...             return result
    ...         else:
    ...             return data
    ...
    >>> fmt = MyFormatter("[1:3]")
    >>> res = fmt.format("[1,2,3,4,5]")
    >>> print(res)
    [2, 3]
    """
    def _load(self, msg: str):
        return msg

    def _parse_py_data_by_formatter(self, py_data):
        raise NotImplementedError('This data parse function is not implemented.')

    def format(self, msg):
        """Format input message.

Args:
    msg: Input message, can be string or other format

**Returns:**

- Formatted data, specific type determined by subclass implementation
"""
        if _is_chat_message(msg):
            if reasoning_content := msg.get('reasoning_content'):
                msg = f'<think>{reasoning_content}</think>' + msg['content']
            else:
                msg = msg['content']
        if isinstance(msg, str): msg = self._load(msg)
        return self._parse_py_data_by_formatter(msg)

    def __call__(self, *msg):
        return self.format(msg[0] if len(msg) == 1 else package(msg))

    def __or__(self, other):
        if not isinstance(other, LazyLLMFormatterBase):
            return NotImplemented
        return PipelineFormatter(other.__ror__(self))

    def __ror__(self, f: Callable) -> Pipeline:
        if isinstance(f, Pipeline):
            if not f._capture:
                _ = Finalizer(lambda: setattr(f, '_capture', True), lambda: setattr(f, '_capture', False))
            f._add(str(uuid.uuid4().hex) if len(f._item_names) else None, self)
            return f
        return Pipeline(f, self)

format(msg)

Format input message.

Parameters:

  • msg

    Input message, can be string or other format

Returns:

  • Formatted data, specific type determined by subclass implementation
Source code in lazyllm/components/formatter/formatterbase.py
    def format(self, msg):
        """Format input message.

Args:
    msg: Input message, can be string or other format

**Returns:**

- Formatted data, specific type determined by subclass implementation
"""
        if _is_chat_message(msg):
            if reasoning_content := msg.get('reasoning_content'):
                msg = f'<think>{reasoning_content}</think>' + msg['content']
            else:
                msg = msg['content']
        if isinstance(msg, str): msg = self._load(msg)
        return self._parse_py_data_by_formatter(msg)

lazyllm.components.formatter.formatterbase.JsonLikeFormatter

Bases: LazyLLMFormatterBase

This class is used to extract subfields from nested structures (like dicts, lists, tuples) using a JSON-like indexing syntax.

The behavior is driven by a formatter string similar to Python-style slicing and dictionary access:

  • [0] fetches the first item
  • [0][{key}] accesses the key field in the first item
  • [0,1][{a,b}] fetches the a and b fields from the first and second items
  • [::2] does slicing with a step of 2
  • *[0][{x}] means return a wrapped/structured result

Parameters:

  • formatter (str, default: None ) –

    A format string that controls how to slice and extract the structure. If None, the input will be returned directly.

Examples:

>>> from lazyllm.components.formatter.formatterbase import JsonLikeFormatter
>>> formatter = JsonLikeFormatter("[{a,b}]")
Source code in lazyllm/components/formatter/formatterbase.py
class JsonLikeFormatter(LazyLLMFormatterBase):
    """This class is used to extract subfields from nested structures (like dicts, lists, tuples) using a JSON-like indexing syntax.

The behavior is driven by a formatter string similar to Python-style slicing and dictionary access:

- `[0]` fetches the first item
- `[0][{key}]` accesses the `key` field in the first item
- `[0,1][{a,b}]` fetches the `a` and `b` fields from the first and second items
- `[::2]` does slicing with a step of 2
- `*[0][{x}]` means return a wrapped/structured result

Args:
    formatter (str, optional): A format string that controls how to slice and extract the structure. If None, the input will be returned directly.


Examples:
    >>> from lazyllm.components.formatter.formatterbase import JsonLikeFormatter
    >>> formatter = JsonLikeFormatter("[{a,b}]")
    """
    class _ListIdxes(tuple): pass
    class _DictKeys(tuple): pass

    def __init__(self, formatter: Optional[str] = None):
        if formatter and formatter.startswith('*['):
            self._return_package = True
            self._formatter = formatter.strip('*')
        else:
            self._return_package = False
            self._formatter = formatter

        if self._formatter:
            assert '*' not in self._formatter, '`*` can only be used before `[` in the beginning'
            self._formatter = self._formatter.strip().replace('{', '[{').replace('}', '}]')
            self._parse_formatter()
        else:
            self._slices = None

    def _parse_formatter(self):
        # Remove the surrounding brackets
        assert self._formatter.startswith('[') and self._formatter.endswith(']')
        slice_str = self._formatter.strip()[1:-1]
        dimensions = slice_str.split('][')
        slices = []

        for dim in dimensions:
            if '{' in dim:
                slices.append(__class__._DictKeys(d.strip() for d in dim[1:-1].split(',') if d.strip()))
            elif ':' in dim:
                assert ',' not in dim, '[a, b:c] is not supported'
                parts = dim.split(':')
                start = int(parts[0]) if _is_number(parts[0]) else None
                end = int(parts[1]) if len(parts) > 1 and _is_number(parts[1]) else None
                step = int(parts[2]) if len(parts) > 2 and _is_number(parts[2]) else None
                slices.append(slice(start, end, step))
            elif ',' in dim:
                slices.append(__class__._ListIdxes(d.strip() for d in dim.split(',') if d.strip()))
            else:
                slices.append(dim.strip())
        self._slices = slices

    def _parse_py_data_by_formatter(self, data, *, slices=None):  # noqa C901
        def _impl(data, slice):
            if isinstance(data, (tuple, list)) and isinstance(slice, str):
                return data[int(slice)]
            if isinstance(slice, __class__._ListIdxes):
                if isinstance(data, dict): return [data[k] for k in slice]
                elif isinstance(data, (tuple, list)): return type(data)(data[int(k)] for k in slice)
                else: raise RuntimeError('Only tuple/list/dict is supported for [a,b,c]')
            if isinstance(slice, __class__._DictKeys):
                assert isinstance(data, dict)
                if len(slice) == 1 and slice[0] == ':': return data
                return {k: data[k] for k in slice}
            return data[slice]

        if slices is None: slices = self._slices
        if not slices: return data
        curr_slice = slices[0]
        if isinstance(curr_slice, slice):
            if isinstance(data, dict):
                assert curr_slice.start is None and curr_slice.stop is None and curr_slice.step is None, (
                    'Only {:} and [:] is supported in dict slice')
                curr_slice = __class__._ListIdxes(data.keys())
            elif isinstance(data, (tuple, list)):
                return type(data)(self._parse_py_data_by_formatter(d, slices=slices[1:])
                                  for d in _impl(data, curr_slice))
        if isinstance(curr_slice, __class__._DictKeys):
            return {k: self._parse_py_data_by_formatter(v, slices=slices[1:])
                    for k, v in _impl(data, curr_slice).items()}
        elif isinstance(curr_slice, __class__._ListIdxes):
            tp = package if self._return_package else list if isinstance(data, dict) else type(data)
            return tp(self._parse_py_data_by_formatter(r, slices=slices[1:]) for r in _impl(data, curr_slice))
        else: return self._parse_py_data_by_formatter(_impl(data, curr_slice), slices=slices[1:])

lazyllm.components.formatter.formatterbase.PythonFormatter

Bases: JsonLikeFormatter

Reserved formatter class for supporting Python-style data extraction syntax. To be developed.

Currently inherits from JsonLikeFormatter with no additional behavior.

Source code in lazyllm/components/formatter/formatterbase.py
class PythonFormatter(JsonLikeFormatter):
    """Reserved formatter class for supporting Python-style data extraction syntax. To be developed.

Currently inherits from JsonLikeFormatter with no additional behavior.
"""
    pass

lazyllm.components.formatter.FileFormatter

Bases: LazyLLMFormatterBase

A formatter that transforms query strings with document context between structured formats.

Supports three modes:

  • "decode": Decodes structured query strings into dictionaries with query and files.
  • "encode": Encodes a dictionary with query and files into a structured query string.
  • "merge": Merges multiple structured query strings into one.

Parameters:

  • formatter (str, default: 'decode' ) –

    The operation mode. Must be one of "decode", "encode", or "merge". Defaults to "decode".

Examples:

>>> from lazyllm.components.formatter import FileFormatter
>>> # Decode mode
>>> fmt = FileFormatter('decode')
Source code in lazyllm/components/formatter/formatterbase.py
class FileFormatter(LazyLLMFormatterBase):
    """A formatter that transforms query strings with document context between structured formats.

Supports three modes:

- "decode": Decodes structured query strings into dictionaries with `query` and `files`.
- "encode": Encodes a dictionary with `query` and `files` into a structured query string.
- "merge": Merges multiple structured query strings into one.

Args:
    formatter (str): The operation mode. Must be one of "decode", "encode", or "merge". Defaults to "decode".


Examples:
    >>> from lazyllm.components.formatter import FileFormatter

    >>> # Decode mode
    >>> fmt = FileFormatter('decode')
    """

    def __init__(self, formatter: str = 'decode'):
        self._mode = formatter.strip().lower()
        assert self._mode in ('decode', 'encode', 'merge')

    def _parse_py_data_by_formatter(self, py_data):
        if self._mode == 'merge':
            if isinstance(py_data, str):
                return py_data
            assert isinstance(py_data, package)
            return lazyllm_merge_query(*py_data)

        if isinstance(py_data, package):
            res = []
            for i_data in py_data:
                res.append(self._parse_py_data_by_formatter(i_data))
            return package(res)
        elif isinstance(py_data, (str, dict)):
            return self._decode_one_data(py_data)
        else:
            return py_data

    def _decode_one_data(self, py_data):
        if self._mode == 'decode':
            if isinstance(py_data, str):
                return decode_query_with_filepaths(py_data)
            else:
                return py_data
        else:
            if isinstance(py_data, dict) and 'query' in py_data and 'files' in py_data:
                return encode_query_with_filepaths(**py_data)
            else:
                return py_data

lazyllm.components.formatter.YamlFormatter

Bases: JsonLikeFormatter

A formatter for extracting structured information from YAML-formatted strings.

Inherits from JsonLikeFormatter. Uses the internal method to parse YAML strings into Python objects, and then applies JSON-like formatting rules to extract desired fields.

Suitable for handling nested YAML content with formatter-based field selection.

Examples:

>>> from lazyllm.components.formatter import YamlFormatter
>>> formatter = YamlFormatter("{name,age}")
>>> msg = """ 
... name: Alice
... age: 30
... city: London
... """
>>> formatter(msg)
{'name': 'Alice', 'age': 30}
Source code in lazyllm/components/formatter/yamlformatter.py
class YamlFormatter(JsonLikeFormatter):
    """A formatter for extracting structured information from YAML-formatted strings.

Inherits from JsonLikeFormatter. Uses the internal method to parse YAML strings into Python objects, and then applies JSON-like formatting rules to extract desired fields.

Suitable for handling nested YAML content with formatter-based field selection.


Examples:
    >>> from lazyllm.components.formatter import YamlFormatter
    >>> formatter = YamlFormatter("{name,age}")
    >>> msg = \"\"\" 
    ... name: Alice
    ... age: 30
    ... city: London
    ... \"\"\"
    >>> formatter(msg)
    {'name': 'Alice', 'age': 30}
    """
    def _load(self, msg: str):
        try:
            return yaml.load(msg, Loader=yaml.SafeLoader)
        except Exception as e:
            lazyllm.LOG.info(f'Error: {e}')
            return ''

lazyllm.components.formatter.encode_query_with_filepaths(query=None, files=None)

Encodes a query string together with associated file paths into a structured string format with context.

If file paths are provided, the query and file list will be wrapped into a JSON object prefixed with __lazyllm_docs__. Otherwise, it returns the original query string.

Parameters:

  • query (str, default: None ) –

    The user query string. Defaults to an empty string.

  • files (str or List[str], default: None ) –

    File path(s) associated with the query. Can be a single string or a list of strings.

Returns:

  • str: A structured encoded query string or the raw query.

Raises:

  • AssertionError

    If files is not a string or list of strings.

Examples:

>>> from lazyllm.components.formatter import encode_query_with_filepaths
>>> # Encode a query along with associated documentation files
>>> encode_query_with_filepaths("Generate questions based on the document", files=["a.md"])
'<lazyllm-query>{"query": "Generate questions based on the document", "files": ["a.md"]}'
Source code in lazyllm/components/formatter/formatterbase.py
def encode_query_with_filepaths(query: str = None, files: Union[str, List[str]] = None) -> str:
    """Encodes a query string together with associated file paths into a structured string format with context.

If file paths are provided, the query and file list will be wrapped into a JSON object prefixed with ``__lazyllm_docs__``. Otherwise, it returns the original query string.

Args:
    query (str): The user query string. Defaults to an empty string.
    files (str or List[str]): File path(s) associated with the query. Can be a single string or a list of strings.

**Returns:**

- str: A structured encoded query string or the raw query.

Raises:
    AssertionError: If `files` is not a string or list of strings.


Examples:
    >>> from lazyllm.components.formatter import encode_query_with_filepaths

    >>> # Encode a query along with associated documentation files
    >>> encode_query_with_filepaths("Generate questions based on the document", files=["a.md"])
    '<lazyllm-query>{"query": "Generate questions based on the document", "files": ["a.md"]}'
    """
    query = query if query else ''
    if files:
        if isinstance(files, str): files = [files]
        assert isinstance(files, list), 'files must be a list.'
        assert all(isinstance(item, str) for item in files), 'All items in files must be strings'
        return LAZYLLM_QUERY_PREFIX + json.dumps({'query': query, 'files': files})
    else:
        return query

lazyllm.components.formatter.decode_query_with_filepaths(query_files)

Decodes a structured query string into a dictionary containing the original query and file paths.

If the input string starts with the special prefix __lazyllm_docs__, it attempts to parse the JSON content; otherwise, it returns the raw query string as-is.

Parameters:

  • query_files (str) –

    The encoded query string that may include both query and file paths.

Returns:

  • Union[dict, str]: A dictionary containing 'query' and 'files' if structured, otherwise the original query string.

Raises:

  • AssertionError

    If the input is not a string.

  • ValueError

    If the string is prefixed but JSON decoding fails.

Examples:

>>> from lazyllm.components.formatter import decode_query_with_filepaths
>>> # Decode a structured query with files
>>> decode_query_with_filepaths('<lazyllm-query>{"query": "Summarize the content", "files": ["doc.md"]}')
{'query': 'Summarize the content', 'files': ['doc.md']}
>>> # Decode a plain string without files
>>> decode_query_with_filepaths("This is just a simple question")
'This is just a simple question'
Source code in lazyllm/components/formatter/formatterbase.py
def decode_query_with_filepaths(query_files: str) -> Union[dict, str]:
    """Decodes a structured query string into a dictionary containing the original query and file paths.

If the input string starts with the special prefix ``__lazyllm_docs__``, it attempts to parse the JSON content; otherwise, it returns the raw query string as-is.

Args:
    query_files (str): The encoded query string that may include both query and file paths.

**Returns:**

- Union[dict, str]: A dictionary containing 'query' and 'files' if structured, otherwise the original query string.

Raises:
    AssertionError: If the input is not a string.
    ValueError: If the string is prefixed but JSON decoding fails.


Examples:
    >>> from lazyllm.components.formatter import decode_query_with_filepaths

    >>> # Decode a structured query with files
    >>> decode_query_with_filepaths('<lazyllm-query>{"query": "Summarize the content", "files": ["doc.md"]}')
    {'query': 'Summarize the content', 'files': ['doc.md']}

    >>> # Decode a plain string without files
    >>> decode_query_with_filepaths("This is just a simple question")
    'This is just a simple question'
    """
    assert isinstance(query_files, str), 'query_files must be a str.'
    query_files = query_files.strip()
    if query_files.startswith(LAZYLLM_QUERY_PREFIX):
        try:
            obj = json.loads(query_files[len(LAZYLLM_QUERY_PREFIX):])
            return obj
        except json.JSONDecodeError as e:
            raise ValueError(f'JSON parsing failed: {e}')
    else:
        return query_files

lazyllm.components.formatter.lazyllm_merge_query(*args)

Merges multiple query strings (potentially with associated file paths) into a single structured query string.

Each argument can be a plain query string or a structured query created by encode_query_with_filepaths. The function decodes each input, concatenates all query texts, and merges the associated file paths. The final result is re-encoded into a single query string with unified context.

Parameters:

  • *args (str, default: () ) –

    Multiple query strings. Each can be either plain text or an encoded structured query with files.

Returns:

  • str: A single structured query string containing the merged query and file paths.

Examples:

>>> from lazyllm.components.formatter import encode_query_with_filepaths, lazyllm_merge_query
>>> # Merge two structured queries with English content and associated files
>>> q1 = encode_query_with_filepaths("Please summarize document one", files=["doc1.md"])
>>> q2 = encode_query_with_filepaths("Add details from document two", files=["doc2.md"])
>>> lazyllm_merge_query(q1, q2)
'<lazyllm-query>{"query": "Please summarize document oneAdd details from document two", "files": ["doc1.md", "doc2.md"]}'
>>> # Merge plain English text queries without documents
>>> lazyllm_merge_query("What is AI?", "Explain deep learning.")
'What is AI?Explain deep learning.'
Source code in lazyllm/components/formatter/formatterbase.py
def lazyllm_merge_query(*args: str) -> str:
    """Merges multiple query strings (potentially with associated file paths) into a single structured query string.

Each argument can be a plain query string or a structured query created by ``encode_query_with_filepaths``. The function decodes each input, concatenates all query texts, and merges the associated file paths. The final result is re-encoded into a single query string with unified context.

Args:
    *args (str): Multiple query strings. Each can be either plain text or an encoded structured query with files.

**Returns:**

- str: A single structured query string containing the merged query and file paths.


Examples:
    >>> from lazyllm.components.formatter import encode_query_with_filepaths, lazyllm_merge_query

    >>> # Merge two structured queries with English content and associated files
    >>> q1 = encode_query_with_filepaths("Please summarize document one", files=["doc1.md"])
    >>> q2 = encode_query_with_filepaths("Add details from document two", files=["doc2.md"])
    >>> lazyllm_merge_query(q1, q2)
    '<lazyllm-query>{"query": "Please summarize document oneAdd details from document two", "files": ["doc1.md", "doc2.md"]}'

    >>> # Merge plain English text queries without documents
    >>> lazyllm_merge_query("What is AI?", "Explain deep learning.")
    'What is AI?Explain deep learning.'
    """
    if len(args) == 1:
        return args[0]
    for item in args:
        assert isinstance(item, str), 'Merge object must be str!'
    querys = ''
    files = []
    for item in args:
        decode = decode_query_with_filepaths(item)
        if isinstance(decode, dict):
            querys += decode['query']
            files.extend(decode['files'])
        else:
            querys += decode
    return encode_query_with_filepaths(querys, files)

lazyllm.components.formatter.formatterbase.JsonLikeFormatter

Bases: LazyLLMFormatterBase

This class is used to extract subfields from nested structures (like dicts, lists, tuples) using a JSON-like indexing syntax.

The behavior is driven by a formatter string similar to Python-style slicing and dictionary access:

  • [0] fetches the first item
  • [0][{key}] accesses the key field in the first item
  • [0,1][{a,b}] fetches the a and b fields from the first and second items
  • [::2] does slicing with a step of 2
  • *[0][{x}] means return a wrapped/structured result

Parameters:

  • formatter (str, default: None ) –

    A format string that controls how to slice and extract the structure. If None, the input will be returned directly.

Examples:

>>> from lazyllm.components.formatter.formatterbase import JsonLikeFormatter
>>> formatter = JsonLikeFormatter("[{a,b}]")
Source code in lazyllm/components/formatter/formatterbase.py
class JsonLikeFormatter(LazyLLMFormatterBase):
    """This class is used to extract subfields from nested structures (like dicts, lists, tuples) using a JSON-like indexing syntax.

The behavior is driven by a formatter string similar to Python-style slicing and dictionary access:

- `[0]` fetches the first item
- `[0][{key}]` accesses the `key` field in the first item
- `[0,1][{a,b}]` fetches the `a` and `b` fields from the first and second items
- `[::2]` does slicing with a step of 2
- `*[0][{x}]` means return a wrapped/structured result

Args:
    formatter (str, optional): A format string that controls how to slice and extract the structure. If None, the input will be returned directly.


Examples:
    >>> from lazyllm.components.formatter.formatterbase import JsonLikeFormatter
    >>> formatter = JsonLikeFormatter("[{a,b}]")
    """
    class _ListIdxes(tuple): pass
    class _DictKeys(tuple): pass

    def __init__(self, formatter: Optional[str] = None):
        if formatter and formatter.startswith('*['):
            self._return_package = True
            self._formatter = formatter.strip('*')
        else:
            self._return_package = False
            self._formatter = formatter

        if self._formatter:
            assert '*' not in self._formatter, '`*` can only be used before `[` in the beginning'
            self._formatter = self._formatter.strip().replace('{', '[{').replace('}', '}]')
            self._parse_formatter()
        else:
            self._slices = None

    def _parse_formatter(self):
        # Remove the surrounding brackets
        assert self._formatter.startswith('[') and self._formatter.endswith(']')
        slice_str = self._formatter.strip()[1:-1]
        dimensions = slice_str.split('][')
        slices = []

        for dim in dimensions:
            if '{' in dim:
                slices.append(__class__._DictKeys(d.strip() for d in dim[1:-1].split(',') if d.strip()))
            elif ':' in dim:
                assert ',' not in dim, '[a, b:c] is not supported'
                parts = dim.split(':')
                start = int(parts[0]) if _is_number(parts[0]) else None
                end = int(parts[1]) if len(parts) > 1 and _is_number(parts[1]) else None
                step = int(parts[2]) if len(parts) > 2 and _is_number(parts[2]) else None
                slices.append(slice(start, end, step))
            elif ',' in dim:
                slices.append(__class__._ListIdxes(d.strip() for d in dim.split(',') if d.strip()))
            else:
                slices.append(dim.strip())
        self._slices = slices

    def _parse_py_data_by_formatter(self, data, *, slices=None):  # noqa C901
        def _impl(data, slice):
            if isinstance(data, (tuple, list)) and isinstance(slice, str):
                return data[int(slice)]
            if isinstance(slice, __class__._ListIdxes):
                if isinstance(data, dict): return [data[k] for k in slice]
                elif isinstance(data, (tuple, list)): return type(data)(data[int(k)] for k in slice)
                else: raise RuntimeError('Only tuple/list/dict is supported for [a,b,c]')
            if isinstance(slice, __class__._DictKeys):
                assert isinstance(data, dict)
                if len(slice) == 1 and slice[0] == ':': return data
                return {k: data[k] for k in slice}
            return data[slice]

        if slices is None: slices = self._slices
        if not slices: return data
        curr_slice = slices[0]
        if isinstance(curr_slice, slice):
            if isinstance(data, dict):
                assert curr_slice.start is None and curr_slice.stop is None and curr_slice.step is None, (
                    'Only {:} and [:] is supported in dict slice')
                curr_slice = __class__._ListIdxes(data.keys())
            elif isinstance(data, (tuple, list)):
                return type(data)(self._parse_py_data_by_formatter(d, slices=slices[1:])
                                  for d in _impl(data, curr_slice))
        if isinstance(curr_slice, __class__._DictKeys):
            return {k: self._parse_py_data_by_formatter(v, slices=slices[1:])
                    for k, v in _impl(data, curr_slice).items()}
        elif isinstance(curr_slice, __class__._ListIdxes):
            tp = package if self._return_package else list if isinstance(data, dict) else type(data)
            return tp(self._parse_py_data_by_formatter(r, slices=slices[1:]) for r in _impl(data, curr_slice))
        else: return self._parse_py_data_by_formatter(_impl(data, curr_slice), slices=slices[1:])

lazyllm.components.formatter.formatterbase.PythonFormatter

Bases: JsonLikeFormatter

Reserved formatter class for supporting Python-style data extraction syntax. To be developed.

Currently inherits from JsonLikeFormatter with no additional behavior.

Source code in lazyllm/components/formatter/formatterbase.py
class PythonFormatter(JsonLikeFormatter):
    """Reserved formatter class for supporting Python-style data extraction syntax. To be developed.

Currently inherits from JsonLikeFormatter with no additional behavior.
"""
    pass

lazyllm.components.formatter.FileFormatter

Bases: LazyLLMFormatterBase

A formatter that transforms query strings with document context between structured formats.

Supports three modes:

  • "decode": Decodes structured query strings into dictionaries with query and files.
  • "encode": Encodes a dictionary with query and files into a structured query string.
  • "merge": Merges multiple structured query strings into one.

Parameters:

  • formatter (str, default: 'decode' ) –

    The operation mode. Must be one of "decode", "encode", or "merge". Defaults to "decode".

Examples:

>>> from lazyllm.components.formatter import FileFormatter
>>> # Decode mode
>>> fmt = FileFormatter('decode')
Source code in lazyllm/components/formatter/formatterbase.py
class FileFormatter(LazyLLMFormatterBase):
    """A formatter that transforms query strings with document context between structured formats.

Supports three modes:

- "decode": Decodes structured query strings into dictionaries with `query` and `files`.
- "encode": Encodes a dictionary with `query` and `files` into a structured query string.
- "merge": Merges multiple structured query strings into one.

Args:
    formatter (str): The operation mode. Must be one of "decode", "encode", or "merge". Defaults to "decode".


Examples:
    >>> from lazyllm.components.formatter import FileFormatter

    >>> # Decode mode
    >>> fmt = FileFormatter('decode')
    """

    def __init__(self, formatter: str = 'decode'):
        self._mode = formatter.strip().lower()
        assert self._mode in ('decode', 'encode', 'merge')

    def _parse_py_data_by_formatter(self, py_data):
        if self._mode == 'merge':
            if isinstance(py_data, str):
                return py_data
            assert isinstance(py_data, package)
            return lazyllm_merge_query(*py_data)

        if isinstance(py_data, package):
            res = []
            for i_data in py_data:
                res.append(self._parse_py_data_by_formatter(i_data))
            return package(res)
        elif isinstance(py_data, (str, dict)):
            return self._decode_one_data(py_data)
        else:
            return py_data

    def _decode_one_data(self, py_data):
        if self._mode == 'decode':
            if isinstance(py_data, str):
                return decode_query_with_filepaths(py_data)
            else:
                return py_data
        else:
            if isinstance(py_data, dict) and 'query' in py_data and 'files' in py_data:
                return encode_query_with_filepaths(**py_data)
            else:
                return py_data

lazyllm.components.formatter.YamlFormatter

Bases: JsonLikeFormatter

A formatter for extracting structured information from YAML-formatted strings.

Inherits from JsonLikeFormatter. Uses the internal method to parse YAML strings into Python objects, and then applies JSON-like formatting rules to extract desired fields.

Suitable for handling nested YAML content with formatter-based field selection.

Examples:

>>> from lazyllm.components.formatter import YamlFormatter
>>> formatter = YamlFormatter("{name,age}")
>>> msg = """ 
... name: Alice
... age: 30
... city: London
... """
>>> formatter(msg)
{'name': 'Alice', 'age': 30}
Source code in lazyllm/components/formatter/yamlformatter.py
class YamlFormatter(JsonLikeFormatter):
    """A formatter for extracting structured information from YAML-formatted strings.

Inherits from JsonLikeFormatter. Uses the internal method to parse YAML strings into Python objects, and then applies JSON-like formatting rules to extract desired fields.

Suitable for handling nested YAML content with formatter-based field selection.


Examples:
    >>> from lazyllm.components.formatter import YamlFormatter
    >>> formatter = YamlFormatter("{name,age}")
    >>> msg = \"\"\" 
    ... name: Alice
    ... age: 30
    ... city: London
    ... \"\"\"
    >>> formatter(msg)
    {'name': 'Alice', 'age': 30}
    """
    def _load(self, msg: str):
        try:
            return yaml.load(msg, Loader=yaml.SafeLoader)
        except Exception as e:
            lazyllm.LOG.info(f'Error: {e}')
            return ''

lazyllm.components.formatter.encode_query_with_filepaths(query=None, files=None)

Encodes a query string together with associated file paths into a structured string format with context.

If file paths are provided, the query and file list will be wrapped into a JSON object prefixed with __lazyllm_docs__. Otherwise, it returns the original query string.

Parameters:

  • query (str, default: None ) –

    The user query string. Defaults to an empty string.

  • files (str or List[str], default: None ) –

    File path(s) associated with the query. Can be a single string or a list of strings.

Returns:

  • str: A structured encoded query string or the raw query.

Raises:

  • AssertionError

    If files is not a string or list of strings.

Examples:

>>> from lazyllm.components.formatter import encode_query_with_filepaths
>>> # Encode a query along with associated documentation files
>>> encode_query_with_filepaths("Generate questions based on the document", files=["a.md"])
'<lazyllm-query>{"query": "Generate questions based on the document", "files": ["a.md"]}'
Source code in lazyllm/components/formatter/formatterbase.py
def encode_query_with_filepaths(query: str = None, files: Union[str, List[str]] = None) -> str:
    """Encodes a query string together with associated file paths into a structured string format with context.

If file paths are provided, the query and file list will be wrapped into a JSON object prefixed with ``__lazyllm_docs__``. Otherwise, it returns the original query string.

Args:
    query (str): The user query string. Defaults to an empty string.
    files (str or List[str]): File path(s) associated with the query. Can be a single string or a list of strings.

**Returns:**

- str: A structured encoded query string or the raw query.

Raises:
    AssertionError: If `files` is not a string or list of strings.


Examples:
    >>> from lazyllm.components.formatter import encode_query_with_filepaths

    >>> # Encode a query along with associated documentation files
    >>> encode_query_with_filepaths("Generate questions based on the document", files=["a.md"])
    '<lazyllm-query>{"query": "Generate questions based on the document", "files": ["a.md"]}'
    """
    query = query if query else ''
    if files:
        if isinstance(files, str): files = [files]
        assert isinstance(files, list), 'files must be a list.'
        assert all(isinstance(item, str) for item in files), 'All items in files must be strings'
        return LAZYLLM_QUERY_PREFIX + json.dumps({'query': query, 'files': files})
    else:
        return query

lazyllm.components.formatter.decode_query_with_filepaths(query_files)

Decodes a structured query string into a dictionary containing the original query and file paths.

If the input string starts with the special prefix __lazyllm_docs__, it attempts to parse the JSON content; otherwise, it returns the raw query string as-is.

Parameters:

  • query_files (str) –

    The encoded query string that may include both query and file paths.

Returns:

  • Union[dict, str]: A dictionary containing 'query' and 'files' if structured, otherwise the original query string.

Raises:

  • AssertionError

    If the input is not a string.

  • ValueError

    If the string is prefixed but JSON decoding fails.

Examples:

>>> from lazyllm.components.formatter import decode_query_with_filepaths
>>> # Decode a structured query with files
>>> decode_query_with_filepaths('<lazyllm-query>{"query": "Summarize the content", "files": ["doc.md"]}')
{'query': 'Summarize the content', 'files': ['doc.md']}
>>> # Decode a plain string without files
>>> decode_query_with_filepaths("This is just a simple question")
'This is just a simple question'
Source code in lazyllm/components/formatter/formatterbase.py
def decode_query_with_filepaths(query_files: str) -> Union[dict, str]:
    """Decodes a structured query string into a dictionary containing the original query and file paths.

If the input string starts with the special prefix ``__lazyllm_docs__``, it attempts to parse the JSON content; otherwise, it returns the raw query string as-is.

Args:
    query_files (str): The encoded query string that may include both query and file paths.

**Returns:**

- Union[dict, str]: A dictionary containing 'query' and 'files' if structured, otherwise the original query string.

Raises:
    AssertionError: If the input is not a string.
    ValueError: If the string is prefixed but JSON decoding fails.


Examples:
    >>> from lazyllm.components.formatter import decode_query_with_filepaths

    >>> # Decode a structured query with files
    >>> decode_query_with_filepaths('<lazyllm-query>{"query": "Summarize the content", "files": ["doc.md"]}')
    {'query': 'Summarize the content', 'files': ['doc.md']}

    >>> # Decode a plain string without files
    >>> decode_query_with_filepaths("This is just a simple question")
    'This is just a simple question'
    """
    assert isinstance(query_files, str), 'query_files must be a str.'
    query_files = query_files.strip()
    if query_files.startswith(LAZYLLM_QUERY_PREFIX):
        try:
            obj = json.loads(query_files[len(LAZYLLM_QUERY_PREFIX):])
            return obj
        except json.JSONDecodeError as e:
            raise ValueError(f'JSON parsing failed: {e}')
    else:
        return query_files

lazyllm.components.formatter.lazyllm_merge_query(*args)

Merges multiple query strings (potentially with associated file paths) into a single structured query string.

Each argument can be a plain query string or a structured query created by encode_query_with_filepaths. The function decodes each input, concatenates all query texts, and merges the associated file paths. The final result is re-encoded into a single query string with unified context.

Parameters:

  • *args (str, default: () ) –

    Multiple query strings. Each can be either plain text or an encoded structured query with files.

Returns:

  • str: A single structured query string containing the merged query and file paths.

Examples:

>>> from lazyllm.components.formatter import encode_query_with_filepaths, lazyllm_merge_query
>>> # Merge two structured queries with English content and associated files
>>> q1 = encode_query_with_filepaths("Please summarize document one", files=["doc1.md"])
>>> q2 = encode_query_with_filepaths("Add details from document two", files=["doc2.md"])
>>> lazyllm_merge_query(q1, q2)
'<lazyllm-query>{"query": "Please summarize document oneAdd details from document two", "files": ["doc1.md", "doc2.md"]}'
>>> # Merge plain English text queries without documents
>>> lazyllm_merge_query("What is AI?", "Explain deep learning.")
'What is AI?Explain deep learning.'
Source code in lazyllm/components/formatter/formatterbase.py
def lazyllm_merge_query(*args: str) -> str:
    """Merges multiple query strings (potentially with associated file paths) into a single structured query string.

Each argument can be a plain query string or a structured query created by ``encode_query_with_filepaths``. The function decodes each input, concatenates all query texts, and merges the associated file paths. The final result is re-encoded into a single query string with unified context.

Args:
    *args (str): Multiple query strings. Each can be either plain text or an encoded structured query with files.

**Returns:**

- str: A single structured query string containing the merged query and file paths.


Examples:
    >>> from lazyllm.components.formatter import encode_query_with_filepaths, lazyllm_merge_query

    >>> # Merge two structured queries with English content and associated files
    >>> q1 = encode_query_with_filepaths("Please summarize document one", files=["doc1.md"])
    >>> q2 = encode_query_with_filepaths("Add details from document two", files=["doc2.md"])
    >>> lazyllm_merge_query(q1, q2)
    '<lazyllm-query>{"query": "Please summarize document oneAdd details from document two", "files": ["doc1.md", "doc2.md"]}'

    >>> # Merge plain English text queries without documents
    >>> lazyllm_merge_query("What is AI?", "Explain deep learning.")
    'What is AI?Explain deep learning.'
    """
    if len(args) == 1:
        return args[0]
    for item in args:
        assert isinstance(item, str), 'Merge object must be str!'
    querys = ''
    files = []
    for item in args:
        decode = decode_query_with_filepaths(item)
        if isinstance(decode, dict):
            querys += decode['query']
            files.extend(decode['files'])
        else:
            querys += decode
    return encode_query_with_filepaths(querys, files)

lazyllm.components.JsonFormatter

Bases: JsonLikeFormatter

This class is a JSON formatter, that is, the user wants the model to output content is JSON format, and can also select a field in the output content by indexing.

Examples:

>>> import lazyllm
>>> from lazyllm.components import JsonFormatter
>>> toc_prompt='''
... You are now an intelligent assistant. Your task is to understand the user's input and convert the outline into a list of nested dictionaries. Each dictionary contains a `title` and a `describe`, where the `title` should clearly indicate the level using Markdown format, and the `describe` is a description and writing guide for that section.
... 
... Please generate the corresponding list of nested dictionaries based on the following user input:
... 
... Example output:
... [
...     {
...         "title": "# Level 1 Title",
...         "describe": "Please provide a detailed description of the content under this title, offering background information and core viewpoints."
...     },
...     {
...         "title": "## Level 2 Title",
...         "describe": "Please provide a detailed description of the content under this title, giving specific details and examples to support the viewpoints of the Level 1 title."
...     },
...     {
...         "title": "### Level 3 Title",
...         "describe": "Please provide a detailed description of the content under this title, deeply analyzing and providing more details and data support."
...     }
... ]
... User input is as follows:
... '''
>>> query = "Please help me write an article about the application of artificial intelligence in the medical field."
>>> m = lazyllm.TrainableModule("internlm2-chat-20b").prompt(toc_prompt).start()
>>> ret = m(query, max_new_tokens=2048)
>>> print(f"ret: {ret!r}")  # the model output without specifying a formatter
'Based on your user input, here is the corresponding list of nested dictionaries:
[
    {
        "title": "# Application of Artificial Intelligence in the Medical Field",
        "describe": "Please provide a detailed description of the application of artificial intelligence in the medical field, including its benefits, challenges, and future prospects."
    },
    {
        "title": "## AI in Medical Diagnosis",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical diagnosis, including specific examples of AI-based diagnostic tools and their impact on patient outcomes."
    },
    {
        "title": "### AI in Medical Imaging",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical imaging, including the advantages of AI-based image analysis and its applications in various medical specialties."
    },
    {
        "title": "### AI in Drug Discovery and Development",
        "describe": "Please provide a detailed description of how artificial intelligence is used in drug discovery and development, including the role of AI in identifying potential drug candidates and streamlining the drug development process."
    },
    {
        "title": "## AI in Medical Research",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical research, including its applications in genomics, epidemiology, and clinical trials."
    },
    {
        "title": "### AI in Genomics and Precision Medicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in genomics and precision medicine, including the role of AI in analyzing large-scale genomic data and tailoring treatments to individual patients."
    },
    {
        "title": "### AI in Epidemiology and Public Health",
        "describe": "Please provide a detailed description of how artificial intelligence is used in epidemiology and public health, including its applications in disease surveillance, outbreak prediction, and resource allocation."
    },
    {
        "title": "### AI in Clinical Trials",
        "describe": "Please provide a detailed description of how artificial intelligence is used in clinical trials, including its role in patient recruitment, trial design, and data analysis."
    },
    {
        "title": "## AI in Medical Practice",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical practice, including its applications in patient monitoring, personalized medicine, and telemedicine."
    },
    {
        "title": "### AI in Patient Monitoring",
        "describe": "Please provide a detailed description of how artificial intelligence is used in patient monitoring, including its role in real-time monitoring of vital signs and early detection of health issues."
    },
    {
        "title": "### AI in Personalized Medicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in personalized medicine, including its role in analyzing patient data to tailor treatments and predict outcomes."
    },
    {
        "title": "### AI in Telemedicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in telemedicine, including its applications in remote consultations, virtual diagnoses, and digital health records."
    },
    {
        "title": "## AI in Medical Ethics and Policy",
        "describe": "Please provide a detailed description of the ethical and policy considerations surrounding the use of artificial intelligence in the medical field, including issues related to data privacy, bias, and accountability."
    }
]'
>>> m = lazyllm.TrainableModule("internlm2-chat-20b").formatter(JsonFormatter("[:][title]")).prompt(toc_prompt).start()
>>> ret = m(query, max_new_tokens=2048)
>>> print(f"ret: {ret}")  # the model output of the specified formaater
['# Application of Artificial Intelligence in the Medical Field', '## AI in Medical Diagnosis', '### AI in Medical Imaging', '### AI in Drug Discovery and Development', '## AI in Medical Research', '### AI in Genomics and Precision Medicine', '### AI in Epidemiology and Public Health', '### AI in Clinical Trials', '## AI in Medical Practice', '### AI in Patient Monitoring', '### AI in Personalized Medicine', '### AI in Telemedicine', '## AI in Medical Ethics and Policy']
Source code in lazyllm/components/formatter/jsonformatter.py
class JsonFormatter(JsonLikeFormatter):
    """This class is a JSON formatter, that is, the user wants the model to output content is JSON format, and can also select a field in the output content by indexing.


Examples:
    >>> import lazyllm
    >>> from lazyllm.components import JsonFormatter
    >>> toc_prompt='''
    ... You are now an intelligent assistant. Your task is to understand the user's input and convert the outline into a list of nested dictionaries. Each dictionary contains a `title` and a `describe`, where the `title` should clearly indicate the level using Markdown format, and the `describe` is a description and writing guide for that section.
    ... 
    ... Please generate the corresponding list of nested dictionaries based on the following user input:
    ... 
    ... Example output:
    ... [
    ...     {
    ...         "title": "# Level 1 Title",
    ...         "describe": "Please provide a detailed description of the content under this title, offering background information and core viewpoints."
    ...     },
    ...     {
    ...         "title": "## Level 2 Title",
    ...         "describe": "Please provide a detailed description of the content under this title, giving specific details and examples to support the viewpoints of the Level 1 title."
    ...     },
    ...     {
    ...         "title": "### Level 3 Title",
    ...         "describe": "Please provide a detailed description of the content under this title, deeply analyzing and providing more details and data support."
    ...     }
    ... ]
    ... User input is as follows:
    ... '''
    >>> query = "Please help me write an article about the application of artificial intelligence in the medical field."
    >>> m = lazyllm.TrainableModule("internlm2-chat-20b").prompt(toc_prompt).start()
    >>> ret = m(query, max_new_tokens=2048)
    >>> print(f"ret: {ret!r}")  # the model output without specifying a formatter
    'Based on your user input, here is the corresponding list of nested dictionaries:
    [
        {
            "title": "# Application of Artificial Intelligence in the Medical Field",
            "describe": "Please provide a detailed description of the application of artificial intelligence in the medical field, including its benefits, challenges, and future prospects."
        },
        {
            "title": "## AI in Medical Diagnosis",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical diagnosis, including specific examples of AI-based diagnostic tools and their impact on patient outcomes."
        },
        {
            "title": "### AI in Medical Imaging",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical imaging, including the advantages of AI-based image analysis and its applications in various medical specialties."
        },
        {
            "title": "### AI in Drug Discovery and Development",
            "describe": "Please provide a detailed description of how artificial intelligence is used in drug discovery and development, including the role of AI in identifying potential drug candidates and streamlining the drug development process."
        },
        {
            "title": "## AI in Medical Research",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical research, including its applications in genomics, epidemiology, and clinical trials."
        },
        {
            "title": "### AI in Genomics and Precision Medicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in genomics and precision medicine, including the role of AI in analyzing large-scale genomic data and tailoring treatments to individual patients."
        },
        {
            "title": "### AI in Epidemiology and Public Health",
            "describe": "Please provide a detailed description of how artificial intelligence is used in epidemiology and public health, including its applications in disease surveillance, outbreak prediction, and resource allocation."
        },
        {
            "title": "### AI in Clinical Trials",
            "describe": "Please provide a detailed description of how artificial intelligence is used in clinical trials, including its role in patient recruitment, trial design, and data analysis."
        },
        {
            "title": "## AI in Medical Practice",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical practice, including its applications in patient monitoring, personalized medicine, and telemedicine."
        },
        {
            "title": "### AI in Patient Monitoring",
            "describe": "Please provide a detailed description of how artificial intelligence is used in patient monitoring, including its role in real-time monitoring of vital signs and early detection of health issues."
        },
        {
            "title": "### AI in Personalized Medicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in personalized medicine, including its role in analyzing patient data to tailor treatments and predict outcomes."
        },
        {
            "title": "### AI in Telemedicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in telemedicine, including its applications in remote consultations, virtual diagnoses, and digital health records."
        },
        {
            "title": "## AI in Medical Ethics and Policy",
            "describe": "Please provide a detailed description of the ethical and policy considerations surrounding the use of artificial intelligence in the medical field, including issues related to data privacy, bias, and accountability."
        }
    ]'
    >>> m = lazyllm.TrainableModule("internlm2-chat-20b").formatter(JsonFormatter("[:][title]")).prompt(toc_prompt).start()
    >>> ret = m(query, max_new_tokens=2048)
    >>> print(f"ret: {ret}")  # the model output of the specified formaater
    ['# Application of Artificial Intelligence in the Medical Field', '## AI in Medical Diagnosis', '### AI in Medical Imaging', '### AI in Drug Discovery and Development', '## AI in Medical Research', '### AI in Genomics and Precision Medicine', '### AI in Epidemiology and Public Health', '### AI in Clinical Trials', '## AI in Medical Practice', '### AI in Patient Monitoring', '### AI in Personalized Medicine', '### AI in Telemedicine', '## AI in Medical Ethics and Policy']
    """
    def _extract_json_from_string(self, mixed_str: str):  # noqa: C901
        json_objects = []
        brace_level = 0
        current_json = ''
        in_string = False

        for char in mixed_str:
            if char == '"' and (len(current_json) == 0 or current_json[-1] != '\\'):
                in_string = not in_string

            if not in_string:
                if char in '{[':
                    if brace_level == 0:
                        current_json = ''
                    brace_level += 1
                elif char in '}]':
                    brace_level -= 1

            if brace_level > 0 or (brace_level == 0 and char in '}]'):
                current_json += char

            if brace_level == 0 and current_json:
                try:
                    json_objects.append(json.loads(current_json))
                    current_json = ''
                except json.JSONDecodeError:
                    try:
                        repaired_obj = json_repair.loads(current_json)
                        json_objects.append(repaired_obj)
                        current_json = ''
                    except (json.JSONDecodeError, ValueError):
                        continue

        return json_objects

    def _load(self, msg: str):
        # Convert str to json format
        assert msg.count('{') == msg.count('}'), f'{msg} is not a valid json string.'
        try:
            json_objects = self._extract_json_from_string(msg)
            if len(json_objects) == 0:
                raise TypeError(f'{msg} is not a valid json string.')
            return json_objects if len(json_objects) > 1 else json_objects[0]
        except Exception as e:
            lazyllm.LOG.info(f'Error: {e}')
            return ''

lazyllm.components.EmptyFormatter

Bases: LazyLLMFormatterBase

This type is the system default formatter. When the user does not specify anything or does not want to format the model output, this type is selected. The model output will be in the same format.

Examples:

>>> import lazyllm
>>> from lazyllm.components import EmptyFormatter
>>> toc_prompt='''
... You are now an intelligent assistant. Your task is to understand the user's input and convert the outline into a list of nested dictionaries. Each dictionary contains a `title` and a `describe`, where the `title` should clearly indicate the level using Markdown format, and the `describe` is a description and writing guide for that section.
... 
... Please generate the corresponding list of nested dictionaries based on the following user input:
... 
... Example output:
... [
...     {
...         "title": "# Level 1 Title",
...         "describe": "Please provide a detailed description of the content under this title, offering background information and core viewpoints."
...     },
...     {
...         "title": "## Level 2 Title",
...         "describe": "Please provide a detailed description of the content under this title, giving specific details and examples to support the viewpoints of the Level 1 title."
...     },
...     {
...         "title": "### Level 3 Title",
...         "describe": "Please provide a detailed description of the content under this title, deeply analyzing and providing more details and data support."
...     }
... ]
... User input is as follows:
... '''
>>> query = "Please help me write an article about the application of artificial intelligence in the medical field."
>>> m = lazyllm.TrainableModule("internlm2-chat-20b").prompt(toc_prompt).start()  # the model output without specifying a formatter
>>> ret = m(query, max_new_tokens=2048)
>>> print(f"ret: {ret!r}")
'Based on your user input, here is the corresponding list of nested dictionaries:
[
    {
        "title": "# Application of Artificial Intelligence in the Medical Field",
        "describe": "Please provide a detailed description of the application of artificial intelligence in the medical field, including its benefits, challenges, and future prospects."
    },
    {
        "title": "## AI in Medical Diagnosis",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical diagnosis, including specific examples of AI-based diagnostic tools and their impact on patient outcomes."
    },
    {
        "title": "### AI in Medical Imaging",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical imaging, including the advantages of AI-based image analysis and its applications in various medical specialties."
    },
    {
        "title": "### AI in Drug Discovery and Development",
        "describe": "Please provide a detailed description of how artificial intelligence is used in drug discovery and development, including the role of AI in identifying potential drug candidates and streamlining the drug development process."
    },
    {
        "title": "## AI in Medical Research",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical research, including its applications in genomics, epidemiology, and clinical trials."
    },
    {
        "title": "### AI in Genomics and Precision Medicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in genomics and precision medicine, including the role of AI in analyzing large-scale genomic data and tailoring treatments to individual patients."
    },
    {
        "title": "### AI in Epidemiology and Public Health",
        "describe": "Please provide a detailed description of how artificial intelligence is used in epidemiology and public health, including its applications in disease surveillance, outbreak prediction, and resource allocation."
    },
    {
        "title": "### AI in Clinical Trials",
        "describe": "Please provide a detailed description of how artificial intelligence is used in clinical trials, including its role in patient recruitment, trial design, and data analysis."
    },
    {
        "title": "## AI in Medical Practice",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical practice, including its applications in patient monitoring, personalized medicine, and telemedicine."
    },
    {
        "title": "### AI in Patient Monitoring",
        "describe": "Please provide a detailed description of how artificial intelligence is used in patient monitoring, including its role in real-time monitoring of vital signs and early detection of health issues."
    },
    {
        "title": "### AI in Personalized Medicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in personalized medicine, including its role in analyzing patient data to tailor treatments and predict outcomes."
    },
    {
        "title": "### AI in Telemedicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in telemedicine, including its applications in remote consultations, virtual diagnoses, and digital health records."
    },
    {
        "title": "## AI in Medical Ethics and Policy",
        "describe": "Please provide a detailed description of the ethical and policy considerations surrounding the use of artificial intelligence in the medical field, including issues related to data privacy, bias, and accountability."
    }
]'
>>> m = lazyllm.TrainableModule("internlm2-chat-20b").formatter(EmptyFormatter()).prompt(toc_prompt).start()  # the model output of the specified formatter
>>> ret = m(query, max_new_tokens=2048)
>>> print(f"ret: {ret!r}")
'Based on your user input, here is the corresponding list of nested dictionaries:
[
    {
        "title": "# Application of Artificial Intelligence in the Medical Field",
        "describe": "Please provide a detailed description of the application of artificial intelligence in the medical field, including its benefits, challenges, and future prospects."
    },
    {
        "title": "## AI in Medical Diagnosis",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical diagnosis, including specific examples of AI-based diagnostic tools and their impact on patient outcomes."
    },
    {
        "title": "### AI in Medical Imaging",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical imaging, including the advantages of AI-based image analysis and its applications in various medical specialties."
    },
    {
        "title": "### AI in Drug Discovery and Development",
        "describe": "Please provide a detailed description of how artificial intelligence is used in drug discovery and development, including the role of AI in identifying potential drug candidates and streamlining the drug development process."
    },
    {
        "title": "## AI in Medical Research",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical research, including its applications in genomics, epidemiology, and clinical trials."
    },
    {
        "title": "### AI in Genomics and Precision Medicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in genomics and precision medicine, including the role of AI in analyzing large-scale genomic data and tailoring treatments to individual patients."
    },
    {
        "title": "### AI in Epidemiology and Public Health",
        "describe": "Please provide a detailed description of how artificial intelligence is used in epidemiology and public health, including its applications in disease surveillance, outbreak prediction, and resource allocation."
    },
    {
        "title": "### AI in Clinical Trials",
        "describe": "Please provide a detailed description of how artificial intelligence is used in clinical trials, including its role in patient recruitment, trial design, and data analysis."
    },
    {
        "title": "## AI in Medical Practice",
        "describe": "Please provide a detailed description of how artificial intelligence is used in medical practice, including its applications in patient monitoring, personalized medicine, and telemedicine."
    },
    {
        "title": "### AI in Patient Monitoring",
        "describe": "Please provide a detailed description of how artificial intelligence is used in patient monitoring, including its role in real-time monitoring of vital signs and early detection of health issues."
    },
    {
        "title": "### AI in Personalized Medicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in personalized medicine, including its role in analyzing patient data to tailor treatments and predict outcomes."
    },
    {
        "title": "### AI in Telemedicine",
        "describe": "Please provide a detailed description of how artificial intelligence is used in telemedicine, including its applications in remote consultations, virtual diagnoses, and digital health records."
    },
    {
        "title": "## AI in Medical Ethics and Policy",
        "describe": "Please provide a detailed description of the ethical and policy considerations surrounding the use of artificial intelligence in the medical field, including issues related to data privacy, bias, and accountability."
    }
]'
Source code in lazyllm/components/formatter/formatterbase.py
class EmptyFormatter(LazyLLMFormatterBase):
    """This type is the system default formatter. When the user does not specify anything or does not want to format the model output, this type is selected. The model output will be in the same format.


Examples:
    >>> import lazyllm
    >>> from lazyllm.components import EmptyFormatter
    >>> toc_prompt='''
    ... You are now an intelligent assistant. Your task is to understand the user's input and convert the outline into a list of nested dictionaries. Each dictionary contains a `title` and a `describe`, where the `title` should clearly indicate the level using Markdown format, and the `describe` is a description and writing guide for that section.
    ... 
    ... Please generate the corresponding list of nested dictionaries based on the following user input:
    ... 
    ... Example output:
    ... [
    ...     {
    ...         "title": "# Level 1 Title",
    ...         "describe": "Please provide a detailed description of the content under this title, offering background information and core viewpoints."
    ...     },
    ...     {
    ...         "title": "## Level 2 Title",
    ...         "describe": "Please provide a detailed description of the content under this title, giving specific details and examples to support the viewpoints of the Level 1 title."
    ...     },
    ...     {
    ...         "title": "### Level 3 Title",
    ...         "describe": "Please provide a detailed description of the content under this title, deeply analyzing and providing more details and data support."
    ...     }
    ... ]
    ... User input is as follows:
    ... '''
    >>> query = "Please help me write an article about the application of artificial intelligence in the medical field."
    >>> m = lazyllm.TrainableModule("internlm2-chat-20b").prompt(toc_prompt).start()  # the model output without specifying a formatter
    >>> ret = m(query, max_new_tokens=2048)
    >>> print(f"ret: {ret!r}")
    'Based on your user input, here is the corresponding list of nested dictionaries:
    [
        {
            "title": "# Application of Artificial Intelligence in the Medical Field",
            "describe": "Please provide a detailed description of the application of artificial intelligence in the medical field, including its benefits, challenges, and future prospects."
        },
        {
            "title": "## AI in Medical Diagnosis",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical diagnosis, including specific examples of AI-based diagnostic tools and their impact on patient outcomes."
        },
        {
            "title": "### AI in Medical Imaging",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical imaging, including the advantages of AI-based image analysis and its applications in various medical specialties."
        },
        {
            "title": "### AI in Drug Discovery and Development",
            "describe": "Please provide a detailed description of how artificial intelligence is used in drug discovery and development, including the role of AI in identifying potential drug candidates and streamlining the drug development process."
        },
        {
            "title": "## AI in Medical Research",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical research, including its applications in genomics, epidemiology, and clinical trials."
        },
        {
            "title": "### AI in Genomics and Precision Medicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in genomics and precision medicine, including the role of AI in analyzing large-scale genomic data and tailoring treatments to individual patients."
        },
        {
            "title": "### AI in Epidemiology and Public Health",
            "describe": "Please provide a detailed description of how artificial intelligence is used in epidemiology and public health, including its applications in disease surveillance, outbreak prediction, and resource allocation."
        },
        {
            "title": "### AI in Clinical Trials",
            "describe": "Please provide a detailed description of how artificial intelligence is used in clinical trials, including its role in patient recruitment, trial design, and data analysis."
        },
        {
            "title": "## AI in Medical Practice",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical practice, including its applications in patient monitoring, personalized medicine, and telemedicine."
        },
        {
            "title": "### AI in Patient Monitoring",
            "describe": "Please provide a detailed description of how artificial intelligence is used in patient monitoring, including its role in real-time monitoring of vital signs and early detection of health issues."
        },
        {
            "title": "### AI in Personalized Medicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in personalized medicine, including its role in analyzing patient data to tailor treatments and predict outcomes."
        },
        {
            "title": "### AI in Telemedicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in telemedicine, including its applications in remote consultations, virtual diagnoses, and digital health records."
        },
        {
            "title": "## AI in Medical Ethics and Policy",
            "describe": "Please provide a detailed description of the ethical and policy considerations surrounding the use of artificial intelligence in the medical field, including issues related to data privacy, bias, and accountability."
        }
    ]'
    >>> m = lazyllm.TrainableModule("internlm2-chat-20b").formatter(EmptyFormatter()).prompt(toc_prompt).start()  # the model output of the specified formatter
    >>> ret = m(query, max_new_tokens=2048)
    >>> print(f"ret: {ret!r}")
    'Based on your user input, here is the corresponding list of nested dictionaries:
    [
        {
            "title": "# Application of Artificial Intelligence in the Medical Field",
            "describe": "Please provide a detailed description of the application of artificial intelligence in the medical field, including its benefits, challenges, and future prospects."
        },
        {
            "title": "## AI in Medical Diagnosis",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical diagnosis, including specific examples of AI-based diagnostic tools and their impact on patient outcomes."
        },
        {
            "title": "### AI in Medical Imaging",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical imaging, including the advantages of AI-based image analysis and its applications in various medical specialties."
        },
        {
            "title": "### AI in Drug Discovery and Development",
            "describe": "Please provide a detailed description of how artificial intelligence is used in drug discovery and development, including the role of AI in identifying potential drug candidates and streamlining the drug development process."
        },
        {
            "title": "## AI in Medical Research",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical research, including its applications in genomics, epidemiology, and clinical trials."
        },
        {
            "title": "### AI in Genomics and Precision Medicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in genomics and precision medicine, including the role of AI in analyzing large-scale genomic data and tailoring treatments to individual patients."
        },
        {
            "title": "### AI in Epidemiology and Public Health",
            "describe": "Please provide a detailed description of how artificial intelligence is used in epidemiology and public health, including its applications in disease surveillance, outbreak prediction, and resource allocation."
        },
        {
            "title": "### AI in Clinical Trials",
            "describe": "Please provide a detailed description of how artificial intelligence is used in clinical trials, including its role in patient recruitment, trial design, and data analysis."
        },
        {
            "title": "## AI in Medical Practice",
            "describe": "Please provide a detailed description of how artificial intelligence is used in medical practice, including its applications in patient monitoring, personalized medicine, and telemedicine."
        },
        {
            "title": "### AI in Patient Monitoring",
            "describe": "Please provide a detailed description of how artificial intelligence is used in patient monitoring, including its role in real-time monitoring of vital signs and early detection of health issues."
        },
        {
            "title": "### AI in Personalized Medicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in personalized medicine, including its role in analyzing patient data to tailor treatments and predict outcomes."
        },
        {
            "title": "### AI in Telemedicine",
            "describe": "Please provide a detailed description of how artificial intelligence is used in telemedicine, including its applications in remote consultations, virtual diagnoses, and digital health records."
        },
        {
            "title": "## AI in Medical Ethics and Policy",
            "describe": "Please provide a detailed description of the ethical and policy considerations surrounding the use of artificial intelligence in the medical field, including issues related to data privacy, bias, and accountability."
        }
    ]'
    """
    def _parse_py_data_by_formatter(self, msg: str):
        return msg

lazyllm.components.FunctionCallFormatter

Bases: LazyLLMFormatterBase

Function call formatter for processing message dictionaries containing function call information.

This formatter is specifically designed for handling model outputs in function calling scenarios. It extracts only the 'role', 'content', and 'tool_calls' fields from the input dictionary, filtering out other unnecessary fields.

Primarily used in function calling-related modules such as FunctionCall.

Note
  • Input must be a dictionary type, otherwise an assertion error will be raised
  • Only preserves 'role', 'content', and 'tool_calls' fields if they exist in the dictionary

Examples:

>>> from lazyllm.components.formatter.formatterbase import FunctionCallFormatter
>>> formatter = FunctionCallFormatter()
>>> 
>>> # 处理包含函数调用的消息
>>> msg = {
...     'role': 'assistant',
...     'content': 'I will call a function to get the weather.',
...     'tool_calls': [
...         {
...             'id': 'call_123',
...             'type': 'function',
...             'function': {
...                 'name': 'get_weather',
...                 'arguments': '{"location": "Beijing"}'
...             }
...         }
...     ],
...     'other_field': 'will be filtered'
... }
>>> result = formatter.format(msg)
>>> print(result)
{'role': 'assistant', 'content': 'I will call a function to get the weather.', 'tool_calls': [{'id': 'call_123', 'type': 'function', 'function': {'name': 'get_weather', 'arguments': '{"location": "Beijing"}'}}]}
>>> 
>>> # 处理只包含部分字段的消息
>>> msg2 = {
...     'role': 'assistant',
...     'content': 'Hello, how can I help you?'
... }
>>> result2 = formatter.format(msg2)
>>> print(result2)
{'role': 'assistant', 'content': 'Hello, how can I help you?'}
Source code in lazyllm/components/formatter/formatterbase.py
class FunctionCallFormatter(LazyLLMFormatterBase):
    """Function call formatter for processing message dictionaries containing function call information.

This formatter is specifically designed for handling model outputs in function calling scenarios. It extracts only the 'role', 'content', and 'tool_calls' fields from the input dictionary, filtering out other unnecessary fields.

Primarily used in function calling-related modules such as FunctionCall.

Args:
    No parameters, instantiate directly.

Note:
    - Input must be a dictionary type, otherwise an assertion error will be raised
    - Only preserves 'role', 'content', and 'tool_calls' fields if they exist in the dictionary


Examples:
    >>> from lazyllm.components.formatter.formatterbase import FunctionCallFormatter
    >>> formatter = FunctionCallFormatter()
    >>> 
    >>> # 处理包含函数调用的消息
    >>> msg = {
    ...     'role': 'assistant',
    ...     'content': 'I will call a function to get the weather.',
    ...     'tool_calls': [
    ...         {
    ...             'id': 'call_123',
    ...             'type': 'function',
    ...             'function': {
    ...                 'name': 'get_weather',
    ...                 'arguments': '{"location": "Beijing"}'
    ...             }
    ...         }
    ...     ],
    ...     'other_field': 'will be filtered'
    ... }
    >>> result = formatter.format(msg)
    >>> print(result)
    {'role': 'assistant', 'content': 'I will call a function to get the weather.', 'tool_calls': [{'id': 'call_123', 'type': 'function', 'function': {'name': 'get_weather', 'arguments': '{"location": "Beijing"}'}}]}
    >>> 
    >>> # 处理只包含部分字段的消息
    >>> msg2 = {
    ...     'role': 'assistant',
    ...     'content': 'Hello, how can I help you?'
    ... }
    >>> result2 = formatter.format(msg2)
    >>> print(result2)
    {'role': 'assistant', 'content': 'Hello, how can I help you?'}
    """
    def format(self, msg):
        assert isinstance(msg, dict), 'FunctionCallFormatter only supports dict input.'
        return {k: msg[k] for k in ('role', 'content', 'tool_calls', 'reasoning_content') if k in msg}

lazyllm.components.formatter.formatterbase.PipelineFormatter

Bases: LazyLLMFormatterBase

Pipeline formatter for encapsulating data processing pipelines as formatters.

This class wraps Pipeline instances as formatters and supports combining multiple formatters through pipe operators.

Parameters:

  • formatter (Pipeline) –

    Pipeline instance to encapsulate

Source code in lazyllm/components/formatter/formatterbase.py
class PipelineFormatter(LazyLLMFormatterBase):
    """Pipeline formatter for encapsulating data processing pipelines as formatters.

This class wraps Pipeline instances as formatters and supports combining multiple formatters through pipe operators.

Args:
    formatter (Pipeline): Pipeline instance to encapsulate
"""
    def __init__(self, formatter: Pipeline):
        self._formatter = formatter

    def _parse_py_data_by_formatter(self, py_data):
        return self._formatter(py_data)

    def __or__(self, other):
        if isinstance(other, LazyLLMFormatterBase):
            if isinstance(other, PipelineFormatter): other = other._formatter
            return PipelineFormatter(self._formatter | other)
        return NotImplemented

ComponentBase

lazyllm.components.core.ComponentBase

Bases: object

Base class for components, providing a unified interface and basic implementation to facilitate creation of various components.
Components execute tasks via a specified launcher and support custom task execution logic.

Parameters:

  • launcher (LazyLLMLaunchersBase or type, default: empty() ) –

    Launcher instance or launcher class used by the component, defaults to empty launcher.

Examples:

>>> from lazyllm.components.core import ComponentBase
>>> class MyComponent(ComponentBase):
...     def apply(self, x):
...         return x * 2
>>> comp = MyComponent()
>>> comp.name = "ExampleComponent"
>>> print(comp.name)
ExampleComponent
>>> result = comp(10)
>>> print(result)
20
>>> print(comp.apply(5))
10
Source code in lazyllm/components/core.py
class ComponentBase(object, metaclass=LazyLLMRegisterMetaClass):
    """Base class for components, providing a unified interface and basic implementation to facilitate creation of various components.  
Components execute tasks via a specified launcher and support custom task execution logic.

Args:
    launcher (LazyLLMLaunchersBase or type, optional): Launcher instance or launcher class used by the component, defaults to empty launcher.


Examples:
    >>> from lazyllm.components.core import ComponentBase
    >>> class MyComponent(ComponentBase):
    ...     def apply(self, x):
    ...         return x * 2
    >>> comp = MyComponent()
    >>> comp.name = "ExampleComponent"
    >>> print(comp.name)
    ExampleComponent
    >>> result = comp(10)
    >>> print(result)
    20
    >>> print(comp.apply(5))
    10
    """
    def __init__(self, *, launcher=launchers.empty()):  # noqa B008
        self._llm_name = None
        self.job = ReadOnlyWrapper()
        if isinstance(launcher, LazyLLMLaunchersBase):
            self._launcher = launcher
        elif isinstance(launcher, type) and issubclass(launcher, LazyLLMLaunchersBase):
            self._launcher = launcher()
        else:
            raise RuntimeError('Invalid launcher given:', launcher)

    def apply():
        """Core execution method of the component, to be implemented by subclasses.  
Defines the specific business logic or task execution steps of the component.

**Note:**  
If this method is overridden by the subclass, it will be called when the component is invoked.
"""
        raise NotImplementedError('please implement function \'apply\'')

    def cmd(self, *args, **kw) -> Union[str, tuple, list]:
        """Generates the execution command of the component, to be implemented by subclasses.  
The returned command can be a string, tuple, or list, representing the instruction to execute the task.

**Note:**  
If the `apply` method is not overridden, this command will be used to create a job for the launcher to run.
"""
        raise NotImplementedError('please implement function \'cmd\'')

    @property
    def name(self): return self._llm_name
    @name.setter
    def name(self, name): self._llm_name = name

    @property
    def launcher(self): return self._launcher

    def _get_job_with_cmd(self, *args, **kw):
        cmd = self.cmd(*args, **kw)
        cmd = cmd if isinstance(cmd, LazyLLMCMD) else LazyLLMCMD(cmd)
        return self._launcher.makejob(cmd=cmd)

    def _overwrote(self, f):
        return getattr(self.__class__, f) is not getattr(__class__, f) or \
            getattr(self.__class__, '__reg_overwrite__', None) == f

    def __call__(self, *args, **kw):
        if self._overwrote('apply'):
            assert not self._overwrote('cmd'), (
                'Cannot overwrite \'cmd\' and \'apply\' in the same class')
            assert isinstance(self._launcher, launchers.Empty), 'Please use EmptyLauncher instead.'
            return self._launcher.launch(self.apply, *args, **kw)
        else:
            job = self._get_job_with_cmd(*args, **kw)
            self.job.set(job)
            return self._launcher.launch(job)

    def __repr__(self):
        return lazyllm.make_repr('lazyllm.llm.' + self.__class__._lazy_llm_group,
                                 self.__class__.__name__, name=self.name)

apply()

Core execution method of the component, to be implemented by subclasses.
Defines the specific business logic or task execution steps of the component.

Note:
If this method is overridden by the subclass, it will be called when the component is invoked.

Source code in lazyllm/components/core.py
    def apply():
        """Core execution method of the component, to be implemented by subclasses.  
Defines the specific business logic or task execution steps of the component.

**Note:**  
If this method is overridden by the subclass, it will be called when the component is invoked.
"""
        raise NotImplementedError('please implement function \'apply\'')

cmd(*args, **kw)

Generates the execution command of the component, to be implemented by subclasses.
The returned command can be a string, tuple, or list, representing the instruction to execute the task.

Note:
If the apply method is not overridden, this command will be used to create a job for the launcher to run.

Source code in lazyllm/components/core.py
    def cmd(self, *args, **kw) -> Union[str, tuple, list]:
        """Generates the execution command of the component, to be implemented by subclasses.  
The returned command can be a string, tuple, or list, representing the instruction to execute the task.

**Note:**  
If the `apply` method is not overridden, this command will be used to create a job for the launcher to run.
"""
        raise NotImplementedError('please implement function \'cmd\'')

lazyllm.components.deploy.ray.Distributed

Bases: LazyLLMDeployBase

Distributed deployment class, inherits from LazyLLMDeployBase.

Provides distributed model deployment functionality based on Ray framework, supports multi-node cluster deployment.

Parameters:

  • launcher

    Launcher configuration, defaults to remote launcher(ngpus=1)

  • port (int, default: None ) –

    Service port number, defaults to random port(30000-40000)

Attributes:

  • finetuned_model

    Fine-tuned model path

  • base_model

    Base model path

  • master_ip

    Master node IP address

Methods:

  • cmd

    Generate deployment command

  • geturl

    Get deployed service URL address

Source code in lazyllm/components/deploy/ray.py
class Distributed(LazyLLMDeployBase):
    """Distributed deployment class, inherits from LazyLLMDeployBase.

Provides distributed model deployment functionality based on Ray framework, supports multi-node cluster deployment.

Args:
    launcher: Launcher configuration, defaults to remote launcher(ngpus=1)
    port (int, optional): Service port number, defaults to random port(30000-40000)

Attributes:
    finetuned_model: Fine-tuned model path
    base_model: Base model path
    master_ip: Master node IP address

Methods:
    cmd(finetuned_model, base_model, master_ip): Generate deployment command
    geturl(job): Get deployed service URL address
"""

    def __init__(self, launcher=launchers.remote(ngpus=1), port=None):  # noqa B008
        super().__init__(launcher=launcher)
        self.port = port or random.randint(30000, 40000)
        self.finetuned_model = None
        self.base_model = None
        self.master_ip = None

    def cmd(self, finetuned_model=None, base_model=None, master_ip=None):
        """Generate Ray distributed deployment command.

Generate corresponding Ray startup command based on whether it is a master node, supports both head node and worker node modes.

Args:
    finetuned_model: Fine-tuned model path
    base_model: Base model path
    master_ip: Master node IP address, if empty starts as head node

Returns:
    LazyLLMCMD: Object containing deployment command
"""
        self.finetuned_model = finetuned_model
        self.base_model = base_model
        self.master_ip = master_ip
        if not self.master_ip:
            cmd = f'ray start --block --head --port={self.port} && sleep 365d'
        else:
            cmd = f'ray start --address={self.master_ip} && sleep 365d'
        return LazyLLMCMD(cmd=cmd, return_value=self.geturl)

    def geturl(self, job=None):
        """Get URL address of distributed deployment service.

Return corresponding service address information based on deployment mode, supports display mode and actual deployment mode.

Args:
    job: Job object, defaults to current job

Returns:
    Package: Packaged object containing model path and service address
"""
        time.sleep(5)
        if job is None:
            job = self.job
        if lazyllm.config['mode'] == lazyllm.Mode.Display:
            return lazyllm.package(self.finetuned_model, self.base_model, None)
        else:
            if self.master_ip:
                return lazyllm.package(self.finetuned_model, self.base_model, self.master_ip)
            return lazyllm.package(self.finetuned_model, self.base_model, f'{job.get_jobip()}:{self.port}')

cmd(finetuned_model=None, base_model=None, master_ip=None)

Generate Ray distributed deployment command.

Generate corresponding Ray startup command based on whether it is a master node, supports both head node and worker node modes.

Parameters:

  • finetuned_model

    Fine-tuned model path

  • base_model

    Base model path

  • master_ip

    Master node IP address, if empty starts as head node

Returns:

  • LazyLLMCMD

    Object containing deployment command

Source code in lazyllm/components/deploy/ray.py
    def cmd(self, finetuned_model=None, base_model=None, master_ip=None):
        """Generate Ray distributed deployment command.

Generate corresponding Ray startup command based on whether it is a master node, supports both head node and worker node modes.

Args:
    finetuned_model: Fine-tuned model path
    base_model: Base model path
    master_ip: Master node IP address, if empty starts as head node

Returns:
    LazyLLMCMD: Object containing deployment command
"""
        self.finetuned_model = finetuned_model
        self.base_model = base_model
        self.master_ip = master_ip
        if not self.master_ip:
            cmd = f'ray start --block --head --port={self.port} && sleep 365d'
        else:
            cmd = f'ray start --address={self.master_ip} && sleep 365d'
        return LazyLLMCMD(cmd=cmd, return_value=self.geturl)