Module
lazyllm.module.ModuleBase
Bases: SessionConfigableBase
ModuleBase is the core base class in LazyLLM, defining the common interface and fundamental capabilities for all modules.
It abstracts training, deployment, inference, and evaluation logic, while also providing mechanisms for submodule management, hook registration, parameter passing, and recursive updates.
Custom modules should inherit from ModuleBase and implement the forward method to define specific inference logic.
Key Features
- Unified management of submodules, automatically tracking held ModuleBase instances.
- Support for Option type hyperparameters, enabling grid search and automated tuning.
- Hook system that allows executing custom logic before and after calls.
- Encapsulated update pipeline covering training, server deployment, and evaluation.
- Built-in evalset loading and parallel inference evaluation.
Parameters:
-
return_trace(bool, default:False) –Whether to write inference results into the trace queue for debugging and tracking. Default is
False.
Use Cases
- When combining some or all of training, deployment, inference, and evaluation capabilities, e.g., an embedding model requiring both training and inference.
- When you want to recursively manage submodules through root-level methods such as
start,update, andeval. - When you want user parameters to be automatically propagated from outer modules to inner implementations (see WebModule).
- When you want the module to support parameter grid search (see TrialModule).
Examples:
>>> import lazyllm
>>> class Module(lazyllm.module.ModuleBase):
... pass
...
>>> class Module2(lazyllm.module.ModuleBase):
... def __init__(self):
... super(__class__, self).__init__()
... self.m = Module()
...
>>> m = Module2()
>>> m.submodules
[<Module type=Module>]
>>> m.m3 = Module()
>>> m.submodules
[<Module type=Module>, <Module type=Module>]
Source code in lazyllm/module/module.py
324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 | |
eval(*, recursive=True)
Evaluate the module (and all its submodules). This function takes effect after the module has been set with an evaluation set using 'evalset'.
Parameters:
-
recursive(bool, default:True) –Whether to recursively evaluate all submodules. Defaults to True.
Examples:
>>> import lazyllm
>>> class MyModule(lazyllm.module.ModuleBase):
... def forward(self, input):
... return f'reply for input'
...
>>> m = MyModule()
>>> m.evalset([1, 2, 3])
>>> m.eval().eval_result
['reply for input', 'reply for input', 'reply for input']
Source code in lazyllm/module/module.py
evalset(evalset, load_f=None, collect_f=lambda x: x)
Set the evaluation set for the module.
During update or eval, the module will perform inference on the evaluation set, and the results will be stored in the eval_result variable.
Parameters:
-
evalset(Union[list, str]) –Evaluation data list or path to an evaluation data file.
-
load_f(Optional[Callable], default:None) –Function to load and parse the evaluation file into a list if
evalsetis a file path, default is None. -
collect_f(Callable, default:lambda x: x) –Function to post-process evaluation results, default is
lambda x: x.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> print(m.eval_result)
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
Source code in lazyllm/module/module.py
forward(*args, **kw)
Forward computation interface that must be implemented by subclasses.
This method defines the logic for receiving inputs and returning outputs, and is the core function of the module as a functor.
Parameters:
-
*args–Variable positional arguments, subclass can define the input as needed.
-
**kw–Variable keyword arguments, subclass can define the input as needed.
Source code in lazyllm/module/module.py
start()
Start the deployment services of the module and all its submodules. This ensures that the server functionality of the module and its submodules is executed, suitable for initialization or restarting services.
Returns:
- ModuleBase: Returns itself to support method chaining
Examples:
>>> import lazyllm
>>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
>>> m.start()
<Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
Source code in lazyllm/module/module.py
restart()
Restart the deployment services of the module and its submodules. Internally calls the start method to reinitialize the services.
Returns:
- ModuleBase: Returns itself to support method chaining
Examples:
>>> import lazyllm
>>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
>>> m.restart()
<Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
Source code in lazyllm/module/module.py
update(*, recursive=True)
Update the module (and all its submodules). The module will be updated when the _get_train_tasks method is overridden.
Parameters:
-
recursive(bool, default:True) –Whether to recursively update all submodules, default is True.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> print(m.eval_result)
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
Source code in lazyllm/module/module.py
stream_output(stream_output=None)
Context manager for streaming output during inference or execution.
When a dictionary is provided to stream_output, a prefix and suffix can be specified along with optional colors.
Parameters:
-
stream_output(Optional[Union[bool, Dict]], default:None) –Configuration for streaming output.
- If True, enables default streaming output.
-
If a dictionary, may include:
- 'prefix' (str): Text to output at the beginning.
- 'prefix_color' (str, optional): Color of the prefix.
- 'suffix' (str): Text to output at the end.
- 'suffix_color' (str, optional): Color of the suffix.
Source code in lazyllm/module/module.py
used_by(module_id)
Mark which module is using the current module, indicating the calling relationship.
Supports chaining by returning the module itself.
Parameters:
-
module_id(str) –Unique ID of the parent module that uses this module.
Returns:
- ModuleBase: Returns the module itself for method chaining.
Source code in lazyllm/module/module.py
register_hook(hook_type)
Register a hook to execute specific logic during module invocation.
The hook must inherit from LazyLLMHook and can be used to add custom operations before or after the module's forward computation, such as logging or metrics collection.
Parameters:
-
hook_type(LazyLLMHook) –Hook object to register.
Source code in lazyllm/module/module.py
unregister_hook(hook_type)
Unregister a previously registered hook.
If the hook exists in the module, it will be removed and no longer executed during module invocation.
Parameters:
-
hook_type(LazyLLMHook) –Hook object to unregister.
Source code in lazyllm/module/module.py
clear_hooks()
Clear all hooks registered in the module.
After calling this, the module will no longer execute any hook logic.
update_server(*, recursive=True)
Update the deployment (server) part of the module and its submodules. When a module or submodule implements deployment functionality, the corresponding services will be started.
Parameters:
-
recursive(bool, default:True) –Whether to recursively update deployment tasks of all submodules, default is True.
Source code in lazyllm/module/module.py
wait()
Wait for the module or its submodules to finish execution. Currently, this method is a no-op and can be implemented by subclasses according to specific deployment logic.
stop()
Stop the module and all its submodules. This method recursively calls the stop method of each submodule, suitable for releasing resources or shutting down services.
for_each(filter, action)
Execute a specified action on all submodules of the module. Recursively traverses all submodules, and if a submodule satisfies the filter condition, executes the action.
Parameters:
-
filter(Callable) –A function that takes a submodule as input and returns a boolean, used to determine whether to perform the action.
-
action(Callable) –A function to perform on submodules that meet the condition.
Source code in lazyllm/module/module.py
lazyllm.module.servermodule.LLMBase
Bases: object
Base class for large language model modules, inheriting from ModuleBase.
Manages initialization and switching of streaming output, prompts, and formatters; processes file information in inputs; supports instance sharing.
Parameters:
-
stream(bool or dict, default:False) –Whether to enable streaming output or streaming configuration, default is False.
-
return_trace(bool) –Whether to return execution trace, default is False.
-
init_prompt(bool, default:True) –Whether to automatically create a default prompt at initialization, default is True.
Source code in lazyllm/module/servermodule.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 | |
prompt(prompt=None, history=None)
Set or switch the prompt. Supports None, PrompterBase subclass, or string/dict to create ChatPrompter.
Parameters:
-
prompt(str / dict / PrompterBase / None, default:None) –The prompt to set.
-
history(list, default:None) –Conversation history, only valid when prompt is str or dict.
Returns:
- self: For chaining calls.
Source code in lazyllm/module/servermodule.py
formatter(format=None)
Set or switch the output formatter. Supports None, FormatterBase subclass or callable.
Parameters:
-
format(FormatterBase / Callable / None, default:None) –Formatter object or function, default is None.
Returns:
- self: For chaining calls.
Source code in lazyllm/module/servermodule.py
share(prompt=None, format=None, stream=None, history=None, copy_static_params=False)
Creates a shallow copy of the current instance, with optional resetting of prompt, formatter, and stream attributes.
Useful for scenarios where multiple sessions or agents share a base configuration but customize certain parameters.
Parameters:
-
prompt(str / dict / PrompterBase / None, default:None) –New prompt, optional.
-
format(FormatterBase / None, default:None) –New formatter, optional.
-
stream(bool / dict / None, default:None) –New streaming settings, optional.
-
history(list / None, default:None) –New conversation history, effective only when setting prompt.
Returns:
- LLMBase: The new shared instance.
Source code in lazyllm/module/servermodule.py
lazyllm.module.ActionModule
Bases: ModuleBase
Used to wrap a Module around functions, modules, flows, Module, and other callable objects. The wrapped Module (including the Module within the flow) will become a submodule of this Module.
Parameters:
-
action(Callable | list[Callable], default:()) –The object to be wrapped, which is one or a set of callable objects.
-
return_trace(bool, default:False) –Whether to enable trace mode to record the execution stack. Defaults to
False.
Examples:
>>> import lazyllm
>>> def myfunc(input): return input + 1
...
>>> class MyModule1(lazyllm.module.ModuleBase):
... def forward(self, input): return input * 2
...
>>> class MyModule2(lazyllm.module.ModuleBase):
... def _get_deploy_tasks(self): return lazyllm.pipeline(lambda : print('MyModule2 deployed!'))
... def forward(self, input): return input * 4
...
>>> class MyModule3(lazyllm.module.ModuleBase):
... def _get_deploy_tasks(self): return lazyllm.pipeline(lambda : print('MyModule3 deployed!'))
... def forward(self, input): return f'get {input}'
...
>>> m = lazyllm.ActionModule(myfunc, lazyllm.pipeline(MyModule1(), MyModule2), MyModule3())
>>> print(m(1))
get 16
>>>
>>> m.evalset([1, 2, 3])
>>> m.update()
MyModule2 deployed!
MyModule3 deployed!
>>> print(m.eval_result)
['get 16', 'get 24', 'get 32']
evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)
Set the evaluation set for the Module. Modules that have been set with an evaluation set will be evaluated during update or eval, and the evaluation results will be stored in the eval_result variable.
evalset(evalset, collect_f=lambda x: ...)→ None
Parameters:
-
evalset (list)–Evaluation set
-
collect_f (Callable)–Post-processing method for evaluation results, no post-processing by default.
evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None
Parameters:
-
evalset (str)–Path to the evaluation set
-
load_f (Callable)–Method for loading the evaluation set, including parsing file formats and converting to a list
-
collect_f (Callable)–Post-processing method for evaluation results, no post-processing by default.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
Source code in lazyllm/module/module.py
811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 | |
submodules
property
Returns all submodules of type ModuleBase contained in the wrapped action. This automatically traverses any nested modules inside a Pipeline.
Returns:
- list[ModuleBase]: List of submodules
forward(*args, **kw)
Executes the wrapped action with the provided input arguments. Equivalent to directly calling the module.
Parameters:
-
args(list of callables or single callable, default:()) –Positional arguments to be passed to the wrapped action.
-
kwargs(dict of callables) –Keyword arguments to be passed to the wrapped action.
Returns:
- Any: The result of executing the wrapped action.
Source code in lazyllm/module/module.py
lazyllm.module.TrainableModule
Bases: UrlModule
Trainable module, all models (including LLM, Embedding, etc.) are served through TrainableModule
TrainableModule(base_model='', target_path='', *, stream=False, return_trace=False)
Parameters:
-
base_model(str, default:'') –Name or path of the base model.
-
target_path(str, default:'') –Path to save the fine-tuning task.
-
stream(bool, default:False) –Whether to output stream.
-
return_trace(bool, default:False) –Record the results in trace.
-
trust_remote_code(bool, default:True) –Whether to trust remote code.
-
type(str / LLMType, default:None) –Model type.
-
source(str, default:None) –Model source. If not set, it will read the value from the environment variable LAZYLLM_MODEL_SOURCE.
TrainableModule.trainset(v):
Set the training set for TrainableModule
Parameters:
-
v(str) –Path to the training/fine-tuning dataset.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).trainset('/file/to/path').deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
TrainableModule.train_method(v, **kw):
Set the training method for TrainableModule. Continued pre-training is not supported yet, expected to be available in the next version.
Parameters:
-
v(LazyLLMTrainBase) –Training method, options include
train.autoetc. -
kw(**dict) –Parameters required by the training method, corresponding to v.
TrainableModule.finetune_method(v, **kw):
Set the fine-tuning method and its parameters for TrainableModule.
Parameters:
-
v(LazyLLMFinetuneBase) –Fine-tuning method, options include
finetune.auto/finetune.alpacalora/finetune.collieetc. -
kw(**dict) –Parameters required by the fine-tuning method, corresponding to v.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
TrainableModule.deploy_method(v, **kw):
Set the deployment method and its parameters for TrainableModule.
Parameters:
-
v(LazyLLMDeployBase) –Deployment method, options include
deploy.auto/deploy.lightllm/deploy.vllmetc. -
kw(**dict) –Parameters required by the deployment method, corresponding to v.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy).mode('finetune')
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
TrainableModule.mode(v):
Set whether to execute training or fine-tuning during update for TrainableModule.
Parameters:
-
v(str) –Sets whether to execute training or fine-tuning during update, options are 'finetune' and 'train', default is 'finetune'.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
eval(*, recursive=True)
Evaluate the module (and all its submodules). This function takes effect after the module has set an evaluation set through evalset.
Parameters:
-
recursive (bool)–Whether to recursively evaluate all submodules, default is True.
evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)
Set the evaluation set for the Module. Modules that have been set with an evaluation set will be evaluated during update or eval, and the evaluation results will be stored in the eval_result variable.
evalset(evalset, collect_f=lambda x: ...)→ None
Parameters:
-
evalset (list)–Evaluation set
-
collect_f (Callable)–Post-processing method for evaluation results, no post-processing by default.
evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None
Parameters:
-
evalset (str)–Path to the evaluation set
-
load_f (Callable)–Method for loading the evaluation set, including parsing file formats and converting to a list
-
collect_f (Callable)–Post-processing method for evaluation results, no post-processing by default.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
restart()
Restart the module and all its submodules.
Examples:
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.restart()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
start()
Deploy the module and all its submodules.
Examples:
import lazyllm
m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
m.start()
m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
Source code in lazyllm/module/llms/trainablemodule.py
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 | |
wait()
Wait for the model deployment task to complete. This method blocks the current thread until the deployment is finished.
Examples:
>>> import lazyllm
>>> class Mywait(lazyllm.module.llms.TrainableModule):
... def forward(self):
... self.wait()
Source code in lazyllm/module/llms/trainablemodule.py
stop(task_name=None)
Pause a specific task of the model.
Parameters:
-
task_name(str, default:None) –The name of the task to pause. Defaults to None (pauses the 'deploy' task by default).
Examples:
>>> import lazyllm
>>> class Mystop(lazyllm.module.llms.TrainableModule):
... def forward(self, task):
... self.stop(task)
Source code in lazyllm/module/llms/trainablemodule.py
prompt(prompt='', history=None)
Processes the input prompt and generates a format compatible with the model.
Parameters:
-
prompt(str, default:'') –The input prompt. Defaults to an empty string.
-
history(List, default:None) –Conversation history.
Examples:
>>> import lazyllm
>>> class Myprompt(lazyllm.module.llms.TrainableModule):
... def forward(self, prompt, history):
... self.prompt(prompt,history)
Source code in lazyllm/module/llms/trainablemodule.py
log_path(task_name=None)
Get task log path.
Get corresponding log file path based on task name, supports default deployment tasks and manually specified tasks.
Parameters:
-
task_name(Optional[str], default:None) –Task name, defaults to None (get default deployment task log)
Returns:
-
str–Log file path
Source code in lazyllm/module/llms/trainablemodule.py
forward_openai(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)
Perform forward inference using OpenAI compatible interface.
Call deployed model service through OpenAI standard API format, supports chat history, file processing, tool calling and streaming output.
Parameters:
-
__input(Union[Tuple[Union[str, Dict], str], str, Dict], default:package()) –Input data, can be text, dictionary or packaged data
-
llm_chat_history–Chat history records
-
lazyllm_files–File data
-
tools–Tool calling configuration
-
stream_output(bool, default:False) –Whether to stream output
-
**kw–Other keyword arguments
Returns:
-
–
Model inference result
Source code in lazyllm/module/llms/trainablemodule.py
forward_standard(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)
Perform forward inference using standard interface.
Call deployed model service through custom standard API format, supports template messages, file encoding and streaming output.
Parameters:
-
__input(Union[Tuple[Union[str, Dict], str], str, Dict], default:package()) –Input data, can be text, dictionary or packaged data
-
llm_chat_history–Chat history records
-
lazyllm_files–File data
-
tools–Tool calling configuration
-
stream_output(bool, default:False) –Whether to stream output
-
**kw–Other keyword arguments
Returns:
-
–
Model inference result
Source code in lazyllm/module/llms/trainablemodule.py
forward(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)
Supports handling various input formats, automatically builds the input structure required by the model, and adapts to multimodal scenarios.
Examples:
>>> import lazyllm
>>> from lazyllm.module import TrainableModule
>>> class MyModule(TrainableModule):
... def forward(self, __input, **kw):
... return f"processed: {__input}"
...
>>> MyModule()("Hello")
'processed: Hello'
Source code in lazyllm/module/llms/trainablemodule.py
lazyllm.module.UrlModule
Bases: ModuleBase, LLMBase, _UrlHelper
The URL obtained from deploying the ServerModule can be wrapped into a Module. When calling __call__ , it will access the service.
Parameters:
-
url(str, default:'') –The URL of the service to be wrapped, defaults to empty string.
-
stream(bool | Dict[str, str], default:False) –Whether to request and output in streaming mode, default is non-streaming.
-
return_trace(bool, default:False) –Whether to record the results in trace, default is False.
-
init_prompt(bool, default:True) –Whether to initialize prompt, defaults to True.
Examples:
>>> import lazyllm
>>> def demo(input): return input * 2
...
>>> s = lazyllm.ServerModule(demo, launcher=lazyllm.launchers.empty(sync=False))
>>> s.start()
INFO: Uvicorn running on http://0.0.0.0:35485
>>> u = lazyllm.UrlModule(url=s._url)
>>> print(u(1))
2
Source code in lazyllm/module/servermodule.py
333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 | |
forward(*args, **kw)
Defines the computation steps to be executed each time. All subclasses of ModuleBase need to override this function.
Examples:
>>> import lazyllm
>>> class MyModule(lazyllm.module.ModuleBase):
... def forward(self, input):
... return input + 1
...
>>> MyModule()(1)
2
Source code in lazyllm/module/servermodule.py
lazyllm.module.ServerModule
Bases: UrlModule
The ServerModule class inherits from UrlModule and provides functionality to deploy any callable object as an API service.
Built on FastAPI, it supports launching a main service with multiple satellite services, as well as preprocessing, postprocessing, and streaming capabilities.
A local callable can be deployed as a service, or an existing service can be accessed directly via a URL.
Parameters:
-
m(Optional[Union[str, ModuleBase]], default:None) –The module or its name to be wrapped as a service.
If a string is provided, it is treated as a URL andurlmust be None.
If a ModuleBase is provided, it will be wrapped as a service. -
pre(Optional[Callable], default:None) –Preprocessing function executed in the service process. Default is
None. -
post(Optional[Callable], default:None) –Postprocessing function executed in the service process. Default is
None. -
stream(Union[bool, Dict], default:False) –Whether to enable streaming output. Can be a boolean or a dictionary with streaming configuration. Default is
False. -
return_trace(Optional[bool], default:False) –Whether to return debug trace information. Default is
False. -
port(Optional[int], default:None) –Port to deploy the service. If
None, a random port will be assigned. -
pythonpath(Optional[str], default:None) –PYTHONPATH environment variable passed to the subprocess. Defaults to
None. -
launcher(Optional[LazyLLMLaunchersBase], default:None) –The launcher used to deploy the service. Defaults to asynchronous remote deployment.
-
url(Optional[str], default:None) –URL of an already deployed service. If provided,
mmust be None.
Examples:
>>> import lazyllm
>>> def demo(input): return input * 2
...
>>> s = lazyllm.ServerModule(demo, launcher=launchers.empty(sync=False))
>>> s.start()
INFO: Uvicorn running on http://0.0.0.0:35485
>>> print(s(1))
2
>>> class MyServe(object):
... def __call__(self, input):
... return 2 * input
...
... @lazyllm.FastapiApp.post
... def server1(self, input):
... return f'reply for {input}'
...
... @lazyllm.FastapiApp.get
... def server2(self):
... return f'get method'
...
>>> m = lazyllm.ServerModule(MyServe(), launcher=launchers.empty(sync=False))
>>> m.start()
INFO: Uvicorn running on http://0.0.0.0:32028
>>> print(m(1))
2
Source code in lazyllm/module/servermodule.py
448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 | |
wait()
Wait for the current module service to finish starting or executing.
Typically used to block the main thread until the service finishes or is interrupted.
stop()
Stop the current module service and its related subprocesses.
After this call, the module will no longer respond to requests.
lazyllm.module.AutoModel
A factory for quickly creating either an online OnlineModule or a local TrainableModule. It prioritizes user-provided arguments; when config is enabled, settings in auto_model_config_map can override them, and it automatically decides which module to build:
-
For online mode, arguments are passed through to
OnlineModule(automatically matching OnlineChatModule / OnlineEmbeddingModule / OnlineMultiModalModule). -
For local mode, it initializes
TrainableModulewithmodeland user parameters, then reads the config map for configuration values.
Parameters:
-
model(str) –Name of the model, e.g.,
Qwen3-32B. Required. -
config_id(Optional[str]) –ID from the config file. Defaults to empty.
-
source(Optional[str]) –Provider for online modules (
qwen/glm/openai). Set tolocalto force a local TrainableModule. -
type(Optional[str]) –Model type. If omitted, it will try to fetch from kwargs or be inferred by the online module.
-
config(Union[str, bool]) –Whether to enable overrides from
auto_model_config_map, or a user-specified config file path. Defaults to True. -
**kwargs–Only the synonyms
base_model,embed_model_nameandmodel_nameformodelare accepted; no other user-supplied fields are allowed. Other model parameters (e.g.stream,type,url) should be specified in the configuration file (auto_model_config_map) and referenced viaconfig_idso they are injected automatically.
Source code in lazyllm/module/llms/automodel.py
lazyllm.module.TrialModule
Bases: object
Parameter grid search module will traverse all its submodules, collect all searchable parameters, and iterate over these parameters for fine-tuning, deployment, and evaluation.
Parameters:
-
m(Callable) –The submodule whose parameters will be grid-searched. Fine-tuning, deployment, and evaluation will be based on this module.
Examples:
>>> import lazyllm
>>> from lazyllm import finetune, deploy
>>> m = lazyllm.TrainableModule('b1', 't').finetune_method(finetune.dummy, **dict(a=lazyllm.Option(['f1', 'f2'])))
>>> m.deploy_method(deploy.dummy).mode('finetune').prompt(None)
>>> s = lazyllm.ServerModule(m, post=lambda x, ori: f'post2({x})')
>>> s.evalset([1, 2, 3])
>>> t = lazyllm.TrialModule(s)
>>> t.update()
>>>
dummy finetune!, and init-args is {a: f1}
dummy finetune!, and init-args is {a: f2}
[["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"], ["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"]]
Source code in lazyllm/module/trialmodule.py
update()
Iterates through all configuration options of the module, updates the module in parallel using multiprocessing, and collects the evaluation results for each configuration.
Source code in lazyllm/module/trialmodule.py
work(m, q)
staticmethod
Static method to deepcopy the module, perform update in a subprocess, and put the evaluation result into a queue.
Parameters:
-
m(Callable) –The module to perform update on.
-
q(Queue) –Queue to store evaluation results.
Source code in lazyllm/module/trialmodule.py
lazyllm.module.OnlineChatModule
Bases: _DynamicSourceRouterMixin, LLMBase
Used to manage and create access modules for large model platforms currently available on the market. Currently, it supports openai, sensenova, glm, kimi, qwen, doubao, ppio and deepseek (since the platform does not allow recharges for the time being, access is not supported for the time being). For how to obtain the platform's API key, please visit Getting Started
Parameters:
-
model(str, default:None) –Specify the model to access (Note that you need to use Model ID or Endpoint ID when using Doubao. For details on how to obtain it, see Getting the Inference Access Point. Before using the model, you must first activate the corresponding service on the Doubao platform.), default is
gpt-3.5-turbo(openai)/SenseChat-5(sensenova)/glm-4(glm)/moonshot-v1-8k(kimi)/qwen-plus(qwen)/mistral-7b-instruct-v0.2(doubao)/deepseek/deepseek-v3.2(ppio). A recognised source name can also be passed here; it will be automatically swapped intosource. -
source(str, default:None) –Specify the type of module to create. Options include
openai/sensenova/glm/kimi/qwen/doubao/ppio/deepseek (not yet supported). -
url(str, default:None) –Specify the base link of the platform to be accessed. The default is the official link. The alias
base_urlis also accepted. -
system_prompt(str) –Specify the requested system prompt. The default is the official system prompt.
-
api_key(str, default:None) –You can pass an explicit API key. If set to
autoordynamic, the key is resolved from config at runtime, enabling dynamic key switching. -
stream(bool, default:True) –Whether to request and output in streaming mode, default is streaming.
-
dynamic_auth(bool, default:False) –Whether to enable dynamic auth. When True, it is equivalent to
api_key='dynamic'. -
return_trace(bool, default:False) –Whether to record the results in trace, default is False.
Examples:
>>> import lazyllm
>>> from functools import partial
>>> m = lazyllm.OnlineChatModule(source="sensenova", stream=True)
>>> query = "Hello!"
>>> with lazyllm.ThreadPoolExecutor(1) as executor:
... future = executor.submit(partial(m, llm_chat_history=[]), query)
... while True:
... if value := lazyllm.FileSystemQueue().dequeue():
... print(f"output: {''.join(value)}")
... elif future.done():
... break
... print(f"ret: {future.result()}")
...
output: Hello
output: ! How can I assist you today?
ret: Hello! How can I assist you today?
>>> from lazyllm.components.formatter import encode_query_with_filepaths
>>> vlm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-Vision")
>>> query = "what is it?"
>>> inputs = encode_query_with_filepaths(query, ["/path/to/your/image"])
>>> print(vlm(inputs))
Source code in lazyllm/module/llms/onlinemodule/chat.py
lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoChat
Bases: OnlineChatModuleBase
Doubao online chat module, inheriting from OnlineChatModuleBase.
Encapsulates the Doubao API (ByteDance) for multi-turn Q&A interactions. Defaults to model doubao-1-5-pro-32k-250115, supporting streaming and optional trace return.
Parameters:
-
model(str, default:None) –The model name to use. Defaults to
doubao-1-5-pro-32k-250115. -
base_url(str, default:None) –Base URL of the API, default is "https://ark.cn-beijing.volces.com/api/v3/".
-
api_key(Optional[str], default:None) –Doubao API key. If not provided, it is read from
lazyllm.config['doubao_api_key']. -
stream(bool, default:True) –Whether to enable streaming output. Defaults to True.
-
return_trace(bool, default:False) –Whether to return trace information. Defaults to False.
-
**kwargs–Additional arguments passed to the base class OnlineChatModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
lazyllm.module.llms.onlinemodule.supplier.ppio.PPIOChat
Bases: OnlineChatModuleBase
PPIO (Paiou Cloud) online chat module, inheriting from OnlineChatModuleBase.
Encapsulates the PPIO API for multi-turn Q&A interactions. Defaults to model deepseek/deepseek-v3.2, supporting streaming and optional trace return. PPIO provides OpenAI-compatible API interface.
Parameters:
-
model(str, default:None) –The model name to use. Defaults to
deepseek/deepseek-v3.2. -
base_url(str, default:None) –Base URL of the API, default is "https://api.ppinfra.com/openai".
-
api_key(Optional[str], default:None) –PPIO API key. If not provided, it is read from
lazyllm.config['ppio_api_key']. -
stream(bool, default:True) –Whether to enable streaming output. Defaults to True.
-
return_trace(bool, default:False) –Whether to return trace information. Defaults to False.
-
**kwargs–Additional arguments passed to the base class OnlineChatModuleBase.
Examples:
>>> import lazyllm
>>> # Set environment variable: export LAZYLLM_PPIO_API_KEY=your_api_key
>>> # Or create config file ~/.lazyllm/config.json: {"ppio_api_key": "your_api_key"}
>>> chat = lazyllm.OnlineChatModule(source='ppio', model='deepseek/deepseek-v3.2')
>>> response = chat('Hello, how are you?')
>>> print(response)
Source code in lazyllm/module/llms/onlinemodule/supplier/ppio.py
lazyllm.module.OnlineEmbeddingModule
Bases: _DynamicSourceRouterMixin
Used to manage and create online Embedding service modules currently on the market, currently supporting openai, sensenova, glm, qwen, doubao.
Parameters:
-
model(str, default:None) –Specify the model to access (Note that you need to use Model ID or Endpoint ID when using Doubao. For details on how to obtain it, see Getting the Inference Access Point. Before using the model, you must first activate the corresponding service on the Doubao platform.), default is
text-embedding-ada-002(openai)/nova-embedding-stable(sensenova)/embedding-2(glm)/text-embedding-v1(qwen)/doubao-embedding-text-240715(doubao). The aliasesembed_model_nameandmodel_nameare also accepted. A recognised source name can be passed here too; it will be automatically swapped intosource. -
source(str, default:None) –Specify the type of module to create. Options are
openai/sensenova/glm/qwen/doubao. -
url(str, default:None) –Specify the base link of the platform to be accessed. The default is the official link. The aliases
embed_urlandbase_urlare also accepted. -
type(str, default:None) –Service type, either
embedorrerank. Inferred from the model name when omitted. -
api_key(str, default:None) –You can pass an explicit API key. If set to
autoordynamic, the key is resolved from config at runtime, enabling dynamic key switching. -
dynamic_auth(bool, default:False) –Whether to enable dynamic auth. When True, it is equivalent to
api_key='dynamic'. -
return_trace(bool, default:False) –Whether to record the results in trace. Defaults to False.
-
batch_size(int, default:32) –Batch size for bulk requests. Defaults to 32.
Examples:
>>> import lazyllm
>>> m = lazyllm.OnlineEmbeddingModule(source="sensenova")
>>> emb = m("hello world")
>>> print(f"emb: {emb}")
emb: [0.0010528564, 0.0063285828, 0.0049476624, -0.012008667, ..., -0.009124756, 0.0032043457, -0.051696777]
>>> m2 = lazyllm.OnlineEmbeddingModule("sensenova")
>>> emb2 = m2("hello world")
Source code in lazyllm/module/llms/onlinemodule/embedding.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
lazyllm.module.OnlineMultiModalModule
Bases: _DynamicSourceRouterMixin
Used to manage and create online multimodal service modules. Supported task types are stt / tts / text2image / image_editing.
Parameters:
-
model(str, default:None) –Model name to use.
-
source(str, default:None) –Supplier to use, such as
qwen/glm/minimax/siliconflow/doubao. -
type(str, default:None) –Multimodal task type, one of
stt/tts/text2image/image_editing. -
url(str, default:None) –Base URL of the platform. Defaults to each supplier's official endpoint. The alias
base_urlis also accepted. -
api_key(str, default:None) –You can pass an explicit API key. If set to
autoordynamic, the key is resolved from config at runtime, enabling dynamic key switching. -
dynamic_auth(bool, default:False) –Whether to enable dynamic auth. When True, it is equivalent to
api_key='dynamic'. -
return_trace(bool, default:False) –Whether to record the result in trace. Defaults to False.
Examples:
>>> import lazyllm
>>> stt = lazyllm.OnlineMultiModalModule(source='qwen', type='stt', api_key='dynamic')
>>> tts = lazyllm.OnlineMultiModalModule(source='qwen', type='tts', dynamic_auth=True)
>>> img = lazyllm.OnlineMultiModalModule(source='qwen', type='text2image')
Source code in lazyllm/module/llms/onlinemodule/multimodal.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIEmbed
Bases: LazyLLMOnlineEmbedModuleBase
Online embedding module using OpenAI.
This class wraps the OpenAI Embedding API, defaulting to the text-embedding-ada-002 model, and converts text into vector representations.
Parameters:
-
embed_url(str, default:None) –The URL endpoint of the OpenAI embedding API. Default is "https://api.openai.com/v1/embeddings".
-
embed_model_name(str, default:None) –The name of the embedding model to use. Default is "text-embedding-ada-002".
-
api_key(str, default:None) –The OpenAI API key. If not provided, it will be read from
lazyllm.config.
Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py
lazyllm.module.llms.onlinemodule.supplier.qwen.QwenSTT
Bases: LazyLLMOnlineSTTModuleBase
Speech-to-Text (STT) module based on Qwen's multimodal API, with paraformer-v2 as the default model.
Parameters:
-
model(str, default:None) –Model name. Defaults to
None, in which case it will uselazyllm.config['qwen_stt_model_name']orQwenSTT.MODEL_NAME. -
api_key(str, default:None) –API key for Qwen service. Defaults to
None. -
return_trace(bool, default:False) –Whether to return intermediate trace information during inference. Defaults to
False. -
**kwargs–Additional parameters passed to the parent class
LazyLLMOnlineSTTModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
lazyllm.module.OnlineChatModuleBase = LazyLLMOnlineChatModuleBase
module-attribute
lazyllm.module.OnlineEmbeddingModuleBase
Bases: LazyLLMOnlineBase
OnlineEmbeddingModuleBase is the base class for managing embedding model interfaces on open platforms, used for requesting text to obtain embedding vectors. It is not recommended to directly instantiate this class. Specific platform classes should inherit from this class for instantiation.
If you need to support the capabilities of embedding models on a new open platform, please extend your custom class from OnlineEmbeddingModuleBase:
- If the request and response data formats of the new platform's embedding model are the same as OpenAI's, no additional processing is needed; simply pass the URL and model.
- If the request or response data formats of the new platform's embedding model differ from OpenAI's, you need to override the _encapsulated_data or _parse_response methods.
- Configure the api_key supported by the new platform as a global variable by using
lazyllm.config.add(variable_name, type, default_value, environment_variable_name).
Parameters:
-
embed_url(str) –Embedding API URL address.
-
api_key(str) –API access key.
-
embed_model_name(str) –Embedding model name.
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False.
Examples:
>>> import lazyllm
>>> from lazyllm.module import OnlineEmbeddingModuleBase
>>> class NewPlatformEmbeddingModule(OnlineEmbeddingModuleBase):
... def __init__(self,
... embed_url: str = '<new platform embedding url>',
... embed_model_name: str = '<new platform embedding model name>'):
... super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
...
>>> class NewPlatformEmbeddingModule1(OnlineEmbeddingModuleBase):
... def __init__(self,
... embed_url: str = '<new platform embedding url>',
... embed_model_name: str = '<new platform embedding model name>'):
... super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
...
... def _encapsulated_data(self, text:str, **kwargs):
... pass
... return json_data
...
... def _parse_response(self, response: dict[str, any]):
... pass
... return embedding
Source code in lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 | |
run_embed_batch(input, data, proxies, url=None, **kwargs)
Internal method for executing batch embedding processing.
This method handles batch text embedding requests, supporting both single-threaded and multi-threaded processing modes. It automatically adjusts batch size and retries on request failures, providing robust error handling mechanisms.
Parameters:
-
input(List) –Original input text list
-
data(List) –Encapsulated batch request data list
-
proxies–Proxy settings, set to None if NO_PROXY is True
-
url(str, default:None) –Full endpoint URL used for this request, default to be self._embed_url
-
**kwargs–Additional keyword arguments
Returns:
- A list of embedding vector lists, each sublist corresponds to an input text's embedding vector
Source code in lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py
lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoEmbed
Bases: LazyLLMOnlineEmbedModuleBase
DoubaoEmbed class inherits from OnlineEmbeddingModuleBase, encapsulating the functionality to call Doubao's online text embedding service.
It supports remote text vector representation retrieval by specifying the service URL, model name, and API key.
Parameters:
-
embed_url(Optional[str], default:None) –URL of the Doubao text embedding service, defaulting to the Beijing region endpoint.
-
embed_model_name(Optional[str], default:None) –Name of the Doubao embedding model used, default is "doubao-embedding-text-240715".
-
api_key(Optional[str], default:None) –API key for accessing the Doubao service. If not provided, it is read from lazyllm config.
Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoMultimodalEmbed
Bases: LazyLLMOnlineMultimodalEmbedModuleBase
DoubaoMultimodalEmbed class inherits from OnlineEmbeddingModuleBase, encapsulating the functionality to call Doubao's online multimodal (text + image) embedding service.
It supports converting text and image inputs into a unified vector representation by specifying the service URL, model name, and API key, enabling remote retrieval of multimodal embeddings.
Parameters:
-
embed_url(Optional[str], default:None) –URL of the Doubao multimodal embedding service, defaulting to the Beijing region endpoint.
-
embed_model_name(Optional[str], default:None) –Name of the Doubao multimodal embedding model used, default is "doubao-embedding-vision-241215".
-
api_key(Optional[str], default:None) –API key for accessing the Doubao service. If not provided, it is read from lazyllm config.
Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat
Bases: OnlineChatModuleBase, FileHandlerBase
GLMChat class inherits from OnlineChatModuleBase and FileHandlerBase, encapsulating the functionality of accessing Zhipu's GLM series models online.
It supports chat generation, file handling, and fine-tuning. The default model is GLM-4, but other trainable models (e.g., chatglm3-6b, chatglm_12b) are also supported.
Parameters:
-
base_url(Optional[str], default:None) –API endpoint for Zhipu GLM service, default is "https://open.bigmodel.cn/api/paas/v4/".
-
model(Optional[str], default:None) –Name of the GLM model to use. Defaults to "glm-4", or one from the TRAINABLE_MODEL_LIST.
-
api_key(Optional[str], default:None) –API key for accessing GLM service. If not provided, it is read from lazyllm config.
-
stream(Optional[bool], default:True) –Whether to enable streaming output. Defaults to True.
-
return_trace(Optional[bool], default:False) –Whether to return debug trace information. Defaults to False.
-
**kwargs–Additional optional parameters passed to OnlineChatModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
lazyllm.module.llms.onlinemodule.supplier.glm.GLMText2Image
Bases: LazyLLMOnlineText2ImageModuleBase
GLM Text-to-Image module, inheriting from GLMMultiModal, encapsulates the functionality to generate images using the GLM CogView-4 model.
It supports generating a specified number of images with given resolution based on a text prompt and can call the remote service via an API key.
Parameters:
-
model_name(Optional[str], default:None) –Name of the GLM model to use, defaulting to "cogview-4-250304" or the 'glm_text_to_image_model_name' in config.
-
api_key(Optional[str], default:None) –API key to access the GLM image generation service.
-
return_trace(bool, default:False) –Whether to return debug trace information, default is False.
-
**kwargs–Additional parameters passed to GLMMultiModal.
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
lazyllm.module.llms.onlinemodule.supplier.qwen.QwenText2Image
Bases: LazyLLMOnlineText2ImageModuleBase
Qwen Text-to-Image module and Image-Edit module, inheriting from LazyLLMOnlineText2ImageModuleBase, encapsulates the functionality to generate images using the Qwen Wanx2.1-t2i-turbo model.
It supports generating a specified number of images with given resolution based on a text prompt, and allows setting negative prompts, random seeds, and prompt extension. The service is called remotely via DashScope API.
Parameters:
-
model(Optional[str], default:None) –Name of the Qwen model to use, default is taken from config 'qwen_text2image_model_name', or "wanx2.1-t2i-turbo" if not set.
-
api_key(Optional[str], default:None) –API key for accessing DashScope service.
-
return_trace(bool, default:False) –Whether to return debug trace information, default is False.
-
**kwargs–Additional parameters passed to LazyLLMOnlineText2ImageModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 | |
lazyllm.module.llms.onlinemodule.supplier.kimi.KimiChat
Bases: OnlineChatModuleBase
KimiChat class, inheriting from OnlineChatModuleBase, encapsulates the functionality to call Kimi chat service provided by Moonshot AI.
By specifying the API key, model name, and service URL, it supports safe and accurate Chinese and English Q&A interactions, as well as image input in base64 format.
Parameters:
-
base_url(str, default:None) –Base URL of the Kimi service, default is "https://api.moonshot.cn/".
-
model(str, default:None) –Kimi model name to use, default is "moonshot-v1-8k".
-
api_key(Optional[str], default:None) –API key for accessing Kimi service. If not provided, it is read from lazyllm config.
-
stream(bool, default:True) –Whether to enable streaming output, default is True.
-
return_trace(bool, default:False) –Whether to return debug trace information, default is False.
-
**kwargs–Additional parameters passed to OnlineChatModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/kimi.py
lazyllm.module.llms.onlinemodule.fileHandler.FileHandlerBase
FileHandlerBase is a base class for handling fine-tuning data files, mainly used for validating and converting fine-tuning data formats.
This class cannot be instantiated directly; it must be inherited by a subclass that implements specific file format conversion logic.
Capabilities include
- Validate that the fine-tuning data file is in standard
.jsonlformat. - Check that each data entry contains messages in the correct format (with
roleandcontentfields). - Verify that roles are within the allowed range (system, knowledge, user, assistant).
- Ensure each conversation example contains at least one assistant response.
- Provide temporary file storage for further processing.
Examples:
>>> import lazyllm
>>> from lazyllm.module.llms.onlinemodule.fileHandler import FileHandlerBase
>>> import tempfile
>>> import json
>>> sample_data = [
... {"messages": [{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there!"}]},
... {"messages": [{"role": "user", "content": "How are you?"}, {"role": "assistant", "content": "I'm doing well, thank you!"}]}
... ]
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
... for item in sample_data:
... f.write(json.dumps(item, ensure_ascii=False) + '
')
... temp_file_path = f.name
>>> class CustomFileHandler(FileHandlerBase):
... def _convert_file_format(self, filepath: str) -> str:
... with open(filepath, 'r', encoding='utf-8') as f:
... data = [json.loads(line) for line in f]
... converted_data = []
... for item in data:
... messages = item.get('messages', [])
... conversation = []
... for msg in messages:
... conversation.append(f"{msg['role']}: {msg['content']}")
... converted_data.append('
'.join(conversation))
... return '
---
'.join(converted_data)
>>> handler = CustomFileHandler()
>>> try:
... result = handler.get_finetune_data(temp_file_path)
... print("数据验证和转换成功")
... except Exception as e:
... print(f"错误: {e}")
... finally:
... import os
... os.unlink(temp_file_path)
Source code in lazyllm/module/llms/onlinemodule/fileHandler.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
get_finetune_data(filepath)
Get and process fine-tuning data files, including validating file format and converting to the format supported by the target platform.
Parameters:
-
filepath(str) –Path to the fine-tuning data file, must be in .jsonl format
Source code in lazyllm/module/llms/onlinemodule/fileHandler.py
lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat
Bases: OnlineChatModuleBase, FileHandlerBase
GLMChat class inherits from OnlineChatModuleBase and FileHandlerBase, encapsulating the functionality of accessing Zhipu's GLM series models online.
It supports chat generation, file handling, and fine-tuning. The default model is GLM-4, but other trainable models (e.g., chatglm3-6b, chatglm_12b) are also supported.
Parameters:
-
base_url(Optional[str], default:None) –API endpoint for Zhipu GLM service, default is "https://open.bigmodel.cn/api/paas/v4/".
-
model(Optional[str], default:None) –Name of the GLM model to use. Defaults to "glm-4", or one from the TRAINABLE_MODEL_LIST.
-
api_key(Optional[str], default:None) –API key for accessing GLM service. If not provided, it is read from lazyllm config.
-
stream(Optional[bool], default:True) –Whether to enable streaming output. Defaults to True.
-
return_trace(Optional[bool], default:False) –Whether to return debug trace information. Defaults to False.
-
**kwargs–Additional optional parameters passed to OnlineChatModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
lazyllm.module.llms.onlinemodule.supplier.glm.GLMRerank
Bases: LazyLLMOnlineRerankModuleBase
Reranking module for Zhipu AI, inheriting from OnlineEmbeddingModuleBase, used for relevance reranking of documents.
Parameters:
-
embed_url(str, default:None) –Base URL for reranking API, defaults to "https://open.bigmodel.cn/api/paas/v4/rerank".
-
embed_model_name(str, default:'rerank') –Model name to use, defaults to "rerank".
-
api_key(str, default:None) –Zhipu AI API key, if not provided will be read from lazyllm.config['glm_api_key'].
Properties
type: Returns model type, fixed as "ONLINE_RERANK".
Main Features
- Performs relevance reranking for input query and document list
- Supports custom ranking parameters
- Returns relevance scores for each document
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
lazyllm.module.OnlineMultiModalModule
Bases: _DynamicSourceRouterMixin
Used to manage and create online multimodal service modules. Supported task types are stt / tts / text2image / image_editing.
Parameters:
-
model(str, default:None) –Model name to use.
-
source(str, default:None) –Supplier to use, such as
qwen/glm/minimax/siliconflow/doubao. -
type(str, default:None) –Multimodal task type, one of
stt/tts/text2image/image_editing. -
url(str, default:None) –Base URL of the platform. Defaults to each supplier's official endpoint. The alias
base_urlis also accepted. -
api_key(str, default:None) –You can pass an explicit API key. If set to
autoordynamic, the key is resolved from config at runtime, enabling dynamic key switching. -
dynamic_auth(bool, default:False) –Whether to enable dynamic auth. When True, it is equivalent to
api_key='dynamic'. -
return_trace(bool, default:False) –Whether to record the result in trace. Defaults to False.
Examples:
>>> import lazyllm
>>> stt = lazyllm.OnlineMultiModalModule(source='qwen', type='stt', api_key='dynamic')
>>> tts = lazyllm.OnlineMultiModalModule(source='qwen', type='tts', dynamic_auth=True)
>>> img = lazyllm.OnlineMultiModalModule(source='qwen', type='text2image')
Source code in lazyllm/module/llms/onlinemodule/multimodal.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
lazyllm.module.llms.onlinemodule.supplier.qwen.QwenRerank
Bases: LazyLLMOnlineRerankModuleBase
Qwen reranking module, inheriting from OnlineEmbeddingModuleBase, used for relevance reranking of documents.
Parameters:
-
embed_url(str, default:None) –Base URL for reranking API, defaults to "https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank".
-
embed_model_name(str, default:None) –Model name to use, defaults to "gte-rerank".
-
api_key(str, default:None) –Qwen API key, if not provided will be read from lazyllm.config['qwen_api_key'].
-
**kwargs–Additional arguments passed to the base class.
Properties
type: Returns model type, fixed as "ONLINE_RERANK".
Main Features
- Performs relevance reranking for input query and document list
- Supports custom ranking parameters
- Returns index and relevance score for each document
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
lazyllm.module.llms.onlinemodule.supplier.qwen.QwenTTS
Bases: LazyLLMOnlineTTSModuleBase
Qwen's text-to-speech module, inheriting from LazyLLMOnlineTTSModuleBase, providing support for multiple speech synthesis models.
Parameters:
-
model(str, default:None) –Model name, defaults to "qwen-tts". Available models include: - cosyvoice-v2 - cosyvoice-v1 - sambert - qwen-tts - qwen-tts-latest
-
api_key(str, default:None) –API key, defaults to None, will be read from lazyllm.config['qwen_api_key'].
-
return_trace(bool, default:False) –Whether to return call trace information, defaults to False.
-
**kwargs–Additional arguments passed to the base class.
Synthesis Parameters:
input (str): Text content to convert.
voice (str): Speaker voice, defaults to model's default voice.
speech_rate (float): Speech rate, defaults to 1.0.
volume (int): Volume, defaults to 50.
pitch (float): Pitch, defaults to 1.0.
Note
- Different models may support different voice options
- Returned audio data is automatically encoded into file format
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 | |
lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaChat
Bases: OnlineChatModuleBase, FileHandlerBase, _SenseNovaBase
SenseNovaChat is the LLM interface management component for SenseTime's open platform, inheriting from OnlineChatModuleBase and FileHandlerBase, providing both chat and file handling capabilities.
Parameters:
-
base_url(str, default:None) –Base URL for the API, defaults to "https://api.sensenova.cn/compatible-mode/v1/".
-
model(str, default:None) –Name of the model to use, defaults to "SenseChat-5".
-
api_key(str, default:None) –SenseTime API key, if not provided will be read from lazyllm.config['sensenova_api_key'].
-
secret_key(str, default:None) –SenseTime secret key, if not provided will be read from lazyllm.config['sensenova_secret_key'].
-
stream(bool, default:True) –Whether to enable streaming output, defaults to True.
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False.
-
**kwargs–Additional arguments passed to the base class.
Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 | |
set_deploy_parameters(**kw)
Set parameters for model deployment.
Parameters:
-
**kw–Key-value pairs of deployment parameters that will be used when creating deployment.
Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py
lazyllm.module.llms.onlinemodule.base.onlineMultiModalBase.OnlineMultiModalBase
Bases: LazyLLMOnlineBase, LLMBase
Base class for online multimodal models, inheriting from LLMBase, providing basic functionality for multimodal models.
Parameters:
-
model_name(str) –Model name, defaults to None. A warning will be generated if not specified.
-
return_trace(bool, default:False) –Whether to return call trace information, defaults to False.
-
**kwargs–Additional arguments passed to the base class.
Properties:
series: Returns the model series name.
type: Returns the model type, fixed as "MultiModal".
Main Methods:
share(): Create a shared instance of the module.
forward(input, lazyllm_files, **kwargs): Main method for handling input and files.
_forward(input, files, **kwargs): Forward method to be implemented by subclasses.
Notes
- Subclasses must implement the _forward method.
- A warning log will be generated if model name (model_name) is not specified.
Source code in lazyllm/module/llms/onlinemodule/base/onlineMultiModalBase.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
lazyllm.module.llms.onlinemodule.base.utils.LazyLLMOnlineBase
Bases: ModuleBase
Base class for online modules, inheriting from ModuleBase and powered by LazyLLMRegisterMetaClass, providing unified basic functionality for all online service modules.
This class encapsulates common behaviors of online modules, including caching mechanisms and debug tracing functionality, serving as the foundation for building various online API service modules.
Key Features
- Inherits all basic functionality from ModuleBase, including submodule management, hook registration, etc.
- Supports online module caching mechanism, controllable through configuration.
- Provides debug tracing functionality for troubleshooting and performance analysis.
- Serves as a common base class for all online service modules (chat, embedding, multimodal, etc.).
Parameters:
-
return_trace(bool, default:False) –Whether to write inference results into the trace queue for debugging and tracking. Default is
False.
Use Cases
- As a base class for online chat modules (OnlineChatModuleBase).
- As a base class for online embedding modules (OnlineEmbeddingModuleBase).
- As a base class for online multimodal modules (OnlineMultiModalBase).
- Providing unified basic functionality for custom online service modules.
Source code in lazyllm/module/llms/onlinemodule/base/utils.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | |
lazyllm.module.module.ModuleCache
Bases: object
Module cache manager providing unified cache storage and retrieval functionality.
This class encapsulates multiple cache strategies (memory, file, SQLite, Redis), automatically selecting cache storage methods based on configuration, providing efficient caching mechanisms for module execution results.
Key Features
- Supports multiple cache strategies: memory cache, file cache, SQLite database cache, Redis cache.
- Automatically selects cache strategy based on configuration, defaults to memory cache.
- Supports cache mode control (read-write, read-only, write-only, disabled).
- Provides unified cache interface, hiding underlying storage implementation details.
- Supports parameter hashing to ensure uniqueness of cache keys.
Parameters:
-
strategy(Optional[str], default:None) –Cache strategy, options include 'memory', 'file', 'sqlite', 'redis'. Defaults to None, will use strategy from configuration.
Use Cases
- Provide caching for module execution results to avoid redundant computation.
- Use Redis cache in distributed environments for sharing.
- Use file or database cache for persistent storage.
- Select different cache strategies based on performance requirements.
Source code in lazyllm/module/module.py
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 | |
close()
Close cache storage strategy.
Releases resources occupied by the cache storage strategy, such as closing database connections, clearing memory cache, etc. After calling this method, the cache will no longer be available.
Note:
- After calling this method, the cache instance will no longer be usable.
- Different cache strategies may have different resource cleanup behaviors.
Source code in lazyllm/module/module.py
get(key, args, kw)
Retrieve data from cache.
Retrieves data from cache based on the provided key and parameters. Throws an exception if cache mode doesn't allow reading or data doesn't exist.
Parameters:
-
key–Cache key used to identify cached data.
-
args–Positional arguments used to generate cache hash key.
-
kw–Keyword arguments used to generate cache hash key.
Returns:
- Any: Data stored in cache.
Exceptions:
- CacheNotFoundError: Raised when specified data doesn't exist in cache.
- RuntimeError: Raised when cache mode is set to write-only (WO).
Source code in lazyllm/module/module.py
set(key, args, kw, value)
Store data in cache.
Stores data in cache based on the provided key and parameters. If cache mode doesn't allow writing, returns directly without executing storage operation.
Parameters:
-
key–Cache key used to identify cached data.
-
args–Positional arguments used to generate cache hash key.
-
kw–Keyword arguments used to generate cache hash key.
-
value–Data to be stored.
Note:
- If cache mode is set to read-only (RO) or disabled (NONE), this method will return directly without executing storage operation.
Source code in lazyllm/module/module.py
lazyllm.module.llms.onlinemodule.supplier.qwen.QwenChat
Bases: OnlineChatModuleBase, FileHandlerBase
TODO: The Qianwen model has been finetuned and deployed successfully,
but it is not compatible with the OpenAI interface and can only
be accessed through the Dashscope SDK.
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 | |
set_deploy_parameters(**kw)
Set model deployment parameters.
Configure relevant parameters for deployment tasks, such as capacity specifications, for subsequent model deployment.
Parameters:
-
**kw–Deployment parameter key-value pairs.
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
lazyllm.module.llms.onlinemodule.supplier.qwen.QwenEmbed
Bases: LazyLLMOnlineEmbedModuleBase
Qwen online text embedding module.
This class inherits from OnlineEmbeddingModuleBase and provides interaction capabilities with the Qwen text embedding API, supporting conversion of text to vector representations.
Parameters:
-
embed_url(str, default:None) –Embedding API URL address. Defaults to Qwen official API address
-
embed_model_name(str, default:None) –Embedding model name. Defaults to 'text-embedding-v1'
-
api_key(str, default:None) –API key. Defaults to 'qwen_api_key' from configuration
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
lazyllm.module.llms.onlinemodule.supplier.glm.GLMEmbed
Bases: LazyLLMOnlineEmbedModuleBase
GLM embedding model interface class for calling Zhipu AI's text embedding services.
Parameters:
-
embed_url(str, default:None) –Embedding service API address, defaults to "https://open.bigmodel.cn/api/paas/v4/embeddings"
-
embed_model_name(str, default:None) –Embedding model name, defaults to "embedding-2"
-
api_key(str, default:None) –API key
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
lazyllm.module.llms.onlinemodule.supplier.glm.GLMSTT
Bases: LazyLLMOnlineSTTModuleBase
GLM Speech-to-Text module, inherits from GLMMultiModal.
Provides speech-to-text (STT) functionality based on Zhipu AI, supports audio file speech recognition.
Parameters:
-
model_name(str, default:None) –Model name, defaults to configured model name or "glm-asr"
-
api_key(str, default:None) –API key, defaults to configured key
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False
-
**kwargs–Other model parameters
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
lazyllm.module.llms.onlinemodule.supplier.deepseek.DeepSeekChat
Bases: OnlineChatModuleBase
DeepSeek large language model interface module.
Parameters:
-
base_url(str, default:None) –API base URL, defaults to "https://api.deepseek.com"
-
model(str, default:None) –Model name, defaults to "deepseek-chat"
-
api_key(str, default:None) –API key, if None, gets from configuration
-
stream(bool, default:True) –Whether to enable streaming output, defaults to True
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False
-
**kwargs–Other parameters passed to base class
Source code in lazyllm/module/llms/onlinemodule/supplier/deepseek.py
lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoText2Image
Bases: LazyLLMOnlineText2ImageModuleBase
ByteDance Doubao Text-to-Image module supporting text to image generation and image editing.
Based on ByteDance Doubao multimodal model's text-to-image functionality, inherits from LazyLLMOnlineText2ImageModuleBase and calls Doubao via the Volcengine Ark SDK for high-quality generation.
Parameters:
-
api_key(str, default:None) –Doubao API key, defaults to None.
-
model_name(str) –Model name, defaults to "doubao-seedream-3-0-t2i-250415".
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False.
-
**kwargs–Other parameters passed to parent class.
Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIChat
Bases: OnlineChatModuleBase, FileHandlerBase
OpenAI API integration module for chat completion and fine-tuning operations.
Provides interface to interact with OpenAI's chat models, supporting both inference and fine-tuning capabilities. Inherits from OnlineChatModuleBase and FileHandlerBase.
Parameters:
-
base_url(str, default:None) –OpenAI API base URL, defaults to "https://api.openai.com/v1/".
-
model(str, default:None) –Model name to use for chat completion, defaults to "gpt-3.5-turbo".
-
api_key(str, default:None) –OpenAI API key, defaults to lazyllm.config['openai_api_key'].
-
stream(bool, default:True) –Whether to use streaming response, defaults to True.
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False.
-
**kwargs–Additional arguments passed to OnlineChatModuleBase.
Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIRerank
Bases: LazyLLMOnlineRerankModuleBase
The OpenAIRerank class provides functionality to call OpenAI's Reranking API for re-ordering a list of text documents.
This class inherits from OnlineEmbeddingModuleBase and mainly provides:
- Setting the embedding model URL and name;
- Encapsulating request data and calling the OpenAI Rerank API;
- Parsing the returned ranking results.
Parameters:
-
embed_url(str, default:None) –Base URL of the OpenAI API, default is 'https://api.openai.com/v1/'.
-
embed_model_name(str, default:None) –Name of the embedding model used for Rerank.
-
api_key(str, default:None) –OpenAI API Key, optional. If not provided, the default from lazyllm config is used.
-
**kw–Additional keyword arguments passed to the parent constructor.
Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py
lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaEmbed
Bases: LazyLLMOnlineEmbedModuleBase, _SenseNovaBase
SenseTime SenseNova Embedding module for text vectorization operations.Provides interface to interact with SenseTime's SenseNova embedding models, supporting text-to-vector conversion functionality. Inherits from OnlineEmbeddingModuleBase and _SenseNovaBase.
Parameters:
-
embed_url(str, default:None) –Embedding API URL, defaults to "https://api.sensenova.cn/v1/llm/embeddings".
-
embed_model_name(str, default:None) –Embedding model name, defaults to "nova-embedding-stable".
-
api_key(str, default:None) –API access key, defaults to None.
-
secret_key(str, default:None) –API secret key, defaults to None.
Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py
lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowTTS
Bases: LazyLLMOnlineTTSModuleBase
SiliconFlow Text-to-Speech module, inherits from OnlineMultiModalBase.
Provides text-to-speech (TTS) functionality based on SiliconFlow, supports converting text to audio files.
Parameters:
-
api_key(str, default:None) –API key, defaults to configured siliconflow_api_key
-
model_name(str, default:None) –Model name, defaults to "fnlp/MOSS-TTSD-v0.5"
-
base_url(str, default:None) –Base API URL, defaults to "https://api.siliconflow.cn/v1/"
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False
-
**kwargs–Other model parameters
Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowChat
Bases: OnlineChatModuleBase, FileHandlerBase
SiliconFlow module, inherits from OnlineChatModuleBase and FileHandlerBase.
Provides large language model chat capabilities via the SiliconFlow platform, supports multiple models (including vision-language models), and includes file handling functionality.
Parameters:
-
base_url(str, default:None) –Base API URL, defaults to "https://api.siliconflow.cn/v1/"
-
model(str, default:None) –Model name to use, defaults to "Qwen/QwQ-32B"
-
api_key(str, default:None) –API key, defaults to lazyllm.config['siliconflow_api_key']
-
stream(bool, default:True) –Whether to enable streaming output, defaults to True
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False
-
**kwargs–Other model parameters
Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowRerank
Bases: LazyLLMOnlineRerankModuleBase
SiliconFlow reranking module, inherits from OnlineEmbeddingModuleBase.
Provides text reranking functionality via the SiliconFlow platform, reordering a list of documents based on their relevance to a given query.
Parameters:
-
embed_url(str, default:None) –Reranking API URL, defaults to "https://api.siliconflow.cn/v1/rerank"
-
embed_model_name(str, default:None) –Name of the reranking model to use, defaults to "BAAI/bge-reranker-v2-m3"
-
api_key(str, default:None) –API key, defaults to lazyllm.config['siliconflow_api_key']
-
**kw–Additional reranking module parameters
Returns: List[Tuple]: A list of reranking results, each containing 'index' and 'relevance_score'.
Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowText2Image
Bases: LazyLLMOnlineText2ImageModuleBase
SiliconFlow Text-to-Image module, inherits from OnlineMultiModalBase.
Provides text-to-image generation functionality based on SiliconFlow, supports generating images from text descriptions and image editing.
Parameters:
-
api_key(str, default:None) –API key, defaults to configured siliconflow_api_key
-
model_name(str) –Model name, defaults to "Qwen/Qwen-Image"
-
base_url(str) –Base API URL, defaults to "https://api.siliconflow.cn/v1/"
-
return_trace(bool, default:False) –Whether to return trace information, defaults to False
-
**kwargs–Other model parameters
Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | |
lazyllm.module.llms.onlinemodule.supplier.aiping.AipingChat
Bases: OnlineChatModuleBase, FileHandlerBase
AipingChat is an online chat module for AIPing, inheriting from OnlineChatModuleBase and FileHandlerBase.
Provides an interface to interact with AIPing's large language models, supporting chat generation, file handling, and model fine-tuning. Supports multiple models including Vision-Language Models (VLM) such as Qwen2.5-VL, Qwen3-VL, GLM-4.5V, GLM-4.6V, etc.
Parameters:
-
base_url(str, default:None) –Base URL for the API, defaults to "https://aiping.cn/api/v1/".
-
model(str, default:None) –Name of the model to use, defaults to "DeepSeek-R1".
-
api_key(Optional[str], default:None) –API key for accessing AIPing service. If not provided, it is read from lazyllm config.
-
stream(bool, default:True) –Whether to enable streaming output, defaults to True.
-
return_trace(bool, default:False) –Whether to return debug trace information, defaults to False.
-
**kwargs–Additional parameters passed to OnlineChatModuleBase.
Features
- Supports multiple large language models, including general chat models and vision-language models
- Supports streaming output for better user experience
- Integrated file handling functionality, supporting fine-tuning data format validation and conversion
- Built-in system prompt: "You are an intelligent assistant developed by AIPing. You are a helpful assistant."
- Supports API key validation to ensure service security
Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py
lazyllm.module.llms.onlinemodule.supplier.aiping.AipingEmbed
Bases: LazyLLMOnlineEmbedModuleBase
Aiping text embedding module, inheriting from OnlineEmbeddingModuleBase.
Provides an interface to interact with AIPing's text embedding service, supporting conversion of text to vector representations with batch processing support.
Parameters:
-
embed_url(str, default:'https://aiping.cn/api/v1/embeddings') –Embedding API URL, defaults to "https://aiping.cn/api/v1/embeddings".
-
embed_model_name(str, default:'text-embedding-v1') –Name of the embedding model to use, defaults to "text-embedding-v1".
-
api_key(Optional[str], default:None) –API key for accessing AIPing service. If not provided, it is read from lazyllm config.
-
batch_size(int, default:16) –Batch size for processing, defaults to 16.
-
**kw–Additional parameters passed to the base class.
Features
- Converts text to high-dimensional vector representations
- Supports batch text processing for improved efficiency
- Configurable batch size to accommodate different performance requirements
- Seamless integration with AIPing API
Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py
lazyllm.module.llms.onlinemodule.supplier.aiping.AipingRerank
Bases: LazyLLMOnlineRerankModuleBase
Aiping reranking module, inheriting from OnlineEmbeddingModuleBase.
Provides an interface to interact with AIPing's reranking service, used for reordering a list of documents based on their relevance to a given query. Returns a list of tuples containing document index and relevance score.
Parameters:
-
embed_url(str, default:None) –Reranking API URL, defaults to "https://aiping.cn/api/v1/rerank".
-
embed_model_name(str, default:None) –Name of the reranking model to use, defaults to "Qwen3-Reranker-0.6B".
-
api_key(Optional[str], default:None) –API key for accessing AIPing service. If not provided, it is read from lazyllm config.
-
**kw–Additional parameters passed to the base class.
Properties
type (str): Returns model type, fixed as "RERANK".
Features
- Reranks documents based on query relevance
- Supports custom ranking parameters (e.g., top_n)
- Returns index and relevance score for each document
- Suitable for search result optimization and document recommendation scenarios
Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py
lazyllm.module.llms.onlinemodule.supplier.aiping.AipingText2Image
Bases: LazyLLMOnlineText2ImageModuleBase
Aiping text-to-image module, inheriting from OnlineMultiModalBase.
Provides an interface to interact with AIPing's image generation service, supporting image generation from text descriptions. Supports parameters such as negative prompts, image count, size, and random seeds.
Parameters:
-
api_key(Optional[str], default:None) –API key for accessing AIPing service. If not provided, it is read from lazyllm config.
-
model_name(str, default:None) –Name of the model to use, defaults to "Qwen-Image".
-
base_url(str, default:None) –Base URL for the API, defaults to "https://aiping.cn/api/v1/".
-
return_trace(bool, default:False) –Whether to return debug trace information, defaults to False.
-
**kwargs–Additional parameters passed to the base class.
Features
- Generates high-quality images from text prompts
- Supports negative prompts to filter unwanted image features
- Configurable number of images to generate (n parameter)
- Supports multiple image size specifications
- Supports random seed control for reproducible results
- Automatically downloads generated images and encodes them as files
- Default negative prompt: "模糊,低质量"
Note
- This module automatically downloads generated images to local files
- The returned result contains file path information for easy subsequent processing
Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | |