Instructlab

Latest version: v0.24.3

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.25

Features

- Update vLLM to version 0.7.3.

0.24

Breaking Changes

Features

- Update NVIDIA CUDA to version 12.8, in order to support NVIDIA Blackwell GPUs.
- Update vLLM to version 0.7.2. As a requirement for this new vLLM version, PyTorch is updated to 2.5.1
- Use vLLM wheel from PyPI for NVIDIA CUDA to speed up installation of InstructLab.
- `ilab model chat` now has `--no-decoration` option to display chat responses without decoration.
- A new command `ilab model remove` has been introduced so users can now remove the model via `ilab` CLI.
- `ilab process list` now has `--state` to filter different states.

0.23

Breaking Changes

- llama-cpp-python has been bumped to 0.3.2. This allows for serving of Granite 3.0 GGUF Models. With this change, some previous handling of context window size has been modified to work with the 0.3.z releases of llama-cpp-python.
- `ilab train --pipeline=simple` no longer supports Intel Gaudi (`hpu`) devices. Simple training on Gaudi was experimental and limited to a single device.
- The results of MT-Bench and MT-Bench-Branch are now stored in `$XDG_DATA_HOME/eval/{mt_bench,mt_bench_branch}` respectively. Previously the results for both benchmarks were stored in `$XDG_DATA_HOME/eval`. `ilab config init` must be run to initialize the new evaluation results directories.
- `system_prompt` variable and `dk_bench` section have been added to the `evaluate` section of the configuration file. `ilab config init` should be run to initialize these new sections in the config file.

Features

- An experimental/preview implementation of Retrieval-Augmented Generation (RAG) is added. It is enabled only when an `ILAB_FEATURE_SCOPE` environment variable is set to `DevPreviewNoUpgrade`. For details see the [instructions in README.md](https://github.com/instructlab/instructlab/?tab=readme-ov-file#-configure-retrieval-augmented-generation-developer-preview).
- A new command-group `ilab rag` has been introduced. The group includes two new commands: `ilab rag convert` and `ilab rag ingest`. The former converts documents (e.g., PDF) into a structured form and the latter ingests them into a vector index file.
- A new argument `--rag` is added to the `ilab model chat` command that uses that index during chat to augment the generation. When that flag is sent, the chat functionality responds to each chat input by first retrieving text from the vector index and then providing that text to the model for its use in answering.
- A new command `ilab model upload` has been introduced so users can now upload their trained models to [Hugging Face](https://huggingface.co/), OCI registry endpoints, and [AWS S3](https://aws.amazon.com/s3/) buckets via the `ilab` CLI
- `ilab model serve` now has separate `--host` and `--port` options, replacing the `host_port` configuration. The default values are `127.0.0.1` for `--host` and `8000` for `--port`, allowing users to configure the server's binding address and port independently through the configuration file or command-line flags.
- Update vLLM to version 0.6.4.post1. As a requirement for this new vLLM version, PyTorch is updated to 2.5.1
- `--disable-accelerate-full-state-at-epoch` added for accelerated training. With this option only HuggingFace checkpoints are saved which are required for multi-phase training. However, if set this switch also disables resumeable training because "full resumeability" requires full-state checkpoints. This option should be used if storage is limited and/or resumeability isn't required.
- `ilab data generate` now stores generated data from each run in individually dated directories.
- `ilab data list` now organizes tables per dated run, and outputs a more detailed table that describes which model generated which dataset.
- `ilab model train` and `ilab model test` now search for training data in per-run directories in addition to the top-level directory, maintaining backwards compatibility with old datasets.
- `ilab model evaluate` now has support for DK-Bench (Domain Knowledge Bench). DK-Bench takes in a set of questions and reference answers provided from a user, gets responses from a model to those questions, and then uses a judge model to grade the response to each question on a 1-5 scale compared to the reference answer. The highest possible score for each question is a 5 (fully accurate, and completely aligned with the reference) and the lowest is a 1 (entirely incorrect, and irrelevant). To run the benchmark the environment variable OPENAI_API_KEY must be set. The judge model for DK-Bench is `gpt-4o` and any judge model provided for DK-Bench must be the name of an OpenAI model.
- `ilab model evaluate` now has `--skip-server` option to skip launching the server and evaluate directly with the HuggingFace model. This option supports mmlu and mmlu_branch benchmarks.

0.22

Breaking Changes

Features

- `ilab train --pipeline=accelerated --strategy=lab-skills-only` supports training with only the skills phase (leaving out knowledge).
- Previously, System Profile auto-detection was done by reading the names of the YAML files and matching them to your hardware. We now depend on the `Metadata` class stored in the configuration file itself. Please select `y` when prompted to over-write your existing system profiles to utilize the the new auto-detection system.

0.21

Breaking Changes

- train-profiles have been deprecated and replaced with system-profiles. These profiles follow the format of the config file and apply to all commands. They live in `~/.local/share/instructlab/internal/system_profiles`
- The default model has been changed from Merlinite to Granite - see <https://github.com/instructlab/instructlab/issues/2238> for more details
- Removed the `--greedy-mode` flag from `ilab model chat`. Please update any scripts or workflows relying on `--greedy-mode` to ensure compatibility.

Features

- `ilab` now supports system profiles. These profiles apply entire configuration files tailored to specific hardware configurations. We support a set of auto-detected profiles for CPU enabled Linux machines, M-Series Apple Silicon Chips, and Nvidia GPUs, and Intel Gaudi 3. When you run `ilab config init`, one of these profiles should be selected for you. If there is not a direct match, a menu will be displayed allowing you to choose one.
- Add support for inferencing with IBM granite architecture models.
- `ilab model chat` now includes a temperature setting feature, allowing users to adjust the response generation behavior. A default temperature of 1.0 has been added to the configuration file, which can be customized for each chat session using the `--temperature` or `-t` flag. Lower values produce
more deterministic and focused responses, while higher values increase variability.
- the `full` training pipeline now fits on devices with 16 and 32 GB of RAM! If you are on a Mac, these optimizations are done for you. If you are on Linux try using `--optimize-memory`, results vary per CPU vendor.
- `ilab data generate` now has `--max-num-tokens` which defaults to 4096. This flag can be used to generate less data per SDG run. Specifying a value like `512` results in a faster run with less data generated. This works well with consumer hardware and will reduce training time.

- `ilab model download` uses the `hf_transfer` library for faster model downloads reducing the average download time by 60%. This only applies to models that are hosted on Hugging Face Hub. This can be disabled by setting the environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `0`.

0.20

Breaking Changes

- vLLM has been upgraded to [v0.6.2](https://github.com/opendatahub-io/vllm/releases/tag/v0.6.2) and will need to be reinstalled if you are upgrading `ilab` from an older version
- Intel Gaudi software has been updated to 1.18.0 with Python 3.11 and
Torch 2.4.0.

Page 1 of 2

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.