NIM#
Serve optimized model containers with NIM in a Flyte task.
NVIDIA NIM, part of NVIDIA AI Enterprise, provides a streamlined path for developing AI-powered enterprise applications and deploying AI models in production. It includes an out-of-the-box optimization suite, enabling AI model deployment across any cloud, data center, or workstation. Since NIM can be self-hosted, there is greater control over cost, data privacy, and more visibility into behind-the-scenes operations.
With NIM, you can invoke the model’s endpoint as if it is hosted locally, minimizing network overhead.
Installation#
To use the NIM plugin, run the following command:
pip install flytekitplugins-inference
Example usage#
For a usage example, see NIM example usage.
Note
NIM can only be run in a Flyte cluster as it must be deployed as a sidecar service in a Kubernetes pod.