OPEA Release Notes v1.4¶
We are excited to announce the release of OPEA version 1.4, which includes significant contributions from the open-source community. This release addresses over 330 pull requests.
More information about how to get started with OPEA v1.4 can be found on the Getting Started page. All project source code is maintained in the opea-project organization. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.
Table of Contents¶
What’s New in OPEA v1.4¶
This release includes new features, optimizations, and user-focused updates.
Advanced Agent Capabilities¶
MCP (Model Context Protocol) Support: The OPEA agent now supports the MCP, allowing for standardized and more efficient integration with external data and services. (GenAIComps#1678, GenAIComps#)
Deep Research Agent: The example is designed to handle complex, multi-step research. It leverages langchain-ai/open_deep_research and supports Intel Gaudi accelerators. (GenAIExamples#)
Components as MCP Servers¶
OPEA components can now serve as Model Context Protocol (MCP) servers, allowing external MCP-compatible frameworks and applications to integrate with OPEA seamlessly. (GenAIComps#1652)
KubeAI Operator for OPEA¶
The KubeAI Operator now features an improved autoscaler, monitoring support, optimized resource placement via NRI plugins, and expanded support for new models on Gaudi. (GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#)
New GenAI Capabilities¶
Fine-Tuning of Reasoning Models: This feature is compatible with the dataset format used in FreedomIntelligence/medical-o1-reasoning-SFT, enabling you to customize models with your own data. (GenAIComps#)
HybridRAG: Combined GraphRAG (knowledge graph-based retrieval) and VectorRAG (vector database retrieval) for enhanced accuracy and contextual relevance. (GenAIExamples#1968)
LLM Router: LLM Router decides which downstream LLM serving endpoint is best suited for an incoming prompt. (GenAIComps#1716)
OPEA Store: Redis and MongoDB have been integrated into OPEA Store. (GenAIComps#1816, GenAIComps#)
Guardrails: Added Input/Output Guardrails to enforce content safety and prevent the creation of inappropriate outputs. (GenAIComps#1798)
Language Detection: The microservice is used to ensure the pipeline’s response matches the query’s language. (GenAIComps#1774)
Prompt Template: The microservice can dynamically generate system and user prompts based on structured inputs and document context. (GenAIComps#1826)
Air-gapped Environment Support: Some OPEA microservices can now be deployed in an air-gapped Docker environment. (GenAIComps#1480)
Remote Inference Endpoints Support: Added support for remote inference endpoints for OPEA examples. (GenAIExamples#1973)
Better User Experience¶
One-click Deployment: You can now deploy 8 OPEA examples with one click. ChatQnA can deploy in an air-gapped Docker environment. (GenAIExamples#1727)
GenAIStudio: Added support for drag-and-drop creation of documentation summarization and code generation applications. (GenAIStudio#61)
Documentation Refinement: Refined READMEs for key examples and components to help readers easily locate documentation tailored to deployment, customization, and hardware. (GenAIExamples#1673, GenAIComps#)
Newly Supported Models¶
OPEA introduces support for the following models in this release.
Model |
TGI-Gaudi |
vLLM-CPU |
vLLM-Gaudi |
vLLM-ROCm |
OVMS |
Optimum-Habana |
PredictionGuard |
SGLANG-CPU |
---|---|---|---|---|---|---|---|---|
meta-llama/Llama-4-Scout-17B-16E-Instruct |
- |
- |
- |
- |
- |
- |
- |
✓ |
meta-llama/Llama-4-Maverick-17B-128E-Instruct |
- |
- |
- |
- |
- |
- |
- |
✓ |
(✓: supported; -: not validated; x: unsupported)
Newly Supported Hardware¶
Support for AMD® EPYC™ has been added for 11 OPEA examples. (GenAIExamples#2083)
Newly Supported OS¶
Support for openEuler has been added. (GenAIExamples#2088, GenAIComps#)
Updated Dependencies¶
Dependency |
Hardware |
Scope |
Version |
Version in OPEA v1.3 |
Comments |
---|---|---|---|---|---|
huggingface/text-embeddings-inference |
all |
all supported examples |
cpu-1.7 |
cpu-1.6 |
|
vllm |
Xeon |
all supported examples except EdgeCraftRAG |
v0.10.0 |
v0.8.3 |
Changes to Default Behavior¶
CodeTrans
: The default model changed frommistralai/Mistral-7B-Instruct-v0.3
toQwen/Qwen2.5-Coder-7B-Instruct
on Xeon and Gaudi.
Validated Hardware¶
Intel® Gaudi® AI Accelerators (2nd)
Intel® Xeon® Scalable processor (3rd)
Intel® Arc™ Graphics GPU (A770)
AMD® EPYC™ processors (4th, 5th)
Validated Software¶
Docker version 28.3.3
Docker Compose version v2.39.1
Intel® Gaudi® software and drivers v1.21
Kubernetes v1.32.7
TEI v1.7
TGI v2.4.0 (Xeon, EPYC), v2.3.1 (Gaudi), v2.4.1 (ROCm)
Torch v2.5.1
Ubuntu 22.04
vLLM v0.10.0 (Xeon, EPYC), v0.6.6.post1+Gaudi-1.20.0 (Gaudi)
Known Issues¶
AvatarChatbot cannot run in a K8s environment due to a functional gap in the wav2clip service. (GenAIExamples#)
Full Changelogs¶
Contributors¶
This release would not have been possible without the contributions of the following organizations and individuals.
Contributing Organizations¶
AMD
: AMD EPYC support.Bud
: Components as MCP Servers.Intel
: Development and improvements to GenAI examples, components, infrastructure, evaluation, and studio.MariaDB
: Added ChatQnA docker-compose example on Intel Xeon using MariaDB Vector.openEuler
: openEuler OS support.
Individual Contributors¶
For a comprehensive list of individual contributors, please refer to the Full Changelogs section.