OPEA Release Notes v1.3¶
We are excited to announce the release of OPEA version 1.3, which includes significant contributions from the open-source community. This release addresses over 520 pull requests.
More information about how to get started with OPEA v1.3 can be found on the Getting Started page. All project source code is maintained in the opea-project organization. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.
Table of Contents¶
What’s New in OPEA v1.3¶
This release introduces exciting capabilities, optimizations, and user-centric enhancements:
Advanced Agent Capabilities¶
Multi-Turn Conversation: Enhanced the OPEA agent framework for dynamic, context-aware dialogues. (GenAIComps#1248)
Finance Agent Example: A financial agent example for automating financial data aggregation and leveraging LLMs to generate insights, forecasts, and strategic recommendations. (GenAIExamples#)
Performance and Scalability¶
vLLM Enhancement: Integrated vLLM as the default LLM serving backend for key GenAI examples across Intel® Xeon® processors, Intel® Gaudi® accelerators, and AMD® GPUs. (GenAIExamples#)
KubeAI Operator for OPEA (Alpha release): Simplified OPEA inference operations in cloud environment and enabled optimal out-of-the-box performance for specific models and hardware using profiles. (GenAIInfra#945)
Ecosystem Integrations¶
Haystack Integration: Enabled OPEA as a backend of Haystack. (Haystack-OPEA#)
Cloud Readiness: Expanded automated Terraform deployment for ChatQnA to include support for Azure, and enabled CodeGen deployments on AWS and GCP. (GenAIExamples#1731)
New GenAI Capabilities¶
OPEA Store: Delivered a unified data store access API and a robust data store integration layer that streamlines data store integration. ArangoDB was integrated. (GenAIComps#1493)
CodeGen using RAG and Agent: Leveraged RAG and code agent to provide an additional layer of intelligence and adaptability for CodeGen example. (GenAIExamples#1757)
Enhanced Multimodality: Added support for additional audio file types (.mp3) and supported spoken audio captions with image ingestion. (GenAIExamples#1549)
Struct to Graph: Supported transforming structured data to graphs using Neo4j graph database. (GenAIComps#1502)
Text to Graph: Supported creating graphs from text by extracting graph triplets. (GenAIComps#1357, GenAIComps#)
Text to Cypher: Supported generating and executing Cypher queries from natural language for graph database retrieval. (GenAIComps#1319)
Enhanced Evaluation¶
Enhanced Long-Context Model Evaluation: Supported evaluating long-context model on Intel® Gaudi® with vLLM. (HELMET#20)
TAG-Bench for SQL Agents: Integrated TAG-Bench to evaluate complex SQL query generation (GenAIEval#).
DocSum Support: GenAIEval now supports evaluating the performance of DocSum. (GenAIEval#252)
Toxicity Detection Evaluation: Introduced a workflow to evaluate the capability of detecting toxic language based on LLMs. (GenAIEval#241)
Model Card: Added a model card generator for generating reports containing model performance and fairness metrics. (GenAIEval#236)
Observability¶
OpenTelemetry Tracing: Leveraged OpenTelemetry to enable tracing for ChatQnA and AgentQnA along with TGI and TEI. (GenAIExamples#1542)
Application dashboards: Helm installed application E2E performance dashboard(s). (GenAIInfra#800)
E2E (end-to-end) metric improvements: E2E metrics are summed together for applications that use multiple megaservice instances. Tests for the E2E metrics + fixes. (GenAIComps#1301, (GenAIComps#)
Better User Experience¶
GenAIStudio: Supported drag-and-drop creation of agentic applications. (GenAIStudio#50)
Documentation Refinement: Refined READMEs for key examples to help readers easily locate documentation tailored to deployment, customization, and hardware. (GenAIExamples#1741)
Optimized Dockerfiles: Simplified application Dockerfiles for faster image builds. (GenAIExamples#1585)
Exploration¶
SQFT: Supported low-precision sparse parameter-efficient fine-tuning on LLMs. (GenAIResearch#1)
Newly Supported Models¶
OPEA introduced the support for the following models in this release.
Model |
TGI-Gaudi |
vLLM-CPU |
vLLM-Gaudi |
vLLM-ROCm |
OVMS |
Optimum-Habana |
PredictionGuard |
---|---|---|---|---|---|---|---|
deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
✓ |
✓ |
✓ |
✓ |
- |
✓ |
- |
deepseek-ai/DeepSeek-R1-Distill-Llama-70B |
✓ |
✓ |
✓ |
✓ |
- |
✓ |
- |
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B |
✓ |
✓ |
✓ |
✓ |
- |
✓ |
- |
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
✓ |
✓ |
✓ |
✓ |
- |
✓ |
- |
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B |
✓ |
✓ |
✓ |
✓ |
- |
✓ |
- |
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B |
✓ |
✓ |
✓ |
✓ |
- |
✓ |
- |
deepseek-ai/Deepseek-v3 |
✓ |
- |
✓ |
✓ |
- |
✓ |
- |
Hermes-3-Llama-3.1-8B |
- |
- |
- |
✓ |
- |
- |
✓ |
ibm-granite/granite-3.2-8b-instruct |
- |
- |
✓ |
✓ |
- |
- |
- |
Phi-4-mini |
x |
x |
x |
✓ |
x |
✓ |
- |
Phi-4-multimodal-instruct |
x |
x |
x |
✓ |
x |
✓ |
- |
mistralai/Mistral-Small-24B-Instruct-2501 |
✓ |
- |
✓ |
✓ |
- |
✓ |
- |
mistralai/Mistral-Large-Instruct-2411 |
x |
- |
✓ |
✓ |
- |
✓ |
- |
(✓: supported; -: not validated; x: unsupported)
Newly Supported Hardware¶
AMD® GPU using AMD® ROCm™ for 9 examples. (GenAIExamples#1613 and 8 more.)
Other Notable Changes¶
Expand the following lists to read:
GenAIExamples
Functionalities
[AgentQnA] Added web search tool support and simplify the run instructions. (#1656) (e8f2313)
[ChatQnA] Added support for latest deepseek models on Gaudi (#1491) (9adf7a6)
[EdgeCraftRAG] A sleek new UI based on Vue and Ant Design for enhanced user experience, supporting concurrent multi-requests on vLLM, JSON pipeline configuration, and API-based prompt modification. (#1665) (5a50ae0)
[EdgeCraftRAG] Supported multi-card deployment of Intel ARC GPU for vllm inference (#1729) (1a0c5f0)
[FaqGen] Merged FaqGen into ChatQnA for unified Chatbot experience. (#1654) (6d24c1c)
Benchmark
[ChatQnA] Provided unified scalable deployment and benchmarking support for examples (#1315) (ed16308)
Deployment
Sync values yaml file for 1.3 release (#1748) (46ebb78)
Bug Fixes
Documentation
Updated README.md for OPEA OTLP tracing (#1406) (4c41a5d)
Updated README.md for Agent UI (#1495) (88a8235)
Refactored AudioQnA README (#1508) (9f36e84)
Added a new section to change LLM model such as deepseek based on validated model table in LLM microservice (#1501) (970b869)
Updated README.md of AIPC quick start (#1578) (852bc70)
Added short descriptions to the images OPEA publishes on Docker Hub (#1637) (68747a9)
CI/CD/UT
GenAIComps
Functionalities
[agent] Enabled custom prompt for react_llama and react_langgraph (#1391) (558a2f6)
[dataprep] Added Multimodal support for Milvus for dataprep component (#1380) (006bd91)
[dataprep]: New Arango integration (#1558)
[dataprep]: Added ability to customize Dataprep unique input parameters by way of subclassing the DataprepRequest pydantic model. Avoids having to introduce parameters unique to a few Dataprep integrations across all Dataprep providers (#1525)
[retrieval]: New Arango integration (#1558)
[cores/mega] Added remote endpoint support (#1399) (1871dec)
[docsum] Enlarged DocSum prompt buffer (#1471) (772ef6e)
[embeddings] Refined CLIP embedding microservice by leveraging the third-party CLIP (#1298) (7727235)
[finetuning] Added xtune to finetuning for Intel ARC GPU (#1432) (80ef317)
[guardrails] Added native support for toxicity detection guardrail microservice (#1258) (625aec9)
[llm/text-generation] Added support for string message in Bedrock textgen (#1291) (364ccad)
[ipex] Added native LLM microservice using IPEX (#1337) (d51a136)
[lvm] Integrated vLLM to lvm as a backend (#1362) (831c5a3)
[lvm] Integrated UI-TARS vLLM in lvm component (#1458) (4a15795)
[nubula] Docker deployment support for Nebula graph database (#1396) (342c1ed)
[OVMS] Text generation, Embeddings and Reranking microservices based on OVMS component (#) (78b94fc)
[retriever/milvus] Added Multimodal support for Milvus for retriever component (#1381) (40d431a)
[text2image & image2image] Enriched input parameters of text2image and image2image. (#1339) (42f323f)
Refined synchronized I/O in asynchronous functions (#1300) (b08571f)
Bug Fixes
Docsum error by HuggingFaceEndpoint (#1246) (30e3dea)
Fixed tei embedding and tei reranking bug (#1256) (fa01f46)
Fixed web-retrievers hub client and tei endpoint issue (#1270) (ecb7f7b)
Fixed Dataprep Ingest Data Issue. (#1271) (b777db7)
Fixed metric id issue when init multiple Orchestrator instance (#1280) (f8e6216)
Bug Fix neo4j dataprep ingest error handling and skip_ingestion argument passing (#1288) (4a90692)
Fixed the retriever issue of Milvus (#1286) (47f68a4)
Fixed Qdrant retriever RAG issue. (#1289) (c3c8497)
Fixed agent message format. (#1297) (022d052)
Fixed milvus dataprep ingest files failure (#1299) (a033c05)
Fixed docker image security issues (#1321) (589587a)
Megaservice / orchestrator metric testing + fixes (#1348) (1064b2b)
Fixed finetuning python regex syntax error (#1446) (380f95c)
Upgraded Optimum Habana version to fix security check issue (#1571) (83350aa)
Make llamaguard compatible with both TGI and vLLM (#1581) (4024302)
Documentation
CI/CD/UT
Refine dataprep test scripts (#1305) (a4f6af1)
GenAIEval
Auto Tuner
RAG Pilot - A RAG pipeline tuning tool allowing fine-grained control over key aspects of parsing, chunking, postprocessing, and generating selection, enabling better retrieval and response generation. (#243) (97da8f2)
Monitoring
Metrics
Collect vllm latency metric for e2e test (#244) (1b6a91d)
Bug Fixes
Documentation
Add recommendations to platform optimization documentation (ea086a6)
GenAIInfra
HelmChart
[TDX] Added Intel TDX support to helm charts (#799) (040860e)
Add helm starter chart for developing new charts (#776) (6154b6c)
HPA enabling usability improvement (#770) (3016f5f)
Helm chart for Ollama (#774) (7d66afb)
Helm: Added Qdrant support (#796) (99ccf0c)
Chatqna: Added Qdrant DB support (#813) (5576cfd)
Helm installed application metric Grafana dashboards (#800) (f46e8c1)
LLM TextGen Bedrock Support (#811) (da37b9f)
codegen: Add rag pipeline and change default UI (#985) (46b1b6b)
dataprep/retriever: Support airgap offline environment (#980) (b9b10e9)
CSP
Added automated provisioning of CosmosDB and App Insights for OPEA applications (#657) (d29bd2d)
Bug Fixes
Fixed the helm chart release dependency update (#842) (f121edd)
CI/CD/UT
CI: Enabled milvus related test (#767) (5b2cca9)
GenAIStudio
Updated studio fe table UI and updated studio be according to the dataprep refactor (#32) (1168507)
[Feat] Added GenAI Studio UI improvement (#48) (ad64f7c)
Enabled LLM Traces for sandbox (#51) (df6b73e)
Migrated to internal k8 mysql and enable deployment package generation for agentqna (#52) (0cddbe0)
Deprecations¶
Deprecated Examples¶
The following GenAI examples are deprecated, and were removed since OPEA v1.3:
Example |
Migration Solution |
Reasons for Deprecation |
---|---|---|
Use the example ChatQnA instead. |
Provide users with a unified chatbot experience and reduce redundancy. |
Deprecated Docker Images¶
The following Docker images are deprecated, and not updated / tagged for OPEA v1.3 release:
Deprecated Docker Image |
Migration Solution |
Reasons for Deprecation |
---|---|---|
Use opea/agent-openwebui instead. |
Open WebUI based UI for better user experience. |
|
Use opea/chathistory-mongo instead. |
Follow the OPEA naming rules |
|
Use opea/chatqna or opea/chatqna-without-rerank instead. |
FaqGen is deprecated. |
|
Use opea/chatqna-ui instead. |
FaqGen is deprecated. |
|
Use opea/chatqna-ui instead. |
FaqGen is deprecated. |
|
Use opea/feedbackmanagement-mongo instead. |
Follow the OPEA naming rules |
|
Use opea/promptregistry-mongo instead. |
Follow the OPEA naming rules |
The following Docker images are deprecated, and will not be updated / tagged since OPEA v1.4 release:
Deprecated Docker Image |
Migration Solution |
Reasons for Deprecation |
---|---|---|
Use opea/chathistory instead. The Docker image will be released with the |
OPEA introduced OPEAStore to decouple chathistory component from MongoDB. |
|
Use opea/feedbackmanagement instead. The Docker image will be released with the |
OPEA introduced OPEAStore to decouple feedback management component from MongoDB. |
|
Use opea/promptregistry instead. The Docker image will be released with the |
OPEA introduced OPEAStore to decouple prompt registry component from MongoDB. |
Deprecated GenAIExample Variables¶
Example |
Type |
Variable |
Migration Solution |
---|---|---|---|
environment variable |
|
Removed from Intel AIPC deployment. Use the environment variable |
|
environment variable |
|
Removed from Intel AIPC deployment. Instead, users can customize |
|
environment variable |
|
Removed due to no uses. |
|
environment variable |
|
Removed due to no uses. |
|
environment variable |
|
Removed due to no uses. |
|
environment variable |
|
Instead, it has been split into two new environment variables: |
Deprecated GenAIComps Parameters¶
Component |
Parameter |
Migration Solution |
---|---|---|
|
Its functionality is now fully covered by the new |
Updated Dependencies¶
Dependency |
Hardware |
Scope |
Version |
Version in OPEA v1.2 |
Comments |
---|---|---|---|---|---|
gradio |
- |
all examples |
5.11.0 |
5.5.0 |
|
huggingface/text-generation-inference |
AMD GPU |
all examples |
2.4.1-rocm |
2.3.1-rocm |
|
huggingface/text-embeddings-inference |
all |
all examples |
cpu-1.6 |
cpu-1.5 |
|
langchain |
- |
llms/doc-summarization |
0.3.14 |
0.3.15 |
Avoid bugs in FaqGen and DocSum. |
optimum-habana |
Gaudi |
lvms/llama-vision |
1.17.0 |
- |
|
pytorch |
Gaudi |
all components |
2.5.1 |
2.4.0 |
|
transformers |
- |
lvms/llama-vision |
4.48.0 |
4.45.1 |
|
vllm |
Xeon |
all supported examples except EdgeCraftRAG |
v0.8.3 |
- |
|
vllm |
Gaudi |
all supported examples except EdgeCraftRAG |
v0.6.6.post1+Gaudi-1.20.0 |
v0.6.4.post2+Gaudi-1.19.0 |
|
vllm |
AMD GPU |
all supported examples |
rocm6.3.1_instinct_vllm0.8.3_20250410 |
- |
Changes to Default Behavior¶
[agent] The default model changed from
meta-llama/Meta-Llama-3-8B-Instruct
tometa-llama/Llama-3.3-70B-Instruct
.
Validated Hardware¶
Intel® Arc™ Graphics GPU (A770)
Intel® Gaudi® Al Accelerators (2nd, 3rd)
Intel® Xeon® Scalable processor (4th, 5th, 6th)
AMD® Instinct™ MI300X Accelerators (CDNA3)
Validated Software¶
Known Issues¶
AvatarChatbot can not work in K8s environment because of a functional gap in wav2clip service. (GenAIExamples#)
Full Changelogs¶
Contributors¶
This release would not have been possible without the contributions of the following organizations and individuals.
Contributing Organizations¶
Amazon
: Ollama deployment, Bedrock integration, OVMS integration and bug fixes.AMD
: vLLM enablement on AMD GPUs for key examples, AMD GPUs enabling on more examples, AMD OPEA blogs.ArangoDB
: OPEA Store and ArangoDB integration.Intel
: Development and improvements to GenAI examples, components, infrastructure, and evaluation.Infosys
: Azure support and documentation updates.National Chiao Tung University
: Documentation updates.Prediction Guard
: Maintenance of Prediction Guard components.
Individual Contributors¶
For a comprehensive list of individual contributors, please refer to the Full Changelogs section.