OPEA Release Notes v0.9¶
What’s New in OPEA v0.9¶
Broaden functionality
Provide telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger
Initialize two Agent examples: AgentQnA and DocIndexRetriever
Support for authentication and authorization
Add Nginx Component to strengthen backend security
Provide Toxicity Detection Microservice
Support the experimental Fine-tuning microservice
Enhancement
Align the Microservice format with the standards of OpenAI (Chat Completions, Fine-tuning… etc)
Enhance the performance benchmarking and evaluation for GenAI Examples, ex: TGI, resource allocation, …etc
Enable support for launching container images as a non-root user
Use Llama-Guard-2-8B as default Guardrails model and bge-large-zh-v1.5 as default embedding model, mistral-7b-grok as default CodeTrans model
Add ProductivitySuite to provide access management and maintains user context
Deployment
Support Red Hat OpenShift Container Platform (RHOCP)
GenAI Microservices Connector (GMC) successfully tested on Nvidia GPUs
Add Kubernetes support for AudioQnA and VisualQnA examples
OPEA Docker Hub: https://hub.docker.com/u/opea
GitHub IO: https://opea-project.github.io/latest/index.html
Thanks for the external contribution from Sharan Shirodkar, Aishwarya Ramasethu , Michal Nicpon and Jacob Mansdorfer
Details¶
GenAIExamples
ChatQnA
Update port in set_env.sh(040d2b7)
Fix minor issue in ChatQnA Gaudi docker README(a5ed223)
update chatqna dataprep-redis port(02a1536)
Add support for .md file in file upload in the chatqna-ui(7a67298)
Added the ChatQnA delete feature, and updated the corresponding README(09a3196)
fixed ISSUE-528(45cf553)
Fix vLLM and vLLM-on-Ray UT bug(cfcac3f)
set OLLAMA_MODEL env to docker container(c297155)
Update guardrail docker file path(06c4484)
remove ray serve(c71bc68)
Refine docker_compose for dataprep param settings(3913c7b)
fix chatqna guardrails(db2d2bd)
Support ChatQnA pipeline without rerank microservice(a54ffd2)
Update the number of microservice replicas for OPEA v0.9(e6b4fff)
Update set_env.sh(9657f7b)
add env for chatqna vllm(f78aa9e)
Deployment
update manifests for v0.9(ba78b4c)
Update K8S manifest for ChatQnA/CodeGen/CodeTrans/DocSum(01c1b75)
Update benchmark manifest to fix errors(4fd3517)
Update env for manifest(4fa37e7)
update manifests for v0.9(08f57fa)
Add AudioQnA example via GMC(c86cf85)
add k8s support for audioqna(0a6bad0)
Update mainifest for FaqGen(80e3e2a)
Add kubernetes support for VisualQnA(4f7fc39)
Add dataprep microservice to chatQnA example and the e2e test(1c23d87)
Documentation
[doc] Update README.md(c73e4e0)
doc fix: Update README.md to remove specific dicscription of paragraph-1(5a9c109)
doc: fix markdown in docker_image_list.md(9277fe6)
doc: fix markdown in Translation/README.md(d645305)
doc: fix markdown in SearchQnA/README.md(c461b60)
doc: fix FaqGen/README.md markdown(704ec92)
doc: fix markdown in DocSum/README.md(83712b9)
doc: fix markdown in CodeTrans/README.md(076bca3)
doc: fix CodeGen/README.md markdown(33f8329)
doc: fix markdown in ChatQnA/README.md(015a2b1)
doc: fix headings in markdown files(21fab71)
doc: missed an H1 in the middle of a doc(4259240)
doc: remove use of HTML for table in README(e81e0e5)
Update ChatQnA readme with OpenShift instructions(ed48371)
Convert HTML to markdown format.(14621f8)
Fix typo {your_ip} to {host_ip}(ad8ca88)
README fix typo(abc02e1)
fix script issues in MD file(acdd712)
Minor documentation improvements in the CodeGen README(17b9676)
Refine Main README(08eb269)
[Doc]Add a micro/mega service WorkFlow for DocSum(343d614)
Update README for k8s deployment(fbb81b6)
Other examples
Clean deprecated VisualQnA code(87617e7)
Using TGI official release docker image for intel cpu(b2771ad)
Add VisualQnA UI(923cf69)
fix container name(5ac77f7)
Add VisualQnA docker for both Gaudi and Xeon using TGI serving(2390920)
Remove LangSmith from Examples(88eeb0d)
Modify the language variable to match language highlight.(f08d411)
Remove deprecated folder.(7dd9952)
update env for manifest(4fa37e7)
AgentQnA example(67df280)
fix tgi xeon tag(6674832)
Add new DocIndexRetriever example(566cf93)
Add env params for chatqna xeon test(5d3950)
ProductivitySuite Combo Application with REACT UI and Keycloak Authen(947cbe3)
change codegen tgi model(06cb308)
change searchqna prompt(acbaaf8)
minor fix mismatched hf token(ac324a9)
fix translation gaudi env(4f3be23)
Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml (c25063f)
CI/CD/UT
update deploy_gmc logical in cd workflow(c016d82)
fix ghcr.io/huggingface/text-generation-inference tag(503a1a9)
Add GMC e2e in CD workflow(f45e4c6)
Fix CI test changed file detect issue(5dcadf3)
update cd workflow name(3363a37)
Change microservice tags in CD workflow(71363a6)
Fix manual freeze images workflow(c327972)
open chatqna guardrails test(db2d2bd)
Add gmc build, scan and deploy workflow(a39f23a)
Enhance CI/CD infrastructure(c26d0f6)
Fix typo in CI workflow(e12baca)
Fix ChatQnA Qdrant CI issues(e71aba0)
remove continue-on-error: true to stop the test when image build failed(6296e9f)
Fix CD workflow typos(039014f)
Freeze base images(c9f9aca)
support multiple test cases for ChatQnA(939502d)
set action back to pull_request_target(1c07a38)
Add BoM collect workflow and image publish workflow(e93146b)
Fix left issues in CI/CD structure refactor(a6385bc)
Add composable manifest e2e test for cd workflow(d68be05)
Add secrets for CI test(3c9e2aa)
Build up docker images CD workflow(8c384e0)
fix corner issue in CI test(64bfea9)
Rename github workflow files(ebc165a)
Improve manifest chaqna test(a072441)
Refactor build image workflows with common action.yml(e22d413)
Automatic create issue to GenAIInfra when docker compose files changed(8bdb598)
Add components owner(ab98795)
Fix code scan warning(ac89855)
Check url of docker image list.(cf021ee)
change namespace surfix to random string (46af6f3)
chatqna k8s manifest: Fixed retriever-redis v0.9 image issue(7719755)
Adding Trivy and SBOM actions(f3ffcd5)
optimize CI log format(dfaf479)
GenAIComps
Cores
Refine parameter in api_protocol.py(0584b45)
Revert the default value of max_new_tokens to 1024(f2497c5)
Fixed Orchestrator schedule method(76877c1)
fix wrong indent(9b0edf2)
Allow downstream of streaming nodes(90e367e)
Add Retrieval gateway in core to support IndexRetrivel Megaservice(56daf95)
add telemetry doc(2a2a93)
LLM/embedding/reranking/retrieval
Using habana docker 1.16.1 everywhere(5deb383)
adding entrypoint.sh to faq-generation comp (4a7b8f4)
Fix image in docker compose yaml to use the built docker image tag from the README(72a2553)
Refine LLM Native Microservice(b16b14a)
Fix Retriever qdrant issue(7aee7e4)
Change /root/ to /home/user/.(4a67d42)
Fix embeddings_langchain-mosec issue.(87905ad)
fix HuggingFaceEmbedding deprecated in favor of HuggingFaceInferenceAPIEmbedding(2891cc6)
align vllm-ray response format to tgi response format(ac4a777)
build new images for llms(ed99d47)
LLM micro service input data does not have input model name(761f7e0)
Fix OpenVINO vLLM build scripts and update unit test case(91d825c)
Refine the instructions to run the retriever example with qdrant(eb51018)
Add cmds to restart ollama service and add proxy settings while launching docker(8eb8b6a)
Vllm and vllm-ray bug fix (add opea for vllm, update setuptools version)(0614fc2)
remove deprecated langchain imports and switch to langchain-huggingface(055404a)
[Enhence] Increase mosec_embedding forward timeout to support high concurrency cases(b61f61b)
Fix issues in updating embedding & reranking model to bge-large-zh-v1.5(da19c5d)
refact embedding/ranking/llm request/response by referring to openai format(7287caa)
align VLLM micro-service output format with UI(c1887ed)
fix vllm docker command(c1a5883)
Update Embedding Mosec Dockerfile to use BAAI/bge-large-zh-v1.5(bbdc1f0)
remove length limitation of embedding(edcd1e8)
Support SearchedDoc input type in LLM for No Rerank Pipeline (3c29fb4)
Add local_embedding return 768 length to align with chatqna example(a234db)
Refine LLM for No Rerank(fe8ef3)
Remove redundant dependency from ‘vllm-ray’ comps(068527d)
LVM/TTS/ASR
Revise TTS, SpeechT5Model to end the last audio chunk at the correct punctuation mark location(20fc8ca)
Support llava-next using TGI(e156101)
whisper: Fix container build failure(d5b8cdf)
support whisper long-form generation (daec680)
Support multiple image sources for LVM microservice(ed776ac)
fix ffmpeg build on hpu(ac3909d)
Support streaming output for LVM microservice(c5a0344)
Add video-llama LVM microservice under lvms(db8c893)
add torchvision into requirements(1566047)
Use Gaudi base images from Dockerhub(33db504)
update the requirements.txt for tts and asr(5ba2561)
DataPrep
Fix Dataprep qdrant issues and add Test Script(a851abf)
Refine robustness of Dataprep Redis(04986c1)
Address testcase failure(075e84f)
Added support for Unified Port, GET/DELETE endpoints in pgvector Dataprep(8a62bac)
Update dataprep default mosec embedding model in config.py(8f0f2b0)
unify port in one microservice.(f8d45e5)
Pinecone update to OPEA(7c9f77b)
Refine Dataprep Code & UT(867e9d7)
Support delete for Milvus vector db in Dataprep(767a14c)
Redis-dataprep: Make Redis connection consistent(cfaf5f0)
Update Dataprep with Parameter Settings(55b457b)
Fix Dataprep Potential Error in get_file(04ff8bf)
Add dependency for pdf2image and OCR processing(9397522)
Fix the data load issue for structured files (40f1463)
Fix deps #568(c541d1d)
Other Components
Remove ‘langsmith’ per code review(dcf68a0)
Refine Nginx Component(69f9895)
Add logging for unified debug(fab1fbd)
Add Nginx Component for Service Forwarding(60cc0b0)
Fix line endings to LF(fecf4ac)
Add Assistant API for agent(f3a8935)
doc: remove use of unknown highlight language(5bd8bda)
Update README.md(b271739)
doc: fix multiple H1 headings(77e0e7b)
Add RagAgentDocGrader to agent comp(368c833)
Update Milvus docker-compose.yaml(d3eefea)
prompt_registry: Unifying API endpoint port(27a01ee)
Minor SPDX header update(4712545)
Modification to toxicity plugin PR (63650d0)
Optional container build instructions(be4833f)
Add Uvicorn dependency(b2e2b1a)
Support launch as Non-Root user in all published container images.(1eaf6b7)
Update readme and remove empty readme(a61e434)
Refine Guardrails README and update model(7749ce3)
Add codeowner(fb0ea3d)
Remove unnecessary langsmith dependency(cc8cd70)
doc: add .gitignore(d39fee9)
Add output evaluation for guardrails(62ca5bc)
Add ML detection strategy to PII detection guardrail(de27e6b)
Add finetuning list job, cancel job, retrieve finetuning job feature(7bbbdaf)
update finetuning api with openai format.(1ff81da)
Add finetuning component (ad0bb7c)
Add toxicity detection microservice(97fdf54)
fix searchqna readme(66cbbf3)
Fix typos and add definitions for toxicity detection microservice(9b8798a)
CI/CD/UT
Fix tts image build error(8b9dcdd)
Add CD workflow.(5dedd04)
Fix CI test changed file detect issue(cd83854)
add sudo in wf remove(1043336)
adapt GenAIExample test structure refine(7ffaf24)
Freeze base images(61dba72)
Fix image build check waring.(2b14c63)
Modify validate result check.(8a6079d)
Fix requirement actions(2207503)
Add validate result detection.(cf15b91)
Check build fail and change port 8008 to 5025/5026.(5159aac)
Freeze requirements(5d9a855)
Fix vllm-ray issue(0bd8215)
Standardize image build.(a56a847)
clean local images before test(f36629a)
update test files(ab8ebc4)
Fix validation failure without exit.(f46f1f3)
Update Microservice CI trigger path(3ffcff4)
Add E2E example test(ec4143e)
Added unified ports for Chat History Microservice.(2098b91)
add secrets for test(cafcf1b)
[tests] normalize embedding and reranking endpoint docker image name(e3f29c3)
fix asr ut on hpu(9580298)
update image build list(7185d6b)
Add path check for dockerfiles in compose.yaml and change workflow name.(c45f8f0)
enhance docker image build(75d6bc9)
refactor build image with common action.yml(ee5b0f6)
Fix ‘=’ miss issues.(eb5cc8a)
fix freeze workflow(945b9e4)
GenAIEvals
remove useless code.(1004d5b)
Unify benchmark tool based on stresscli library(71637c0)
Fixed query list id out-of-range issue(7b719de)
Add GMC chatqna benchmark script(6a390da)
Add test example prompts for codegen(ebee50c)
doc: fix language on codeblock in README(85aef83)
Fix metrics issue of CRUD(82c1654)
Add benchmark stresscli scripts(9998cd7)
remove useless code(1004d5b)
Add GMC chatqna benchmark script(6a390da)
Fixed query list id out-of-range issue(7b719de)
enhance multihop dataset accuracy(dfc2c1e)
doc: add Kubernetes platform-optimization README(7600db4)
doc: fix platform optimization README based on PR#73 feedback(8c7eb1b)
update for faq benchmark(d754a84)
Support e2e and first token P90 statistics(b07cd12)
GenAIInfra
GMC
update GMC e2e and Doc(8a85364)
Fixed some bugs for GMC yaml files(112295a)
Set up CD workflow for GMC(3d94844)
GMC: Add GPU support for GMC.(119941e)
authN-authZ: add oauth2-proxy support for authentication and authorization together with GMC(488a1ca)
Output streaming support for the whole pipeline in GMC router(c412aa3)
re-org k8s manifests files for GMC and examples(d39b315)
GMC: resource management(81060ab)
Enable GMC helm installation test in CI(497ff61)
Add helm chart for deploying GMC itself(a76c90f)
Add multiple endpoints for GMC pipeline via gmcrouter(da4f091)
GMC: fix unsafe quoting(aa2730a)
fix: update doc for authN-authZ with oauth(54cd66f)
Troubleshooting guide for the validating webhook.(b47ec0c)
Fix router bugs on max_new_tokens and dataprep gaudi yaml file(5735dd3)
Add dataprep microservice to chatQnA example(d9a0271)
Troubleshooting guide for the validating webhook(b47ec0c)
Add HPA support to ChatQnA(cab7a88)
HelmChart
Add manual helm e2e test flow(3b5f62e)
Add script to generate manifests from helm charts(273cb1d)
ui: update chatqna helm chart readme and env name(a1d6d70)
Update helm chart readme(656dcc6)
helm: fix tei/tgi/docsum(a270726)
helm: update data-prep to latest changes(625899b)
helm: Update helm manifest to address user raised issues(4319660)
helm: Support local embedding(73b5b65)
ui: add helm chart/manifests for conversational UI(9dbe550)
helm: Add K8S probes to retriever-usvc(af47b3c)
Enable google secrets in helm chart e2e workflow(7079049)
Helm/Manifest: Add K8S probe(d3fc939)
Enable helm/common tests in CI(fa8ef35)
Helm: Add Nvidia GPU support for ChatQnA(868103b)
misc changes(b1182c4)
tgi: Update tgi version on xeon to latest-intel-cpu(c06bcea)
Fix typos in README(faa976b)
Support HF_ENDPOINT(cf28da4)
Set model-volume default to tmp volume(b5c14cd)
Enable using PV as model cache directory(c0d2ba6)
add manual helm e2e test flow(3b5f62e)
helm/manifest: Update to release v0.9(182183e)
Others
Rename workflows to get better readable(cb31d05)
Add manual job to freeze image tags and versions after code freeze(c0f5e2f)
tgi: revert xeon version to 2.2.0(076e81e)
Initial commit for Intel Gaudi Base Operator(c2a13d1)
Add AudioQnA example and e2e test(1b50b73)
Reorg and rename CI workflows to follow the rules(2bf648c)
Fix errors in ci workflow(779e526)
Add e2e test for chatqna with switch mode enable(7b20273)
Validating webhook implementation(df5f6f3)
Enhance manually run image build workflow(e983c32)
Add image build process on manual event(833dcec)
CI: change chart e2e to support tag replacing(739788a)
Add e2e test for chatQnA with dataprep microservice(c1fd27f)
Fix a bug of chart e2e workflow(86dd739)
Improve chart e2e test workflow and scripts(70205e5)
rename workflows to get better readable(cb31d05)
Correct TGI image tag for NV platform(629033b)
authN-authZ: change folder and split support(0c39b7b)
fix errors of manual helm workflow(bd46dfd)
update freeze tag manual workflow(c565909)
Update README(9480afc)
improve cd workflows and add release document (a4398b0)
Add some NVIDIA platform support docs and scripts(cad2fc3)