OPEA Release Notes v0.9

What’s New in OPEA v0.9

  • Broaden functionality

    • Provide telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger

    • Initialize two Agent examples: AgentQnA and DocIndexRetriever

    • Support for authentication and authorization

    • Add Nginx Component to strengthen backend security

    • Provide Toxicity Detection Microservice

    • Support the experimental Fine-tuning microservice

  • Enhancement

    • Align the Microservice format with the standards of OpenAI (Chat Completions, Fine-tuning… etc)

    • Enhance the performance benchmarking and evaluation for GenAI Examples, ex: TGI, resource allocation, …etc

    • Enable support for launching container images as a non-root user

    • Use Llama-Guard-2-8B as default Guardrails model and bge-large-zh-v1.5 as default embedding model, mistral-7b-grok as default CodeTrans model

    • Add ProductivitySuite to provide access management and maintains user context

  • Deployment

    • Support Red Hat OpenShift Container Platform (RHOCP)

    • GenAI Microservices Connector (GMC) successfully tested on Nvidia GPUs

    • Add Kubernetes support for AudioQnA and VisualQnA examples

  • OPEA Docker Hub: https://hub.docker.com/u/opea

  • GitHub IO: https://opea-project.github.io/latest/index.html

  • Thanks for the external contribution from Sharan Shirodkar, Aishwarya Ramasethu , Michal Nicpon and Jacob Mansdorfer

Details

GenAIExamples
  • ChatQnA

    • Update port in set_env.sh(040d2b7)

    • Fix minor issue in ChatQnA Gaudi docker README(a5ed223)

    • update chatqna dataprep-redis port(02a1536)

    • Add support for .md file in file upload in the chatqna-ui(7a67298)

    • Added the ChatQnA delete feature, and updated the corresponding README(09a3196)

    • fixed ISSUE-528(45cf553)

    • Fix vLLM and vLLM-on-Ray UT bug(cfcac3f)

    • set OLLAMA_MODEL env to docker container(c297155)

    • Update guardrail docker file path(06c4484)

    • remove ray serve(c71bc68)

    • Refine docker_compose for dataprep param settings(3913c7b)

    • fix chatqna guardrails(db2d2bd)

    • Support ChatQnA pipeline without rerank microservice(a54ffd2)

    • Update the number of microservice replicas for OPEA v0.9(e6b4fff)

    • Update set_env.sh(9657f7b)

    • add env for chatqna vllm(f78aa9e)

  • Deployment

    • update manifests for v0.9(ba78b4c)

    • Update K8S manifest for ChatQnA/CodeGen/CodeTrans/DocSum(01c1b75)

    • Update benchmark manifest to fix errors(4fd3517)

    • Update env for manifest(4fa37e7)

    • update manifests for v0.9(08f57fa)

    • Add AudioQnA example via GMC(c86cf85)

    • add k8s support for audioqna(0a6bad0)

    • Update mainifest for FaqGen(80e3e2a)

    • Add kubernetes support for VisualQnA(4f7fc39)

    • Add dataprep microservice to chatQnA example and the e2e test(1c23d87)

  • Documentation

    • [doc] Update README.md(c73e4e0)

    • doc fix: Update README.md to remove specific dicscription of paragraph-1(5a9c109)

    • doc: fix markdown in docker_image_list.md(9277fe6)

    • doc: fix markdown in Translation/README.md(d645305)

    • doc: fix markdown in SearchQnA/README.md(c461b60)

    • doc: fix FaqGen/README.md markdown(704ec92)

    • doc: fix markdown in DocSum/README.md(83712b9)

    • doc: fix markdown in CodeTrans/README.md(076bca3)

    • doc: fix CodeGen/README.md markdown(33f8329)

    • doc: fix markdown in ChatQnA/README.md(015a2b1)

    • doc: fix headings in markdown files(21fab71)

    • doc: missed an H1 in the middle of a doc(4259240)

    • doc: remove use of HTML for table in README(e81e0e5)

    • Update ChatQnA readme with OpenShift instructions(ed48371)

    • Convert HTML to markdown format.(14621f8)

    • Fix typo {your_ip} to {host_ip}(ad8ca88)

    • README fix typo(abc02e1)

    • fix script issues in MD file(acdd712)

    • Minor documentation improvements in the CodeGen README(17b9676)

    • Refine Main README(08eb269)

    • [Doc]Add a micro/mega service WorkFlow for DocSum(343d614)

    • Update README for k8s deployment(fbb81b6)

  • Other examples

    • Clean deprecated VisualQnA code(87617e7)

    • Using TGI official release docker image for intel cpu(b2771ad)

    • Add VisualQnA UI(923cf69)

    • fix container name(5ac77f7)

    • Add VisualQnA docker for both Gaudi and Xeon using TGI serving(2390920)

    • Remove LangSmith from Examples(88eeb0d)

    • Modify the language variable to match language highlight.(f08d411)

    • Remove deprecated folder.(7dd9952)

    • update env for manifest(4fa37e7)

    • AgentQnA example(67df280)

    • fix tgi xeon tag(6674832)

    • Add new DocIndexRetriever example(566cf93)

    • Add env params for chatqna xeon test(5d3950)

    • ProductivitySuite Combo Application with REACT UI and Keycloak Authen(947cbe3)

    • change codegen tgi model(06cb308)

    • change searchqna prompt(acbaaf8)

    • minor fix mismatched hf token(ac324a9)

    • fix translation gaudi env(4f3be23)

    • Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml (c25063f)

  • CI/CD/UT

    • update deploy_gmc logical in cd workflow(c016d82)

    • fix ghcr.io/huggingface/text-generation-inference tag(503a1a9)

    • Add GMC e2e in CD workflow(f45e4c6)

    • Fix CI test changed file detect issue(5dcadf3)

    • update cd workflow name(3363a37)

    • Change microservice tags in CD workflow(71363a6)

    • Fix manual freeze images workflow(c327972)

    • open chatqna guardrails test(db2d2bd)

    • Add gmc build, scan and deploy workflow(a39f23a)

    • Enhance CI/CD infrastructure(c26d0f6)

    • Fix typo in CI workflow(e12baca)

    • Fix ChatQnA Qdrant CI issues(e71aba0)

    • remove continue-on-error: true to stop the test when image build failed(6296e9f)

    • Fix CD workflow typos(039014f)

    • Freeze base images(c9f9aca)

    • support multiple test cases for ChatQnA(939502d)

    • set action back to pull_request_target(1c07a38)

    • Add BoM collect workflow and image publish workflow(e93146b)

    • Fix left issues in CI/CD structure refactor(a6385bc)

    • Add composable manifest e2e test for cd workflow(d68be05)

    • Add secrets for CI test(3c9e2aa)

    • Build up docker images CD workflow(8c384e0)

    • fix corner issue in CI test(64bfea9)

    • Rename github workflow files(ebc165a)

    • Improve manifest chaqna test(a072441)

    • Refactor build image workflows with common action.yml(e22d413)

    • Automatic create issue to GenAIInfra when docker compose files changed(8bdb598)

    • Add components owner(ab98795)

    • Fix code scan warning(ac89855)

    • Check url of docker image list.(cf021ee)

    • change namespace surfix to random string (46af6f3)

    • chatqna k8s manifest: Fixed retriever-redis v0.9 image issue(7719755)

    • Adding Trivy and SBOM actions(f3ffcd5)

    • optimize CI log format(dfaf479)

GenAIComps
  • Cores

    • Refine parameter in api_protocol.py(0584b45)

    • Revert the default value of max_new_tokens to 1024(f2497c5)

    • Fixed Orchestrator schedule method(76877c1)

    • fix wrong indent(9b0edf2)

    • Allow downstream of streaming nodes(90e367e)

    • Add Retrieval gateway in core to support IndexRetrivel Megaservice(56daf95)

    • add telemetry doc(2a2a93)

  • LLM/embedding/reranking/retrieval

    • Using habana docker 1.16.1 everywhere(5deb383)

    • adding entrypoint.sh to faq-generation comp (4a7b8f4)

    • Fix image in docker compose yaml to use the built docker image tag from the README(72a2553)

    • Refine LLM Native Microservice(b16b14a)

    • Fix Retriever qdrant issue(7aee7e4)

    • Change /root/ to /home/user/.(4a67d42)

    • Fix embeddings_langchain-mosec issue.(87905ad)

    • fix HuggingFaceEmbedding deprecated in favor of HuggingFaceInferenceAPIEmbedding(2891cc6)

    • align vllm-ray response format to tgi response format(ac4a777)

    • build new images for llms(ed99d47)

    • LLM micro service input data does not have input model name(761f7e0)

    • Fix OpenVINO vLLM build scripts and update unit test case(91d825c)

    • Refine the instructions to run the retriever example with qdrant(eb51018)

    • Add cmds to restart ollama service and add proxy settings while launching docker(8eb8b6a)

    • Vllm and vllm-ray bug fix (add opea for vllm, update setuptools version)(0614fc2)

    • remove deprecated langchain imports and switch to langchain-huggingface(055404a)

    • [Enhence] Increase mosec_embedding forward timeout to support high concurrency cases(b61f61b)

    • Fix issues in updating embedding & reranking model to bge-large-zh-v1.5(da19c5d)

    • refact embedding/ranking/llm request/response by referring to openai format(7287caa)

    • align VLLM micro-service output format with UI(c1887ed)

    • fix vllm docker command(c1a5883)

    • Update Embedding Mosec Dockerfile to use BAAI/bge-large-zh-v1.5(bbdc1f0)

    • remove length limitation of embedding(edcd1e8)

    • Support SearchedDoc input type in LLM for No Rerank Pipeline (3c29fb4)

    • Add local_embedding return 768 length to align with chatqna example(a234db)

    • Refine LLM for No Rerank(fe8ef3)

    • Remove redundant dependency from ‘vllm-ray’ comps(068527d)

  • LVM/TTS/ASR

    • Revise TTS, SpeechT5Model to end the last audio chunk at the correct punctuation mark location(20fc8ca)

    • Support llava-next using TGI(e156101)

    • whisper: Fix container build failure(d5b8cdf)

    • support whisper long-form generation (daec680)

    • Support multiple image sources for LVM microservice(ed776ac)

    • fix ffmpeg build on hpu(ac3909d)

    • Support streaming output for LVM microservice(c5a0344)

    • Add video-llama LVM microservice under lvms(db8c893)

    • add torchvision into requirements(1566047)

    • Use Gaudi base images from Dockerhub(33db504)

    • update the requirements.txt for tts and asr(5ba2561)

  • DataPrep

    • Fix Dataprep qdrant issues and add Test Script(a851abf)

    • Refine robustness of Dataprep Redis(04986c1)

    • Address testcase failure(075e84f)

    • Added support for Unified Port, GET/DELETE endpoints in pgvector Dataprep(8a62bac)

    • Update dataprep default mosec embedding model in config.py(8f0f2b0)

    • unify port in one microservice.(f8d45e5)

    • Pinecone update to OPEA(7c9f77b)

    • Refine Dataprep Code & UT(867e9d7)

    • Support delete for Milvus vector db in Dataprep(767a14c)

    • Redis-dataprep: Make Redis connection consistent(cfaf5f0)

    • Update Dataprep with Parameter Settings(55b457b)

    • Fix Dataprep Potential Error in get_file(04ff8bf)

    • Add dependency for pdf2image and OCR processing(9397522)

    • Fix the data load issue for structured files (40f1463)

    • Fix deps #568(c541d1d)

  • Other Components

    • Remove ‘langsmith’ per code review(dcf68a0)

    • Refine Nginx Component(69f9895)

    • Add logging for unified debug(fab1fbd)

    • Add Nginx Component for Service Forwarding(60cc0b0)

    • Fix line endings to LF(fecf4ac)

    • Add Assistant API for agent(f3a8935)

    • doc: remove use of unknown highlight language(5bd8bda)

    • Update README.md(b271739)

    • doc: fix multiple H1 headings(77e0e7b)

    • Add RagAgentDocGrader to agent comp(368c833)

    • Update Milvus docker-compose.yaml(d3eefea)

    • prompt_registry: Unifying API endpoint port(27a01ee)

    • Minor SPDX header update(4712545)

    • Modification to toxicity plugin PR (63650d0)

    • Optional container build instructions(be4833f)

    • Add Uvicorn dependency(b2e2b1a)

    • Support launch as Non-Root user in all published container images.(1eaf6b7)

    • Update readme and remove empty readme(a61e434)

    • Refine Guardrails README and update model(7749ce3)

    • Add codeowner(fb0ea3d)

    • Remove unnecessary langsmith dependency(cc8cd70)

    • doc: add .gitignore(d39fee9)

    • Add output evaluation for guardrails(62ca5bc)

    • Add ML detection strategy to PII detection guardrail(de27e6b)

    • Add finetuning list job, cancel job, retrieve finetuning job feature(7bbbdaf)

    • update finetuning api with openai format.(1ff81da)

    • Add finetuning component (ad0bb7c)

    • Add toxicity detection microservice(97fdf54)

    • fix searchqna readme(66cbbf3)

    • Fix typos and add definitions for toxicity detection microservice(9b8798a)

  • CI/CD/UT

    • Fix tts image build error(8b9dcdd)

    • Add CD workflow.(5dedd04)

    • Fix CI test changed file detect issue(cd83854)

    • add sudo in wf remove(1043336)

    • adapt GenAIExample test structure refine(7ffaf24)

    • Freeze base images(61dba72)

    • Fix image build check waring.(2b14c63)

    • Modify validate result check.(8a6079d)

    • Fix requirement actions(2207503)

    • Add validate result detection.(cf15b91)

    • Check build fail and change port 8008 to 5025/5026.(5159aac)

    • Freeze requirements(5d9a855)

    • Fix vllm-ray issue(0bd8215)

    • Standardize image build.(a56a847)

    • clean local images before test(f36629a)

    • update test files(ab8ebc4)

    • Fix validation failure without exit.(f46f1f3)

    • Update Microservice CI trigger path(3ffcff4)

    • Add E2E example test(ec4143e)

    • Added unified ports for Chat History Microservice.(2098b91)

    • add secrets for test(cafcf1b)

    • [tests] normalize embedding and reranking endpoint docker image name(e3f29c3)

    • fix asr ut on hpu(9580298)

    • update image build list(7185d6b)

    • Add path check for dockerfiles in compose.yaml and change workflow name.(c45f8f0)

    • enhance docker image build(75d6bc9)

    • refactor build image with common action.yml(ee5b0f6)

    • Fix ‘=’ miss issues.(eb5cc8a)

    • fix freeze workflow(945b9e4)

GenAIEvals
  • remove useless code.(1004d5b)

  • Unify benchmark tool based on stresscli library(71637c0)

  • Fixed query list id out-of-range issue(7b719de)

  • Add GMC chatqna benchmark script(6a390da)

  • Add test example prompts for codegen(ebee50c)

  • doc: fix language on codeblock in README(85aef83)

  • Fix metrics issue of CRUD(82c1654)

  • Add benchmark stresscli scripts(9998cd7)

  • remove useless code(1004d5b)

  • Add GMC chatqna benchmark script(6a390da)

  • Fixed query list id out-of-range issue(7b719de)

  • enhance multihop dataset accuracy(dfc2c1e)

  • doc: add Kubernetes platform-optimization README(7600db4)

  • doc: fix platform optimization README based on PR#73 feedback(8c7eb1b)

  • update for faq benchmark(d754a84)

  • Support e2e and first token P90 statistics(b07cd12)

GenAIInfra
  • GMC

    • update GMC e2e and Doc(8a85364)

    • Fixed some bugs for GMC yaml files(112295a)

    • Set up CD workflow for GMC(3d94844)

    • GMC: Add GPU support for GMC.(119941e)

    • authN-authZ: add oauth2-proxy support for authentication and authorization together with GMC(488a1ca)

    • Output streaming support for the whole pipeline in GMC router(c412aa3)

    • re-org k8s manifests files for GMC and examples(d39b315)

    • GMC: resource management(81060ab)

    • Enable GMC helm installation test in CI(497ff61)

    • Add helm chart for deploying GMC itself(a76c90f)

    • Add multiple endpoints for GMC pipeline via gmcrouter(da4f091)

    • GMC: fix unsafe quoting(aa2730a)

    • fix: update doc for authN-authZ with oauth(54cd66f)

    • Troubleshooting guide for the validating webhook.(b47ec0c)

    • Fix router bugs on max_new_tokens and dataprep gaudi yaml file(5735dd3)

    • Add dataprep microservice to chatQnA example(d9a0271)

    • Troubleshooting guide for the validating webhook(b47ec0c)

    • Add HPA support to ChatQnA(cab7a88)

  • HelmChart

    • Add manual helm e2e test flow(3b5f62e)

    • Add script to generate manifests from helm charts(273cb1d)

    • ui: update chatqna helm chart readme and env name(a1d6d70)

    • Update helm chart readme(656dcc6)

    • helm: fix tei/tgi/docsum(a270726)

    • helm: update data-prep to latest changes(625899b)

    • helm: Update helm manifest to address user raised issues(4319660)

    • helm: Support local embedding(73b5b65)

    • ui: add helm chart/manifests for conversational UI(9dbe550)

    • helm: Add K8S probes to retriever-usvc(af47b3c)

    • Enable google secrets in helm chart e2e workflow(7079049)

    • Helm/Manifest: Add K8S probe(d3fc939)

    • Enable helm/common tests in CI(fa8ef35)

    • Helm: Add Nvidia GPU support for ChatQnA(868103b)

    • misc changes(b1182c4)

    • tgi: Update tgi version on xeon to latest-intel-cpu(c06bcea)

    • Fix typos in README(faa976b)

    • Support HF_ENDPOINT(cf28da4)

    • Set model-volume default to tmp volume(b5c14cd)

    • Enable using PV as model cache directory(c0d2ba6)

    • add manual helm e2e test flow(3b5f62e)

    • helm/manifest: Update to release v0.9(182183e)

  • Others

    • Rename workflows to get better readable(cb31d05)

    • Add manual job to freeze image tags and versions after code freeze(c0f5e2f)

    • tgi: revert xeon version to 2.2.0(076e81e)

    • Initial commit for Intel Gaudi Base Operator(c2a13d1)

    • Add AudioQnA example and e2e test(1b50b73)

    • Reorg and rename CI workflows to follow the rules(2bf648c)

    • Fix errors in ci workflow(779e526)

    • Add e2e test for chatqna with switch mode enable(7b20273)

    • Validating webhook implementation(df5f6f3)

    • Enhance manually run image build workflow(e983c32)

    • Add image build process on manual event(833dcec)

    • CI: change chart e2e to support tag replacing(739788a)

    • Add e2e test for chatQnA with dataprep microservice(c1fd27f)

    • Fix a bug of chart e2e workflow(86dd739)

    • Improve chart e2e test workflow and scripts(70205e5)

    • rename workflows to get better readable(cb31d05)

    • Correct TGI image tag for NV platform(629033b)

    • authN-authZ: change folder and split support(0c39b7b)

    • fix errors of manual helm workflow(bd46dfd)

    • update freeze tag manual workflow(c565909)

    • Update README(9480afc)

    • improve cd workflows and add release document (a4398b0)

    • Add some NVIDIA platform support docs and scripts(cad2fc3)