OPEA Release Notes v1.0¶

What’s New in OPEA v1.0¶

Highlights
- Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
- Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
- Improve RAG with Knowledge Graph based on Neo4j
- Improve VisualQnA and provide multi-modality RAG support
- Faster microservice launch through removal of some dispatch overhead
- Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
- Enable HorizontalPodAutoscaler (HPA) for better resource management
- Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
- Further improvement on documentation and developer experience
Other features
- Enable OpenAI compatible format on applicable microservices
- Support microservice launch from ModelScope to address China ecosystem need
- Support Red Hat OpenShift Container Platform (RHOCP)
- Refactor the code and CI/CD pipeline to provide better support for contributors
- Improve Docker versioning to avoid the potential conflict
- Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
- Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
Learn more about OPEA at
- Getting Started: https://opea-project.github.io/latest/index.html
- Github: https://github.com/opea-project
- Docker Hub: https://hub.docker.com/u/opea
Release Documentation:
- Landing Page: https://opea.dev/
- Release Notes: https://github.com/opea-project/docs/tree/main/release_notes

Details¶

GenAIExamples

Deployment
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Update mount path in xeon k8s(2a6af64)
- Add Nginx - k8s manifest in CodeTrans(6a679ba)
- Add Nginx - docker in CodeTrans(cc84847)
- watch more docker compose files changes(4b0bc26)
- Add chatQnA UI manifest(758d236)
- Revert the LLM model for kubernetes GMS(f5f1e32)
- [ChatQnA] Update retrieval & dataprep manifests(6730b24)
- [ChatQnA]Update manifests(3563f5d)
- [ChatQnA] Update benchmarking manifests(36fb9a9)
- [ChatQnA] udate OOB & Tuned manifests(ac34860)
- Add nginx and UI to the ChatQnA manifest(05f9828)
- [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
- [Translation] Support manifests and nginx(1e13031)
- update V1.0 benchmark manifest (e5affb9)
- update image name(e2a74f7)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Change megaservice path in line with new file structure(5ab27b6)
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- Add chatQnA UI manifest(758d236)
- Yaml: add comments to specify gaudi device ids.(63406dc)
- add tgi bf16 setup on CPU k8s.(ba17031)
Documentation
- [ChatQnA] Update README for ModelScope(aebc23f)
- Update README.md(4bd7841)
- [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
- [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
- Fix readme for nv gpu(43b2ae5)
- [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
- Refine ChatQnA README for TGI(afc3341)
- Add default model for VisualQnA README(07baa8f)
- Update readme for manifests of some examples(adb157f)
- doc: use markdown table in supported_examples(9cf1d88)
- doc: remove invalid code block language(c6d811a)
- add AudioQnA readme with supported model(f4f4da2)
- add more code owners(7f89797)
- doc: fix headings(7a0fca7)
- [Codegen] Refine readme to prompt users on how to change the model.(814164d)
- Update README.md and remove some open-source details(2ef83fc)
- Add issue template(84a781a)
- doc: fix headings and indenting(67394b8)
- Add default model in readme for FaqGen and DocSum(d487093)
- Change docs of kubernetes for curl commands in README(4133757)
- Update v0.9 RAG release data(947936e)
- Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
- Update docker images list.(a8244c4)
- refactor the network port setting for AWS(bc81770)
- Add validate microservice details link(bd811bd)
- [ChatQnA] Add Nginx in Docker Compose and README(6c36448
- [Doc] Update CodeGen and Translation READMEs(a09395e)
- [Doc] Refine READMEs(372d78c)
- Remove marketing materials(d85ec09)
- doc PR to main instead of of v1.0r(dc94026)
- Update README.md for Multiplatforms(b205dc7)
- Refine the quick start of ChatQnA(3b70fb0)
- Update supported_examples(96d5cd9)
- [Doc] doc improvement(e0b3b57)
- Fix README issues(bceacdc)
- doc: fix broken image reference and markdown(d422929)
- doc: give document meaningful title(a3fa0d6)
- doc: fix incorrefine readme for reorg(d2bab99)
- doc: fix incorrect path to png image files (d97882e)
- update doc according to comments(f990f79)
- doc: fix headings and indenting(67394b8)
- Update README.md(4bd7841)
- refine readme for reorg(d2bab99)
- Update README with new examples(2d28beb)
- README: fix broken links(ff6f841)
- Update v0.9 RAG release data(947936e)
- Update README.md of pdf file(87e51d5)
- [ChatQnA] Update README for ModelScope(aebc23f)
- Add table to list port, endpoint, framework, model, serving, and hardware for each microservice in ChatQnA(1a934af)
- Update SearchQnA document and compose.yaml(5c67204)
- Update invalid link(7b2194f)
- AgentQnA: Fix erroneous link in the README(1144fae)
- Fix Xeon reference per its trademark(e1b8ce0)
- Provide the method to get nke-10k-2023.pdf(a2745b2)
- adopted tech writing style(558ea3b)
- Improve ChatQnA flowchat according to feedback(375ea7a)
- Fix BACKEND_SERVICE_ENDPOINT variable value in the VideoQnA instructions(79e947e)
- [Doc] Refine ChatQnA README(7eaab93)
Functionalities and Bug Fix
- Fix refactor bug(7c13f2c)
- Provide the method to get nke-10k-2023.pdf(a2745b2)
- Integrate visualQnA backend(fa12083)
- Enable nginx for VisualQnA(def19b4)
- Add Settings and Update system Prompt option(1d1e1f9)
- Refactor folder to support different vendors(d73129c)
- Add rerank finetuning example(71857f5)
- remove logs for benchmark(e0bc5f2)
- update image build for 2 new examples(0869029)
- fix comps/nginx image build content(22d066a)
- react-ui: Add support to display Chinese(8c40204)
- [VisualQnA] Update compose.yaml to fix the endpoint url issue in UI(fbaa024)
- Add megaservice definition without microservice wrappers(ebe6b47)
- Add instruction tuning example(4c78f8c)
- fix token name(1e47444)
- Modify the handling of detected warnings to only prompt.(e6f5d13)
- Always upload scan artifacts(6f3e54a)
- Update ChatQnA env (32afb65)
- Yinghu5 patch 1(beda609)
- Update ollama run command(10c81f1)
- weekly update images tag(035f39f)
- Fix port conflict in llava-tgi-service in VisualQnA(993688a)
- Remove ‘vim’ from all Dockerfiles(1874dfd)
- enhance image publish action(5fde666)
- Update port in set_env.sh for TGI endpoint(e5ec38c)
- move evaluation scripts(f04f061)
- Handle uncontrolled data path for MultimodalQnA v1.0 release(872e93e)
- Align parameters for “max_token, repetition_penalty,presence_penalty,frequency_penalty”(2f03a3a)
- Remove useless folder.(88829c9)
- Enable nginx for VisualQnA(def19b4)
- Refactor folder to support different vendors(d73129c)
- fix path bug for reorg(264759d)
- fix reorg bug(504228e)
- update image build for 2 new examples(0869029)
- Add megaservice definition without microservice wrappers(ebe6b47)
- Add hyperlinks picture paths validation.(0611707)
- Added gaudi example for rerank model finetuning(edcc50f)
- Add VideoRAGQnA as MMRAG usecase in Example(2dd69dc)
- Agent example for v1.0 release(262a6f6)
- Fix issues with the VisualQnA instructions (bc4bbfa)
- Made cogen react ui to use runtime environment variables(b84c989)
- add image build for new examples(3f2e7b7)
- fix image build issue on push(88fde62)
- Add Settings and Update system Prompt option(1d1e1f9)
- [ChatQnA] Add no_wrapper benchmarking and update legacy manifests(06696c8)
- ProviIntegrate visualQnA backend(fa12083)
- Integrate visualQnA backend(fa12083)
- Add imagePrompt to display default image hint(e48532e)
- BUGFIX: rename videoragqna to videoqna to align with other examples(e102291)
- Fix megaservice ulimit issue under high concurrency(4112fd0)
CI/CD/UT
- Add new test cases for VisualQnA(995a62c)
- docker image cd workflow enhance (675ea4a)
- optimize image scan cd workflow(dba908a)
- Refine code scan output and remove opea_release_data.md.(21e215c)
- Fix other repo issue.(412a0b0)
- [DocIndexRetriever] Add xeon test and fix gaudi test (62dbb6d)
- watch more docker compose files’ changes(4b0bc26)
- fix typo in test script in AgentQnA(10fe3c6)
- Fix InstructionTuning and RerankFinetuning tests(be8e283)
- Fix issue(0bb0abb)
- print image build test commit(3ce3955)
- Fix SearchQnA tests bug(daf2a4f)
- [ProductivitySuite] Fix CD Issue(d55a33d)

GenAIComps

Cores
- Optimize mega flow by removing microservice wrapper(0bb69ac)
- Fix guardrails out handle logics for space linebreak and quote(e38ed6d)
- fix mismatched response format w/wo streaming guardrails(b6c0785)
Fine-tuning/Pre-training
- Added finetuned model deployment tutorial in readme(2931147)
- Add LLM pretraining support(58e9972)
- updates to containers for finetuning composite(f4d123c)
- enable embedding finetuning(7e1a2e5)
- update finetuning doc(7d2cd6b)
- Support rerank model finetuning(7d9265f)
- remove Update checkpoint format(8369fbf)
- finetuning models limitation.(a924579)
- Update checkpoint format(8369fbf)
- update upload_training_files format(3367b76)
- refine logging code.(5b3053f)
- Added finetuned model deployment tutorial in readme(2931147)
- enable embedding finetuning(7e1a2e5)
LVM/Video RAG
- Fix lvms videl-llama code issue(38abaab)
- Fix LVM streaming issue(fb4b8d2)
- Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)
- Retriever and lvm update for multimodal rag on videos(1513998)
- BUG FIX: LVM security fix(3e548f3)
- Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)
- Add local Rerank microservice for VideoRAGQnA(5fb4a38)
- Add Megaservice support for MMRAG - MultimodalRAGQnAWithVideos usecase(99be1bd)
- Bugfix for PR 496 to add format_video_name function(54aa943)
- Prediction Guard LVM component(1249c4f)
- Fix LVM streaming issue(fb4b8d2)
- Fix lvms videl-llama code issue(38abaab)
- Fix vLLM components images building(161c338)
- Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)
LLM/Rerank/Retrieval
- fix vllm llamaindex stream bug(ca94c60)
- Support Llama index for llms native(2e41dcf)
- Prediction Guard LLM component(391c4a5)
- update vllm to latest version for hpu(599a58f)
- Align parameters for “max_token, repetition_penalty,presence_penalty,frequency_penalty”(3a31295)
- optimize rerank with backend ref(d76751a)
- add VDMS retriever microservice for v0.9 Milestone(445c9b1)
- Fix the Retriever README error(1d761fa)
- optimize rerank with backend ref(d76751a)
- unify default reranking model with BAAI/bge-reranker-base(48d4e53)
- Fix Ollama langchain upgrade issue(8adbcce)
- vllm langchain: Add Document Retriever Support(0f2c2b1)
- Support Llama index for vLLM(8e3f553)
- Changes to comps/llms/text-generation/README(18092f3)
- Fix security problem(a672569)
DataPrep/vector stores
- Fix the loading error of jsonl file(2fbce3e)
- To avoid port conflicts change port to others.(89197e5)
- Dataprep fetch page fix(01886fe)
- Multimodal dataprep(6d4b668)
- Refine Dataprep Milvus MS(7686cfa)
- dataprep: Fix issue in uploading docx with embedding image(b873cf8)
- add: Pathway vector store and retriever as LangChain component(2c2322e)
- adding lancedb to langchain vectorstores(2360e5a)
- adding dataprep support for CLIP based models for VideoRAGQnA example for v1.0(f84d91a)
- Fix the loading error of jsonl file(2fbce3e)
Other Components
- Fix intent detection code issue(4c0f527)
- clear some unnecessary scripts and Dockerfile commands.(824a7e2)
- Update CODEOWNERS(5537b7f)
- doc: fix heading levels in markdown content(a8a46bc)
- [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)
- unify default reranking model with BAAI/bge-reranker-base(48d4e53)
- feedback_management: Remove ‘vim’ from Dockerfile(b2e64d2)
- switch to using upstream ‘tgi-gaudi’ on HuggingFace(90cc44f)
- Using Pip ‘–no-cache-dir’ within all Dockerfiles(f1f866f)
- Change image tag.(2093558)
- add code owners(0379aeb)
- Remove revision for TEI Embedding(d609071)
- BUGFIX: fix SearchedMultimodalDoc in docarray(ed44b44)
- Feedback management microservice component(72123b2)
- bump version into v1.0(9a1af76)
- Add Scan Container.(0d49244)
- Remove ‘vim’ from all Dockerfiles(25174c0)
- update image build yaml(b541fd8)
- ollama: Update curl proxy.(f510b69)
- Embedding Runtime on NeuralSpeed(0292355)
- add microservice for intent detection(84a7e57)
- Update README.md for Multiplatforms(ef90fbb)
- doc: fix heading levels(f8f8854)
- Prediction Guard embeddings component(191061b)
- [ChatQnA] Support K8S Python Client to export ChatQnA E2E manifests(af4e0f8)
- Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)
- replace langchain/langchain:latest with python:3.11-slim(6ce6551)
- Support for UI of MultimodalRAGWithVideos in GenAIExamples(7664578)
- [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)
- Remove fixed version in requirements.txt(f416f84)
- Update README.md for broken/missing readme(00227b8)
- adding embedding support for CLIP based models for VideoRAGQnA example for v0.9(2a53e25)
- same PR as #694 but on main branch(4b5d85b)
- doc: Fix headings(f6ae4fa)
- Fix all the microservices which affected by langchain version upgrade(04385c9)
- update version freeze for requirements-runtime.txt(1e4c382)
- add contributing section to main readme(2ba3516)
- Update embedding svc test port number(574fecf)
- Enable GraphRAG with Neo4J(29fe569)
- Refine READMEs after reorg(7e40475)
- Support export megaservice yaml to docker compose file(cff0a4d)
- Rename videoragqna to videoqna to align with other examples(2b68323)
- Update example name into MultimodalQnA and update image names(2ca56f3)
- Fix Reorg Issues(a3da7c1)
- Move neuralspeed embedding rerank and vllm-xft to catalog(98c62a0)
- fix ragagent text generator bug(42cde68)
- Add Bias Detection Microservice(812c85c)
- Fix intent detection code issue(4c0f527)
- Update README.md of Table in markdown(849cac9)
- update dependency version(4eee716)
CI/CD/UT
- add PREDICTIONGUARD_API_KEY for CI(94eb60f)
- update CI test log achieve(960f66c)
- expand CI timeout(6c24078)
- image scan and publish cd enhance(341f97a)
- add resume finetuning checkpoint ut.(c718602)
- Bug_fix.(2a91903)
- Optimize the content of the alerts.(8a11413)
- Add compose file.(7a21d09)
- Remove duplicate code(8325d5d)
- Fix image build fail issue.(3ce387a)
- Bug fix(12fd97a)
- enhance image publish job(9007212)
- Dockerflie check(2705e93)
- Make the scanning method optional.(ae71eee)
- Modify output messages.(3e87c3b)
- minor fix for CI detect(1785149)
- Add OpenAI client access OPEA microservice UT cases(1b69897)
- optimize ci test scope(4165c7d)
- Fixed CI yaml(3ac391a)
- Move fintuning test script path(267fb02)
- Add E2E test for bias detection of guardrails(e29865e)
- Add hyperlinks and paths validation.(ccdd2d0)
- Update manual test.(2794abd)
- Opt filecheck(61b8fa9)
- add PREDICTIONGUARD_API_KEY for CI(94eb60f)
- update ci action(b4a7f26)
- update image build compose(3d00a33)
- Adding Bias Detection Container to CI(6617e22)
- update cd workflow(3c5fc80)
- update torch cpu installation(0458443)
- Fix error.(887ca75)
- temp remove dockerfile check(2d5130f)
- Bug_fix.(2a91903)
- add resume finetuning checkpoint ut.(c718602)
- Optimize the content of the alerts.(8a11413)

GenAIEvals

Accuracy
- add audioqna asr wer eval scripts(cf8bd83)
- update llm-as-judge doc.(102fcdd)
- [v1.0] Add docker metric support(cff0a36)
- fix issue because of ragas changes(6abbe40)
- Add README for codegen acc test.(77bb66c)
- Update chatqna input to fix input length(4f46a12)
- Support bigcode eval for codegen v0.1(02b60b5)
- Add FaqGen Accuracy scripts & Refine Ragas(4df6438)
- update rag_eval readme(425b423)
- fix bigcode version when python>=3.11(1d3a502)
- add acc tuning script.(a6fd418)
Performance
- [ChatQnA] Support the replica tuning for ChatQnA(484b69a)
- Fix rerank benchmark script(8edda1c)
- Support service-list for metrics collection in benchmark.py(58502c5)
- Support benchmark file for w/o rerank pipeline(17d35e3)
- Update configuration in benchmark README(514a6d6)
- Support P50, P90, P99 for next token latency(6ac555c)
- Support microservice level benchmark(626d269)
- Support stresscli for codegen(907dc19)
- Align llm microservice parameters with end to end test(476a327)
- Fix microservice level benchmark issue(211b560)
- Add benchmark part into top README(ac52f79)
- Add CRAG benchmark(a9b087f)
- [ChatQnA] Support the replica tuning for ChatQnA(484b69a)
- add file for w/o rerank(17d35e3)
- add bench-target as the prefix of output folder(3f0ceaf)
Others
- doc: fix headings and indents(65a0a5b)
- doc: add title to new FaqGen README(52a540d)
- add code owners(047c479)
- doc: fix heading level(d5dbbf0)
- doc: fix JSON example(7318fb8)
- Update CODEOWNERS(4db9fb3)
- doc: update platform optimization document(d982681)
- doc: add title to new FaqGen README(52a540d)
- remove examples.(340f507)
- Add hyperlinks and paths validation(df58fe5)
- Remove useless file(0af532a)

GenAIInfra

GMC
- GMC: Add a CR for switch mode on one NV GPU card(02412e7)
- Update the GMC README based on current changes.(6f7a24e)
- fix GMC crashes in e2e (5a2b306)
- Add unit test for new function in GMC router(0343a2f)
- GMC: add UT for reconcile filters(6442127)
- Enable gmc build workflow on push(19fe1a2)
- Doc: Fix some typos to run GMC more smoothly(59000c5)
- Improve the performance of GMC router(68a2011)
- GMC: enhance log(a18404e)
HelmChart
- e2e helm chart: Add ui for codegen/codetrans/docsum(267d828)
- helm: Add guardrails llama_guard support(8206a8c)
- Enable guardrail case in helm e2e tests(491c2e2)
- helm chart: add nginx to avoid CORS issue(353f3a5)
- helm-chart/common: Add logging config for service components(b80ae50)
- helm-chart/data-prep: Add the missing config for dataprep-redis(b70b914)
- helm: use latest image tag on main branch(65b04dc)
- helm/manifest: Update to release v0.9(182183e)
- Add topologySpreadConstraints support(af9e1b6)
- Add TGI additional options(bf10bdd)
- Add vLLM inference engine support(0094f52)
- Remove unused values and change GenAIExamples default(26f9b16)
- ‘ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu’ is intel cpu(c84ac4c)
Documentation
- add code owner(59ce505)
- doc: fix headings and indenting(c10bca1)
- doc: fix headings, spelling, inter-doc references(22d012e)
- doc: fix image references(0a3e006)
- Add docs for all 3 use cases of ChatQnA examples and change models for switch case(987870f)
- doc: restructure authN-authZ directory(b9bc034)
- Update README(9480afc)
- doc: fix markdown issues(a339a87)
- Doc: Fix broken links(032ddbc)
- Enhance helm chart repo usage in README(0de5535)
- Create troubleshooting.md(d55ded4)
Others
- Fix CI bug #417(56d7d5d)
- disable hpa-values test in chart e2e in CI(9b38302)
- Add unit test for memory bandwidth exporter.(43adcc6)
- Enable unit test for memory-bandwidth-exporter in CI(923c1f3)
- add Observability for OPEA(8d304ac)
- fix a badcommit in #383(406bbc2)
- Add dataprep CR for NV platform(fa9788d)
- Add memory bandwidth exporter for AI workload.(9107af9)
- authN-authZ: update configs(0f5cef1)
- E2E: exclude terminating pods when wait_util_all_pod_ready(39fb55e)
- Add gateway guardrails(b22fc52)
- fix #314(f9204f0)
- v0.9 charts release(b2328b8)
- Restructure the directory of config sample and update the e2e test(326a637)
- Enhance ut(96cd929)
- improve cd workflows and add release document(a4398b0)
- Add HPA support to ChatQnA(cab7a88)
- Add some NVIDIA platform support docs and scripts(cad2fc3)
- Expose options of memory bandwidth exporter in k8s manifests and docker for user configuration(2517e79)
- Update the image version for ChatQnA examples(593458c)
- Update top level README(b224b65)
- Enable OIDC based Authentication with apisix(ee907d6)
- HPA improvements(8d86fff)
- authn-authz: fix CORS issue and refine doc(994250c)
- Add hyperlinks and paths validation(d8cd3a1)