OPEA Release Notes v1.0

What’s New in OPEA v1.0

  • Highlights

    • Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning

    • Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)

    • Improve RAG with Knowledge Graph based on Neo4j

    • Improve VisualQnA and provide multi-modality RAG support

    • Faster microservice launch through removal of some dispatch overhead

    • Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation

    • Enable HorizontalPodAutoscaler (HPA) for better resource management

    • Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples

    • Further improvement on documentation and developer experience

  • Other features

    • Enable OpenAI compatible format on applicable microservices

    • Support microservice launch from ModelScope to address China ecosystem need

    • Support Red Hat OpenShift Container Platform (RHOCP)

    • Refactor the code and CI/CD pipeline to provide better support for contributors

    • Improve Docker versioning to avoid the potential conflict

    • Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates, resulting in scale enhancements

    • Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface

  • Learn more about OPEA at

    • Getting Started: https://opea-project.github.io/latest/index.html

    • Github: https://github.com/opea-project

    • Docker Hub: https://hub.docker.com/u/opea

  • Release Documentation:

    • Landing Page: https://opea.dev/

    • Release Notes: https://github.com/opea-project/docs/tree/main/release_notes

Details

GenAIExamples
  • Deployment

    • Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)

    • K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)

    • Update mount path in xeon k8s(2a6af64)

    • Add Nginx - k8s manifest in CodeTrans(6a679ba)

    • Add Nginx - docker in CodeTrans(cc84847)

    • watch more docker compose files changes(4b0bc26)

    • Add chatQnA UI manifest(758d236)

    • Revert the LLM model for kubernetes GMS(f5f1e32)

    • [ChatQnA] Update retrieval & dataprep manifests(6730b24)

    • [ChatQnA]Update manifests(3563f5d)

    • [ChatQnA] Update benchmarking manifests(36fb9a9)

    • [ChatQnA] udate OOB & Tuned manifests(ac34860)

    • Add nginx and UI to the ChatQnA manifest(05f9828)

    • [ChatQnA] Update OOB with wrapper manifests.(933c3d3)

    • [Translation] Support manifests and nginx(1e13031)

    • update V1.0 benchmark manifest (e5affb9)

    • update image name(e2a74f7)

    • K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)

    • Change megaservice path in line with new file structure(5ab27b6)

    • Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)

    • Add chatQnA UI manifest(758d236)

    • Yaml: add comments to specify gaudi device ids.(63406dc)

    • add tgi bf16 setup on CPU k8s.(ba17031)

  • Documentation

    • [ChatQnA] Update README for ModelScope(aebc23f)

    • Update README.md(4bd7841)

    • [ChatQnA] Update README for without Rerank Pipeline(6b617d6)

    • [ChatQnA] Update Benchmark README for w/o rerank(4a51874)

    • Fix readme for nv gpu(43b2ae5)

    • [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)

    • Refine ChatQnA README for TGI(afc3341)

    • Add default model for VisualQnA README(07baa8f)

    • Update readme for manifests of some examples(adb157f)

    • doc: use markdown table in supported_examples(9cf1d88)

    • doc: remove invalid code block language(c6d811a)

    • add AudioQnA readme with supported model(f4f4da2)

    • add more code owners(7f89797)

    • doc: fix headings(7a0fca7)

    • [Codegen] Refine readme to prompt users on how to change the model.(814164d)

    • Update README.md and remove some open-source details(2ef83fc)

    • Add issue template(84a781a)

    • doc: fix headings and indenting(67394b8)

    • Add default model in readme for FaqGen and DocSum(d487093)

    • Change docs of kubernetes for curl commands in README(4133757)

    • Update v0.9 RAG release data(947936e)

    • Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)

    • Update docker images list.(a8244c4)

    • refactor the network port setting for AWS(bc81770)

    • Add validate microservice details link(bd811bd)

    • [ChatQnA] Add Nginx in Docker Compose and README(6c36448

    • [Doc] Update CodeGen and Translation READMEs(a09395e)

    • [Doc] Refine READMEs(372d78c)

    • Remove marketing materials(d85ec09)

    • doc PR to main instead of of v1.0r(dc94026)

    • Update README.md for Multiplatforms(b205dc7)

    • Refine the quick start of ChatQnA(3b70fb0)

    • Update supported_examples(96d5cd9)

    • [Doc] doc improvement(e0b3b57)

    • Fix README issues(bceacdc)

    • doc: fix broken image reference and markdown(d422929)

    • doc: give document meaningful title(a3fa0d6)

    • doc: fix incorrefine readme for reorg(d2bab99)

    • doc: fix incorrect path to png image files (d97882e)

    • update doc according to comments(f990f79)

    • doc: fix headings and indenting(67394b8)

    • Update README.md(4bd7841)

    • refine readme for reorg(d2bab99)

    • Update README with new examples(2d28beb)

    • README: fix broken links(ff6f841)

    • Update v0.9 RAG release data(947936e)

    • Update README.md of pdf file(87e51d5)

    • [ChatQnA] Update README for ModelScope(aebc23f)

    • Add table to list port, endpoint, framework, model, serving, and hardware for each microservice in ChatQnA(1a934af)

    • Update SearchQnA document and compose.yaml(5c67204)

    • Update invalid link(7b2194f)

    • AgentQnA: Fix erroneous link in the README(1144fae)

    • Fix Xeon reference per its trademark(e1b8ce0)

    • Provide the method to get nke-10k-2023.pdf(a2745b2)

    • adopted tech writing style(558ea3b)

    • Improve ChatQnA flowchat according to feedback(375ea7a)

    • Fix BACKEND_SERVICE_ENDPOINT variable value in the VideoQnA instructions(79e947e)

    • [Doc] Refine ChatQnA README(7eaab93)

  • Functionalities and Bug Fix

    • Fix refactor bug(7c13f2c)

    • Provide the method to get nke-10k-2023.pdf(a2745b2)

    • Integrate visualQnA backend(fa12083)

    • Enable nginx for VisualQnA(def19b4)

    • Add Settings and Update system Prompt option(1d1e1f9)

    • Refactor folder to support different vendors(d73129c)

    • Add rerank finetuning example(71857f5)

    • remove logs for benchmark(e0bc5f2)

    • update image build for 2 new examples(0869029)

    • fix comps/nginx image build content(22d066a)

    • react-ui: Add support to display Chinese(8c40204)

    • [VisualQnA] Update compose.yaml to fix the endpoint url issue in UI(fbaa024)

    • Add megaservice definition without microservice wrappers(ebe6b47)

    • Add instruction tuning example(4c78f8c)

    • fix token name(1e47444)

    • Modify the handling of detected warnings to only prompt.(e6f5d13)

    • Always upload scan artifacts(6f3e54a)

    • Update ChatQnA env (32afb65)

    • Yinghu5 patch 1(beda609)

    • Update ollama run command(10c81f1)

    • weekly update images tag(035f39f)

    • Fix port conflict in llava-tgi-service in VisualQnA(993688a)

    • Remove ‘vim’ from all Dockerfiles(1874dfd)

    • enhance image publish action(5fde666)

    • Update port in set_env.sh for TGI endpoint(e5ec38c)

    • move evaluation scripts(f04f061)

    • Handle uncontrolled data path for MultimodalQnA v1.0 release(872e93e)

    • Align parameters for “max_token, repetition_penalty,presence_penalty,frequency_penalty”(2f03a3a)

    • Remove useless folder.(88829c9)

    • Enable nginx for VisualQnA(def19b4)

    • Refactor folder to support different vendors(d73129c)

    • fix path bug for reorg(264759d)

    • fix reorg bug(504228e)

    • update image build for 2 new examples(0869029)

    • Add megaservice definition without microservice wrappers(ebe6b47)

    • Add hyperlinks picture paths validation.(0611707)

    • Added gaudi example for rerank model finetuning(edcc50f)

    • Add VideoRAGQnA as MMRAG usecase in Example(2dd69dc)

    • Agent example for v1.0 release(262a6f6)

    • Fix issues with the VisualQnA instructions (bc4bbfa)

    • Made cogen react ui to use runtime environment variables(b84c989)

    • add image build for new examples(3f2e7b7)

    • fix image build issue on push(88fde62)

    • Add Settings and Update system Prompt option(1d1e1f9)

    • [ChatQnA] Add no_wrapper benchmarking and update legacy manifests(06696c8)

    • ProviIntegrate visualQnA backend(fa12083)

    • Integrate visualQnA backend(fa12083)

    • Add imagePrompt to display default image hint(e48532e)

    • BUGFIX: rename videoragqna to videoqna to align with other examples(e102291)

    • Fix megaservice ulimit issue under high concurrency(4112fd0)

  • CI/CD/UT

    • Add new test cases for VisualQnA(995a62c)

    • docker image cd workflow enhance (675ea4a)

    • optimize image scan cd workflow(dba908a)

    • Refine code scan output and remove opea_release_data.md.(21e215c)

    • Fix other repo issue.(412a0b0)

    • [DocIndexRetriever] Add xeon test and fix gaudi test (62dbb6d)

    • watch more docker compose files’ changes(4b0bc26)

    • fix typo in test script in AgentQnA(10fe3c6)

    • Fix InstructionTuning and RerankFinetuning tests(be8e283)

    • Fix issue(0bb0abb)

    • print image build test commit(3ce3955)

    • Fix SearchQnA tests bug(daf2a4f)

    • [ProductivitySuite] Fix CD Issue(d55a33d)

GenAIComps
  • Cores

    • Optimize mega flow by removing microservice wrapper(0bb69ac)

    • Fix guardrails out handle logics for space linebreak and quote(e38ed6d)

    • fix mismatched response format w/wo streaming guardrails(b6c0785)

  • Fine-tuning/Pre-training

    • Added finetuned model deployment tutorial in readme(2931147)

    • Add LLM pretraining support(58e9972)

    • updates to containers for finetuning composite(f4d123c)

    • enable embedding finetuning(7e1a2e5)

    • update finetuning doc(7d2cd6b)

    • Support rerank model finetuning(7d9265f)

    • remove Update checkpoint format(8369fbf)

    • finetuning models limitation.(a924579)

    • Update checkpoint format(8369fbf)

    • update upload_training_files format(3367b76)

    • refine logging code.(5b3053f)

    • Added finetuned model deployment tutorial in readme(2931147)

    • enable embedding finetuning(7e1a2e5)

  • LVM/Video RAG

    • Fix lvms videl-llama code issue(38abaab)

    • Fix LVM streaming issue(fb4b8d2)

    • Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)

    • Retriever and lvm update for multimodal rag on videos(1513998)

    • BUG FIX: LVM security fix(3e548f3)

    • Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)

    • Add local Rerank microservice for VideoRAGQnA(5fb4a38)

    • Add Megaservice support for MMRAG - MultimodalRAGQnAWithVideos usecase(99be1bd)

    • Bugfix for PR 496 to add format_video_name function(54aa943)

    • Prediction Guard LVM component(1249c4f)

    • Fix LVM streaming issue(fb4b8d2)

    • Fix lvms videl-llama code issue(38abaab)

    • Fix vLLM components images building(161c338)

    • Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)

  • LLM/Rerank/Retrieval

    • fix vllm llamaindex stream bug(ca94c60)

    • Support Llama index for llms native(2e41dcf)

    • Prediction Guard LLM component(391c4a5)

    • update vllm to latest version for hpu(599a58f)

    • Align parameters for “max_token, repetition_penalty,presence_penalty,frequency_penalty”(3a31295)

    • optimize rerank with backend ref(d76751a)

    • add VDMS retriever microservice for v0.9 Milestone(445c9b1)

    • Fix the Retriever README error(1d761fa)

    • optimize rerank with backend ref(d76751a)

    • unify default reranking model with BAAI/bge-reranker-base(48d4e53)

    • Fix Ollama langchain upgrade issue(8adbcce)

    • vllm langchain: Add Document Retriever Support(0f2c2b1)

    • Support Llama index for vLLM(8e3f553)

    • Changes to comps/llms/text-generation/README(18092f3)

    • Fix security problem(a672569)

  • DataPrep/vector stores

    • Fix the loading error of jsonl file(2fbce3e)

    • To avoid port conflicts change port to others.(89197e5)

    • Dataprep fetch page fix(01886fe)

    • Multimodal dataprep(6d4b668)

    • Refine Dataprep Milvus MS(7686cfa)

    • dataprep: Fix issue in uploading docx with embedding image(b873cf8)

    • add: Pathway vector store and retriever as LangChain component(2c2322e)

    • adding lancedb to langchain vectorstores(2360e5a)

    • adding dataprep support for CLIP based models for VideoRAGQnA example for v1.0(f84d91a)

    • Fix the loading error of jsonl file(2fbce3e)

  • Other Components

    • Fix intent detection code issue(4c0f527)

    • clear some unnecessary scripts and Dockerfile commands.(824a7e2)

    • Update CODEOWNERS(5537b7f)

    • doc: fix heading levels in markdown content(a8a46bc)

    • [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)

    • unify default reranking model with BAAI/bge-reranker-base(48d4e53)

    • feedback_management: Remove ‘vim’ from Dockerfile(b2e64d2)

    • switch to using upstream ‘tgi-gaudi’ on HuggingFace(90cc44f)

    • Using Pip ‘–no-cache-dir’ within all Dockerfiles(f1f866f)

    • Change image tag.(2093558)

    • add code owners(0379aeb)

    • Remove revision for TEI Embedding(d609071)

    • BUGFIX: fix SearchedMultimodalDoc in docarray(ed44b44)

    • Feedback management microservice component(72123b2)

    • bump version into v1.0(9a1af76)

    • Add Scan Container.(0d49244)

    • Remove ‘vim’ from all Dockerfiles(25174c0)

    • update image build yaml(b541fd8)

    • ollama: Update curl proxy.(f510b69)

    • Embedding Runtime on NeuralSpeed(0292355)

    • add microservice for intent detection(84a7e57)

    • Update README.md for Multiplatforms(ef90fbb)

    • doc: fix heading levels(f8f8854)

    • Prediction Guard embeddings component(191061b)

    • [ChatQnA] Support K8S Python Client to export ChatQnA E2E manifests(af4e0f8)

    • Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)

    • replace langchain/langchain:latest with python:3.11-slim(6ce6551)

    • Support for UI of MultimodalRAGWithVideos in GenAIExamples(7664578)

    • [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)

    • Remove fixed version in requirements.txt(f416f84)

    • Update README.md for broken/missing readme(00227b8)

    • adding embedding support for CLIP based models for VideoRAGQnA example for v0.9(2a53e25)

    • same PR as #694 but on main branch(4b5d85b)

    • doc: Fix headings(f6ae4fa)

    • Fix all the microservices which affected by langchain version upgrade(04385c9)

    • update version freeze for requirements-runtime.txt(1e4c382)

    • add contributing section to main readme(2ba3516)

    • Update embedding svc test port number(574fecf)

    • Enable GraphRAG with Neo4J(29fe569)

    • Refine READMEs after reorg(7e40475)

    • Support export megaservice yaml to docker compose file(cff0a4d)

    • Rename videoragqna to videoqna to align with other examples(2b68323)

    • Update example name into MultimodalQnA and update image names(2ca56f3)

    • Fix Reorg Issues(a3da7c1)

    • Move neuralspeed embedding rerank and vllm-xft to catalog(98c62a0)

    • fix ragagent text generator bug(42cde68)

    • Add Bias Detection Microservice(812c85c)

    • Fix intent detection code issue(4c0f527)

    • Update README.md of Table in markdown(849cac9)

    • update dependency version(4eee716)

  • CI/CD/UT

    • add PREDICTIONGUARD_API_KEY for CI(94eb60f)

    • update CI test log achieve(960f66c)

    • expand CI timeout(6c24078)

    • image scan and publish cd enhance(341f97a)

    • add resume finetuning checkpoint ut.(c718602)

    • Bug_fix.(2a91903)

    • Optimize the content of the alerts.(8a11413)

    • Add compose file.(7a21d09)

    • Remove duplicate code(8325d5d)

    • Fix image build fail issue.(3ce387a)

    • Bug fix(12fd97a)

    • enhance image publish job(9007212)

    • Dockerflie check(2705e93)

    • Make the scanning method optional.(ae71eee)

    • Modify output messages.(3e87c3b)

    • minor fix for CI detect(1785149)

    • Add OpenAI client access OPEA microservice UT cases(1b69897)

    • optimize ci test scope(4165c7d)

    • Fixed CI yaml(3ac391a)

    • Move fintuning test script path(267fb02)

    • Add E2E test for bias detection of guardrails(e29865e)

    • Add hyperlinks and paths validation.(ccdd2d0)

    • Update manual test.(2794abd)

    • Opt filecheck(61b8fa9)

    • add PREDICTIONGUARD_API_KEY for CI(94eb60f)

    • update ci action(b4a7f26)

    • update image build compose(3d00a33)

    • Adding Bias Detection Container to CI(6617e22)

    • update cd workflow(3c5fc80)

    • update torch cpu installation(0458443)

    • Fix error.(887ca75)

    • temp remove dockerfile check(2d5130f)

    • Bug_fix.(2a91903)

    • add resume finetuning checkpoint ut.(c718602)

    • Optimize the content of the alerts.(8a11413)

GenAIEvals
  • Accuracy

    • add audioqna asr wer eval scripts(cf8bd83)

    • update llm-as-judge doc.(102fcdd)

    • [v1.0] Add docker metric support(cff0a36)

    • fix issue because of ragas changes(6abbe40)

    • Add README for codegen acc test.(77bb66c)

    • Update chatqna input to fix input length(4f46a12)

    • Support bigcode eval for codegen v0.1(02b60b5)

    • Add FaqGen Accuracy scripts & Refine Ragas(4df6438)

    • update rag_eval readme(425b423)

    • fix bigcode version when python>=3.11(1d3a502)

    • add acc tuning script.(a6fd418)

  • Performance

    • [ChatQnA] Support the replica tuning for ChatQnA(484b69a)

    • Fix rerank benchmark script(8edda1c)

    • Support service-list for metrics collection in benchmark.py(58502c5)

    • Support benchmark file for w/o rerank pipeline(17d35e3)

    • Update configuration in benchmark README(514a6d6)

    • Support P50, P90, P99 for next token latency(6ac555c)

    • Support microservice level benchmark(626d269)

    • Support stresscli for codegen(907dc19)

    • Align llm microservice parameters with end to end test(476a327)

    • Fix microservice level benchmark issue(211b560)

    • Add benchmark part into top README(ac52f79)

    • Add CRAG benchmark(a9b087f)

    • [ChatQnA] Support the replica tuning for ChatQnA(484b69a)

    • add file for w/o rerank(17d35e3)

    • add bench-target as the prefix of output folder(3f0ceaf)

  • Others

    • doc: fix headings and indents(65a0a5b)

    • doc: add title to new FaqGen README(52a540d)

    • add code owners(047c479)

    • doc: fix heading level(d5dbbf0)

    • doc: fix JSON example(7318fb8)

    • Update CODEOWNERS(4db9fb3)

    • doc: update platform optimization document(d982681)

    • doc: add title to new FaqGen README(52a540d)

    • remove examples.(340f507)

    • Add hyperlinks and paths validation(df58fe5)

    • Remove useless file(0af532a)

GenAIInfra
  • GMC

    • GMC: Add a CR for switch mode on one NV GPU card(02412e7)

    • Update the GMC README based on current changes.(6f7a24e)

    • fix GMC crashes in e2e (5a2b306)

    • Add unit test for new function in GMC router(0343a2f)

    • GMC: add UT for reconcile filters(6442127)

    • Enable gmc build workflow on push(19fe1a2)

    • Doc: Fix some typos to run GMC more smoothly(59000c5)

    • Improve the performance of GMC router(68a2011)

    • GMC: enhance log(a18404e)

  • HelmChart

    • e2e helm chart: Add ui for codegen/codetrans/docsum(267d828)

    • helm: Add guardrails llama_guard support(8206a8c)

    • Enable guardrail case in helm e2e tests(491c2e2)

    • helm chart: add nginx to avoid CORS issue(353f3a5)

    • helm-chart/common: Add logging config for service components(b80ae50)

    • helm-chart/data-prep: Add the missing config for dataprep-redis(b70b914)

    • helm: use latest image tag on main branch(65b04dc)

    • helm/manifest: Update to release v0.9(182183e)

    • Add topologySpreadConstraints support(af9e1b6)

    • Add TGI additional options(bf10bdd)

    • Add vLLM inference engine support(0094f52)

    • Remove unused values and change GenAIExamples default(26f9b16)

    • ‘ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu’ is intel cpu(c84ac4c)

  • Documentation

    • add code owner(59ce505)

    • doc: fix headings and indenting(c10bca1)

    • doc: fix headings, spelling, inter-doc references(22d012e)

    • doc: fix image references(0a3e006)

    • Add docs for all 3 use cases of ChatQnA examples and change models for switch case(987870f)

    • doc: restructure authN-authZ directory(b9bc034)

    • Update README(9480afc)

    • doc: fix markdown issues(a339a87)

    • Doc: Fix broken links(032ddbc)

    • Enhance helm chart repo usage in README(0de5535)

    • Create troubleshooting.md(d55ded4)

  • Others

    • Fix CI bug #417(56d7d5d)

    • disable hpa-values test in chart e2e in CI(9b38302)

    • Add unit test for memory bandwidth exporter.(43adcc6)

    • Enable unit test for memory-bandwidth-exporter in CI(923c1f3)

    • add Observability for OPEA(8d304ac)

    • fix a badcommit in #383(406bbc2)

    • Add dataprep CR for NV platform(fa9788d)

    • Add memory bandwidth exporter for AI workload.(9107af9)

    • authN-authZ: update configs(0f5cef1)

    • E2E: exclude terminating pods when wait_util_all_pod_ready(39fb55e)

    • Add gateway guardrails(b22fc52)

    • fix #314(f9204f0)

    • v0.9 charts release(b2328b8)

    • Restructure the directory of config sample and update the e2e test(326a637)

    • Enhance ut(96cd929)

    • improve cd workflows and add release document(a4398b0)

    • Add HPA support to ChatQnA(cab7a88)

    • Add some NVIDIA platform support docs and scripts(cad2fc3)

    • Expose options of memory bandwidth exporter in k8s manifests and docker for user configuration(2517e79)

    • Update the image version for ChatQnA examples(593458c)

    • Update top level README(b224b65)

    • Enable OIDC based Authentication with apisix(ee907d6)

    • HPA improvements(8d86fff)

    • authn-authz: fix CORS issue and refine doc(994250c)

    • Add hyperlinks and paths validation(d8cd3a1)