OPEA Release Notes v0.8¶

What’s New in OPEA v0.8¶

Broaden functionality
- Support frequently asked questions (FAQs) generation GenAI example
- Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
- Enable end-to-end performance and accuracy benchmarking
- Support the experimental Agent microservice
- Support LLM serving on Ray
Multi-platform support
- Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
- Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
- Enable the experimental authentication and authorization support using JWT tokens
- Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
OPEA Docker Hub: https://hub.docker.com/u/opea

Details¶

GenAIExamples

ChatQnA
- Add ChatQnA instructions for AIPC(26d4ff)
- Adapt Vllm response format (034541)
- Update tgi version(5f52a1)
- Update README.md(f9312b)
- Udpate ChatQnA docker compose for Dataprep Update(335362)
- [Doc] Add valid micro-service details(e878dc)
- Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
- Fix win PC issues(ba6541)
- [Doc]Add ChatQnA Flow Chart(97da49)
- Add guardrails in the ChatQnA pipeline(955159)
- Fix a minor bug for chatqna in docker-compose(b46ae8)
- Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
- Added ChatQnA example using Qdrant retriever(c74564)
- Update TEI version v1.5 for better performance(f4b4ac)
- Update ChatQnA upload feature(598484)
- Add auto truncate for embedding and rerank(8b6094)
Deployment
- Add Kubernetes manifest files for deploying DocSum(831463)
- Update Kubernetes manifest files for CodeGen(2f9397)
- Add Kubernetes manifest files for deploying CodeTrans(c9548d)
- Updated READMEs for kubernetes example pipelines(c37d9c)
- Update all examples yaml files of GMC in GenAIExample(290a74)
- Doc: fix minor issue in GMC doc(d99461)
- README for installing 4 worklods using helm chart(6e797f)
- Update Kubernetes manifest files for deploying ChatQnA(665c46)
- Add new example of SearchQnA for GenAIExample(21b7d1)
- Add new example of Translation for GenAIExample(d0b028)
Other examples
- Update reranking microservice dockerfile path (d7a5b7)
- Update tgi-gaudi version(3505bd)
- Refine README of Examples(f73267)
- Update READMEs(8ad7f3)
- [CodeGen] Add codegen flowchart(377dd2)
- Update audioqna image name(615f0d)
- Add auto-truncate to gaudi tei (8d4209)
- Update visualQnA chinese version(497895)
- Fix Typo for Translation Example(95c13d)
- FAQGen Megaservice(8c4a25)
- Code-gen-react-ui(1b48e5)
- Added doc sum react-ui(edf0d1)
CI/UT
- Frontend failed with unknown timeout issue (7ebe78)
- Adding Chatqna Benchmark Test(11a56e)
- Expand tgi connect timeout(ee0dcb)
- Optimize gmc manifest e2e tests(15fc6f)
- Add docker compose yaml print for test(bb4230)
- Refactor translation ci test (b7975e)
- Refactor searchqna ci test(ecf333)
- Translate UT for UI(284d85)
- Enhancement the codetrans e2e test(450efc)
- Allow gmc e2e workflow to get secrets(f45f50)
- Add checkout ref in gmc e2e workflow(62ae64)
- SearchQnA UT(268d58)

GenAIComps

Cores
- Support https for microservice(2d6772)
- Enlarge megaservice request timeout for supporting high concurrency(876ca5)
- Add dynamic DAG(f2995a)
LLM
- Optional vllm microservice container build(963755)
- Refine vllm instruction(6e2c28)
- Introduce ‘entrypoint.sh’ for some Containers(9ecc5c)
- Support llamaindex for retrieval microservice and remove langchain(61795f)
- Update tgi with text-generation-inference:2.1.0(f23694)
- Fix requirements(f4b029)
- Add vLLM on Ray microservice(ec3b2e)
- Update code/readme/UT for Ray Serve and VLLM(dd939c)
- Allow the Ollama microservice to be configurable with different models(2458e2)
- LLM performance optimization and code refine(6e31df)
DataPrep
- Support get/delete file in Dataprep Microservice(5d0842)
- Dataprep | PGVector : Added support for new changes in utils.py(54eb7a)
- Enhance the dataprep microservice by adding separators(ef97c2)
- Freeze python-bidi==0.4.2 for dataprep/redis(b4012f)
- Support delete data for Redis vector db(967fdd)
Other Components
- Remove ingest in Retriever MS(d25d2c)
- Qdrant retriever microservice(9b658f)
- Update milvus service for dataprep and retriever(d7cdab)
- Architecture specific args for a few containers(1dd7d4)
- Update driver compatible image(1d4664)
- Fix Llama-Guard-2 issue(6b091c)
- Embeddings: adaptive detect embedding model arguments in mosec(f164f0)
- Architecture specific args for langchain guardrails(5e232a)
- Fix requirements install issue for reranks/fastrag(94e807)
- Update to remove warnings when building Dockerfiles(3e5dd0)
- Initiate Agent component(c3f6b2)
- Add FAQGen gateway in core to support FAQGen Example(9c90eb)
- Prompt registry(f5a548)
- Chat History microservice for chat data persistence(30d95b)
- Align asr output and llm input without using orchestrator(64e042)
- Doc: add missing in README.md codeblock(2792e2)
- Prompt registry(f5a548)
- Chat History microservice for chat data persistence(30d95b)
- Align asr output and llm input without using orchestrator(64e042)
CI/UT
- Fix duplicate ci test(33f37c)
- Build and push new docker images into registry(80da5a)
- Update image build for gaudi(fe3d22)
- Add guardrails ut(556030)

GenAIEvals

Update lm-eval to 0.4.3(89c825)
Add toxicity/bias/hallucination metrics(48015a)
Support stress benchmark test(59cb27)
Add rag related metrics(83ad9c)
Added CRUD Chinese benchmark example(9cc6ca)
Add MultiHop English benchmark accuracy(8aa1e6)

GenAIInfra

GMC
- Enable image build on push for gmc(f8a295)
- Revise workflow to support gmc running in kind(a2dc96)
- Enable GMC system installation on push(af2d0f)
- Enhance the switch mode for GMC router service required(f96b0e)
- Optimize GMC e2e scripts(27a062)
- Optimize app namesapces and fix some typos in gmc e2e test(9c97fa)
- Add GMC into README(b25c0b)
- Gmc: add authN & authZ support on fake JWT token(3756cf)
- GMC: adopt new common/menifests(b18531)
- Add new example of searchQnA on both xeon and gaudi(883c8d)
- Support switch mode in GMC for MI6 team(d11aeb)
- Add translation example into GMC(6235a9)
- Gmc: add authN & authZ support on keycloak(3d139b)
- GMC: Support new component(4c5a51)
- GMC: update README(d57b94)
HelmChart
- Helm chart: change default global.modelUseHostPath value(8ffc3b)
- Helm chart: Add readOnlyRootFilesystem to securityContext(9367a9)
- Update chatqna with additional dependencies(009c96)
- Update codegen with additional dependencies(d41dd2)
- Make endpoints configurable by user(486023)
- Add data prep component(384931)
- The microservice port number is not configurable(fbaa6a)
- Add MAX_INPUT_TOKENS to tgi(2fcbb0)
- Add script to generate yaml files from helm-charts(6bfe31)
- Helm: support adding extra env from external configmap(7dabdf)
- Helm: expose dataprep configurable items into value file(83fc1a)
- Helm: upgrade version to 0.8.0(b3cbde)
- Add whisper and asr components(9def61)
- Add tts and speecht5 components helm chart(9d1465)
- Update the script to generate comp manifest(ab53e9)
- Helm: remove unused Probes(c1cff5)
- Helm: Add tei-gaudi support(a456bf)
- Helm redis-vector-db: Add missings in value file(9e15ef)
- Helm: Use empty string instead of null in value files(6151ac)
- Add component k8s manifest files(68483c)
- Add helm test for chart redis-vector-db(236381)
- Add helm test for chart tgi(9b5def)
- Add helm test for chart tei(f5c7fa)
- Add helm test for chart teirerank(00532a)
- Helm test: Make curl fail if http_status > 400 returned(92c4b5)
- Add helm test for chart embedding-usvc(a98561)
- Add helm test for chart llm-uservice(f4f3ea)
- Add helm test for chart reranking-usvc(397208)
- Add helm test for chart retriever-usvc(6db408)
- Helm: Support automatically install dependency charts(dc90a5)
- Helm: support remove helm dependency(fbdb1d)
- Helm: upgrade tgi chart(c3a1c1)
- Helm/manifest: update tei config for tei-gaudi(88b3c1)
- Add CodeTrans helm chart(5b05f9)
- Helm: Update chatqna to latest(7ff03b)
- Add DocSum helm chart(b56116)
- Add docsum support for helm test(f6354b)
- Helm: Update codegen to latest(419e5b)
- Fix codegen helm chart readme(b4b28e)
- Disable runAsRoot for speecht5 and whisper(aeef78)
- Use upstream tei-gaudi image(e4d3ff)
Others
- Enhancement the e2e test for GenAIInfra for fixing some bugs(602af5)
- Fix bugs for router on handling response from pipeline microservices(ef47f9)
- Improve the examples of codegen and codetrans e2e test(07494c)
- Remove the dependencies of common microservices(f6dd87)
- Add scripts for KubeRay and Ray Cluster(7d3d13)
- Enable CI for common components(9e27a0)
- Disable common component test(e1cd50)
- CI for common: avoid false error in helm test result(876b7a)
- Add the init input for pipeline to keep the parameter information(e25a1f)
- Adjust CI gaudi version(d75d8f)
- Fix CHART_MOUNT and HFTOKEN for CI(10b908)
- Change tgi tag because gaudi driver is upgraded to 1.16.1 (6796ef)
- Update README for new manifests(ec32bf)
- Support multiple router service in one namespace(0ac732)
- Improve workflow trigger conditions to be more precise(ab5c8d)
- Remove unnecessary component DocSumGaudi which would cause error(9b973a)
- Remove chart_test scripts and add script to dump pod status(88caf0)

Thanks to these contributors¶

We would like to thank everyone who contributed to OPEA project. Here are the contributors: