OPEA Release Data¶

This page shows the benchmark data of GenAIExamples. More data for different examples will be submitted in the future release.

ChatQnA¶

Docker Images for Test
opea/embedding-tei:v0.9
ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
opea/llm-tgi:v0.9
ghcr.io/huggingface/tgi-gaudi:2.0.1
opea/dataprep-redis:v0.9
redis/redis-stack:7.2.0-v9
opea/reranking-tei:v0.9
opea/tei-gaudi:v0.9
opea/retriever-redis:v0.9
opea/chatqna:v0.9

System Summary:
1-node, 2x Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz, 40 cores, 270W TDP, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 1024GB (32x32GB DDR4 3200 MT/s [3200 MT/s]), BIOS ETM02, microcode 0xd0003b9, 8x Habana Labs Ltd., 4x MT28800 Family [ConnectX-5 Ex], 4x 7T INTEL SSDPF2KX076TZ, 2x 894.3G SAMSUNG MZ1L2960HCJR-00A07, Ubuntu 22.04.3 LTS, 5.15.0-92-generic. Software: WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_SW. Test by Intel as of 08/20/24.

Performance Data¶

1Node E2E Performance (Sec)	Gaudi nodes	Concurrency	Input	Output	Average Latency	P90 Total latency
OOB w/o Reranking	1	128	128	128	5.597	7.59
OOB w/ Reranking	1	128	128	128	6.003	8.123

2Nodes E2E Performance (Sec)	Gaudi nodes	Concurrency	Input	Output	Average Latency	P90 Total latency
OOB w/o Reranking	2	256	128	128	7.05	9.122
OOB w/ Reranking	2	256	128	128	7.26	9.239

4Nodes E2E Performance (Sec)	Gaudi nodes	Concurrency	Input	Output	Average Latency	P90 Total latency
OOB w/o Reranking	4	512	128	128	16.293	21.169
OOB w/ Reranking	4	512	128	128	17.22	21.942

Go to Benchmark README for reproduce steps, tuned performance data will be released soon.

Accuracy Data¶

Test Case	Hits@10	Hits@4	MAP@10	MRR@10
Retrieval w/o Reranking	66.16%	49.80%	17.62%	39.75%
Retrieval w/ Reranking	72.28%	63.24%	24.97%	56.79%

Go to Accuracy README for reproduce steps.