# LM-Eval Microservice This microservice, designed for [lm-eval](https://github.com/EleutherAI/lm-evaluation-harness), which can host a separate llm server to evaluate `lm-eval` tasks. ## CPU service ### build cpu docker ``` docker build -f Dockerfile.cpu -t opea/lm-eval:latest . ``` ### start the server - set the environments `MODEL`, `MODEL_ARGS`, `DEVICE` and start the server ``` docker run -p 9006:9006 --ipc=host -e MODEL="hf" -e MODEL_ARGS="pretrained=Intel/neural-chat-7b-v3-3" -e DEVICE="cpu" opea/lm-eval:latest ``` ### evaluate the model - set `base_url` and `tokenizer` ``` git clone https://github.com/opea-project/GenAIEval cd GenAIEval pip install -e . cd GenAIEval/evaluation/lm_evaluation_harness/examples python main.py \ --model genai-hf \ --model_args "base_url=http://{your_ip}:9006,tokenizer=Intel/neural-chat-7b-v3-3" \ --tasks "lambada_openai" \ --batch_size 2 ```