Code Generation Example (CodeGen)¶

Table of Contents¶

Overview
Problem Motivation
Architecture
- High-Level Diagram
- OPEA Microservices Diagram
Deployment Options
Benchmarking
Automated Deployment using Terraform
Contribution

Overview¶

The Code Generation (CodeGen) example demonstrates an AI application designed to assist developers by generating computer code based on natural language prompts or existing code context. It leverages Large Language Models (LLMs) trained on vast datasets of repositories, documentation, and code for programming.

This example showcases how developers can quickly deploy and utilize a CodeGen service, potentially integrating it into their IDEs or development workflows to accelerate tasks like code completion, translation, summarization, refactoring, and error detection.

Problem Motivation¶

Writing, understanding, and maintaining code can be time-consuming and complex. Developers often perform repetitive coding tasks, struggle with translating between languages, or need assistance understanding large codebases. CodeGen LLMs address this by automating code generation, providing intelligent suggestions, and assisting with various code-related tasks, thereby boosting productivity and reducing development friction. This OPEA example provides a blueprint for deploying such capabilities using optimized components.

Architecture¶

High-Level Diagram¶

The CodeGen application follows a microservice-based architecture enabling scalability and flexibility. User requests are processed through a gateway, which orchestrates interactions between various backend services, including the core LLM for code generation and potentially retrieval-augmented generation (RAG) components for context-aware responses.

High-level Architecture

OPEA Microservices Diagram¶

This example utilizes several microservices from the OPEA GenAIComps repository. The diagram below illustrates the interaction between these components for a typical CodeGen request, potentially involving RAG using a vector database.

flowchart LR %% Colors %% classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 classDef invisible fill:transparent,stroke:transparent; style CodeGen-MegaService stroke:#000000 %% Subgraphs %% subgraph CodeGen-MegaService["CodeGen-MegaService"] direction LR EM([Embedding MicroService]):::blue RET([Retrieval MicroService]):::blue RER([Agents]):::blue LLM([LLM MicroService]):::blue end subgraph User Interface direction LR a([Submit Query Tab]):::orchid UI([UI server]):::orchid Ingest([Manage Resources]):::orchid end CLIP_EM{{Embedding service}} VDB{{Vector DB}} V_RET{{Retriever service}} Ingest{{Ingest data}} DP([Data Preparation]):::blue LLM_gen{{LLM Serving}} GW([CodeGen GateWay]):::orange %% Data Preparation flow direction LR Ingest[Ingest data] --> UI UI --> DP DP <-.-> CLIP_EM %% Questions interaction direction LR a[User Input Query] --> UI UI --> GW GW <==> CodeGen-MegaService EM ==> RET RET ==> RER RER ==> LLM %% Embedding service flow direction LR EM <-.-> CLIP_EM RET <-.-> V_RET LLM <-.-> LLM_gen direction TB %% Vector DB interaction V_RET <-.->VDB DP <-.->VDB

Deployment Options¶

This CodeGen example can be deployed manually on various hardware platforms using Docker Compose or Kubernetes. Select the appropriate guide based on your target environment:

Hardware	Deployment Mode	Guide Link
Intel Xeon CPU	Single Node (Docker)	Xeon Docker Compose Guide
Intel Gaudi HPU	Single Node (Docker)	Gaudi Docker Compose Guide
AMD EPYC CPU	Single Node (Docker)	EPYC Docker Compose Guide
AMD ROCm GPU	Single Node (Docker)	ROCm Docker Compose Guide
Intel Xeon CPU	Kubernetes (Helm)	Kubernetes Helm Guide
Intel Gaudi HPU	Kubernetes (Helm)	Kubernetes Helm Guide
Intel Xeon CPU	Kubernetes (GMC)	Kubernetes GMC Guide
Intel Gaudi HPU	Kubernetes (GMC)	Kubernetes GMC Guide

Note: Building custom microservice images can be done using the resources in GenAIComps.

Benchmarking¶

Guides for evaluating the performance and accuracy of this CodeGen deployment are available:

Benchmark Type	Guide Link
Accuracy	Accuracy Benchmark Guide
Performance	Performance Benchmark Guide

Automated Deployment using Terraform¶

Intel® Optimized Cloud Modules for Terraform provide an automated way to deploy this CodeGen example on various Cloud Service Providers (CSPs).

Cloud Provider	Intel Architecture	Intel Optimized Cloud Module for Terraform	Comments
AWS	4th Gen Intel Xeon with Intel AMX	AWS Deployment	Available
GCP	4th/5th Gen Intel Xeon	GCP Deployment	Available
Azure	4th/5th Gen Intel Xeon	Work-in-progress	Coming Soon
Intel Tiber AI Cloud	5th Gen Intel Xeon with Intel AMX	Work-in-progress	Coming Soon

Validated Configurations¶

Deploy Method	LLM Engine	LLM Model	Hardware
Docker Compose	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	Intel Gaudi
Docker Compose	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	Intel Xeon
Docker Compose	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	AMD EPYC
Docker Compose	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	AMD ROCm
Helm Charts	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	Intel Gaudi
Helm Charts	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	Intel Xeon
Helm Charts	vLLM, TGI	Qwen/Qwen2.5-Coder-7B-Instruct	AMD ROCm

Contribution¶

We welcome contributions to the OPEA project. Please refer to the contribution guidelines for more information.