14 KiB
| stage | group | description | info |
|---|---|---|---|
| AI-Powered | Custom Models | Set up your self-hosted model infrastructure | To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments |
Set up your self-hosted model infrastructure
DETAILS: Tier: For a limited time, Premium and Ultimate. In the future, GitLab Duo Enterprise. Offering: Self-managed Status: Beta
- Introduced in GitLab 17.1 with a flag named
ai_custom_model. Disabled by default.
FLAG: The availability of this feature is controlled by a feature flag. For more information, see the history.
By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.
To set up your self-hosted model infrastructure:
- Install the large language model (LLM) serving infrastructure.
- Configure your GitLab instance.
- Install the GitLab AI Gateway.
For an installation video guide, see Self-Hosted Models Deployment.
For an installation video guide in French, see Self-Hosted Models Deployment (French Language version).
Install large language model serving infrastructure
Install one of the following GitLab-approved LLM models:
| Model | Code completion | Code generation | GitLab Duo Chat |
|---|---|---|---|
| CodeGemma 2b | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
| CodeGemma 7b-it (Instruction) | {dotted-circle} No | {check-circle} Yes | {dotted-circle} No |
| CodeGemma 7b-code (Code) | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
| Code-Llama 13b-code | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
| Code-Llama 13b | {dotted-circle} No | {check-circle} Yes | {dotted-circle} No |
| Codestral 22B (see setup instructions) | {check-circle} Yes | {check-circle} Yes | {dotted-circle} No |
| Mistral 7B | {dotted-circle} No | {check-circle} Yes | {check-circle} Yes |
| Mixtral 8x22B | {dotted-circle} No | {check-circle} Yes | {check-circle} Yes |
| Mixtral 8x7B | {dotted-circle} No | {check-circle} Yes | {check-circle} Yes |
Use a serving architecture
You should use one of the following serving architectures with your installed LLM:
Litellm config examples for quickly getting started with Ollama
model_list:
- model_name: mistral
litellm_params:
model: ollama/mistral:latest
api_base: YOUR_HOSTING_SERVER
- model_name: mixtral
litellm_params:
model: ollama/mixtral:latest
api_base: YOUR_HOSTING_SERVER
- model_name: codegemma
litellm_params:
model: ollama/codegemma
api_base: YOUR_HOSTING_SERVER
- model_name: codestral
litellm_params:
model: ollama/codestral
api_base: YOUR_HOSTING_SERVER
- model_name: codellama
litellm_params:
model: ollama/codellama:13b
api_base: YOUR_HOSTING_SERVER
- model_name: codellama_13b_code
litellm_params:
model: ollama/codellama:code
api_base: YOUR_HOSTING_SERVER
- model_name: deepseekcoder
litellm_params:
model: ollama/deepseekcoder
api_base: YOUR_HOSTING_SERVER
- model_name: mixtral_8x22b
litellm_params:
model: ollama/mixtral:8x22b
api_base: YOUR_HOSTING_SERVER
- model_name: codegemma_2b
litellm_params:
model: ollama/codegemma:2b
api_base: YOUR_HOSTING_SERVER
- model_name: codegemma_7b
litellm_params:
model: ollama/codegemma:code
api_base: YOUR_HOSTING_SERVER
Configure your GitLab instance
-
For the GitLab instance to know where the AI Gateway is located so it can access the gateway, set the environment variable
AI_GATEWAY_URLinside your GitLab instance environment variables:AI_GATEWAY_URL=https://<your_ai_gitlab_domain> -
Where your GitLab instance is installed, run the following Rake task to activate GitLab Duo features:
sudo gitlab-rake gitlab:duo:enable_feature_flags -
sudo gitlab-rails consoleIn the console, enable the
ai_custom_modelfeature flag:Feature.enable(:ai_custom_model)Exit the Rails console.
Install the GitLab AI Gateway
Install by using Docker
Prerequisites:
- Install a Docker container engine, such as Docker.
- Use a valid hostname accessible within your network. Do not use
localhost.
The GitLab AI Gateway Docker image contains all necessary code and dependencies in a single container.
Find the AI Gateway release
Find the GitLab official Docker image at:
- AI Gateway Docker image on Container Registry.
- AI Gateway Docker image on DockerHub.
- Release process for self-hosted AI Gateway.
Use the image tag that corresponds to your GitLab version. For example, if the
GitLab version is v17.4.0, use self-hosted-v17.4.0-ee tag.
For version v17.3.0-ee, use image tag gitlab-v17.3.0.
WARNING: Docker for Windows is not officially supported. There are known issues with volume permissions, and potentially other unknown issues. If you are trying to run on Docker for Windows, see the getting help page for links to community resources (such as IRC or forums) to seek help from other users.
Set up the volumes location
Create a directory where the logs will reside on the Docker host. It can be under
your user's home directory (for example ~/gitlab-agw), or in a directory like
/srv/gitlab-agw. To create that directory, run:
sudo mkdir -p /srv/gitlab-agw
If you're running Docker with a user other than root, ensure appropriate
permissions have been granted to that directory.
Optional: Download documentation index
NOTE:
This only applies to AI Gateway image tag gitlab-17.3.0-ee and earlier. For images with tag self-hosted-v17.4.0-ee and later, documentation search is embedded into the Docker image.
To improve results when asking GitLab Duo Chat questions about GitLab, you can index GitLab documentation and provide it as a file to the AI Gateway.
To index the documentation in your local installation, run:
pip install requests langchain langchain_text_splitters
python3 scripts/custom_models/create_index.py -o <path_to_created_index/docs.db>
This creates a file docs.db at the specified path.
You can also create an index for a specific GitLab version:
python3 scripts/custom_models/create_index.py --version_tag="{gitlab-version}"
Start a container from the image
For Docker images with version self-hosted-17.4.0-ee and later, run the following:
docker run -e AIGW_GITLAB_URL=<your_gitlab_instance> <image>
For Docker images with version gitlab-17.3.0-ee and gitlab-17.2.0, run:
docker run -e AIGW_CUSTOM_MODELS__ENABLED=true \
-v path/to/created/index/docs.db:/app/tmp/docs.db \
-e AIGW_FASTAPI__OPENAPI_URL="/openapi.json" \
-e AIGW_AUTH__BYPASS_EXTERNAL=true \
-e AIGW_FASTAPI__DOCS_URL="/docs"\
-e AIGW_FASTAPI__API_PORT=5052 \
<image>
The arguments AIGW_FASTAPI__OPENAPI_URL and AIGW_FASTAPI__DOCS_URL are not
mandatory, but are useful for debugging. From the host, accessing http://localhost:5052/docs
should open the AI Gateway API documentation.
Install by using Docker Engine
-
For the AI Gateway to access the API, it must know where the GitLab instance is located. To do this, set the environment variables
AIGW_GITLAB_URLandAIGW_GITLAB_API_URL:AIGW_GITLAB_URL=https://<your_gitlab_domain> AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ -
After you've set up the environment variables, run the image.
-
Track the initialization process:
sudo docker logs -f gitlab-aigw
After starting the container, visit gitlab-aigw.example.com. It might take
a while before the Docker container starts to respond to queries.
Install by using the AI Gateway Helm chart
Prerequisites:
- You must have a:
- Domain you own, that you can add a DNS record to.
- Kubernetes cluster.
- Working installation of
kubectl. - Working installation of Helm, version v3.11.0 or later.
For more information, see Test the GitLab chart on GKE or EKS.
Add the AI Gateway Helm repository
Add the AI Gateway Helm repository to Helm’s configuration:
helm repo add ai-gateway \
https://gitlab.com/api/v4/projects/gitlab-org%2fcharts%2fai-gateway-helm-chart/packages/helm/devel
Install the AI Gateway
-
Create the
ai-gatewaynamespace:kubectl create namespace ai-gateway -
Generate the certificate for the domain where you plan to expose the AI Gateway.
-
Create the TLS secret in the previously created namespace:
kubectl -n ai-gateway create secret tls ai-gateway-tls --cert="<path_to_cert>" --key="<path_to_cert_key>" -
For the AI Gateway to access the API, it must know where the GitLab instance is located. To do this, set the
gitlab.urlandgitlab.apiUrltogether with theingress.hostsandingress.tlsvalues as follows:helm repo add ai-gateway \ https://gitlab.com/api/v4/projects/gitlab-org%2fcharts%2fai-gateway-helm-chart/packages/helm/devel helm repo update helm upgrade --install ai-gateway \ ai-gateway/ai-gateway \ --version 0.1.1 \ --namespace=ai-gateway \ --set="gitlab.url=https://<your_gitlab_domain>" \ --set="gitlab.apiUrl=https://<your_gitlab_domain>/api/v4/" \ --set "ingress.enabled=true" \ --set "ingress.hosts[0].host=<your_gateway_domain>" \ --set "ingress.hosts[0].paths[0].path=/" \ --set "ingress.hosts[0].paths[0].pathType=ImplementationSpecific" \ --set "ingress.tls[0].secretName=ai-gateway-tls" \ --set "ingress.tls[0].hosts[0]=<your_gateway_domain>" \ --set="ingress.className=nginx" \ --timeout=300s --wait --wait-for-jobs
This step can take will take a few seconds in order for all resources to be allocated and the AI Gateway to start.
Wait for your pods to get up and running:
kubectl wait pod \
--all \
--for=condition=Ready \
--namespace=ai-gateway \
--timeout=300s
When your pods are up and running, you can set up your IP ingresses and DNS records.
Configure the GitLab instance
Configure the GitLab instance.
With those steps completed, your Helm chart installation is complete.
Upgrade the AI Gateway Docker image
To upgrade the AI Gateway, download the newest Docker image tag.
-
Stop the running container:
sudo docker stop gitlab-aigw -
Remove the existing container:
sudo docker rm gitlab-aigw -
Pull and run the new image.
-
Ensure that the environment variables are all set correctly.
Alternative installation methods
For information on alternative ways to install the AI Gateway, see issue 463773.
Troubleshooting
First, run the debugging scripts to verify your self-hosted model setup.
For more information on other actions to take, see the troubleshooting documentation.