Skip to content

HTTP Server Mode

By default, GitLab MCP Server runs in stdio mode — each AI client spawns its own server process. HTTP mode is an alternative where a single server process serves multiple clients over the network, each authenticating with their own GitLab token.

ScenarioRecommended Mode
Single developer, local AI clientstdio
Team sharing one server instanceHTTP
Remote/headless server deploymentHTTP
CI/CD integration with MCPHTTP
Testing with curl or HTTP clientsHTTP
Terminal window
# Single GitLab.com instance (fixed URL for all clients; replace for self-managed GitLab)
gitlab-mcp-server --http --gitlab-url=https://gitlab.com
# Multi-instance (each client specifies their GitLab URL via GITLAB-URL header)
gitlab-mcp-server --http --http-addr=:8080

The server starts listening on port 8080 by default. The MCP endpoint is available at /mcp.

FlagDefaultDescription
--http(off)Enable HTTP transport mode
--http-addr:8080HTTP listen address (host:port)
--gitlab-url(optional)Fixed GitLab instance URL. Omit to require GITLAB-URL per request
--skip-tls-verifyfalseSkip TLS certificate verification for self-signed certs
--meta-toolstrueEnable domain meta-tools. Set false for individual tools
--tool-surface(empty)Explicit tool catalog selector; see Tool and capability surface options. Overrides --meta-tools
--capability-surfacefullResource and prompt selector; see Tool and capability surface options
--meta-param-schemaopaqueMeta-tool input schema mode: opaque, compact, or full
--enterprisefalseForce Enterprise/Premium tools; omit to auto-detect CE/EE per token+URL entry
--read-onlyfalseRead-only mode: disable all mutating tools
--safe-modefalseIntercept mutating tools and return a JSON preview instead of executing them
--embedded-resourcestrueEmbed canonical MCP resource URIs in get_* tool results
--max-http-clients100Maximum unique token+URL entries in the server pool
--session-timeout30mIdle MCP session timeout
--auto-updatetrueAuto-update mode: true, check, or false
--auto-update-repojmrplens/gitlab-mcp-serverGitHub repository for release assets
--auto-update-interval1hPeriodic update check interval
--auth-modelegacyAuthentication mode: legacy or oauth (RFC 9728)
--oauth-cache-ttl15mOAuth token identity cache TTL (range: 1m–2h)
--revalidate-interval15mToken re-validation interval; 0 to disable (upper bound: 24h)
--rate-limit-rps0Per-server tools/call rate limit in req/s (0 = disabled)
--rate-limit-burst40Token-bucket burst size when --rate-limit-rps > 0
--trusted-proxy-headerHTTP header with real client IP for rate limiting behind proxies (e.g. Fly-Client-IP, X-Forwarded-For)

--tool-surface selects the visible MCP tool catalog for every HTTP server-pool entry:

  • meta: domain-level meta-tools, the default consolidated catalog.
  • individual: every GitLab operation is exposed as its own tool.
  • dynamic: the current low-token three-tool surface with gitlab_search_tools, gitlab_describe_tools, and gitlab_execute_tool.
  • dynamic-3: explicit selector for the same three-tool dynamic surface, useful for pinned configurations.
  • dynamic-2: experimental two-tool surface with gitlab_find_action and gitlab_execute_tool.

--capability-surface controls resources and prompts independently of tools: full registers all resources and prompts, while minimal keeps only the workspace roots resource for low-token dynamic deployments.

HTTP clients control only their GitLab token and, in multi-instance mode, the GITLAB-URL selector. Server policy options such as --tool-surface, --capability-surface, --meta-param-schema, --rate-limit-rps, --read-only, --safe-mode, --auth-mode, and --trusted-proxy-header are fixed by the MCP server process and cannot be changed per user, session, or JSON-RPC request.

If a client sends config-like headers such as TOOL-SURFACE, CAPABILITY-SURFACE, META-PARAM-SCHEMA, RATE-LIMIT-RPS, or GITLAB-SAFE-MODE, the server ignores them and logs their option names in ignored_options without logging their values.

Clients must provide their GitLab Personal Access Token on every HTTP request using one of two headers.

When the server starts without --gitlab-url, clients must specify which GitLab instance to target using the GITLAB-URL header:

GITLAB-URL: https://gitlab.example.com

If --gitlab-url was set at startup, this header is ignored and logged. If --gitlab-url was not set and the header is omitted, the request is rejected.

PRIVATE-TOKEN: glpat-xxxxxxxxxxxxxxxxxxxx
Authorization: Bearer glpat-xxxxxxxxxxxxxxxxxxxx

If both headers are present, PRIVATE-TOKEN takes precedence. Requests without a valid token are rejected.

OAuth mode (--auth-mode=oauth) enables RFC 9728–compliant OAuth 2.1 authentication. Instead of managing tokens manually, MCP clients discover the authorization server automatically and handle the OAuth flow:

Terminal window
gitlab-mcp-server --http --gitlab-url=https://gitlab.com --auth-mode=oauth

How it works:

  1. The server exposes /.well-known/oauth-protected-resource with metadata pointing to your GitLab instance as the authorization server
  2. MCP clients (VS Code, Claude Code) discover this endpoint and initiate the OAuth 2.1 PKCE flow
  3. Users authorize in the browser — no token copying required
  4. The server validates Bearer tokens against the GitLab API and caches the identity for --oauth-cache-ttl (default: 15 minutes)

Client configuration in OAuth mode:

{
"servers": {
"gitlab": {
"type": "http",
"url": "http://your-server:8080/mcp",
"oauth": {
"clientId": "YOUR_GITLAB_APPLICATION_ID",
"scopes": ["api"]
}
}
}
}
  • clientId: The Application ID from your GitLab OAuth Application (see docs/oauth-app-setup.md)
  • scopes: Must include api for full tool functionality

VS Code handles OAuth discovery and authorization automatically.

The core of HTTP mode is a bounded LRU pool of MCP server instances, keyed by the SHA-256 hash of each client’s token and GitLab URL.

graph TD
    subgraph "HTTP Mode Architecture"
        REQ1["Client A<br/>Token: glpat-aaa<br/>URL: gitlab.com"] --> HANDLER[StreamableHTTPHandler]
        REQ2["Client B<br/>Token: glpat-bbb<br/>URL: gitlab.com"] --> HANDLER
        REQ3["Client C<br/>Token: glpat-aaa<br/>URL: self-hosted.example.com"] --> HANDLER

        HANDLER --> POOL[Server Pool]

        POOL --> ENTRY1["hash(glpat-aaa + gitlab.com)<br/>MCP Server + GitLab Client"]
        POOL --> ENTRY2["hash(glpat-bbb + gitlab.com)<br/>MCP Server + GitLab Client"]
        POOL --> ENTRY3["hash(glpat-aaa + self-hosted)<br/>MCP Server + GitLab Client"]
    end

    ENTRY1 --> GL1["GitLab API<br/>as user A @ gitlab.com"]
    ENTRY2 --> GL2["GitLab API<br/>as user B @ gitlab.com"]
    ENTRY3 --> GL3["GitLab API<br/>as user A @ self-hosted"]

Key properties:

  • Clients with the same token and same GitLab URL share the same MCP server instance
  • Clients with different tokens or different GitLab URLs get completely isolated instances
  • Raw tokens are never stored — only SHA-256 hashes of token+URL are kept in memory
  • When the pool reaches --max-http-clients, the least recently used entry is evicted
  1. First request: Token and GitLab URL are extracted, combined and hashed, and a new MCP server + GitLab client is created
  2. Subsequent requests: The existing entry is found and promoted in the LRU list
  3. Idle timeout: After --session-timeout of inactivity, the MCP session is closed (but the pool entry remains)
  4. Pool eviction: When capacity is reached, the oldest entry is removed entirely

HTTP mode includes an optional per-server token-bucket rate limiter that throttles tools/call requests. The limiter is disabled by default (--rate-limit-rps=0) and applies to each pool entry independently — that is, the scope is the same (token + GitLab URL) key used by the server pool.

FlagDefaultMeaning
--rate-limit-rps0Sustained refill rate, in requests per second. 0 disables the limiter
--rate-limit-burst40Maximum bucket capacity (peak burst over 1s)

When --rate-limit-rps > 0, each pool entry gets its own token bucket sized at --rate-limit-burst tokens, refilled at --rate-limit-rps per second. Only tools/call requests consume tokens; tools/list, resources/*, prompts/*, initialize, and other low-cost RPCs are not rate-limited.

When a request would deplete the bucket, the server returns a CallToolResult with IsError: true and a text message such as rate limit exceeded for <tool>; retry after a short backoff. Clients should back off (exponentially or by detecting that message) and retry. The limiter does not return HTTP 429 because the limit is enforced after JSON-RPC routing, inside the MCP layer.

  • Single-user deployment (typical local dev): leave disabled (--rate-limit-rps=0)
  • Shared instance behind a proxy (Fly.io, Kubernetes): start with --rate-limit-rps=10 --rate-limit-burst=40. Each token+URL pair gets its own quota, so this protects against a single noisy client without affecting others
  • Large multi-tenant deployment: combine with infra-level rate limiting (Cloudflare, Caddy, nginx). The MCP-level limiter is a safety net, not a replacement for edge enforcement

Add to .vscode/mcp.json:

{
"servers": {
"gitlab": {
"type": "http",
"url": "http://your-server:8080/mcp",
"headers": {
"PRIVATE-TOKEN": "glpat-your-token"
}
}
}
}

The project publishes a multi-arch Docker image at ghcr.io/jmrplens/gitlab-mcp-server for linux/amd64 and linux/arm64. The image runs as a non-root user (UID 10001), exposes port 8080, ships a built-in /health endpoint for orchestrators, and starts in HTTP mode by default.

Terminal window
docker run -d \
--name gitlab-mcp \
--read-only \
--tmpfs /tmp:rw,size=64m \
--cap-drop=ALL \
--security-opt=no-new-privileges:true \
-p 8080:8080 \
ghcr.io/jmrplens/gitlab-mcp-server:latest \
--http \
--http-addr=0.0.0.0:8080 \
--gitlab-url=https://gitlab.com
services:
gitlab-mcp:
image: ghcr.io/jmrplens/gitlab-mcp-server:latest
ports:
- "8080:8080"
command:
# Single instance mode (fixed GitLab.com URL for all clients; replace for self-managed GitLab):
- "--http"
- "--gitlab-url=https://gitlab.com"
- "--http-addr=:8080"
- "--max-http-clients=200"
- "--session-timeout=1h"
# Or multi-instance mode (remove --gitlab-url, clients send GITLAB-URL header)
# Security hardening (least privilege, OWASP Docker security)
read_only: true
tmpfs:
- /tmp:rw,size=64m,mode=1777
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
restart: unless-stopped

Start the service:

Terminal window
docker compose up -d

The image follows OWASP Docker Top 10 guidance:

PropertyValue
Base imagealpine:3.23 (minimal, regularly patched)
Userappuser (UID 10001, non-root)
FilesystemRead-only with writable tmpfs for /tmp
CapabilitiesAll dropped (--cap-drop=ALL)
Privilege escalationDisabled (no-new-privileges:true)
Build flags-trimpath -buildmode=pie (PIE binary, no source paths in stack traces)
OCI labelsorg.opencontainers.image.* populated with version, commit, source URL

Auto-update is disabled by default when using the reference docker-compose.yml (which sets --auto-update=false). Container immutability is the recommended pattern: pull a newer image tag and restart the container. If you need in-place updates (e.g. on a single-host deployment without an image registry mirror), set --auto-update=true and mount the binary path as a writable volume.

Fly.io is a managed platform that runs Docker images globally with built-in TLS, anycast routing, and per-region machine scaling. The repository ships a reference fly.toml configured for multi-instance mode — each client supplies its own GitLab token and GITLAB-URL header per request, so a single Fly app can serve users connecting to different GitLab instances.

  • A Fly.io account and the flyctl CLI installed
  • A clone of the repository (or a fork with your own fly.toml)
Terminal window
# 1. Sign in
flyctl auth login
# 2. Create the app (use a unique name; the default in fly.toml is gitlab-mcp-server)
flyctl launch --no-deploy --copy-config --name your-mcp-app-name
# 3. Deploy
flyctl deploy

The shipped fly.toml uses the multi-stage Dockerfile at the repository root and overrides the container CMD with HTTP-mode flags:

[experimental]
cmd = [
"--http",
"--http-addr", "0.0.0.0:8080",
"--meta-tools",
"--auto-update=false",
"--trusted-proxy-header", "Fly-Client-IP"
]
Terminal window
flyctl status # Machine state and recent deploys
flyctl logs # Live tail of structured logs
flyctl scale count 2 # Run two machines (e.g. for HA)
flyctl scale memory 512 # Bump memory to 512 MB (default in fly.toml: 256 MB)
  • The Fly proxy probes GET /health every 30s with a 5s timeout (configured under [[http_service.checks]])
  • TLS is terminated at the Fly edge with force_https = true — internal traffic to the machine on port 8080 is HTTP
  • auto_stop_machines = "stop" and min_machines_running = 0 let idle deployments scale to zero between requests

Auto-update is disabled in the shipped configuration (--auto-update=false). On Fly.io, the recommended upgrade flow is to redeploy with a new image:

Terminal window
flyctl deploy --image ghcr.io/jmrplens/gitlab-mcp-server:<version>

This replaces the running machines with the new version atomically and preserves your secrets and configuration.

The shipped fly.toml runs in legacy auth mode (PAT per request). OAuth mode is also supported but requires a stable public URL known at startup so the OAuth discovery endpoints (/.well-known/oauth-protected-resource) advertise the right metadata. To enable OAuth, set --auth-mode=oauth and --gitlab-url=<your-default> in the cmd array, then redeploy. See docs/oauth-app-setup.md for the GitLab OAuth Application setup.

You can verify the server is running by sending a tools/list request:

Terminal window
curl -s -X POST http://localhost:8080/mcp \
-H "Content-Type: application/json" \
-H "PRIVATE-TOKEN: glpat-your-token" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}' | head -c 200

A successful response returns a JSON-RPC result with the list of available tools.