Add latest changes from gitlab-org/gitlab@master

This commit is contained in:
GitLab Bot 2025-01-28 21:40:53 +00:00
parent ce7068dc28
commit 09b26cfc53
29 changed files with 761 additions and 98 deletions

View File

@ -19,12 +19,12 @@ the noise (due to constantly failing tests, flaky tests, and so on) so that new
- [ ] [Code review guidelines](https://docs.gitlab.com/ee/development/code_review.html)
- [ ] [Style guides](https://docs.gitlab.com/ee/development/contributing/style_guides.html)
- [ ] Quarantine test check-list
- [ ] Follow the [Quarantining Tests guide](https://handbook.gitlab.com/handbook/engineering/infrastructure/test-platform/debugging-qa-test-failures/#quarantining-tests).
- [ ] Confirm the test has a [`quarantine:` tag with the specified quarantine type](https://handbook.gitlab.com/handbook/engineering/infrastructure/test-platform/debugging-qa-test-failures/#quarantined-test-types).
- [ ] Follow the [Quarantining Tests guide](https://handbook.gitlab.com/handbook/engineering/infrastructure/test-platform/pipeline-triage/#quarantining-tests).
- [ ] Confirm the test has a [`quarantine:` tag with the specified quarantine type](https://handbook.gitlab.com/handbook/engineering/infrastructure/test-platform/pipeline-triage/#quarantined-test-types).
- [ ] Note if the test should be [quarantined for a specific environment](https://docs.gitlab.com/ee/development/testing_guide/end_to_end/execution_context_selection.html#quarantine-a-test-for-a-specific-environment).
- [ ] (Optionally) In case of an emergency (e.g. blocked deployments), consider adding labels to pick into auto-deploy (~"Pick into auto-deploy" ~"priority::1" ~"severity::1").
- [ ] Dequarantine test check-list
- [ ] Follow the [Dequarantining Tests guide](https://handbook.gitlab.com/handbook/engineering/infrastructure/test-platform/debugging-qa-test-failures/#dequarantining-tests).
- [ ] Follow the [Dequarantining Tests guide](https://handbook.gitlab.com/handbook/engineering/infrastructure/test-platform/pipeline-triage/#dequarantining-tests).
- [ ] Confirm the test consistently passes on the target GitLab environment(s).
- [ ] To ensure a faster turnaround, ask in the `#quality_maintainers` Slack channel for someone to review and merge the merge request, rather than assigning it directly.

View File

@ -11,10 +11,10 @@
- else
.form-group
= f.label :login, _('Username or primary email')
= f.text_field :login, value: @invite_email, class: 'form-control gl-form-input js-username-field', autocomplete: 'username', autofocus: 'autofocus', autocapitalize: 'off', autocorrect: 'off', required: true, title: _('This field is required.'), data: { testid: 'username-field' }
= f.text_field :login, value: @invite_email, class: 'form-control gl-form-input js-username-field', autocomplete: 'username', autofocus: 'autofocus', autocapitalize: 'off', autocorrect: 'off', required: true, title: _('Username or primary email is required.'), data: { testid: 'username-field' }
.form-group
= f.label :password, _('Password')
= f.password_field :password, class: 'form-control gl-form-input js-password', data: { id: 'user_password', name: 'user[password]', testid: 'password-field' }
= f.password_field :password, class: 'form-control gl-form-input js-password', data: { id: 'user_password', required: true, title: _('Password is required.'), name: 'user[password]', testid: 'password-field' }
.form-text.gl-text-right
- if unconfirmed_email?
= link_to _('Resend confirmation email'), new_user_confirmation_path

View File

@ -0,0 +1,67 @@
---
stage: SaaS Platforms
group: GitLab Dedicated
description: Get to know the GitLab Dedicated architecture through a series of diagrams.
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
---
# GitLab Dedicated architecture
This page provides a set of architectural documents and diagrams for GitLab Dedicated.
## High-level overview
The following diagram shows a high-level overview of the architecture for GitLab Dedicated,
where various AWS accounts managed by GitLab and customers are controlled by the Switchboard application.
![Diagram of a high-level overview of the GitLab Dedicated architecture.](img/high_level_architecture_diagram_v18_0.png)
When managing GitLab Dedicated tenant instances:
- Switchboard is responsible for managing global configuration shared between the AWS cloud providers, accessible by tenants.
- Amp is responsible for the interaction with the customer tenant accounts, such as configuring expected roles and policies, enabling the required services, and provisioning environments.
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/e0b6661c-6c10-43d9-8afa-1fe0677e060c/edit?page=0_0#) files for the diagram in Lucidchart.
## Tenant network
The customer tenant account is a single AWS cloud provider account. The single account provides full tenancy isolation, in its own VPC, and with its own resource quotas.
The cloud provider account is where a highly resilient GitLab installation resides, in its own isolated VPC. On provisioning, the customer tenant gets access to a High Availability (HA) GitLab primary site and a GitLab Geo secondary site.
![Diagram of GitLab-managed AWS accounts in an isolated VPC containing a highly resilient GitLab installation.](img/tenant_network_diagram_v18_0.png)
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/0815dd58-b926-454e-8354-c33fe3e7bff0/edit?invitationId=inv_a6b618ff-6c18-4571-806a-bfb3fe97cb12) files for the diagram in Lucidchart.
### Gitaly setup
GitLab Dedicated deploys Gitaly [in a sharded setup](../../administration/gitaly/index.md#before-deploying-gitaly-cluster), not a Gitaly Cluster. In this setup:
- Customer repositories are spread across multiple virtual machines.
- GitLab manages [storage weights](../../administration/repository_storage_paths.md#configure-where-new-repositories-are-stored) on behalf of the customer.
### Geo setup
GitLab Dedicated leverages GitLab Geo for [disaster recovery](../../subscriptions/gitlab_dedicated/data_residency_and_high_availability.md#disaster-recovery).
Geo does not use an active-active failover configuration. For more information, see [Geo](../../administration/geo/index.md).
### AWS PrivateLink connection (optional)
Optionally, private connectivity is available for your GitLab Dedicated instance, using [AWS PrivateLink](https://aws.amazon.com/privatelink/) as a connection gateway.
Both [inbound](../../administration/dedicated/configure_instance/network_security.md#inbound-private-link) and [outbound](../../administration/dedicated/configure_instance/network_security.md#outbound-private-link) private links are supported.
![Diagram of a GitLab-managed AWS VPC using AWS PrivateLink to connect with a customer-managed AWS VPC.](img/privatelink_diagram_v17_1.png)
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/933b958b-bfad-4898-a8ae-182815f159ca/edit?invitationId=inv_38b9a265-dff2-4db6-abdb-369ea1e92f5f) files for the diagram in Lucidchart.
## Hosted runners for GitLab Dedicated
The following diagram illustrates a GitLab-managed AWS account that contains GitLab runners, which are interconnected to a GitLab Dedicated instance, the public internet, and optionally a customer AWS account that uses AWS PrivateLink.
![Diagram of hosted Runners architecture for GitLab Dedicated.](img/hosted-runners-architecture_v17_3.png)
For more information on how runners authenticate and execute the job payload, see [runner execution flow](https://docs.gitlab.com/runner#runner-execution-flow).
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/0fb12de8-5236-4d80-9a9c-61c08b714e6f/edit?invitationId=inv_4a12e347-49e8-438e-a28f-3930f936defd) files for the diagram in Lucidchart.

View File

@ -13,12 +13,11 @@ DETAILS:
The instructions on this page guide you through configuring your GitLab Dedicated instance, including enabling and updating the settings for [available functionality](../../../subscriptions/gitlab_dedicated/index.md#available-features).
Any functionality in the GitLab application that is not controlled by the SaaS environment can be configured by using the [**Admin** area](../../../administration/admin_area.md).
Administrators can configure additional settings in their GitLab application by using the [**Admin** area](../../../administration/admin_area.md).
Examples of SaaS environment settings include `gitlab.rb` configurations and access to shell, Rails console, and PostgreSQL console.
These environment settings cannot be changed by tenants.
As a GitLab-managed solution, you cannot change any GitLab functionality controlled by SaaS environment settings. Examples of such SaaS environment settings include `gitlab.rb` configurations and access to shell, Rails console, and PostgreSQL console.
GitLab Dedicated Engineers also don't have direct access to tenant environments, except for [break glass situations](../../../subscriptions/gitlab_dedicated/index.md#access-controls).
GitLab Dedicated engineers do not have direct access to your environment, except for [break glass situations](../../../subscriptions/gitlab_dedicated/index.md#access-controls).
NOTE:
An instance refers to a GitLab Dedicated deployment, whereas a tenant refers to a customer.
@ -46,7 +45,7 @@ To make a configuration change:
1. Follow the instructions in the relevant sections below.
For all other instance configurations, submit a support ticket according to the
[configuration change request policy](../configure_instance/index.md#configuration-change-request-policy).
[configuration change request policy](../configure_instance/index.md#request-configuration-changes-with-a-support-ticket).
### Apply configuration changes in Switchboard
@ -98,9 +97,9 @@ To view the configuration change log:
Each configuration change appears as an entry in the table. Select **View details** to see more information about each change.
## Configuration change request policy
## Request configuration changes with a support ticket
This policy does not apply to configuration changes made by a GitLab Dedicated instance admin using Switchboard.
Certain configuration changes require that you submit a support ticket to request the changes. For more information on how to create a support ticket, see [creating a ticket](https://about.gitlab.com/support/portal/#creating-a-ticket).
Configuration changes requested with a [support ticket](https://support.gitlab.com/hc/en-us/requests/new?ticket_form_id=4414917877650) adhere to the following policies:

View File

@ -164,7 +164,7 @@ Hosted runners for GitLab Dedicated have the following configurations:
You can also [enable private connections](#outbound-private-link) from hosted runners to your AWS account.
For more information, see the architecture diagram for [Hosted runners for GitLab Dedicated](index.md#hosted-runners-for-gitlab-dedicated).
For more information, see the architecture diagram for [hosted runners for GitLab Dedicated](architecture.md#hosted-runners-for-gitlab-dedicated).
### Outbound private link

View File

@ -11,75 +11,72 @@ DETAILS:
**Tier:** Ultimate
**Offering:** GitLab Dedicated
GitLab Dedicated is a single-tenant SaaS solution, fully managed and hosted by GitLab.
GitLab Dedicated operators and tenant administrators can use Switchboard to provision, configure, and maintain their tenant environments.
Use GitLab Dedicated to run GitLab on a fully-managed, single-tenant instance hosted on AWS. You maintain control over your instance configuration through Switchboard, the GitLab Dedicated management portal, while GitLab manages the underlying infrastructure.
For more information about this offering, see the [subscription page](../../subscriptions/gitlab_dedicated/index.md).
## Architecture
## Architecture overview
This page collects a set of architectural documents and diagrams for GitLab Dedicated.
GitLab Dedicated runs on a secure infrastructure that provides:
### High-level overview
- A fully isolated tenant environment in AWS
- High availability with automated failover
- Geo-based disaster recovery
- Regular updates and maintenance
- Enterprise-grade security controls
The following diagram shows a high-level overview of the architecture for GitLab Dedicated,
where various AWS accounts managed by GitLab and customers are controlled by a Switchboard application.
To learn more, see [GitLab Dedicated Architecture](architecture.md).
![Diagram of a high-level overview of the GitLab Dedicated architecture.](img/high_level_architecture_diagram_v18_0.png)
## Configure infrastructure
When managing GitLab Dedicated tenant instances:
| Feature | How it works | Set up with |
|------------|-------------|---------------------|
| [Instance sizing](../../subscriptions/gitlab_dedicated/data_residency_and_high_availability.md#availability-and-scalability) | You select an instance size based on your user count. GitLab provisions and maintains the infrastructure. | Onboarding |
| [AWS data regions](../../subscriptions/gitlab_dedicated/data_residency_and_high_availability.md#available-aws-regions) | You choose regions for primary operations, disaster recovery, and backup. GitLab replicates your data across these regions. | Onboarding |
| [Maintenance windows](maintenance.md#maintenance-windows) | You select a weekly 4-hour maintenance window. GitLab performs updates, configuration changes, and security patches during this time. | Onboarding |
| [Release management](maintenance.md#release-rollout-schedule) | GitLab updates your instance monthly with new features and security patches. | Available by <br>default |
| [Geo disaster recovery](create_instance.md#step-2-create-your-gitlab-dedicated-instance) | You choose the secondary region during onboarding. GitLab maintains a replicated secondary site in your chosen region using Geo. | Onboarding |
| [Backup and recovery](../../subscriptions/gitlab_dedicated/data_residency_and_high_availability.md#disaster-recovery) | GitLab backs up your data to your chosen AWS region. | Available by <br>default |
- Switchboard is responsible for managing global configuration shared between the AWS cloud providers, accessible by tenants.
- Amp is responsible for the interaction with the customer tenant accounts, such as configuring expected roles and policies, enabling the required services, and provisioning environments.
## Secure your instance
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/e0b6661c-6c10-43d9-8afa-1fe0677e060c/edit?page=0_0#) files for the diagram in Lucidchart.
| Feature | How it works | Set up with |
|------------|-------------|-----------------|
| [Encryption (BYOK)](create_instance.md#encrypted-data-at-rest-byok) | You provide AWS KMS keys for data encryption. GitLab integrates these keys with your instance. | Onboarding |
| [SAML SSO](configure_instance/saml.md) | You configure the connection to your identity provider. GitLab handles the authentication flow. | Switchboard |
| [IP allowlists](configure_instance/network_security.md#ip-allowlist) | You specify approved IP addresses. GitLab blocks unauthorized access attempts. | Switchboard |
| [Custom certificates](configure_instance/network_security.md#custom-certificates) | You import your SSL certificates. GitLab maintains secure connections to your private services. | Switchboard |
| [Compliance frameworks](../../subscriptions/gitlab_dedicated/index.md#monitoring) | GitLab maintains compliance with SOC 2, ISO 27001, and other frameworks. You can access reports through the [Trust Center](https://trust.gitlab.com/?product=gitlab-dedicated). | Available by <br>default |
| [Emergency access protocols](../../subscriptions/gitlab_dedicated/index.md#access-controls) | GitLab provides controlled break-glass procedures for urgent situations. | Available by <br>default |
### Tenant network
## Set up networking
The customer tenant account is a single AWS cloud provider account. The single account provides full tenancy isolation, in its own VPC, and with its own resource quotas.
| Feature | How it works | Set up with |
|------------|-------------|-----------------|
| [Custom hostname (BYOD)](configure_instance/network_security.md#bring-your-own-domain-byod) | You provide a domain name and configure DNS records. GitLab manages SSL certificates through Let's Encrypt. | Support ticket |
| [Inbound Private Link](configure_instance/network_security.md#inbound-private-link) | You request secure AWS VPC connections. GitLab configures PrivateLink endpoints in your VPC. | Support ticket |
| [Outbound Private Link](configure_instance/network_security.md#outbound-private-link) | You create the endpoint service in your AWS account. GitLab establishes connections using your service endpoints. | Switchboard |
| [Private hosted zones](configure_instance/network_security.md#private-hosted-zones) | You define internal DNS requirements. GitLab configures DNS resolution in your instance network. | Switchboard |
The cloud provider account is where a highly resilient GitLab installation resides, in its own isolated VPC. On provisioning, the customer tenant gets access to a High Availability (HA) GitLab primary site and a GitLab Geo secondary site.
## Use platform tools
![Diagram of GitLab-managed AWS accounts in an isolated VPC containing a highly resilient GitLab installation.](img/tenant_network_diagram_v18_0.png)
| Feature | How it works | Set up with |
|------------|-------------|-----------------|
| [GitLab Pages](../../subscriptions/gitlab_dedicated/index.md#gitlab-pages) | GitLab hosts your static websites on a dedicated domain. You can publish sites from your repositories. | Available by <br>default |
| [Advanced search](../../integration/advanced_search/elasticsearch.md) | GitLab maintains the search infrastructure. You can search across your code, issues, and merge requests. | Available by <br>default |
| [Hosted runners (beta)](hosted_runners.md) | You purchase a subscription and configure your hosted runners. GitLab manages the auto-scaling CI/CD infrastructure. | Switchboard |
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/0815dd58-b926-454e-8354-c33fe3e7bff0/edit?invitationId=inv_a6b618ff-6c18-4571-806a-bfb3fe97cb12) files for the diagram in Lucidchart.
## Manage daily operations
#### Gitaly setup
GitLab Dedicated deploys Gitaly [in a sharded setup](../../administration/gitaly/index.md#before-deploying-gitaly-cluster), not a Gitaly Cluster. In this setup:
- Customer repositories are spread across multiple virtual machines.
- GitLab manages [storage weights](../../administration/repository_storage_paths.md#configure-where-new-repositories-are-stored) on behalf of the customer.
#### Geo setup
GitLab Dedicated leverages GitLab Geo for [disaster recovery](../../subscriptions/gitlab_dedicated/data_residency_and_high_availability.md#disaster-recovery).
Geo does not use an active-active failover configuration. For more information, see [Geo](../../administration/geo/index.md).
#### AWS PrivateLink connection (optional)
Optionally, private connectivity is available for your GitLab Dedicated instance, using [AWS PrivateLink](https://aws.amazon.com/privatelink/) as a connection gateway.
Both [inbound](../../administration/dedicated/configure_instance/network_security.md#inbound-private-link) and [outbound](../../administration/dedicated/configure_instance/network_security.md#outbound-private-link) private links are supported.
![Diagram of a GitLab-managed AWS VPC using AWS PrivateLink to connect with a customer-managed AWS VPC.](img/privatelink_diagram_v17_1.png)
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/933b958b-bfad-4898-a8ae-182815f159ca/edit?invitationId=inv_38b9a265-dff2-4db6-abdb-369ea1e92f5f) files for the diagram in Lucidchart.
### Hosted runners for GitLab Dedicated
The following diagram illustrates a GitLab-managed AWS account that contains GitLab runners, which are interconnected to a GitLab Dedicated instance, the public internet, and optionally a customer AWS account that uses AWS PrivateLink.
![Diagram of hosted Runners architecture for GitLab Dedicated.](img/hosted-runners-architecture_v17_3.png)
For more information on how runners authenticate and execute the job payload, see [Runner execution flow](https://docs.gitlab.com/runner#runner-execution-flow).
GitLab team members with edit access can update the [source](https://lucid.app/lucidchart/0fb12de8-5236-4d80-9a9c-61c08b714e6f/edit?invitationId=inv_4a12e347-49e8-438e-a28f-3930f936defd) files for the diagram in Lucidchart.
| Feature | How it works | Set up with |
|------------|-------------|-----------------|
| [Application logs](monitor.md) | GitLab delivers logs to your AWS S3 bucket. You can request access to monitor instance activity through these logs. | Support ticket |
| [Email service](configure_instance/users_notifications.md#smtp-email-service) | GitLab provides AWS SES by default to send emails from your GitLab Dedicated instance. You can also configure your own SMTP email service. | Support ticket for <br/>custom service |
| [Switchboard access and <br>notifications](configure_instance/users_notifications.md) | You manage Switchboard permissions and notification settings. GitLab maintains the Switchboard infrastructure. | Switchboard |
## Get started
To get started with GitLab Dedicated, use Switchboard to:
To get started with GitLab Dedicated:
1. [Create your GitLab Dedicated instance](../../administration/dedicated/create_instance.md).
1. [Configure your GitLab Dedicated instance](../../administration/dedicated/configure_instance/index.md).

View File

@ -249,7 +249,8 @@ NOTE:
Tracing is available in Development and Testing environment only.
It's not available in Production environment.
1. Access to [LangSmith](https://smith.langchain.com/) site and create an account (You can also be added to GitLab organization).
1. Access [LangSmith](https://smith.langchain.com/) and create an account
1. Optional: [Create an Access Request](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/new?issuable_template=Individual_Bulk_Access_Request) to be added to the GitLab organization in LangSmith.
1. Create [an API key](https://docs.smith.langchain.com/#create-an-api-key) (be careful where you create API key - they can be created in personal namespace or in GL namespace).
1. Set the following environment variables in GDK. You can define it in `env.runit` or directly `export` in the terminal.

View File

@ -6,3 +6,6 @@ Gemfile/MissingFeatureCategory:
Search/NamespacedClass:
Enabled: false
RSpec/MultipleMemoizedHelpers:
Max: 25

View File

@ -53,7 +53,7 @@ ActiveContext.configure do |config|
config.databases = {
es1: {
adapter: 'elasticsearch',
adapter: 'ActiveContext::Databases::Elasticsearch::Adapter',
prefix: 'gitlab_active_context',
options: ::Gitlab::CurrentSettings.elasticsearch_config
}
@ -70,6 +70,36 @@ end
| `client_request_timeout` | The timeout for client requests in seconds | No | N/A | `60` |
| `retry_on_failure` | The number of times to retry a failed request | No | `0` (no retries) | `3` |
| `debug` | Enable or disable debug logging | No | `false` | `true` |
| `max_bulk_size_bytes` | Maximum size before forcing a bulk operation in megabytes | No | `10.megabytes` | `5242880` |
### Scheduling a cron worker for async processing
Create a file which includes the `BulkAsyncProcess` concern and other worker-specific concerns:
```ruby
# frozen_string_literal: true
module Ai
module Context
class BulkProcessWorker
include ActiveContext::Concerns::BulkAsyncProcess
include ::ApplicationWorker
include ::CronjobQueue
include Search::Worker
include Gitlab::ExclusiveLeaseHelpers
prepend ::Geo::SkipSecondary
idempotent!
worker_resource_boundary :cpu
urgency :low
data_consistency :sticky
loggable_arguments 0, 1
end
end
end
```
Schedule the worker on a cron schedule in `config/initializers/1_settings.rb`.
### Registering a queue

View File

@ -0,0 +1,52 @@
# frozen_string_literal: true
module ActiveContext
class BulkProcessor
attr_reader :failures, :adapter
def initialize
@failures = []
@adapter = ActiveContext.adapter
end
def process(ref)
send_bulk if @adapter.add_ref(ref)
end
def flush
send_bulk.failures
end
private
def send_bulk
return self if adapter.empty?
failed_refs = try_send_bulk
logger.info(
'message' => 'bulk_submitted',
'meta.indexing.bulk_count' => adapter.all_refs.size,
'meta.indexing.errors_count' => failed_refs.count
)
failures.push(*failed_refs)
adapter.reset
self
end
def try_send_bulk
result = adapter.bulk
adapter.process_bulk_errors(result)
rescue StandardError => e
logger.error(message: 'bulk_exception', error_class: e.class.to_s, error_message: e.message)
adapter.all_refs
end
def logger
@logger ||= ActiveContext::Config.logger
end
end
end

View File

@ -4,17 +4,24 @@ module ActiveContext
module Databases
module Concerns
module Adapter
attr_reader :client
attr_reader :options, :client, :indexer
delegate :search, to: :client
delegate :all_refs, :add_ref, :empty?, :bulk, :process_bulk_errors, :reset, to: :indexer
def initialize(options)
@options = options
@client = client_klass.new(options)
@indexer = indexer_klass.new(options, client)
end
def client_klass
raise NotImplementedError
end
def indexer_klass
raise NotImplementedError
end
end
end
end

View File

@ -0,0 +1,61 @@
# frozen_string_literal: true
# rubocop: disable Gitlab/ModuleWithInstanceVariables -- this is a concern
module ActiveContext
module Databases
module Concerns
module Indexer
attr_reader :options, :client, :refs
def initialize(options, client)
@options = options
@client = client
@refs = []
end
def all_refs
refs
end
# Adds a reference to the refs array
#
# @param ref [Object] The reference to add
# @return [Boolean] True if bulk processing should be forced, e.g., when a size threshold is reached
def add_ref(ref)
raise NotImplementedError
end
# Checks if nothing should be processed
#
# @return [Boolean] True if bulk processing should be skipped
def empty?
raise NotImplementedError
end
# Performs bulk processing on the refs array
#
# @return [Object] The result of bulk processing
def bulk
raise NotImplementedError
end
# Processes errors from bulk operation
#
# @param result [Object] The result from the bulk operation
# @return [Array] Any failures that occurred during bulk processing
def process_bulk_errors(_result)
raise NotImplementedError
end
# Resets the adapter to a clean state
def reset
@refs = []
# also reset anything that builds up from the refs array
end
end
end
end
end
# rubocop: enable Gitlab/ModuleWithInstanceVariables

View File

@ -9,6 +9,10 @@ module ActiveContext
def client_klass
ActiveContext::Databases::Elasticsearch::Client
end
def indexer_klass
ActiveContext::Databases::Elasticsearch::Indexer
end
end
end
end

View File

@ -6,6 +6,8 @@ module ActiveContext
class Client
include ActiveContext::Databases::Concerns::Client
delegate :bulk, to: :client
OPEN_TIMEOUT = 5
NO_RETRY = 0

View File

@ -0,0 +1,99 @@
# frozen_string_literal: true
module ActiveContext
module Databases
module Elasticsearch
class Indexer
include ActiveContext::Databases::Concerns::Indexer
DEFAULT_MAX_BULK_SIZE = 10.megabytes
attr_reader :operations, :bulk_size
def initialize(...)
super
@operations = []
@bulk_size = 0
end
def add_ref(ref)
operation = build_operation(ref)
@refs << ref
@operations << operation
@bulk_size += calculate_operation_size(operation)
bulk_size >= bulk_threshold
end
def empty?
operations.empty?
end
def bulk
client.bulk(body: operations.flatten)
end
def process_bulk_errors(result)
return [] unless result['errors']
failed_refs = []
result['items'].each_with_index do |item, index|
op = item['index'] || item['update'] || item['delete']
next unless op.nil? || op['error']
ref = refs[index]
logger.warn(
'message' => 'indexing_failed',
'meta.indexing.error' => op&.dig('error') || 'Operation was nil',
'meta.indexing.status' => op&.dig('status'),
'meta.indexing.operation_type' => item.each_key.first,
'meta.indexing.ref' => ref.serialize,
'meta.indexing.identifier' => ref.identifier
)
failed_refs << ref
end
failed_refs
end
def reset
super
@operations = []
@bulk_size = 0
end
private
def build_operation(ref)
case ref.operation.to_sym
when :index, :upsert
[
{ update: { _index: ref.index_name, _id: ref.identifier, routing: ref.routing }.compact },
{ doc: ref.as_indexed_json, doc_as_upsert: true }
]
when :delete
[{ delete: { _index: ref.index_name, _id: ref.identifier, routing: ref.routing }.compact }]
else
raise StandardError, "Operation #{ref.operation} is not supported"
end
end
def calculate_operation_size(operation)
operation.to_json.bytesize + 2 # Account for newlines
end
def bulk_threshold
@bulk_threshold ||= options[:max_bulk_size_bytes] || DEFAULT_MAX_BULK_SIZE
end
def logger
@logger ||= ActiveContext::Config.logger
end
end
end
end
end

View File

@ -9,6 +9,10 @@ module ActiveContext
def client_klass
ActiveContext::Databases::Opensearch::Client
end
def indexer_klass
ActiveContext::Databases::Opensearch::Indexer
end
end
end
end

View File

@ -0,0 +1,11 @@
# frozen_string_literal: true
module ActiveContext
module Databases
module Opensearch
class Indexer
include ActiveContext::Databases::Concerns::Indexer
end
end
end
end

View File

@ -9,6 +9,10 @@ module ActiveContext
def client_klass
ActiveContext::Databases::Postgresql::Client
end
def indexer_klass
ActiveContext::Databases::Postgresql::Indexer
end
end
end
end

View File

@ -0,0 +1,11 @@
# frozen_string_literal: true
module ActiveContext
module Databases
module Postgresql
class Indexer
include ActiveContext::Databases::Concerns::Indexer
end
end
end
end

View File

@ -0,0 +1,114 @@
# frozen_string_literal: true
RSpec.describe ActiveContext::BulkProcessor do
let(:adapter) { ActiveContext::Databases::Elasticsearch::Adapter.new(url: 'http://localhost:9200') }
let(:logger) { instance_double(Logger) }
let(:ref) { double }
before do
allow(ActiveContext).to receive(:adapter).and_return(adapter)
allow(ActiveContext::Config).to receive(:logger).and_return(logger)
allow(logger).to receive(:info)
allow(logger).to receive(:error)
allow(ref).to receive_messages(
operation: :index,
id: 1,
as_indexed_json: { title: 'Test Issue' },
index_name: 'issues',
identifier: '1',
routing: 'group_1'
)
end
describe '#initialize' do
it 'initializes with empty failures and the correct adapter' do
processor = described_class.new
expect(processor.failures).to be_empty
expect(processor.adapter).to be_a(ActiveContext::Databases::Elasticsearch::Adapter)
end
end
describe '#process' do
let(:processor) { described_class.new }
it 'adds ref to adapter and calls send_bulk if it returns true' do
allow(adapter).to receive(:add_ref).and_return(true)
expect(processor).to receive(:send_bulk).once
processor.process(ref)
end
it 'adds ref to adapter and does not call send_bulk if it returns false' do
allow(adapter).to receive(:add_ref).and_return(false)
expect(processor).not_to receive(:send_bulk)
processor.process(ref)
end
end
describe '#flush' do
let(:processor) { described_class.new }
it 'calls send_bulk and returns failures' do
allow(processor).to receive(:send_bulk).and_return(processor)
expect(processor.flush).to eq([])
end
end
describe '#send_bulk' do
let(:processor) { described_class.new }
before do
processor.process(ref)
end
it 'processes bulk and logs info' do
allow(adapter).to receive(:bulk).and_return({ 'items' => [] })
expect(logger).to receive(:info).with(
'message' => 'bulk_submitted',
'meta.indexing.bulk_count' => 1,
'meta.indexing.errors_count' => 0
)
processor.send(:send_bulk)
end
it 'resets the adapter after processing' do
allow(adapter).to receive(:bulk).and_return({ 'items' => [] })
expect(adapter).to receive(:reset)
processor.send(:send_bulk)
end
end
describe '#try_send_bulk' do
let(:processor) { described_class.new }
before do
processor.process(ref)
end
context 'when bulk processing succeeds' do
it 'returns empty array' do
allow(adapter).to receive(:bulk).and_return({ 'items' => [] })
expect(processor.send(:try_send_bulk)).to eq([])
end
end
context 'when bulk processing fails' do
it 'logs error and returns all refs' do
allow(adapter).to receive(:bulk).and_raise(StandardError.new('Bulk processing failed'))
expect(logger).to receive(:error).with(
message: 'bulk_exception',
error_class: 'StandardError',
error_message: 'Bulk processing failed'
)
expect(processor.send(:try_send_bulk)).to eq([ref])
end
end
end
end

View File

@ -0,0 +1,121 @@
# frozen_string_literal: true
RSpec.describe ActiveContext::Databases::Elasticsearch::Indexer do
let(:es_client) { instance_double(Elasticsearch::Client) }
let(:logger) { instance_double(Logger, warn: nil) }
let(:options) { {} }
let(:indexer) { described_class.new(options, es_client) }
let(:ref) { double }
before do
allow(ActiveContext::Config).to receive(:logger).and_return(logger)
allow(ref).to receive_messages(
operation: :index,
id: 1,
as_indexed_json: { title: 'Test Issue' },
index_name: 'issues',
identifier: '1',
routing: 'group_1',
serialize: 'issue 1 group_1'
)
end
describe '#initialize' do
it 'initializes with empty operations and zero bulk size' do
expect(indexer.operations).to be_empty
expect(indexer.bulk_size).to eq(0)
end
end
describe '#add_ref' do
it 'adds the ref and returns true when bulk threshold is reached' do
allow(indexer).to receive(:bulk_threshold).and_return(1)
expect(indexer.add_ref(ref)).to be true
expect(indexer.operations).not_to be_empty
end
it 'adds the ref and returns false when bulk threshold is not reached' do
allow(indexer).to receive(:bulk_threshold).and_return(1000000)
expect(indexer.add_ref(ref)).to be false
expect(indexer.operations).not_to be_empty
end
it 'raises an error for unsupported operations' do
allow(ref).to receive(:operation).and_return(:unsupported)
expect { indexer.add_ref(ref) }.to raise_error(StandardError, /Operation unsupported is not supported/)
end
end
describe '#empty?' do
it 'returns true when there are no operations' do
expect(indexer).to be_empty
end
it 'returns false when there are operations' do
indexer.instance_variable_set(:@operations, [{}])
expect(indexer).not_to be_empty
end
end
describe '#bulk' do
before do
indexer.instance_variable_set(:@operations, [{ index: {} }])
end
it 'calls bulk on the client with flattened operations' do
expect(es_client).to receive(:bulk).with(body: [{ index: {} }])
indexer.bulk
end
end
describe '#process_bulk_errors' do
before do
indexer.instance_variable_set(:@refs, [ref])
end
context 'when there are no errors' do
it 'returns an empty array' do
result = { 'errors' => false }
expect(indexer.process_bulk_errors(result)).to be_empty
end
end
context 'when there are errors' do
let(:result) do
{
'errors' => true,
'items' => [
{ 'index' => { 'error' => 'Error message', 'status' => 400 } }
]
}
end
it 'logs warnings and returns failed refs' do
expect(logger).to receive(:warn).with(
'message' => 'indexing_failed',
'meta.indexing.error' => 'Error message',
'meta.indexing.status' => 400,
'meta.indexing.operation_type' => 'index',
'meta.indexing.ref' => 'issue 1 group_1',
'meta.indexing.identifier' => '1'
)
failed_refs = indexer.process_bulk_errors(result)
expect(failed_refs).to eq([ref])
end
end
end
describe '#reset' do
before do
indexer.instance_variable_set(:@operations, [{}])
indexer.instance_variable_set(:@bulk_size, 100)
end
it 'resets operations and bulk size' do
indexer.reset
expect(indexer.operations).to be_empty
expect(indexer.bulk_size).to eq(0)
end
end
end

View File

@ -1,16 +1,13 @@
# frozen_string_literal: true
require "active_context"
require 'active_context'
require 'active_support/all'
require 'logger'
require 'elasticsearch'
require 'opensearch'
require 'aws-sdk-core'
require 'active_support/concern'
require 'redis'
require 'byebug'
require 'active_support'
require 'active_support/core_ext/numeric/time'
require 'active_context/concerns/bulk_async_process'
Dir[File.join(__dir__, 'support/**/*.rb')].each { |f| require f }

View File

@ -277,16 +277,16 @@ module Gitlab
# `remote_storage: 'gitaly-2'`. And then the metadata would say
# "gitaly-2 is at network address tcp://10.0.1.2:8075".
#
def self.call(storage, service, rpc, request, remote_storage: nil, timeout: default_timeout, &block)
Gitlab::GitalyClient::Call.new(storage, service, rpc, request, remote_storage, timeout).call(&block)
def self.call(storage, service, rpc, request, remote_storage: nil, timeout: default_timeout, gitaly_context: {}, &block)
Gitlab::GitalyClient::Call.new(storage, service, rpc, request, remote_storage, timeout, gitaly_context: gitaly_context).call(&block)
end
def self.execute(storage, service, rpc, request, remote_storage:, timeout:)
def self.execute(storage, service, rpc, request, remote_storage:, timeout:, gitaly_context: {})
enforce_gitaly_request_limits(:call)
Gitlab::RequestContext.instance.ensure_deadline_not_exceeded!
raise_if_concurrent_ruby!
kwargs = request_kwargs(storage, timeout: timeout.to_f, remote_storage: remote_storage)
kwargs = request_kwargs(storage, timeout: timeout.to_f, remote_storage: remote_storage, gitaly_context: gitaly_context)
kwargs = yield(kwargs) if block_given?
stub(service, storage).__send__(rpc, request, kwargs) # rubocop:disable GitlabSecurity/PublicSend
@ -324,12 +324,12 @@ module Gitlab
end
private_class_method :authorization_token
def self.request_kwargs(storage, timeout:, remote_storage: nil)
def self.request_kwargs(storage, timeout:, remote_storage: nil, gitaly_context: {})
metadata = {
'authorization' => "Bearer #{authorization_token(storage)}",
'client_name' => CLIENT_NAME
}
gitaly_context = {}
relative_path = fetch_relative_path
::Gitlab::Auth::Identity.currently_linked do |identity|

View File

@ -3,7 +3,7 @@
module Gitlab
module GitalyClient
class Call
def initialize(storage, service, rpc, request, remote_storage, timeout)
def initialize(storage, service, rpc, request, remote_storage, timeout, gitaly_context: {})
@storage = storage
@service = service
@rpc = rpc
@ -11,11 +11,12 @@ module Gitlab
@remote_storage = remote_storage
@timeout = timeout
@duration = 0
@gitaly_context = gitaly_context
end
def call(&block)
response = recording_request do
GitalyClient.execute(@storage, @service, @rpc, @request, remote_storage: @remote_storage, timeout: @timeout, &block)
GitalyClient.execute(@storage, @service, @rpc, @request, remote_storage: @remote_storage, timeout: @timeout, gitaly_context: @gitaly_context, &block)
end
if response.is_a?(Enumerator)

View File

@ -41085,6 +41085,9 @@ msgstr ""
msgid "Password confirmation"
msgstr ""
msgid "Password is required."
msgstr ""
msgid "Password of the Jenkins server."
msgstr ""
@ -62308,6 +62311,9 @@ msgstr ""
msgid "Username or primary email"
msgstr ""
msgid "Username or primary email is required."
msgstr ""
msgid "Username:"
msgstr ""

View File

@ -11,9 +11,10 @@ RSpec.describe Gitlab::GitalyClient::Call, feature_category: :gitaly do
let(:rpc) { :find_local_branches }
let(:service) { :ref_service }
let(:timeout) { client.long_timeout }
let(:gitaly_context) { { key: :value } }
subject do
described_class.new(storage, service, rpc, request, remote_storage, timeout).call
described_class.new(storage, service, rpc, request, remote_storage, timeout, gitaly_context: gitaly_context).call
end
before do
@ -34,6 +35,17 @@ RSpec.describe Gitlab::GitalyClient::Call, feature_category: :gitaly do
)
end
it 'proxy provided arguments to GitalyClient.execute' do
response = 'response'
expect(client).to receive(:execute).with(
storage, service, rpc, request,
remote_storage: remote_storage, timeout: timeout, gitaly_context: gitaly_context
).and_return(response)
expect(subject).to eq(response)
end
context 'when the response is not an enumerator' do
let(:response) do
Gitaly::FindLocalBranchesResponse.new

View File

@ -394,6 +394,26 @@ RSpec.describe Gitlab::GitalyClient, feature_category: :gitaly do
expect(results[:metadata]).to include('gitaly-session-id')
end
context 'with gitaly_context' do
let(:gitaly_context) { { key: :value } }
it 'passes context as "gitaly-client-context-bin"' do
kwargs = described_class.request_kwargs('default', timeout: 1, gitaly_context: gitaly_context)
expect(kwargs[:metadata]['gitaly-client-context-bin']).to eq(gitaly_context.to_json)
end
context 'when empty context' do
let(:gitaly_context) { {} }
it 'does not provide "gitaly-client-context-bin"' do
kwargs = described_class.request_kwargs('default', timeout: 1, gitaly_context: gitaly_context)
expect(kwargs[:metadata]).not_to have_key('gitaly-client-context-bin')
end
end
end
context 'when RequestStore is not enabled' do
it 'sets a different gitaly-session-id per request' do
gitaly_session_id = described_class.request_kwargs('default', timeout: 1)[:metadata]['gitaly-session-id']
@ -959,12 +979,36 @@ RSpec.describe Gitlab::GitalyClient, feature_category: :gitaly do
end
end
describe '.call' do
subject(:call) do
described_class.call(storage, service, rpc, request, remote_storage: remote_storage, timeout: timeout, gitaly_context: gitaly_context)
end
let(:storage) { 'default' }
let(:service) { :ref_service }
let(:rpc) { :find_local_branches }
let(:request) { Gitaly::FindLocalBranchesRequest.new }
let(:remote_storage) { nil }
let(:timeout) { 10.seconds }
let(:gitaly_context) { { key: :value } }
it 'inits Gitlab::GitalyClient::Call instance with provided arguments' do
expect(Gitlab::GitalyClient::Call).to receive(:new).with(
storage, service, rpc, request, remote_storage, timeout, gitaly_context: gitaly_context
).and_call_original
call
end
end
describe '.execute' do
subject(:execute) do
described_class.execute('default', :ref_service, :find_local_branches, Gitaly::FindLocalBranchesRequest.new,
remote_storage: nil, timeout: 10.seconds)
remote_storage: nil, timeout: 10.seconds, gitaly_context: gitaly_context)
end
let(:gitaly_context) { {} }
it 'raises an exception when running within a concurrent Ruby thread' do
Thread.current[:restrict_within_concurrent_ruby] = true
@ -973,5 +1017,17 @@ RSpec.describe Gitlab::GitalyClient, feature_category: :gitaly do
Thread.current[:restrict_within_concurrent_ruby] = nil
end
context 'with gitaly_context' do
let(:gitaly_context) { { key: :value } }
it 'passes the gitaly_context to .request_kwargs' do
expect(described_class).to receive(:request_kwargs).with(
'default', timeout: 10.seconds, remote_storage: nil, gitaly_context: gitaly_context
).and_call_original
execute
end
end
end
end

View File

@ -4,8 +4,8 @@ require 'spec_helper'
RSpec.describe Gitlab::ImportExport::Project::RelationFactory, :use_clean_rails_memory_store_caching, feature_category: :importers do
let(:group) { create(:group, maintainers: importer_user) }
let(:project) { create(:project, :repository, group: group) }
let(:members_mapper) { double('members_mapper').as_null_object }
let(:project) { create(:project, :repository, group: group) }
let(:admin) { create(:admin) }
let(:importer_user) { admin }
let(:excluded_keys) { [] }
@ -387,18 +387,19 @@ RSpec.describe Gitlab::ImportExport::Project::RelationFactory, :use_clean_rails_
# `project_id`, `described_class.USER_REFERENCES`, noteable_id, target_id, and some project IDs are already
# re-assigned by described_class.
context 'Potentially hazardous foreign keys' do
let(:dummy_int) { project.id + 1 } # to avoid setting an integer that equals the current project.id
let(:relation_sym) { :hazardous_foo_model }
let(:relation_hash) do
{
'integration_id' => 99,
'moved_to_id' => 99,
'namespace_id' => 99,
'ci_id' => 99,
'random_project_id' => 99,
'random_id' => 99,
'milestone_id' => 99,
'project_id' => 99,
'user_id' => 99
'integration_id' => dummy_int,
'moved_to_id' => dummy_int,
'namespace_id' => dummy_int,
'ci_id' => dummy_int,
'random_project_id' => dummy_int,
'random_id' => dummy_int,
'milestone_id' => dummy_int,
'project_id' => dummy_int,
'user_id' => dummy_int
}
end
@ -412,19 +413,20 @@ RSpec.describe Gitlab::ImportExport::Project::RelationFactory, :use_clean_rails_
end
it 'does not preserve any foreign key IDs' do
expect(created_object.values).not_to include(99)
expect(created_object.values).to match_array([created_object.project_id])
end
end
context 'overrided model with pluralized name' do
let(:dummy_int) { project.id + 1 } # to avoid setting an integer that equals the current project.id
let(:relation_sym) { :metrics }
let(:relation_hash) do
{
'id' => 99,
'merge_request_id' => 99,
'id' => dummy_int,
'merge_request_id' => dummy_int,
'merged_at' => Time.now,
'merged_by_id' => 99,
'merged_by_id' => dummy_int,
'latest_closed_at' => nil,
'latest_closed_by_id' => nil
}
@ -436,9 +438,10 @@ RSpec.describe Gitlab::ImportExport::Project::RelationFactory, :use_clean_rails_
end
context 'Project references' do
let(:dummy_int) { project.id + 1 } # to avoid setting an integer that equals the current project.id
let(:relation_sym) { :project_foo_model }
let(:relation_hash) do
Gitlab::ImportExport::Project::RelationFactory::PROJECT_REFERENCES.map { |ref| { ref => 99 } }.inject(:merge)
Gitlab::ImportExport::Project::RelationFactory::PROJECT_REFERENCES.map { |ref| { ref => dummy_int } }.inject(:merge)
end
before do
@ -451,7 +454,7 @@ RSpec.describe Gitlab::ImportExport::Project::RelationFactory, :use_clean_rails_
end
it 'does not preserve any project foreign key IDs' do
expect(created_object.values).not_to include(99)
expect(created_object.values).not_to include(dummy_int)
end
end

View File

@ -73,7 +73,8 @@ RSpec.describe Ci::JobToken::AuthorizationsCompactor, feature_category: :secrets
expect { compactor.compact(5) }.to raise_error(Gitlab::Utils::TraversalIdCompactor::UnexpectedCompactionEntry)
end
it 'raises when a redundant compaction entry is found' do
it 'raises when a redundant compaction entry is found',
quarantine: 'https://gitlab.com/gitlab-org/gitlab/-/issues/508889' do
allow(Gitlab::Utils::TraversalIdCompactor).to receive(:compact).and_wrap_original do |original_method, *args|
original_response = original_method.call(*args)
original_response << ns6.traversal_ids