mirror of
https://github.com/sourcegraph/sourcegraph.git
synced 2026-02-06 20:51:43 +00:00
Adds a new `postgreSQL.logicalReplication` configuration to allow MSP to generate prerequisite setup for integration with Datastream: https://cloud.google.com/datastream/docs/sources-postgresql. Integration with Datastream allows the Data Analytics team to self-serve data enrichment needs for the Telemetry V2 pipeline. Enabling this feature entails downtime (Cloud SQL instance restart), so enabling the logical replication feature at the Cloud SQL level (`cloudsql.logical_decoding`) is gated behind `postgreSQL.logicalReplication: {}`. Setting up the required stuff in Postgres is a bit complicated, requiring 3 Postgres provider instances: 1. The default admin one, authenticated with our admin user 2. New: a workload identity provider, using https://github.com/cyrilgdn/terraform-provider-postgresql/pull/448 / https://github.com/sourcegraph/managed-services-platform-cdktf/pull/11. This is required for creating a publication on selected tables, which requires being owner of said table. Because tables are created by application using e.g. auto-migrate, the workload identity is always the table owner, so we need to impersonate the IAM user 3. New: a "replication user" which is created with the replication permission. Replication seems to not be a propagated permission so we need a role/user that has replication enabled. A bit more context scattered here and there in the docstrings. Beyond the Postgres configuration we also introduce some additional resources to enable easy Datastream configuration: 1. Datastream Private Connection, which peers to the service private network 2. Cloud SQL Proxy VM, which only allows connections to `:5432` from the range specified in 1, allowing a connection to the Cloud SQL instance 2. Datastream Connection Profile attached to 1 From there, data team can click-ops or manage the Datastream Stream and BigQuery destination on their own. Closes CORE-165 Closes CORE-212 Sample config: ```yaml resources: postgreSQL: databases: - "primary" logicalReplication: publications: - name: testing database: primary tables: - users ``` ## Test plan https://github.com/sourcegraph/managed-services/pull/1569 ## Changelog - MSP services can now configure `postgreSQL.logicalReplication` to enable Data Analytics team to replicate selected database tables into BigQuery. |
||
|---|---|---|
| .. | ||
| imageupdater | ||
| resource | ||
| resourceid | ||
| stack | ||
| terraform | ||