table column standards

This commit is contained in:
drethereum 2024-10-18 15:52:25 -06:00
parent 3353b1538c
commit 203507d83c
16 changed files with 223 additions and 172 deletions

127
README.md
View File

@ -4,91 +4,107 @@
----
```yml
blast:
<chain>: -- replace <chain>/<CHAIN> with the profile or name from, remove this comment in your yml
target: dev
outputs:
dev:
type: snowflake
account: <ACCOUNT>
role: <ROLE>
role: INTERNAL_DEV
user: <USERNAME>
password: <PASSWORD>
region: <REGION>
database: BLAST_DEV
warehouse: <WAREHOUSE>
authenticator: externalbrowser
region: us-east-1
database: <CHAIN>_DEV
warehouse: DBT
schema: silver
threads: 12
threads: 4
client_session_keep_alive: False
query_tag: <TAG>
query_tag: dbt_<USERNAME>_dev
prod:
type: snowflake
account: <ACCOUNT>
role: <ROLE>
role: DBT_CLOUD_<CHAIN>
user: <USERNAME>
password: <PASSWORD>
region: <REGION>
database: BLAST
warehouse: <WAREHOUSE>
authenticator: externalbrowser
region: us-east-1
database: <CHAIN>
warehouse: DBT_CLOUD_<CHAIN>
schema: silver
threads: 12
threads: 4
client_session_keep_alive: False
query_tag: <TAG>
query_tag: dbt_<USERNAME>_dev
```
## Variables
### Common DBT Run Variables
To control the creation of UDF or SP macros with dbt run:
* UPDATE_UDFS_AND_SPS
* Default values are False
* When True, executes all macros included in the on-run-start hooks within dbt_project.yml on model run as normal
* When False, none of the on-run-start macros are executed on model run
The following variables can be used to control various aspects of the dbt run. Use them with the `--vars` flag when running dbt commands.
* Usage: `dbt run --vars '{"UPDATE_UDFS_AND_SPS":True}' -m ...`
| Variable | Description | Example Usage |
|----------|-------------|---------------|
| `UPDATE_UDFS_AND_SPS` | Update User Defined Functions and Stored Procedures. By default, this is set to False | `--vars '{"UPDATE_UDFS_AND_SPS":true}'` |
| `STREAMLINE_INVOKE_STREAMS` | Invoke Streamline processes. By default, this is set to False | `--vars '{"STREAMLINE_INVOKE_STREAMS":true}'` |
| `STREAMLINE_USE_DEV_FOR_EXTERNAL_TABLES` | Use development environment for external tables. By default, this is set to False | `--vars '{"STREAMLINE_USE_DEV_FOR_EXTERNAL_TABLES":true}'` |
| `HEAL_CURATED_MODEL` | Heal specific curated models. By default, this is set to an empty array []. See more below. | `--vars '{"HEAL_CURATED_MODEL":["axelar","across","celer_cbridge"]}'` |
| `UPDATE_SNOWFLAKE_TAGS` | Control updating of Snowflake tags. By default, this is set to False | `--vars '{"UPDATE_SNOWFLAKE_TAGS":false}'` |
| `START_GHA_TASKS` | Start GitHub Actions tasks. By default, this is set to False | `--vars '{"START_GHA_TASKS":true}'` |
Use a variable to heal a model incrementally:
* HEAL_MODEL
* Default is FALSE (Boolean)
* When FALSE, logic will be negated
* When TRUE, heal logic will apply
* Include `heal` in model tags within the config block for inclusion in the `dbt_run_heal_models` workflow, e.g. `tags = 'heal'`
#### Example Commands
* Usage: `dbt run --vars '{"HEAL_MODEL":True}' -m ...`
1. Update UDFs and SPs:
```
dbt run --vars '{"UPDATE_UDFS_AND_SPS":true}' -m ...
```
Use a variable to negate incremental logic:
* Example use case: reload records in a curated complete table without a full-refresh, such as `silver_bridge.complete_bridge_activity`:
* HEAL_MODELS
* Default is an empty array []
* When item is included in var array [], incremental logic will be skipped for that CTE / code block
* When item is not included in var array [] or does not match specified item in model, incremental logic will apply
* Example set up: `{% if is_incremental() and 'axelar' not in var('HEAL_MODELS') %}`
2. Invoke Streamline and use dev for external tables:
```
dbt run --vars '{"STREAMLINE_INVOKE_STREAMS":true,"STREAMLINE_USE_DEV_FOR_EXTERNAL_TABLES":true}' -m ...
```
* Usage:
* Single CTE: `dbt run --vars '{"HEAL_MODELS":"axelar"}' -m ...`
* Multiple CTEs: `dbt run --vars '{"HEAL_MODELS":["axelar","across","celer_cbridge"]}' -m ...`
3. Heal specific curated models:
```
dbt run --vars '{"HEAL_CURATED_MODEL":["axelar","across","celer_cbridge"]}' -m ...
```
Use a variable to extend the incremental lookback period:
* LOOKBACK
* Default is a string representing the specified time interval e.g. '12 hours', '7 days' etc.
* Example set up: `SELECT MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'`
4. Update Snowflake tags for a specific model:
```
dbt run --vars '{"UPDATE_SNOWFLAKE_TAGS":true}' -s models/silver/utilities/silver__number_sequence.sql
```
* Usage: `dbt run --vars '{"LOOKBACK":"36 hours"}' -m ...`
5. Start GHA tasks:
```
dbt seed -s github_actions__workflows && dbt run -m models/github_actions --full-refresh && dbt run-operation fsc_utils.create_gha_tasks --vars '{"START_GHA_TASKS":True}'
```
6. Using two or more variables:
```
dbt run --vars '{"UPDATE_UDFS_AND_SPS":true,"STREAMLINE_INVOKE_STREAMS":true,"STREAMLINE_USE_DEV_FOR_EXTERNAL_TABLES":true}' -m ...
```
> Note: Replace `-m ...` with appropriate model selections or tags as needed for your project structure.
## FSC_EVM
`fsc_evm` is a collection of macros, models, and other resources that are used to build the Flipside Crypto EVM models.
For more information on the `fsc_evm` package, see the [FSC_EVM Wiki](https://github.com/FlipsideCrypto/fsc-evm/wiki).
## Applying Model Tags
### Database / Schema level tags
Database and schema tags are applied via the `add_database_or_schema_tags` macro. These tags are inherited by their downstream objects. To add/modify tags call the appropriate tag set function within the macro.
Database and schema tags are applied via the `fsc_evm.add_database_or_schema_tags` macro. These tags are inherited by their downstream objects. To add/modify tags call the appropriate tag set function within the macro.
```
{{ set_database_tag_value('SOME_DATABASE_TAG_KEY','SOME_DATABASE_TAG_VALUE') }}
{{ set_schema_tag_value('SOME_SCHEMA_TAG_KEY','SOME_SCHEMA_TAG_VALUE') }}
{{ fsc_evm.set_database_tag_value('SOME_DATABASE_TAG_KEY','SOME_DATABASE_TAG_VALUE') }}
{{ fsc_evm.set_schema_tag_value('SOME_SCHEMA_TAG_KEY','SOME_SCHEMA_TAG_VALUE') }}
```
### Model tags
To add/update a model's snowflake tags, add/modify the `meta` model property under `config`. Only table level tags are supported at this time via DBT.
```
{% raw %}
{{ config(
...,
meta={
@ -100,24 +116,17 @@ To add/update a model's snowflake tags, add/modify the `meta` model property und
},
...
) }}
```
{% endraw %}
By default, model tags are pushed to Snowflake on each load. You can disable this by setting the `UPDATE_SNOWFLAKE_TAGS` project variable to `False` during a run.
```
dbt run --vars '{"UPDATE_SNOWFLAKE_TAGS":False}' -s models/core/core__fact_blocks.sql
dbt run --vars '{"UPDATE_SNOWFLAKE_TAGS":False}' -s models/silver/utilities/silver__number_sequence.sql
```
### Querying for existing tags on a model in snowflake
```
select *
from table(blast.information_schema.tag_references('blast.core.fact_blocks', 'table'));
```
### Resources:
- Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction)
- Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers
- Join the [chat](https://community.getdbt.com/) on Slack for live discussions and support
- Find [dbt events](https://events.getdbt.com) near you
- Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices
from table(<chain>.information_schema.tag_references('<chain>.core.fact_blocks', 'table'));
```

View File

@ -25,7 +25,8 @@ clean-targets: # directories to be removed by `dbt clean`
- "dbt_packages"
tests:
+store_failures: true # all tests
blast_models:
+store_failures: true # all tests
on-run-start:
- "{{ create_sps() }}"
@ -49,8 +50,18 @@ query-comment:
# Full documentation: https://docs.getdbt.com/docs/configuring-models
models:
+copy_grants: true
+on_schema_change: "append_new_columns"
optimism_models:
+copy_grants: true
+persist_docs:
relation: true
columns: true
+on_schema_change: "append_new_columns"
livequery_models:
+materialized: ephemeral
fsc_evm:
+enabled: false
github_actions_package:
+enabled: true
# In this example config, we tell dbt to build all models in the example/ directory
# as tables. These settings can be overridden in the individual model files

46
makefile Normal file
View File

@ -0,0 +1,46 @@
DBT_TARGET ?= dev
deploy_streamline_functions:
rm -f package-lock.yml && dbt clean && dbt deps
dbt run -s livequery_models.deploy.core --vars '{"UPDATE_UDFS_AND_SPS":True}' -t $(DBT_TARGET)
dbt run-operation fsc_utils.create_evm_streamline_udfs --vars '{"UPDATE_UDFS_AND_SPS":True}' -t $(DBT_TARGET)
cleanup_time:
rm -f package-lock.yml && dbt clean && dbt deps
deploy_streamline_tables:
rm -f package-lock.yml && dbt clean && dbt deps
ifeq ($(findstring dev,$(DBT_TARGET)),dev)
dbt run -m "fsc_evm,tag:bronze_external" --vars '{"STREAMLINE_USE_DEV_FOR_EXTERNAL_TABLES":True}' -t $(DBT_TARGET)
else
dbt run -m "fsc_evm,tag:bronze_external" -t $(DBT_TARGET)
endif
dbt run -m "fsc_evm,tag:streamline_core_complete" "fsc_evm,tag:streamline_core_realtime" "fsc_evm,tag:utils" --full-refresh -t $(DBT_TARGET)
deploy_streamline_requests:
rm -f package-lock.yml && dbt clean && dbt deps
dbt run -m "fsc_evm,tag:streamline_core_complete" "fsc_evm,tag:streamline_core_realtime" --vars '{"STREAMLINE_INVOKE_STREAMS":True}' -t $(DBT_TARGET)
deploy_github_actions:
dbt run -s livequery_models.deploy.marketplace.github --vars '{"UPDATE_UDFS_AND_SPS":True}' -t $(DBT_TARGET)
dbt seed -s github_actions__workflows -t $(DBT_TARGET)
dbt run -m "fsc_evm,tag:gha_tasks" --full-refresh -t $(DBT_TARGET)
ifeq ($(findstring dev,$(DBT_TARGET)),dev)
dbt run-operation fsc_utils.create_gha_tasks --vars '{"START_GHA_TASKS":False}' -t $(DBT_TARGET)
else
dbt run-operation fsc_utils.create_gha_tasks --vars '{"START_GHA_TASKS":True}' -t $(DBT_TARGET)
endif
deploy_new_github_action:
dbt seed -s github_actions__workflows -t $(DBT_TARGET)
dbt run -m "fsc_evm,tag:gha_tasks" --full-refresh -t $(DBT_TARGET)
ifeq ($(findstring dev,$(DBT_TARGET)),dev)
dbt run-operation fsc_utils.create_gha_tasks --vars '{"START_GHA_TASKS":False}' -t $(DBT_TARGET)
else
dbt run-operation fsc_utils.create_gha_tasks --vars '{"START_GHA_TASKS":True}' -t $(DBT_TARGET)
endif
regular_incremental:
dbt run -m "fsc_evm,tag:core" -t $(DBT_TARGET)
.PHONY: deploy_streamline_functions deploy_streamline_tables deploy_streamline_requests deploy_github_actions cleanup_time regular_incremental deploy_new_github_action

View File

@ -1,6 +0,0 @@
{{ config(
materialized = 'view',
tags = ['gha_tasks']
) }}
{{ fsc_utils.gha_task_current_status_view() }}

View File

@ -1,16 +0,0 @@
version: 2
models:
- name: github_actions__current_task_status
columns:
- name: PIPELINE_ACTIVE
tests:
- dbt_expectations.expect_column_values_to_be_in_set:
value_set:
- TRUE
- name: SUCCESSES
tests:
- dbt_expectations.expect_column_values_to_be_in_set:
value_set:
- 2
config:
severity: warn

View File

@ -1,5 +0,0 @@
{{ config(
materialized = 'view'
) }}
{{ fsc_utils.gha_task_history_view() }}

View File

@ -1,5 +0,0 @@
{{ config(
materialized = 'view'
) }}
{{ fsc_utils.gha_task_performance_view() }}

View File

@ -1,5 +0,0 @@
{{ config(
materialized = 'view'
) }}
{{ fsc_utils.gha_task_schedule_view() }}

View File

@ -1,5 +0,0 @@
{{ config(
materialized = 'view'
) }}
{{ fsc_utils.gha_tasks_view() }}

View File

@ -6,22 +6,36 @@
SELECT
A.block_number AS block_number,
HASH AS block_hash, --new column
block_timestamp,
'mainnet' AS network,
'blast' AS blockchain,
d.tx_count,
SIZE,
miner, --new column
extra_data,
parent_hash,
gas_used,
gas_limit,
difficulty,
total_difficulty,
extra_data,
gas_limit,
gas_used,
HASH,
parent_hash,
receipts_root,
sha3_uncles,
SIZE,
uncles AS uncle_blocks,
withdrawals_root,
nonce, --new column
receipts_root,
state_root, --new column
transactions_root, --new column
logs_bloom, --new column
blocks_id AS fact_blocks_id,
GREATEST(
A.inserted_timestamp,
d.inserted_timestamp
) AS inserted_timestamp,
GREATEST(
A.modified_timestamp,
d.modified_timestamp
) AS modified_timestamp,
'blast' AS blockchain, --deprecate
HASH, --deprecate
OBJECT_CONSTRUCT(
'baseFeePerGas',
base_fee_per_gas,
@ -61,16 +75,8 @@ SELECT
transactions_root,
'uncles',
uncles
) AS block_header_json,
blocks_id AS fact_blocks_id,
GREATEST(
A.inserted_timestamp,
d.inserted_timestamp
) AS inserted_timestamp,
GREATEST(
A.modified_timestamp,
d.modified_timestamp
) AS modified_timestamp
) AS block_header_json, --deprecate
withdrawals_root --deprecate
FROM
{{ ref('silver__blocks') }} A
LEFT JOIN {{ ref('silver__tx_count') }}

View File

@ -8,18 +8,27 @@ SELECT
block_number,
block_timestamp,
tx_hash,
origin_function_signature,
origin_from_address,
origin_to_address,
{# tx_position, --new column, requires FR on silver.logs #}
event_index,
contract_address,
topics,
topics[0]::STRING AS topic_0, --new column
topics[1]::STRING AS topic_1, --new column
topics[2]::STRING AS topic_2, --new column
topics[3]::STRING AS topic_3, --new column
DATA,
event_removed,
tx_status,
_log_id,
origin_from_address,
origin_to_address,
origin_function_signature,
CASE
WHEN tx_status = 'SUCCESS' THEN TRUE
ELSE FALSE
END AS tx_succeeded, --new column
logs_id AS fact_event_logs_id,
inserted_timestamp,
modified_timestamp
modified_timestamp,
tx_status, --deprecate
_log_id --deprecate
FROM
{{ ref('silver__logs') }}

View File

@ -5,28 +5,38 @@
) }}
SELECT
tx_hash,
block_number,
block_timestamp,
tx_hash,
tx_position, --new column
trace_index,
from_address,
to_address,
eth_value AS VALUE,
eth_value_precise_raw AS value_precise_raw,
eth_value_precise AS value_precise,
gas,
gas_used,
input,
output,
TYPE,
identifier,
DATA,
tx_status,
trace_address, --new column
sub_traces,
trace_status,
DATA,
VALUE,
value_precise_raw,
value_precise,
value_hex, --new column
gas,
gas_used,
origin_from_address,
origin_to_address,
origin_function_signature,
trace_succeeded, --new column
tx_succeeded, --new column
error_reason,
trace_index,
traces_id AS fact_traces_id,
revert_reason, --new column
fact_traces_id,
inserted_timestamp,
modified_timestamp
modified_timestamp,
trace_status, --deprecate
tx_status, --deprecate
identifier --deprecate
FROM
{{ ref('silver__traces') }}
{{ ref('silver__fact_traces2') }}
--ideal state = source from silver.traces2 and materialize this model as a table (core.fact_traces2)

View File

@ -7,39 +7,45 @@
SELECT
block_number,
block_timestamp,
block_hash,
tx_hash,
nonce,
POSITION,
origin_function_signature,
from_address,
to_address,
origin_function_signature,
VALUE,
value_precise_raw,
value_precise,
tx_fee,
tx_fee_precise,
CASE
WHEN tx_status = 'SUCCESS' THEN TRUE
ELSE FALSE
END AS tx_succeeded, --new column
tx_type,
nonce,
POSITION AS tx_position, --new column
input_data,
gas_price,
effective_gas_price,
gas AS gas_limit,
gas_used,
gas AS gas_limit,
cumulative_gas_used,
effective_gas_price,
max_fee_per_gas,
max_priority_fee_per_gas,
l1_gas_price,
l1_gas_used,
l1_fee_scalar,
l1_fee,
l1_fee_precise,
cumulative_gas_used,
max_fee_per_gas,
max_priority_fee_per_gas,
input_data,
tx_status AS status,
r,
s,
v,
deposit_nonce,
deposit_receipt_version,
transactions_id AS fact_transactions_id,
inserted_timestamp,
modified_timestamp
modified_timestamp,
block_hash, --deprecate
tx_status AS status, --deprecate
POSITION, --deprecate
deposit_nonce, --deprecate, may be separate table
deposit_receipt_version --deprecate, may be separate table
FROM
{{ ref('silver__transactions') }}

View File

@ -16,6 +16,7 @@ WITH base AS (
from_address AS origin_from_address,
to_address AS origin_to_address,
tx_status,
{# position AS tx_position, --new column #}
logs,
_inserted_timestamp
FROM
@ -39,6 +40,7 @@ flat_logs AS (
origin_from_address,
origin_to_address,
tx_status,
{# tx_position, #}
VALUE :address :: STRING AS contract_address,
VALUE :blockHash :: STRING AS block_hash,
VALUE :data :: STRING AS DATA,
@ -63,6 +65,7 @@ new_records AS (
l.origin_to_address,
txs.origin_function_signature,
l.tx_status,
{# l.tx_position, #}
l.contract_address,
l.block_hash,
l.data,
@ -107,6 +110,7 @@ missing_data AS (
t.origin_to_address,
txs.origin_function_signature,
t.tx_status,
{# t.tx_position, #}
t.contract_address,
t.block_hash,
t.data,
@ -140,6 +144,7 @@ FINAL AS (
origin_to_address,
origin_function_signature,
tx_status,
{# tx_position, #}
contract_address,
block_hash,
DATA,
@ -162,6 +167,7 @@ SELECT
origin_to_address,
origin_function_signature,
tx_status,
{# tx_position, #}
contract_address,
block_hash,
DATA,

View File

@ -1,4 +1,6 @@
packages:
- git: https://github.com/FlipsideCrypto/fsc-evm.git
revision: 6456423f9a9fc14fcc6abd51c550d115db6cd28f
- package: calogica/dbt_expectations
version: 0.8.2
- package: dbt-labs/dbt_external_tables
@ -6,13 +8,11 @@ packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
- git: https://github.com/FlipsideCrypto/fsc-utils.git
revision: eb33ac727af26ebc8a8cc9711d4a6ebc3790a107
revision: c3ab97e8e06d31e8c6f63819714e0a2d45c45e82
- package: get-select/dbt_snowflake_query_tags
version: 2.5.0
- git: https://github.com/FlipsideCrypto/fsc-evm.git
revision: ec6adae14ab4060ad4a553fb7f32d7e57693996d
- package: calogica/dbt_date
version: 0.7.2
- git: https://github.com/FlipsideCrypto/livequery-models.git
revision: b024188be4e9c6bc00ed77797ebdc92d351d620e
sha1_hash: 622a679ecf98e6ebf3c904241902ce5328c77e52
sha1_hash: 2976b7495d434571280cda0acd0e5d272631eb00

View File

@ -1,13 +1,3 @@
packages:
- package: calogica/dbt_expectations
version: 0.8.2
- package: dbt-labs/dbt_external_tables
version: 0.8.2
- package: dbt-labs/dbt_utils
version: 1.0.0
- git: https://github.com/FlipsideCrypto/fsc-utils.git
revision: v1.29.0
- package: get-select/dbt_snowflake_query_tags
version: [">=2.0.0", "<3.0.0"]
- git: https://github.com/FlipsideCrypto/fsc-evm.git
revision: v1.5.0
revision: "AN-5278/sl-other-macros"