stellar-models/.cursor/rules/dbt-overview-standard.mdc
eric-laurello f73bbdb510
AN-6483 daily balances (#16)
* bals

* refs

* .

* final QA updates

* remove ref

* final upds

* move the daily back a bit
2025-09-03 15:46:41 -04:00

187 lines
9.1 KiB
Plaintext

---
description:
globs: __overview__.md,models/descriptions/*,models/gold/**/*.sql
alwaysApply: false
---
# dbt Overview Standards
When working with dbt projects, ensure comprehensive documentation exists about the project and a base reference to the gold models exists in the `__overview__.md` file for both human and LLM consumption.
## Project __overview__
The file `models/descriptions/__overview__.md` is the entry point to the model description and documentation and must contain a rich description of what the project is.
The file MUST START AND END WITH DBT JINJA DOCS TAGS `{% docs __overview__ %}` and `{% ENDDOCS %}`
## Quick Links Section Requirements
The `__overview__.md` file MUST contain a "Quick Links to Table Documentation" section that provides direct navigation to all gold model documentation. This section must include a simple list, organized by gold schema, with the models and a hyperlink to the model documentation. If there is an existing section like "using dbt docs" that instructs the user on how to navigate dbt docs or a list of links to flipside and dbt, remove it! These are outdated.
### Required Elements:
**Hyperlinks to Gold Model Documentation** - A comprehensive list of all gold models organized by schema. The schema can be inferred from the model name as the slug before the double underscore. For example, `core__fact_blocks` is a model named `fact_blocks` in the schema `core`.
### Gold Model Links Structure:
The quicklinks section must be organized by schema and use the relative link to load the page generated by dbt documentation. The relative link structure is `#!/model/dbt_uniqueId` where dbt's `uniqueId` format is `node_type.project_name.model_name`. All of these `node_types` are `model`. `project_name` is the name of the dbt models project established in `dbt_project.yml` by the `name` variable. `model_name` is finally the name of the model as determed by the name of the sql file OR the value of `name` in the model's associated `.yml` file. For example, a uniqueId for the blockchain's `fact_blocks` model would be `model.<blockchain_name>_models.core__fact_blocks` making the relative URL `#!/model/model.<blockchain_name>_models.core__fact_blocks`.
```markdown
## **Quick Links to Table Documentation**
**Click on the links below to jump to the documentation for each schema.**
### [Schema Name] Tables
**[Model Type Tables:]**
- model_1
- model_2
### CORE Tables
**Dimension Tables:**
- [core__fact_blocks](relative/path/to/model)
**Fact Tables:**
- [model_1](relative/path/to/model)
```
### Schema Organization Rules:
1. **Group by Schema**: Organize models by their schema (core, defi, nft, price, social, governance, etc.)
2. **Use Exact Schema Names**: Use the exact schema names as they appear in the database (e.g., `<blockchain_database>.CORE`, `<blockchain_database>.DEFI`, `<blockchain_database>.NFT`)
3. **Model Type Subgrouping**: Within each schema, subgroup by model type:
- **Dimension Tables:** (dim_* models)
- **Fact Tables:** (fact_* models)
- **Easy Views:** (ez_* models)
4. **Link Format**: Use the exact dbt docs link format: `#!/model/model.[project_name].[schema]__[model]`
5. **Model Naming**: Use the exact model name as it appears in the file system (without .sql extension)
### Implementation Guidelines for Coding Agents:
1. **Scan Directory Structure**: Read `models/gold/` directory to identify all schema subdirectories
2. **Extract Model Names**: For each schema directory, list all `.sql` files and remove the `.sql` extension
3. **Determine Schema Mapping**: Map model names to database schema names:
dbt models in this project utilize a double underscore in the model name to denote schema vs table <schema>__<table_name>.sql:
- `core__fact_blocks` → `<blockchain_database>.CORE.FACT_BLOCKS`
- `defi__ez_dex_swaps` → `<blockchain_database>.DEFI.EZ_DEX_SWAPS`
4. **Categorize Models**: Group models by prefix:
- `dim_*` → Dimension Tables
- `fact_*` → Fact Tables
- `ez_*` → Easy Views
- `udf_*`, `udtf_*` → Custom Functions
5. **Generate Links**: Create markdown links using the proper format
6. **Maintain Order**: Keep models in alphabetical order within each category
### Validation Requirements:
- Every gold model must have a corresponding link
- Links must use the correct dbt docs format
- Schema names must match the actual database schema
- Model names must match the actual file names (without .sql extension)
- Links must be organized by schema and model type
- All links must be functional and point to valid dbt documentation
- Do NOT add a hyperlink to the category headers. Only hyperlink individual models
## XML Tag Requirements
Every `__overview__.md` file MUST include structured `<llm>` XML tags for easy interpretation by an LLM.
```xml
<llm>
<blockchain>[Protocol Name]</blockchain>
<aliases>[Common Aliases]</aliases>
<ecosystem>[Execution Environment or Layer Type, for example EVM, SVM, IBC, Layer 1, Layer 2]</ecosystem>
<description>[Rich 3-5 sentence description of the blockchain, its consensus mechanism, key features, and developer/user benefits including if the blockchain was built for a specific usecase.]</description>
<external_resources>
<block_scanner>[Link to the primary block scanner for the blockchain]</block_scanner>
<developer_documenation>[Link to the primary developer documentation, maintained by the blockchain devs]</developer_documentation>
</external_resources>
<expert>
<constraints>
<table_availability>
<!-- Specify which tables/schemas are available for this blockchain -->
<!-- Example: "Ensure that your queries use only available tables for [BLOCKCHAIN]" -->
</table_availability>
<schema_structure>
<!-- Explain how the database is organized (dimensions, facts, naming conventions) -->
<!-- Example: "Understand that dimensions and facts combine to make ez_ tables" -->
</schema_structure>
</constraints>
<optimization>
<performance_filters>
<!-- Define key filtering strategies for query performance -->
<!-- Example: "use filters like block_timestamp over the last N days to improve speed" -->
</performance_filters>
<query_structure>
<!-- Specify preferred SQL patterns and structures -->
<!-- Example: "Use CTEs, not subqueries, as readability is important" -->
</query_structure>
<implementation_guidance>
<!-- Provide guidelines for advanced SQL features -->
<!-- Example: "Be smart with aggregations, window functions, etc." -->
</implementation_guidance>
</optimization>
<domain_mapping>
<token_operations>
<!-- Map token-related queries to specific tables -->
<!-- Example: "For token transfers, use ez_token_transfers table" -->
</token_operations>
<defi_analysis>
<!-- Specify DeFi-related tables and their use cases -->
<!-- Example: "For DeFi analysis, use ez_bridge_activity, ez_dex_swaps, ez_lending" -->
</defi_analysis>
<nft_analysis>
<!-- Define NFT-specific tables and functionality -->
<!-- Example: "For NFT queries, utilize ez_nft_sales table in nft schema" -->
</nft_analysis>
<specialized_features>
<!-- Cover blockchain-specific features or complex data structures -->
<!-- Example: "The XYZ data is complex, so ensure you ask clarifying questions" -->
</specialized_features>
</domain_mapping>
<interaction_modes>
<direct_user>
<!-- Define behavior for direct user interactions -->
<!-- Example: "Ask clarifying questions when dealing with complex data" -->
</direct_user>
<agent_invocation>
<!-- Specify response format when invoked by other AI agents -->
<!-- Example: "When invoked by another AI agent, respond with relevant query text" -->
</agent_invocation>
</interaction_modes>
<engagement>
<exploration_tone>
<!-- Set the overall tone and encouragement for data exploration -->
<!-- Example: "Have fun exploring the [BLOCKCHAIN] ecosystem through data!" -->
</exploration_tone>
</engagement>
</expert>
</llm>
```
Place these XML tags at the end of the documentation (BUT STILL BEFORE THE JINJA ENDDOCS TAG).
## Update Process for Coding Agents:
To update the overview, the coding agent MUST:
1. **Scan Current Gold Models**:
- Read the entire `models/gold/` directory structure
- Identify all `.sql` files across all schema subdirectories
- Extract model names (remove `.sql` extension)
2. **Generate Updated Quicklinks Section**:
- Follow these implementation guidelines
- Create a complete new quicklinks section with all current gold models
- Maintain proper schema organization and model type grouping
3. **Update __overview__.md**:
- Replace the entire "Quick Links to Table Documentation" section with the newly generated content
- Ensure proper markdown formatting and link structure
- Create or update the XML tag block
4. **Validation Check**:
- Verify all gold models have corresponding links
- Confirm links use correct dbt docs format
- Check that schema names and model names are accurate
- Ensure alphabetical ordering within categories