stellar-models/.cursor/rules/dbt-overview-standard.mdc

---
description:
globs: __overview__.md,models/descriptions/*,models/gold/**/*.sql
alwaysApply: false
---
# dbt Overview Standards
When working with dbt projects, ensure comprehensive documentation exists about the project and a base reference to the gold models exists in the `__overview__.md` file for both human and LLM consumption.

## Project __overview__
The file `models/descriptions/__overview__.md` is the entry point to the model description and documentation and must contain a rich description of what the project is.
The file MUST START AND END WITH DBT JINJA DOCS TAGS `{% docs __overview__ %}` and `{% ENDDOCS %}`

## Quick Links Section Requirements
The `__overview__.md` file MUST contain a "Quick Links to Table Documentation" section that provides direct navigation to all gold model documentation. This section must include a simple list, organized by gold schema, with the models and a hyperlink to the model documentation. If there is an existing section like "using dbt docs" that instructs the user on how to navigate dbt docs or a list of links to flipside and dbt, remove it! These are outdated.

### Required Elements:
**Hyperlinks to Gold Model Documentation** - A comprehensive list of all gold models organized by schema. The schema can be inferred from the model name as the slug before the double underscore. For example, `core__fact_blocks` is a model named `fact_blocks` in the schema `core`.

### Gold Model Links Structure:
The quicklinks section must be organized by schema and use the relative link to load the page generated by dbt documentation. The relative link structure is `#!/model/dbt_uniqueId` where dbt's `uniqueId` format is `node_type.project_name.model_name`. All of these `node_types` are `model`. `project_name` is the name of the dbt models project established in `dbt_project.yml` by the `name` variable. `model_name` is finally the name of the model as determed by the name of the sql file OR the value of `name` in the model's associated `.yml` file. For example, a uniqueId for the blockchain's `fact_blocks` model would be `model.<blockchain_name>_models.core__fact_blocks` making the relative URL `#!/model/model.<blockchain_name>_models.core__fact_blocks`.

```markdown
## **Quick Links to Table Documentation**

**Click on the links below to jump to the documentation for each schema.**

### [Schema Name] Tables

**[Model Type Tables:]**

- model_1
- model_2

### CORE Tables
**Dimension Tables:**
- [core__fact_blocks](relative/path/to/model)

**Fact Tables:**
- [model_1](relative/path/to/model)

```

### Schema Organization Rules:
1. **Group by Schema**: Organize models by their schema (core, defi, nft, price, social, governance, etc.)
2. **Use Exact Schema Names**: Use the exact schema names as they appear in the database (e.g., `<blockchain_database>.CORE`, `<blockchain_database>.DEFI`, `<blockchain_database>.NFT`)
3. **Model Type Subgrouping**: Within each schema, subgroup by model type:
   - **Dimension Tables:** (dim_* models)
   - **Fact Tables:** (fact_* models)
   - **Easy Views:** (ez_* models)
4. **Link Format**: Use the exact dbt docs link format: `#!/model/model.[project_name].[schema]__[model]`
5. **Model Naming**: Use the exact model name as it appears in the file system (without .sql extension)

### Implementation Guidelines for Coding Agents:
1. **Scan Directory Structure**: Read `models/gold/` directory to identify all schema subdirectories
2. **Extract Model Names**: For each schema directory, list all `.sql` files and remove the `.sql` extension
3. **Determine Schema Mapping**: Map model names to database schema names:
dbt models in this project utilize a double underscore in the model name to denote schema vs table <schema>__<table_name>.sql:
   - `core__fact_blocks` → `<blockchain_database>.CORE.FACT_BLOCKS`
   - `defi__ez_dex_swaps` → `<blockchain_database>.DEFI.EZ_DEX_SWAPS`
4. **Categorize Models**: Group models by prefix:
   - `dim_*` → Dimension Tables
   - `fact_*` → Fact Tables
   - `ez_*` → Easy Views
   - `udf_*`, `udtf_*` → Custom Functions
5. **Generate Links**: Create markdown links using the proper format
6. **Maintain Order**: Keep models in alphabetical order within each category

### Validation Requirements:
- Every gold model must have a corresponding link
- Links must use the correct dbt docs format
- Schema names must match the actual database schema
- Model names must match the actual file names (without .sql extension)
- Links must be organized by schema and model type
- All links must be functional and point to valid dbt documentation
- Do NOT add a hyperlink to the category headers. Only hyperlink individual models

## XML Tag Requirements
Every `__overview__.md` file MUST include structured `<llm>` XML tags for easy interpretation by an LLM.
```xml
<llm>
<blockchain>[Protocol Name]</blockchain>
<aliases>[Common Aliases]</aliases>
<ecosystem>[Execution Environment or Layer Type, for example EVM, SVM, IBC, Layer 1, Layer 2]</ecosystem>
<description>[Rich 3-5 sentence description of the blockchain, its consensus mechanism, key features, and developer/user benefits including if the blockchain was built for a specific usecase.]</description>
<external_resources>
    <block_scanner>[Link to the primary block scanner for the blockchain]</block_scanner>
    <developer_documenation>[Link to the primary developer documentation, maintained by the blockchain devs]</developer_documentation>
</external_resources>
<expert>
  <constraints>
    <table_availability>
      <!-- Specify which tables/schemas are available for this blockchain -->
      <!-- Example: "Ensure that your queries use only available tables for [BLOCKCHAIN]" -->
    </table_availability>

    <schema_structure>
      <!-- Explain how the database is organized (dimensions, facts, naming conventions) -->
      <!-- Example: "Understand that dimensions and facts combine to make ez_ tables" -->
    </schema_structure>
  </constraints>

  <optimization>
    <performance_filters>
      <!-- Define key filtering strategies for query performance -->
      <!-- Example: "use filters like block_timestamp over the last N days to improve speed" -->
    </performance_filters>

    <query_structure>
      <!-- Specify preferred SQL patterns and structures -->
      <!-- Example: "Use CTEs, not subqueries, as readability is important" -->
    </query_structure>

    <implementation_guidance>
      <!-- Provide guidelines for advanced SQL features -->
      <!-- Example: "Be smart with aggregations, window functions, etc." -->
    </implementation_guidance>
  </optimization>

  <domain_mapping>
    <token_operations>
      <!-- Map token-related queries to specific tables -->
      <!-- Example: "For token transfers, use ez_token_transfers table" -->
    </token_operations>

    <defi_analysis>
      <!-- Specify DeFi-related tables and their use cases -->
      <!-- Example: "For DeFi analysis, use ez_bridge_activity, ez_dex_swaps, ez_lending" -->
    </defi_analysis>

    <nft_analysis>
      <!-- Define NFT-specific tables and functionality -->
      <!-- Example: "For NFT queries, utilize ez_nft_sales table in nft schema" -->
    </nft_analysis>

    <specialized_features>
      <!-- Cover blockchain-specific features or complex data structures -->
      <!-- Example: "The XYZ data is complex, so ensure you ask clarifying questions" -->
    </specialized_features>
  </domain_mapping>

  <interaction_modes>
    <direct_user>
      <!-- Define behavior for direct user interactions -->
      <!-- Example: "Ask clarifying questions when dealing with complex data" -->
    </direct_user>

    <agent_invocation>
      <!-- Specify response format when invoked by other AI agents -->
      <!-- Example: "When invoked by another AI agent, respond with relevant query text" -->
    </agent_invocation>
  </interaction_modes>

  <engagement>
    <exploration_tone>
      <!-- Set the overall tone and encouragement for data exploration -->
      <!-- Example: "Have fun exploring the [BLOCKCHAIN] ecosystem through data!" -->
    </exploration_tone>
  </engagement>
</expert>
</llm>
```
Place these XML tags at the end of the documentation (BUT STILL BEFORE THE JINJA ENDDOCS TAG).

## Update Process for Coding Agents:
To update the overview, the coding agent MUST:

1. **Scan Current Gold Models**:
   - Read the entire `models/gold/` directory structure
   - Identify all `.sql` files across all schema subdirectories
   - Extract model names (remove `.sql` extension)

2. **Generate Updated Quicklinks Section**:
   - Follow these implementation guidelines
   - Create a complete new quicklinks section with all current gold models
   - Maintain proper schema organization and model type grouping

3. **Update __overview__.md**:
   - Replace the entire "Quick Links to Table Documentation" section with the newly generated content
   - Ensure proper markdown formatting and link structure
   - Create or update the XML tag block

4. **Validation Check**:
   - Verify all gold models have corresponding links
   - Confirm links use correct dbt docs format
   - Check that schema names and model names are accurate
   - Ensure alphabetical ordering within categories