Conversation
| @@ -1,4 +1,4 @@ | |||
| year_index,borrower_id_rus,borrower_name_rus,report_year,report_month,debt_description,balance_end_of_report_year,interest,principal,total | |||
| year_index,borrower_id_rus,borrower_name_rus,report_year,report_month,debt_description,debt_balance_end_of_report_year,debt_interest_billed,debt_principal_billed,total | |||
There was a problem hiding this comment.
I updated these columns so it would be more clear what they were. Take note @cmgosnell for rus7 if you think this is a good change.
There was a problem hiding this comment.
i like the add of debt_ at the beginning (i added loan_ but i like this better i will mirror) but i don't love _billed. I'd rather go with ending_balance instead of balance_end_of_report_year bc that's what we use for ferc1 and this is super analogous. the form says "billed this year". I think in general it is expected that the values reported are reported during the report date period and idk billed just seems superfluous but i won't die on that hill.
total should have the same treatment!
I'd shoot towards this schema:
"fields": [
"report_date",
"borrower_id_rus",
"borrower_name_rus",
"debt_description",
"debt_ending_balance",
"debt_interest",
"debt_principal",
"debt_total",
],
There was a problem hiding this comment.
I excluded the debt_total field here because it was redundant, do you think I should add it back in?
There was a problem hiding this comment.
i think we should keep the totals in! if we have time to validate that interest + principal = total then i could imagining removing it but unless/until then i think keeping it in is more safe (also what if there are things in total like fees or other cram that means you can't always add em up)
There was a problem hiding this comment.
this looks real good!! are you waiting on the unit conversion from rus7 for the last two of these four tables? (is that why some of the fields are commented out? seems like it but 🤷🏻 . the second phase of rus7 tables are merged now so that should be accessible now. (be wary of alembic migrations 🙃 - i tired to lump that all together so it'd be easier)
for the fully transformed tables i left some tiny nits + we should get lined up about debt column names which feels easy and minor!
also add release notes!
| @@ -1,4 +1,4 @@ | |||
| year_index,borrower_id_rus,borrower_name_rus,report_year,report_month,debt_description,balance_end_of_report_year,interest,principal,total | |||
| year_index,borrower_id_rus,borrower_name_rus,report_year,report_month,debt_description,debt_balance_end_of_report_year,debt_interest_billed,debt_principal_billed,total | |||
There was a problem hiding this comment.
i like the add of debt_ at the beginning (i added loan_ but i like this better i will mirror) but i don't love _billed. I'd rather go with ending_balance instead of balance_end_of_report_year bc that's what we use for ferc1 and this is super analogous. the form says "billed this year". I think in general it is expected that the values reported are reported during the report date period and idk billed just seems superfluous but i won't die on that hill.
total should have the same treatment!
I'd shoot towards this schema:
"fields": [
"report_date",
"borrower_id_rus",
"borrower_name_rus",
"debt_description",
"debt_ending_balance",
"debt_interest",
"debt_principal",
"debt_total",
],
cmgosnell
left a comment
There was a problem hiding this comment.
three non-blocking suggestions: remove dollars from two column names, added enums and tinsy baby remove additional_details_text. honestly all so small so as to be non-blocking.
but i generated all of these tables locally and they look good!
| "primary_renewable_fuel_type": { | ||
| "type": "string", | ||
| "description": ("Primary renewable fuel type used by the plant."), | ||
| }, | ||
| "primary_renewable_fuel_type_id": { | ||
| "type": "integer", | ||
| "description": ("Unique numeric identifier for each renewable fuel type."), | ||
| }, |
There was a problem hiding this comment.
can or should these be enumed?
There was a problem hiding this comment.
yea I can do that quickly!
| "prime_mover_id": { | ||
| "type": "integer", | ||
| "description": "Unique numeric identifier for each prime mover type used by RUS borrowers.", | ||
| }, | ||
| "prime_mover_type": { | ||
| "type": "string", | ||
| "description": "Type of prime mover (e.g. Hydro, Internal Combustion).", | ||
| }, |
There was a problem hiding this comment.
can these also be enum-ed easily?
src/pudl/metadata/resources/rus12.py
Outdated
There was a problem hiding this comment.
nbd but you left a handful of these empty text guys
There was a problem hiding this comment.
I'm gonna leave the draft ones for now because they get handled in the other PR
Overview
Part of #4901 and #4884.
What problem does this address?
Adds transform functions for top RUS tables
What did you change?
schema.ymlfile for transformed tablesRESOURCE_METADATAfromDRAFT_RESOURCE_METADATATO-DO
RESOURCE_METADATAfromDRAFT_RESOURCE_METADATAfields.pyDocumentation
Make sure to update relevant aspects of the documentation:
docs/data_sources/templates).src/metadata).Testing
How did you make sure this worked? How can a reviewer verify this?
To-do list
dbttests.pixi run pre-commit-runto run linters and static code analysis checks.pixi run pytest-cilocally to ensure that the merge queue will accept your PR.build-deploy-pudlGitHub Action manually and ensure that it succeeds.