Make ref table function way simpler and more efficient #209
+36
−23
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I realized I initially built the reference table function to match
dataRetrieval, thinking that there was a reason we couldn't reuse theget_ogc_data()function, but I think this is inefficient and incorrect. Swapped out the repeated code forget_ogc_dataand changed the "id" column to the singular subject name of the reference table, e.g. "site-type-codes" has the "id" column changed to "site_type_code" (the rest of the columns swap underscores for dashes, so keeping consistent).I also learned that if there aren't any geometries present in the dataframe,
geopandaswill simply return apandasDataFrame. This avoids the issue I created in the last MR where I thought I had to specify "no geopandas" to get back a regular dataframe, and the logger was yelling about geopandas not being installed due to my oversight.The only risk is that URLs contain the argument
skipGeometry=FALSE, like this:https://api.waterdata.usgs.gov/ogcapi/v0/collections/time-zone-codes/items?skipGeometry=False&limit=50000
However, it doesn't seem to actually affect what comes back.