Features
- Add `mistral` data holdings to `intake-esm-datastore` ({pr}`133`) [aaronspring](https://github.com/aaronspring)
- Add support for `NA-CORDEX` data holdings. ({pr}`115`) [jukent](https://github.com/jukent)
- Replace `.csv` with `netCDF` as serialization format when saving the built collection to disk. With `netCDF`, we can record very useful information into the global attributes of the netCDF dataset. ({pr}`119`) [andersy005](https://github.com/andersy005)
- Add string representation of `ESMMetadataStoreCatalog`` object ({pr}`122`) [andersy005](https://github.com/andersy005)
- Automatically build missing collections by calling `esm_metadatastore(collection_name="GLADE-CMIP5")`. When the specified collection is part of the curated collections in `intake-esm-datastore`. ({pr}`124`) [andersy005](https://github.com/andersy005)
python
In [1]: import intake
In [2]: col = intake.open_esm_metadatastore(collection_name="GLADE-CMIP5")
In [3]: if "GLADE-CMIP5" collection isn't built already, the above is equivalent to:
In [4]: col = intake.open_esm_metadatastore(collection_input_definition="GLADE-CMIP5")
- Revert back to using official DRS attributes when building CMIP5 and CMIP6 collections.
({pr}`126`) [andersy005](https://github.com/andersy005)
- Add `.df` property for interfacing with the built collection via dataframe
To maintain backwards compatiblity. ({pr}`127`) [andersy005](https://github.com/andersy005)
- Add `unique()` and `nunique()` methods for summarizing count and unique values in a collection.
({pr}`128`) [andersy005](https://github.com/andersy005)
python
In [1]: import intake
In [2]: col = intake.open_esm_metadatastore(collection_name="GLADE-CMIP5")
In [3]: col
Out[3]: GLADE-CMIP5 collection catalogue with 615853 entries: > 3 resource(s)
> 1 resource_type(s)
> 1 direct_access(s)
> 1 activity(s)
> 218 ensemble_member(s)
> 51 experiment(s)
> 312093 file_basename(s)
> 615853 file_fullpath(s)
> 6 frequency(s)
> 25 institute(s)
> 15 mip_table(s)
> 53 model(s)
> 7 modeling_realm(s)
> 3 product(s)
> 9121 temporal_subset(s)
> 454 variable(s)
> 489 version(s)
In[4]: col.nunique()
resource 3
resource_type 1
direct_access 1
activity 1
ensemble_member 218
experiment 51
file_basename 312093
file_fullpath 615853
frequency 6
institute 25
mip_table 15
model 53
modeling_realm 7
product 3
temporal_subset 9121
variable 454
version 489
dtype: int64
In[4]: col.unique(columns=['frequency', 'modeling_realm'])
{'frequency': {'count': 6, 'values': ['mon', 'day', '6hr', 'yr', '3hr', 'fx']},
'modeling_realm': {'count': 7, 'values': ['atmos', 'land', 'ocean', 'seaIce', 'ocnBgchem',
'landIce', 'aerosol']}}
Bug Fixes
- For CMIP6, extract `grid_label` from directory path instead of file name. ({pr}`127`) [andersy005](https://github.com/andersy005)
Contributors to this release
([GitHub contributors page for this release](https://github.com/intake/intake-esm/graphs/contributors?from=2019-10-15&to=2019-12-13&type=c))