This release includes the full endometrial dataset, with all the utilities functions for working with it. We'll include the ovarian and colon cancer datasets in a future release, once we've set up remote data storage.
We have changed the dataset to be a class that the user instantiates, rather than a submodule that they load. As a result, the syntax for loading the dataset has changed slightly. Instead of running "import cptac.endometrial as en", the user would run two commands: first "import cptac", then "en = cptac.Endometrial()". However, manipulating the dataset thereafter uses the same syntax as before, working with the variable the dataset was assigned to, e.g. "clinical = en.get_clinical" and so on.
We have changed the syntax for the three merging functions: compare_omics, append_metadata_to_omics, and append_mutations_to_omics. Instead of separately loading the dataframes you want to merge, and then passing them to the function, you just pass a string to the function containing the name of the dataframe you want to merge, e.g. "appended = en.append_metadata_to_omics(metadata_df_name="derived_molecular", omics_df_name="phosphoproteomics")". Note that the parameter names now have "_name" added to the end. These functions no longer accept dataframes; you must pass dataframe names instead.