What's Changed
* Update Vega-Lite from version 5.8.0 to version 5.14.1; see [Vega-Lite Release Notes](https://github.com/vega/vega-lite/releases).
Enhancements
1. The `chart.transformed_data()` method was added to extract transformed chart data
For example when having an Altair chart including aggregations:
python
import altair as alt
from vega_datasets import data
cars = data.cars.url
chart = alt.Chart(cars).mark_bar().encode(
y='Cylinders:O',
x='mean_acc:Q'
).transform_aggregate(
mean_acc='mean(Acceleration)',
groupby=["Cylinders"]
)
chart
![image](https://github.com/altair-viz/altair/assets/5186265/d2b45c35-fbf4-4ae0-9b1f-08134efc8922)
Its now possible to call the `chart.transformed_data` method to extract a pandas DataFrame containing the transformed data.
python
chart.transformed_data()
![image](https://github.com/altair-viz/altair/assets/5186265/35891190-18bd-4910-a116-5dbc36af9482)
This method is dependent on VegaFusion with the embed extras enabled.
***
2. Introduction of a new data transformer named `vegafusion`
VegaFusion is an external project that provides efficient Rust implementations of most of Altair's data transformations. Using VegaFusion as Data Transformer it can overcome the Altair MaxRowsError by performing data-intensive aggregations in Python and pruning unused columns from the source dataset.
The data transformer can be enabled as such:
python
import altair as alt
alt.data_transformers.enable("vegafusion") default is "default"
> cmd
> DataTransformerRegistry.enable('vegafusion')
>
And one can now visualize a very large DataFrame as histogram where the binning is done within VegaFusion:
python
import pandas as pd
import altair as alt
prepare dataframe with 1 million rows
flights = pd.read_parquet(
"https://vegafusion-datasets.s3.amazonaws.com/vega/flights_1m.parquet"
)
delay_hist = alt.Chart(flights).mark_bar(tooltip=True).encode(
alt.X("delay", bin=alt.Bin(maxbins=30)),
alt.Y("count()")
)
delay_hist
![image](https://github.com/altair-viz/altair/assets/5186265/773b81ab-280d-4164-9c32-99d2b9567e12)
When the `vegafusion` data transformer is active, data transformations will be pre-evaluated when displaying, saving and converting charts as dictionary or JSON.
See a detailed overview on the [VegaFusion Data Transformer](https://altair-viz.github.io/user_guide/large_datasets.html#vegafusion-data-transformer) in the documentation.
***
3. A `JupyterChart` class was added to support accessing params and selections from Python
The `JupyterChart` class makes it possible to update charts after they have been displayed and access the state of interactions from Python.
For example when having an Altair chart including a selection interval as brush:
python
import altair as alt
from vega_datasets import data
source = data.cars()
brush = alt.selection_interval(name="interval", value={"x": [80, 160], "y": [15, 30]})
chart = alt.Chart(source).mark_point().encode(
x='Horsepower:Q',
y='Miles_per_Gallon:Q',
color=alt.condition(brush, 'Cylinders:O', alt.value('grey')),
).add_params(brush)
jchart = alt.JupyterChart(chart)
jchart
![image](https://github.com/altair-viz/altair/assets/5186265/44c2a7f6-195b-4a9d-81a8-69b8330957bf)
It is now possible to return the defined interval selection within Python using the `JupyterChart`
python
jchart.selections.interval.value
> cmd
> {'Horsepower': [80, 160], 'Miles_per_Gallon': [15, 30]}
>
The selection dictionary may be converted into a pandas query to filter the source DataFrame:
python
filter = " and ".join([
f"{v[0]} <= `{k}` <= {v[1]}"
for k, v in jchart.selections.interval.value.items()
])
source.query(filter)
![image](https://github.com/altair-viz/altair/assets/5186265/be57038e-9861-4164-bf84-74f3f83e4959)
Another possibility of the new `JupyerChart` class is to use `IPyWidgets` to control parameters in Altair. Here we use an ipywidget `IntSlider` to control the Altair parameter named `cutoff`.
python
import pandas as pd
import numpy as np
from ipywidgets import IntSlider, link, VBox
rand = np.random.RandomState(42)
df = pd.DataFrame({
'xval': range(100),
'yval': rand.randn(100).cumsum()
})
cutoff = alt.param(name="cutoff", value=23)
chart = alt.Chart(df).mark_point().encode(
x='xval',
y='yval',
color=alt.condition(
alt.datum.xval < cutoff,
alt.value('red'), alt.value('blue')
)
).add_params(
cutoff
)
jchart = alt.JupyterChart(chart)
slider = IntSlider(min=0, max=100, description='ipywidget')
link((slider, "value"), (jchart.params, "cutoff"))
VBox([slider, jchart])
![image](https://github.com/altair-viz/altair/assets/5186265/9297282d-5b26-4388-b74f-1f1debb5f3e9)
The `JupyterChart` class is dependent on AnyWidget. See a detailed overview in the new documentation page on [JupyterChart Interactivity](https://altair-viz.github.io/user_guide/jupyter_chart.html).
***
4. Support for field encoding inference for objects that support the DataFrame Interchange Protocol
We are maturing support for objects build upon the DataFrame Interchange Protocol in Altair.
Given the following pandas DataFrame with an ordered categorical column-type:
python
import altair as alt
from vega_datasets import data
Clean Title column
movies = data.movies()
movies["Title"] = movies["Title"].astype(str)
Convert MPAA rating to an ordered categorical
rating = movies["MPAA_Rating"].astype("category")
rating = rating.cat.reorder_categories(
['Open', 'G', 'PG', 'PG-13', 'R', 'NC-17', 'Not Rated']
).cat.as_ordered()
movies["MPAA_Rating"] = rating
Build chart using pandas
chart = alt.Chart(movies).mark_bar().encode(
alt.X("MPAA_Rating"),
alt.Y("count()")
)
chart
![image](https://github.com/altair-viz/altair/assets/5186265/236aa2bf-4cda-4265-8c5b-7eb244dc3b02)
We can convert the DataFrame to a PyArrow Table and observe that the types are now equally infered when rendering the chart.
python
import pyarrow as pa
Build chart using PyArrow
chart = alt.Chart(pa.Table.from_pandas(movies)).mark_bar().encode(
alt.X("MPAA_Rating"),
alt.Y("count()")
)
chart
![image](https://github.com/altair-viz/altair/assets/5186265/236aa2bf-4cda-4265-8c5b-7eb244dc3b02)
Vega-Altair support of the DataFrame Interchange Protocol is dependent on PyArrow.
***
5. A new transform method `transform_extent` is available
See the following example how this transform can be used:
python
import pandas as pd
import altair as alt
df = pd.DataFrame(
[
{"a": "A", "b": 28},
{"a": "B", "b": 55},
{"a": "C", "b": 43},
{"a": "D", "b": 91},
{"a": "E", "b": 81},
{"a": "F", "b": 53},
{"a": "G", "b": 19},
{"a": "H", "b": 87},
{"a": "I", "b": 52},
]
)
base = alt.Chart(df, title="A Simple Bar Chart with Lines at Extents").transform_extent(
extent="b", param="b_extent"
)
bars = base.mark_bar().encode(x="b", y="a")
lower_extent_rule = base.mark_rule(stroke="firebrick").encode(
x=alt.value(alt.expr("scale('x', b_extent[0])"))
)
upper_extent_rule = base.mark_rule(stroke="firebrick").encode(
x=alt.value(alt.expr("scale('x', b_extent[1])"))
)
bars + lower_extent_rule + upper_extent_rule
![image](https://github.com/altair-viz/altair/assets/5186265/623b36b2-1440-41dc-99bb-876210b6d642)
***
6. It is now possible to add configurable pixels-per-inch (ppi) metadata to saved and displayed PNG images
python
import altair as alt
from vega_datasets import data
source = data.cars()
chart = alt.Chart(source).mark_boxplot(extent="min-max").encode(
alt.X("Miles_per_Gallon:Q").scale(zero=False),
alt.Y("Origin:N"),
)
chart.save("box.png", ppi=300)
![image](https://user-images.githubusercontent.com/15064365/263293470-dc9ce553-96b2-4e7f-8e13-1dc0c66acd0c.png)
python
alt.renderers.enable("png", ppi=144) default ppi is 72
chart
![image](https://github.com/altair-viz/altair/assets/5186265/fce90ec1-bc8b-4ebf-b830-63f535180c2a)
Bug Fixes
* Don't call ``len`` on DataFrame Interchange Protocol objects (3111)
Maintenance
* Add support for new referencing logic in version 4.18 of the jsonschema package
Backward-Incompatible Changes
* Drop support for Python 3.7 which is end-of-life (3100)
* Hard dependencies: Increase minimum required pandas version to 0.25 (3130)
* Soft dependencies: Increase minimum required vl-convert-python version to 0.13.0 and increase minimum required vegafusion version to 1.4.0 (3163, 3160)
New Contributors
* thomend made their first contribution in https://github.com/altair-viz/altair/pull/3086
* NickCrews made their first contribution in https://github.com/altair-viz/altair/pull/3155
Release Notes by Pull Request
<details><summary>Click to view all 52 PRs merged for this release</summary>
* Explicitly specify arguments for to_dict and to_json methods for top-level chart objects by binste in https://github.com/altair-viz/altair/pull/3073
* Add Vega-Lite to Vega compiler registry and format arg to to_dict() and to_json() by jonmmease in https://github.com/altair-viz/altair/pull/3071
* Sanitize timestamps in arrow tables by jonmmease in https://github.com/altair-viz/altair/pull/3076
* Fix ridgeline example by binste in https://github.com/altair-viz/altair/pull/3082
* Support extracting transformed chart data using VegaFusion by jonmmease in https://github.com/altair-viz/altair/pull/3081
* Improve troubleshooting docs regarding Vega-Lite 5 by binste in https://github.com/altair-viz/altair/pull/3074
* Make transformed_data public and add initial docs by jonmmease in https://github.com/altair-viz/altair/pull/3084
* MAINT: Gitignore venv folders and use gitignore for black by binste in https://github.com/altair-viz/altair/pull/3087
* Fixed Wheat and Wages case study by thomend in https://github.com/altair-viz/altair/pull/3086
* Type hints: Parts of folders "vegalite", "v5", and "utils" by binste in https://github.com/altair-viz/altair/pull/2976
* Fix CI by jonmmease in https://github.com/altair-viz/altair/pull/3095
* Add VegaFusion data transformer with mime renderer, save, and to_dict/to_json integration by jonmmease in https://github.com/altair-viz/altair/pull/3094
* Unpin vl-convert-python in dev/ci dependencies by jonmmease in https://github.com/altair-viz/altair/pull/3099
* Drop support for Python 3.7 which is end-of-life by binste in https://github.com/altair-viz/altair/pull/3100
* Add support to transformed_data for reconstructed charts (with from_dict/from_json) by binste in https://github.com/altair-viz/altair/pull/3102
* Add VegaFusion data transformer documentation by jonmmease in https://github.com/altair-viz/altair/pull/3107
* Don't call len on DataFrame interchange protocol object by jonmmease in https://github.com/altair-viz/altair/pull/3111
* copied percentage calculation in example by thomend in https://github.com/altair-viz/altair/pull/3116
* Distributions and medians of likert scale ratings by thomend in https://github.com/altair-viz/altair/pull/3120
* Support for type inference for DataFrames using the DataFrame Interchange Protocol by jonmmease in https://github.com/altair-viz/altair/pull/3114
* Add some 5.1.0 release note entries by jonmmease in https://github.com/altair-viz/altair/pull/3123
* Add a code of conduct by joelostblom in https://github.com/altair-viz/altair/pull/3124
* master -> main by jonmmease in https://github.com/altair-viz/altair/pull/3126
* Handle pyarrow-backed columns in pandas 2 DataFrames by jonmmease in https://github.com/altair-viz/altair/pull/3128
* Fix accidental requirement of Pandas 1.5. Bump minimum Pandas version to 0.25. Run tests with it by binste in https://github.com/altair-viz/altair/pull/3130
* Add Roadmap and CoC to the documentation by jonmmease in https://github.com/altair-viz/altair/pull/3129
* MAINT: Use importlib.metadata and packaging instead of deprecated pkg_resources by binste in https://github.com/altair-viz/altair/pull/3133
* Add online JupyterChart widget based on AnyWidget by jonmmease in https://github.com/altair-viz/altair/pull/3119
* feat(widget): prefer lodash-es/debounce to reduce import size by manzt in https://github.com/altair-viz/altair/pull/3135
* Fix contributing descriptions by thomend in https://github.com/altair-viz/altair/pull/3121
* Implement governance structure based on GitHub's MVG by binste in https://github.com/altair-viz/altair/pull/3139
* Type hint schemapi.py by binste in https://github.com/altair-viz/altair/pull/3142
* Add JupyterChart section to Users Guide by jonmmease in https://github.com/altair-viz/altair/pull/3137
* Add governance page to the website by jonmmease in https://github.com/altair-viz/altair/pull/3144
* MAINT: Remove altair viewer as a development dependency by binste in https://github.com/altair-viz/altair/pull/3147
* Add support for new referencing resolution in jsonschema>=4.18 by binste in https://github.com/altair-viz/altair/pull/3118
* Update Vega-Lite to 5.14.1. Add transform_extent by binste in https://github.com/altair-viz/altair/pull/3148
* MAINT: Fix type hint errors which came up with new pandas-stubs release by binste in https://github.com/altair-viz/altair/pull/3154
* JupyterChart: Add support for params defined in the extent transform by jonmmease in https://github.com/altair-viz/altair/pull/3151
* doc: Add tooltip to Line example with custom order by NickCrews in https://github.com/altair-viz/altair/pull/3155
* docs: examples: add line plot with custom order by NickCrews in https://github.com/altair-viz/altair/pull/3156
* docs: line: Improve prose on custom ordering by NickCrews in https://github.com/altair-viz/altair/pull/3158
* docs: examples: remove connected_scatterplot by NickCrews in https://github.com/altair-viz/altair/pull/3159
* Refactor optional import logic and verify minimum versions by jonmmease in https://github.com/altair-viz/altair/pull/3160
* Governance: Mark binste as committee chair by binste in https://github.com/altair-viz/altair/pull/3165
* Add ppi argument for saving and displaying charts as PNG images by jonmmease in https://github.com/altair-viz/altair/pull/3163
* Silence AnyWidget warning (and support hot-reload) in development mode by jonmmease in https://github.com/altair-viz/altair/pull/3166
* Update roadmap.rst by mattijn in https://github.com/altair-viz/altair/pull/3167
* Add return type to transform_extent by binste in https://github.com/altair-viz/altair/pull/3169
* Use import_vl_convert in _spec_to_mimebundle_with_engine for better error message by jonmmease in https://github.com/altair-viz/altair/pull/3168
* update example world projections by mattijn in https://github.com/altair-viz/altair/pull/3170
* Send initial selections to Python in JupyterChart by jonmmease in https://github.com/altair-viz/altair/pull/3172
</details>
**Full Changelog**: https://github.com/altair-viz/altair/compare/v5.0.1...v5.1.0