Improvements
- Support for writing based on Arrow as the transfer mechanism of the data
from Python to GDAL (requires GDAL >= 3.8). This is provided through the
new `pyogrio.raw.write_arrow` function, or by using the `use_arrow=True`
option in `pyogrio.write_dataframe` (314, 346).
- Add support for `fids` filter to `read_arrow` and `open_arrow`, and to
`read_dataframe` with `use_arrow=True` (304).
- Add some missing properties to `read_info`, including layer name, geometry name
and FID column name (365).
- `read_arrow` and `open_arrow` now provide
[GeoArrow-compliant extension metadata](https://geoarrow.org/extension-types.html),
including the CRS, when using GDAL 3.8 or higher (366).
- The `open_arrow` function can now be used without a `pyarrow` dependency. By
default, it will now return a stream object implementing the
[Arrow PyCapsule Protocol](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html)
(i.e. having an `__arrow_c_stream__`method). This object can then be consumed
by your Arrow implementation of choice that supports this protocol. To keep
the previous behaviour of returning a `pyarrow.RecordBatchReader`, specify
`use_pyarrow=True` (349).
- Warn when reading from a multilayer file without specifying a layer (362).
- Allow writing to a new in-memory datasource using io.BytesIO object (397).
Bug fixes
- Fix error in `write_dataframe` if input has a date column and
non-consecutive index values (325).
- Fix encoding issues on windows for some formats (e.g. ".csv") and always write ESRI
Shapefiles using UTF-8 by default on all platforms (361).
- Raise exception in `read_arrow` or `read_dataframe(..., use_arrow=True)` if
a boolean column is detected due to error in GDAL reading boolean values for
FlatGeobuf / GPKG drivers (335, 387); this has been fixed in GDAL >= 3.8.3.
- Properly ignore fields not listed in `columns` parameter when reading from
the data source not using the Arrow API (391).
- Properly handle decoding of ESRI Shapefiles with user-provided `encoding`
option for `read`, `read_dataframe`, and `open_arrow`, and correctly encode
Shapefile field names and text values to the user-provided `encoding` for
`write` and `write_dataframe` (384).
- Fixed bug preventing reading from bytes or file-like in `read_arrow` /
`open_arrow` (407).
Packaging
- The GDAL library included in the wheels is updated from 3.7.2 to GDAL 3.8.5.
Potentially breaking changes
- Using a `where` expression combined with a list of `columns` that does not include
the column referenced in the expression is not recommended and will now
return results based on driver-dependent behavior, which may include either
returning empty results (even if non-empty results are expected from `where` parameter)
or raise an exception (391). Previous versions of pyogrio incorrectly
set ignored fields against the data source, allowing it to return non-empty
results in these cases.