SciPy 1.8.0 is the culmination of 6 months of hard work. It contains
many new features, numerous bug-fixes, improved test coverage and better
documentation. There have been a number of deprecations and API changes
in this release, which are documented below. All users are encouraged to
upgrade to this release, as there are a large number of bug-fixes and
optimizations. Before upgrading, we recommend that users check that
their own code does not use deprecated SciPy functionality (to do so,
run your code with ``python -Wd`` and check for ``DeprecationWarning`` s).
Our development attention will now shift to bug-fix releases on the
1.8.x branch, and on adding new features on the master branch.
This release requires Python 3.8+ and NumPy 1.17.3 or greater.
For running on PyPy, PyPy3 6.0+ is required.
**************************
Highlights of this release
**************************
- A sparse array API has been added for early testing and feedback; this
work is ongoing, and users should expect minor API refinements over
the next few releases.
- The sparse SVD library PROPACK is now vendored with SciPy, and an interface
is exposed via `scipy.sparse.svds` with ``solver='PROPACK'``. It is currently
default-off due to potential issues on Windows that we aim to
resolve in the next release, but can be optionally enabled at runtime for
friendly testing with an environment variable setting of ``USE_PROPACK=1``.
- A new `scipy.stats.sampling` submodule that leverages the ``UNU.RAN`` C
library to sample from arbitrary univariate non-uniform continuous and
discrete distributions
- All namespaces that were private but happened to miss underscores in
their names have been deprecated.
The NumPy 1.22.3 is maintenance release that fixes bugs discovered after the
1.22.2 release. The most noticeable fixes may be those for DLPack. One that may
cause some problems is disallowing strings as inputs to logical ufuncs. It is
still undecided how strings should be treated in those functions and it was
thought best to simply disallow them until a decision was reached. That should
not cause problems with older code.
The NumPy 1.22.2 is maintenance release that fixes bugs discovered after the
1.22.1 release. Notable fixes are:
- Several build related fixes for downstream projects and other platforms.
- Various Annotation fixes/additions.
- Numpy wheels for Windows will use the 1.41 tool chain, fixing downstream link
problems for projects using NumPy provided libraries on Windows.
- Deal with CVE-2021-41495 complaint.
The NumPy 1.22.1 is maintenance release that fixes bugs discovered after the
1.22.0 release. Notable fixes are:
- Fix f2PY docstring problems (SciPy)
- Fix reduction type problems (AstroPy)
- Fix various typing bugs.
NumPy 1.22.0 is a big release featuring the work of 153 contributers spread
over 609 pull requests. There have been many improvements, highlights are:
* Annotations of the main namespace are essentially complete. Upstream is a
moving target, so there will likely be further improvements, but the major
work is done. This is probably the most user visible enhancement in this
release.
* A preliminary version of the proposed Array-API is provided. This is a step
in creating a standard collection of functions that can be used across
applications such as CuPy and JAX.
* NumPy now has a DLPack backend. DLPack provides a common interchange format
for array (tensor) data.
* New methods for ``quantile``, ``percentile``, and related functions. The new
methods provide a complete set of the methods commonly found in the
literature.
* A new configurable allocator for use by downstream projects.
* The universal functions have been refactored to implement most of
:ref:`NEP 43 <NEP43>`. This also unlocks the ability to experiment with the
future DType API.
The R package installs the file lib/R/etc/Makeconf, which is intended
to be used by R packages that themselves compile programs. This
feature is rarely used, but the math/R-nimble package is an example of
one that does. For this to work, the compiler flags embedded within
Makeconf must be compatible with the system compiler. At least on
MacOS, this is not the case by default, and so nimble compilations
fail. This substitutes ${COMPILER_RPATH_FLAG} into configure.ac, so
that it is used when creating Makeconf. Since neither R itself nor
most R packages compile other programs, Makeconf is generally not used
and this fix will have no impact.
lapack creates .mod files that are created in the same location
when the static and shared libraries these then interfere with each
other. put the .mod files created when buildling the static library
in a different directory to fix this.
Now correctly detects external devel/cpu_features package, so remove
patches for that. Also remove boost dependency since the package was
changed to use C++17 instead of boost.
Upstream changes:
CHANGES IN R 4.1.3:
NEW FEATURES:
* The default version of Bioconductor has been changed to 3.14.
(This is used by setRepositories and the menus in GUIs.)
UTILITIES:
* R CMD check --as-cran has a workaround for a bug in versions of
file up to at least 5.41 which mis-identify DBF files last
changed in 2022 as executables.
C-LEVEL FACILITIES:
* The legacy S-compatibility macros SINGLE_* in R_ext/Constants.h
(included by R.h) are deprecated and will be removed in R 4.2.0.
BUG FIXES:
* Initialization of self-starting nls() models with initialization
functions following the pre-R-4.1.0 API (without the ...
argument) works again for now, with a deprecation warning.
* Fixed quoting of ~autodetect~ in Java setting defaults to avoid
inadvertent user lookup due to leading ~, reported in PR#18231 by
Harold Gutch.
* substr(., start, stop) <- v now treats _negative_ stop values
correctly. Reported with a patch in PR#18228 by Brodie Gaslam.
* Subscripting an array x without dimnames by a
length(dim(x))-column character matrix gave "random" non-sense,
now an error; reported in PR#18244 by Mikael Jagan.
* ...names() now matches names(list(...)) closely, fixing PR#18247.
* all.equal(*, scale = s) now works as intended when length(s) > 1,
partly thanks to Michael Chirico's PR#18272.
* print(x) for long vectors x now also works for named atomic
vectors or lists and prints the correct number when reaching the
getOption("max.print") limit; partly thanks to a report and
proposal by Hugh Parsonage to the R-devel list.
* all.equal(<selfStart>, *) no longer signals a deprecation
warning.
* reformulate(*, response=r) gives a helpful error message now when
length(r) > 1, thanks to Bill Dunlap's PR#18281.
* Modifying globalCallingHandlers inside withCallingHandlers() now
works or fails correctly, thanks to Henrik Bengtsson's PR#18257.
* hist(<Date>, breaks = "days") and hist(<POSIXt>, breaks = "secs")
no longer fail for inputs of length 1.
* qbeta(.001, .9, .009) and similar cases now converge correctly
thanks to Ben Bolker's report in PR#17746.
* window(x, start, end) no longer wrongly signals "'start' cannot
be after 'end'", fixing PR#17527 and PR#18291.
* data() now checks that its (rarely used) list argument is a
character vector - a couple of packages passed other types and
gave incorrect results.
* which() now checks its arr.ind argument is TRUE rather coercing
to logical and taking the first element - which gave incorrect
results in package code.
* model.weights() and model.offset() more carefully extract their
model components, thanks to Ben Bolker and Tim Taylor's R-devel
post.
* list.files(recursive = TRUE) now shows all broken symlinks
(previously, some of them may have been omitted, PR#18296).
This package contains basic definitions related to indexed
profunctors. These are primarily intended as internal utilities to support
the optics and generic-lens package families.
Haskellers are usually familiar with monoids and semigroups. A monoid has
an appending operation <> (or mappend), and an identity element, mempty. A
semigroup has an appending <> operation, but does not require a mempty
element.
A Semiring has two appending operations, plus and times, and two respective
identity elements, zero and one.
More formally, a Semiring R is a set equipped with two binary relations +
and *, such that:
- (R,+) is a commutative monoid with identity element 0,
- (R,*) is a monoid with identity element 1,
- (*) left and right distributes over addition, and multiplication by '0'
annihilates R.
This package provides tools for working with various Kan extensions and Kan
lifts in Haskell.
Among the interesting bits included are:
* Right and left Kan extensions (Ran and Lan)
* Right and left Kan lifts (Rift and Lift)
* Multiple forms of the Yoneda lemma (Yoneda)
* The Codensity monad, which can be used to improve the asymptotic
complexity of code over free monads (Codensity, Density)
* A "comonad to monad-transformer transformer" that is a special case of a
right Kan lift. (CoT, Co)
Free monads are useful for many tree-like structures and domain specific
languages.
If f is a Functor then the free Monad on f is the type of trees whose nodes
are labeled with the constructors of f. The word "free" is used in the
sense of "unrestricted" rather than "zero-cost": Free f makes no
constraining assumptions beyond those given by f and the definition of
Monad. As used here it is a standard term from the mathematical theory of
adjoint functors.
Cofree comonads are dual to free monads. They provide convenient ways to
talk about branching streams and rose-trees, and can be used to annotate
syntax trees. The cofree comonad can be seen as a stream parameterized by a
Functor that controls its branching factor.
This Haskell library provides an efficient lazy wheel sieve for prime
generation inspired by "Lazy wheel sieves and spirals of primes" by Colin
Runciman and "The Genuine Sieve of Eratosthenes" by Melissa O'Neil.
5.3.7 [2022.01.09]
* Relax the Bind constraints in the following instances to Functor:
-instance (Bind f, Monad f) => Alt (MaybeT f)
-instance (Bind f, Monad f) => Plus (MaybeT f)
+instance (Functor f, Monad f) => Alt (MaybeT f)
+instance (Functor f, Monad f) => Plus (MaybeT f)
-instance (Bind f, Monad f, Semigroup e) => Alt (ExceptT e f)
-instance (Bind f, Monad f, Semigroup e, Monoid e) => Plus (ExceptT e f)
+instance (Functor f, Monad f, Semigroup e) => Alt (ExceptT e f)
+instance (Functor f, Monad f, Semigroup e, Monoid e) => Plus (ExceptT e f)
-- If building with transformers-0.5.* or older
-instance (Bind f, Monad f) => Alt (ErrorT e f)
-instance (Bind f, Monad f, Error e) => Plus (ErrorT e f
+instance (Functor f, Monad f) => Alt (ErrorT e f)
+instance (Functor f, Monad f, Error e) => Plus (ErrorT e f)
5.3.6 [2021.10.07]
* Allow building with GHC 9.2.
* Allow building with transformers-0.6.*.
* Add Alt instance for Identity.
* Add Conclude, Decide and Divise type classes and instances.
* Add (<.*>), (<*.>), and traverseMaybe functions, which make it easier to
defined Traversable1 instances for data types that have fields with a
combination of Traversable and Traversable1 instances.
* Add Semigroupoids.Do module with overloads for use with QualifiedDo.
* Add Apply, Alt, Plus, Bind and BindTrans instances for the CPS versions
of WriterT and RWST.
* Add psum function to Data.Functor.Plus.
* Add Categorical data type.
0.20 [2021.11.15]
* Support hashable-1.4. The Hashable1 instances added in 0.19.2 are removed
for all types except NonEmpty, in accordance with the corresponding
changes from hashable-1.4.
0.19.2 [2021.08.30]
* Backport Hashable1 instances for NonEmpty, Min, Max, First, Last,
WrappedMonoid, and Option.
0.3.7.0
* Make division (/) on Scientifics slightly more efficient.
* Fix the Show instance to surround negative numbers with parentheses when
necessary.
* Add (Template Haskell) Lift Scientific instance
* Mark modules as Safe or Trustworthy (Safe Haskell).
v0.11.4 (2021 Oct 3)
Maintenance and fixes
* Fix standard deviation code in density utils by replacing it with `np.std`.
v0.11.3 (2021 Oct 1)
New features
* Added `labeller` argument to enable label customization in plots and summary
* Added `arviz.labels` module with classes and utilities
* Added probability estimate within ROPE in `plot_posterior`
* Added `rope_color` and `ref_val_color` arguments to `plot_posterior`
* Improved retrieving or pointwise log likelihood in `from_cmdstanpy`, `from_cmdstan` and `from_pystan`
* Added interactive legend to bokeh `forestplot`
* Added interactive legend to bokeh `ppcplot`
* Add more helpful error message for HDF5 problems reading `InferenceData` from NetCDF
* Added `data.log_likelihood`, `stats.ic_compare_method` and `plot.density_kind` to `rcParams`
* Improve error messages in `stats.compare()`, and `var_name` parameter.
* Added ability to plot HDI contours to `plot_kde` with the new `hdi_probs` parameter.
* Add dtype parsing and setting in all Stan converters
* Add option to specify colors for each element in ppc_plot
Maintenance and fixes
* Fix conversion for numpyro models with ImproperUniform latent sites
* Fixed conversion of Pyro output fit using GPUs
* Enforced using coordinate values as default labels
* Integrate `index_origin` with all the library
* Fix pareto k threshold typo in reloo function
* Preserve shape from Stan code in `from_cmdstanpy`
* Updated `from_pystan` converters to follow schema convention
* Used generator instead of list wherever possible
* Correctly use chain index when constructing PyMC3 `DefaultTrace` in `from_pymc3`
* Fix bugs in CmdStanPyConverter
* Fix `c` argument in `plot_khat`
* Fix `ax` argument in `plot_elpd`
* Remove warning in `stats.py` compare function
* Fix `ess/rhat` plots in `plot_forest`
* Fix `from_numpyro` crash when importing model with `thinning=x` for `x > 1`
* Upload updated mypy.ini in ci if mypy copilot fails
* Added type checking to raise an error whenever `InferenceData` object is passed using `io_pymc3`'s `trace` argument
* Fix `xlabels` in `plot_elpd`
* Renamed `sample` dim to `__sample__` when stacking `chain` and `draw` to avoid dimension collision
* Removed the `circular` argument in `plot_dist` in favor of `is_circular`
* Fix `legend` argument in `plot_separation`
* Removed testing dependency on http download for radon dataset
* Fixed plot_kde to take labels with kwargs.
* Fixed xarray related tests.
* Fix Bokeh deprecation warnings
* Fix credible inteval percentage in legend in `plot_loo_pit`
* Arguments `filter_vars` and `filter_groups` now raise `ValueError` if illegal arguments are passed
* Remove constrained_layout from arviz rcparams
* Fix plot_elpd for a single outlier
Deprecation
* Deprecated `index_origin` and `order` arguments in `az.summary`
Documentation
* Language improvements of the first third of the "Label guide"
* Added "Label guide" page and API section for `arviz.labels` module
* Add "Installation guide" page to the documentation
* Improve documentation on experimental `SamplingWrapper` classes
* Added example to `plot_hdi` using Inference Data
* Removed `geweke` diagnostic from `numba` user guide
* Restructured the documentation sections to improve community and about us information
v0.11.2 (2021 Feb 21)
New features
* Added `to_zarr` and `from_zarr` methods to InferenceData
* Added confidence interval band to auto-correlation plot
Maintenance and fixes
* Updated CmdStanPy converter form compatibility with versions >=0.9.68
* Updated `from_cmdstanpy`, `from_cmdstan`, `from_numpyro` and `from_pymc3` converters to follow schema convention
* Fix calculation of mode as point estimate
* Remove variable name from legend in posterior predictive plot
* Added significant digits formatter to round rope values
* Updated `from_cmdstan`. csv reader, dtype problem fixed and dtype kwarg added for manual dtype casting
Deprecation
* Removed Geweke diagnostic
* Removed credible_interval and include_circ arguments
Documentation
* Added an example for converting dataframe to InferenceData
* Added example for `coords` argument in `plot_posterior` docstring
v0.11.1 (2021 Feb 2)
Maintenance and fixes
* Fixed ovelapping titles and repeating warnings on circular traceplot
* Removed repetitive variable names from forest plots of multivariate variables
* Fixed regression in `plot_pair` labels that prevented coord names to be shown when necessary
Documentation
* Use tabs in ArviZ example gallery
v0.11.0 (2021 Dec 17)
New features
* Added `to_dataframe` method to InferenceData
* Added `__getitem__` magic to InferenceData
* Added group argument to summary
* Add `ref_line`, `bar`, `vlines` and `marker_vlines` kwargs to `plot_rank`
* Add observed argument to (un)plot observed data in `plot_ppc`
* Add support for named dims and coordinates with multivariate observations
* Add support for discrete variables in rank plots
`loo_pit`
* Add `skipna` argument to `plot_posterior`
* Make stacking the default method to compute weights in `compare`
* Add `copy()` method to `InferenceData` class.
Maintenance and fixes
* prevent wrapping group names in InferenceData repr_html
* Updated CmdStanPy interface
* Remove left out warning about default IC scale in `compare`
* Fixed a typo found in an error message raised in `distplot.py`
* Fix typo in `loo_pit` extraction of log likelihood
* Have `from_pystan` store attrs as strings to allow netCDF storage
* Remove ticks and spines in `plot_violin`
* Use circular KDE function and fix tick labels in circular `plot_trace`
* Fix `pair_plot` for mixed discrete and continuous variables
* Fix in-sample deviance in `plot_compare`
* Fix computation of weights in compare
* Avoid repeated warning in summary
* Fix hdi failure with boolean array
* Automatically get the current axes instance for `plt_kde`, `plot_dist` and `plot_hdi`
* Add grid argument to manually specify the number of rows and columns
* Switch to `compact=True` by default in our plots
* `plot_elpd`, avoid modifying the input dict
* Do not plot divergences in `plot_trace` when `kind=rank_vlines` or `kind=rank_bars`
* Allow ignoring `observed` argument of `pymc3.DensityDist` in `from_pymc3`
* Make `from_pymc3` compatible with theano-pymc 1.1.0
* Improve typing hints
Deprecation
* `plot_khat` deprecate `annotate` argument in favor of `threshold`. The new argument accepts floats
Documentation
* Reorganize documentation and change sphinx theme
* Switch to [MyST](https://myst-parser.readthedocs.io/en/latest/) and [MyST-NB](https://myst-nb.readthedocs.io/en/latest/index.html)
for markdown/notebook parsing in docs
* Incorporated `input_core_dims` in `hdi` and `plot_hdi` docstrings
* Add documentation pages about experimental `SamplingWrapper`s usage
* Show example titles in gallery page
* Add `sample_stats` naming convention to the InferenceData schema
* Extend api documentation about `InferenceData` methods
Experimental
* Modified `SamplingWrapper` base API
v0.10.0 (2020 Sep 24)
New features
* Added InferenceData dataset containing circular variables
* Added `is_circular` argument to `plot_dist` and `plot_kde` allowing for a circular histogram (Matplotlib, Bokeh) or 1D KDE plot (Matplotlib).
* Added `to_dict` method for InferenceData object
* Added `circ_var_names` argument to `plot_trace` allowing for circular traceplot (Matplotlib)
* Ridgeplot is hdi aware. By default displays truncated densities at the specified `hdi_prop` level
* Added `plot_separation`
* Extended methods from `xr.Dataset` to `InferenceData`
* Add `extend` and `add_groups` to `InferenceData`
* Added `__iter__` method (`.items`) for InferenceData
* Add support for discrete variables in `plot_bpv`
Maintenance and fixes
* Automatic conversion of list/tuple to numpy array in distplot
* `plot_posterior` fix overlap of hdi and rope
* `plot_dist` bins argument error fixed
* Improve handling of circular variables in `az.summary`
* Removed change of default warning in `ELPDData` string representation
* Update `radon` example dataset to current InferenceData schema specification
* Update `from_cmdstan` functionality and add warmup groups
* Restructure plotting code to be compatible with mpl>=3.3
* Replaced `_fast_kde()` with `kde()` which now also supports circular variables via the argument `circular`
* Increased `from_pystan` attrs information content
* Allow `plot_trace` to return and accept axes
* Update diagnostics to be on par with posterior package
* Use method="average" in `scipy.stats.rankdata`
* Add more `plot_parallel` examples
* Bump minimum xarray version to 0.16.1
* Fix multi rope for `plot_forest`
* Bump minimum xarray version to 0.16.1
* `from_dict` will now store warmup groups even with the main group missing
* increase robustness for repr_html handling
v0.21.1 (31 January 2022)
-------------------------
This is a bugfix release to resolve
Bug fixes
~~~~~~~~~
- Add `packaging` as a dependency to Xarray
v0.21.0 (27 January 2022)
-------------------------
New Features
~~~~~~~~~~~~
- New top-level function :py:func:`cross`.
- ``keep_attrs`` support for :py:func:`where`
- Enable the limit option for dask array in the following methods :py:meth:`DataArray.ffill`, :py:meth:`DataArray.bfill`, :py:meth:`Dataset.ffill` and :py:meth:`Dataset.bfill`
Breaking changes
~~~~~~~~~~~~~~~~
- Rely on matplotlib's default datetime converters instead of pandas'
- Improve repr readability when there are a large number of dimensions in datasets or dataarrays by
wrapping the text once the maximum display width has been exceeded.
Deprecations
~~~~~~~~~~~~
- Removed the lock kwarg from the zarr and pydap backends, completing the deprecation cycle started in :issue:`5256`.
- Support for ``python 3.7`` has been dropped.
Bug fixes
~~~~~~~~~
- Preserve chunks when creating a :py:class:`DataArray` from another :py:class:`DataArray`
- Properly support :py:meth:`DataArray.ffill`, :py:meth:`DataArray.bfill`, :py:meth:`Dataset.ffill` and :py:meth:`Dataset.bfill` along chunked dimensions
- Subclasses of ``byte`` and ``str`` (e.g. ``np.str_`` and ``np.bytes_``) will now serialise to disk rather than raising a ``ValueError: unsupported dtype for netCDF4 variable: object`` as they did previously
- Fix applying function with non-xarray arguments using :py:func:`xr.map_blocks`.
- No longer raise an error for an all-nan-but-one argument to
:py:meth:`DataArray.interpolate_na` when using `method='nearest'`
- `dt.season <https://xarray.pydata.org/en/stable/generated/xarray.DataArray.dt.season.html>`_ can now handle NaN and NaT.
- Determination of zarr chunks handles empty lists for encoding chunks or variable chunks that occurs in certain cirumstances
Internal Changes
~~~~~~~~~~~~~~~~
- Replace ``distutils.version`` with ``packaging.version``
- Removed internal checks for ``pd.Panel``
- Add ``pyupgrade`` pre-commit hook
Changelog:
0.17.0:
C++ API
set the baseline C++ version to 17.
mdds has been internalized so that the public header no longer contains references to mdds. With this change, the users can use different API versions of mdds between the ixion build and run-time use.
cleaned up public API to make use of std::string_view and std::variant where appropriate.
formula interpreter
implemented built-in LEFT() function.
misc
it is no longer required to set the size of void* at build time to ensure the binaries to be fully functional.
fixed a bug where named expressions with names containing invalid characters were still allowed in.
Version 0.55.1 (27 January, 2022)
---------------------------------
This is a bugfix release that closes all the remaining issues from the
accelerated release of 0.55.0 and also any release critical regressions
discovered since then.
CUDA target deprecation notices:
* Support for CUDA toolkits < 10.2 is deprecated and will be removed in Numba
0.56.
* Support for devices with Compute Capability < 5.3 is deprecated and will be
removed in Numba 0.56.
0.11.0:
* Remove six, networkx and decorator dependency
* Bump gast and Beniget requirements to support python 3.10
* Bump xsimd to 7.5.0
* Minimal default support for non-linux, non-osx, now-windows platform
* Numpy improvements for np.bincount, np.transpose, np.searchsorted
* Restore (and test) cython compatibility
* Expose pythran.get_include for toolchain integration
* Improve error message on invalid spec
* Handle static dispatching based on keyword signature
* Raise Memory Error upon (too) large numpy alloc
* Support scalar case of scipy.special.binom
* Trim the number of warnings in pythonic codebase
0.9.12
KInd of hoping this is the last 0.9 release, and I find time to stablize as 1.0, and start the 2.0 work some time soon...
Changelog:
Remove Cyclic references (memory leak)
Add left & right shift operations (<< and >>)
Switch to GH actions & CodeCov.io for CI tests
Add extra contributors details
Reformat w/ Black + isort, and have linting of those in CI
This package was BROKEN from the first import.
I've re-imported it to wip, when it's finished there we can reimport it.
Also remove the two packages trying to use this.
Version 0.55.0
This release includes a significant number important dependency upgrades along with a number of new features and bug fixes.
Version 0.54.1
This is a bugfix release for 0.54.0. It fixes a regression in structured array type handling, a potential leak on initialization failure in the CUDA target, a regression caused by Numba’s vendored cloudpickle module resetting dynamic classes and a few minor testing/infrastructure related problems.
Version 0.53.1
This is a bugfix release for 0.53.0. It contains the following four pull-requests which fix two critical regressions and two build failures reported by the openSuSe team:
* Fix regression on gufunc serialization
* Fix regression in CUDA: Set stream in mapped and managed array device_setup
* Ignore warnings from packaging module when testing import behaviour.
* set non-reported llvm timing values to 0.0
Version 0.53.0
This release continues to add new features, bug fixes and stability improvements to Numba.
Highlights of core changes:
Support for Python 3.9
Function sub-typing
Initial support for dynamic gufuncs
Parallel Accelerator (@njit(parallel=True) now supports Fortran ordered arrays
Version 0.52.0
This release focuses on performance improvements, but also adds some new features and contains numerous bug fixes and stability improvements.
Version 1.1.4 (September, 2020)
- Switched from Nose to Pytest for testing. Patch courtesy @kmosiejczuk,
[PR #32](https://github.com/bmc/munkres/pull/32), with some additional
cleanup by me.
- Fix to [Issue #34](https://github.com/bmc/munkres/issues/34), in which
`print_matrix` wasn't handling non-integral values. Patch courtesy @finn0,
via [PR #35](https://github.com/bmc/munkres/pull/35).
- Various changes from `http:` URLs to `https:` URLs, courtesy @finn0
via [PR #36](https://github.com/bmc/munkres/pull/36/).
Version 1.1.3:
**Nonexistent**. Accidentally published before check-in. Deleted from
PyPI. Use version 1.1.4.
Version 1.1.2 (February, 2019)
- Removed `NoReturn` type annotations, to allow compatibility with Python 3.5
releases prior to 3.5.4. Thanks to @jackwilsdon for catching that issue.
Version 1.1.1 (February, 2019)
- Version bump to get past a PyPI publishing issue. (Can't republish
partially published 1.1.0.)
Version 1.1.0 (February, 2019)
- Only supports Python 3.5 or better, from this version forward (since Python
2 is at end of life in 11 months).
- Added `typing` type hints.
- Updated docs to use `pdoc`, since `epydoc` is pretty much dead.
2021-12-31: 1.0.8 release:
* src/Soros.py: fix FutureWarning: Possible nested set at position, reported by Rene Engelhard
* fr.sor:
- use hyphens instead of spaces, e.g. cent-deux, reported by "4560041" at GitHub
- new prefix "informal" for 1100–1900 (onze-cents - dix-neuf-cents)
- add prefix "feminine" and "masculine" (1 -> une/un), bug reports by arena94 at GitGub
* hu_Hung.sor:
- fix transliteration of old Hungarian family names, bug report by Zoltán Óvári
- fix 100–199, 1000–1999, 1000000–1999999 and 1000000000–1999999999 (bad ordering)
– fix conversion of single letters "í", "Í" and "NY";
- fix unnecessary conversion of words ending with "q", e.g. "IQ";
- fix unnecessary conversion of words not ending with unknown letters
* mr.sor: Marathi spelling corrections by Shantanu Oak
* pl.sor: fix ordinal 20-29, reported by Gabryha at GitHub
* uk.sor, CalcAddIn.xcu, description.xml.in: fix spell by Olexandr Nesterenko
- replace apostrophe symbol to U+02BC, reported by Volodymyr Lisivka
- up number to 10^42
- add cardinal, update help
- add uk locale
* zh.sor: add ordinal numbers, use always 二 for 2, reported by Ming-Hua
This flag should be set for packages that import pkg_resources
and thus need setuptools after the build step.
Set this flag for packages that need it and bump PKGREVISION.
This is a minor version of PyTables. The main feature added is that
compatibility with Python 3.10, numpy 1.21 and HDF5 1.12 has been improved,
while support for Python 3.5 has been dropped.
The CI infrastructure has been moved to GitHub Actions.
Changes from 2.8.0 to 2.8.1
---------------------------
* Fixed dependency list.
* Added ``pyproject.toml`` and modernize the ``setup.py`` script. Thanks to
Antonio Valentino for the PR.
Changes from 2.7.3 to 2.8.0
---------------------------
* Wheels for Python 3.10 are now provided.
* Support for Python 2.7 and 3.5 has been discontinued.
* All residual support for Python 2.X syntax has been removed, and therefore
the setup build no longer makes calls to the `2to3` script. The `setup.py`
has been refactored to be more modern.
* The examples on how to link into Intel VML/MKL/oneAPI now use the dynamic
library.
-Fix crash whem missing closing ceil, floor, |
-Describe piecewise functions in README
-Fixed tick symbol resulting in panic
-Prevent constants from being overridden
-Output numbers with the precision specified
-Float erros margin for comparison operators
-NaN for comparison with imaginary numbers
-Updated interperter tests to expect 0 after comparisons
1.21.5:
BUG: Fix shadowed reference of `dtype` in type stub
BUG: Fix headers for universal2 builds
BUG: ``VOID_nonzero`` could sometimes mutate alignment flag
BUG: Do not use nonzero fastpath on unaligned arrays
BUG: Distutils patch to allow for 2 as a minor version (!)
BUG, SIMD: Fix 64-bit/8-bit integer division by a scalar
BUG, SIMD: Workaround broadcasting SIMD 64-bit integers on MSVC...
REL: Prepare for the NumPy 1.21.4 release.
TST: Fix a `Arrayterator` typing test failure
Release 3.4.5
=============
Changes
-------
- The deprecation warning when using :func:`rpy2.robjects.lib.grid.activate`
was missing (indirectly revealed through issue #804).
- The named argument `LINPACK` in :meth:`rpy2.robjects.vectors.Matrix.svd`
is no longer present in R.
Bugs fixed
----------
- SIGPIPE sent to a process running Python+rpy2 could result in a segfault.
This was caused by an incorrect setting of R signal handlers (issue #809).
Release 3.4.4
==============
Changes
-------
- `RRuntimeError` exceptions raised while evaluating R code
an R magic (ipython/jupyter) are now propagated (issue #792).
Release 3.4.3
=============
New features
------------
- :mod:`rpy2.robjects.lib.ggplot2` maps more functions in the
R package (issue #767)
- Utility function :func:`rpy2.robjects.lib.ggplot2.dict2rvec`
to convert a Python `Dict[str, str]` into an R named vector
of strings.
Bugs fixed
----------
- Calling mod:`rpy2.situation` to report on the environment no longer
stops with an uncaught exception when no R home can be determined
(issue #774)
- Converting pandas series with the older numpy types could result
in an error (issue #781)
- Numpy converter was not properly turing R integer or float arrays
into their numpy equivalent (issue #785)
- The HTML representation of R list without named element was
incorrect (issue #787)
Release 3.4.2
=============
Bugs fixed
----------
- Multithreading during the initialization of the embedded R no longer
triggers a fatal error (issue #729)
Changes
-------
- :mod:`pytest` is now an optional package. Optional sets of packages are
`numpy`, `pandas`, `test`, and `all` (all optional packages). They
can be specified during the installation. For example
`pip install rpy2[test]`. (issue #670)
Release 3.4.1
=============
Bugs fixed
----------
- The file `requirements.txt` was missing from the source distribution
on pypi (issue #764).
Release 3.4.0
=============
New Features
------------
- The mapping of the R C API now includes `Rf_isSymbol()`.
- Singleton class :class:`rpy2.rinterface_lib.sexp.RVersion` to report
the R version for the embedded R.
- :func:`rpy2.rinterface.local_context` to create a context manager
to evaluate R code within a local environment.
- The `staticmethod` :meth:`rpy2.robjects.vectors.DateVector.isrinstance`
will tell whether an R objects is an R `Date` array.
Changes
-------
- The dynamic generation of docstrings for R man pages
is now using R's `Rd2txt`.
- The :func:`rpy2.rinterface_lib._rinterface_capi._findVarInFrame`
is replaced by the function
:func:`rpy2.rinterface_lib._rinterface_capi._findvar_in_frame`
(see fix to issue #710).
- The functions :func:`rpy2.robjects.numpy.activate()` and
:func:`rpy2.robjects.pandas.activate()` are deprecated and will
be removed in rpy2-3.5.0.
- :func:`rpy2.rinterface_lib.embedded.setinitialized` was renamed to
:func:`rpy2.rinterface_lib.embedded._setinitialized` to indicate that
one should not use it.
- :meth:`rpy2.robjects.lib.ggplot2.vars` to map the R function
`ggplot2::vars` (issue #742).
- Report correctly the class of R matrix objects with R>=4.0: it is
now `('matrix', 'array')`. With R<4.0 `('matrix')` is still reported.
- The conversion of R/rpy2 objects to python objects using R class name mapping
is extended to more classes. The documentation about conversion covers the topic.
- If `R_NilValue` is not null when the initialization of the embedded R is attempted,
it is now assumed that R was initialized through other means (e.g., an other C library in the
same process) and the C-level initialization is be skipped.
- The conversion `rpy2py` is now working with any Python object inheriting
from `_rinterface_capi.SupportsSEXP`.
Bugs fixed
----------
- The C function `Rf_findVarInFrame()` in the R API can trigger
in an R-level error, and while this is rare, when it does
when embedded in Python it creates a segfault. Calls are
now wrapped in `R_ToplevelExec()` to limit the propagation
of R exceptions. This solved issue #710.
- More complete and correct mapping of R class names in
:func:`rpy2.rinterface_lib.sexp.rclass_get`.
- Initializing the embedded R caused the loss of ability to use Ctrl-C
to send SIGINT to a Python process (issue #723)
- :mod:`rpy2.sitation` is now working when the environment variable
`R_HOME` is set even though R is not in the `PATH` or in the Windows
registry (issue #744).
- Handling an R language objects could result in a segfault when its
R class was queried (issue #749).
- The conversion of R string arrays to `numpy` arrays was leaving
R's `NA` value as R NA objects. NAs in this type of arrays are now
turned to `None` in the resulting `numpy` array (issue #751).
- `rpy2.situation.get_rlib_path()` was returning an environment variable
with an invalid separator on Windows (mentioned in issue #754).
- R strings encoded with something else than 'utf-8' could result in
errors when trying to convert to Python strings (issue #754).
- Extracting documentation pages for R objects in packages could
generate spurious warnings when several "section" tags are present.
- R `Date` arrays/vectors were not wrapped into
:class:`rpy2.robjects.vectors.DateVector` objects but left as
R arrays of floats (which they are at the C level).
- The HTML representation of short R lists without names could
fail with an error.
- The :meth:`__repr__` of `robjects`-level objects was not displaying
the rpy2 class the R object is mapped to.
Release 3.3.6
=============
Bugs fixed
----------
- The unit tests for importing R packages with `lib_loc` were
broken (issue #720).
- Trying to create a memoryview for an R array with complex values
was failing with an attribute error.
- Fix the constructor of metaclass
:class:`rpy2.robjects.methods.RS4Auto_Type`.
- Fix call to end the embedded R in :class:`rpy2.robjects.R.__cleanup__`
(issue #734).
Release 3.3.5
=============
Bugs fixed
----------
- The callback handler to read input to R returned an
invalid result, leading to R asking for input
without ever acknowledging it received it.
Release 3.3.4
=============
Bugs fixed
----------
- Creating an R vector object from a Python object implementing
the buffer protocol could give incorrect results as C-level
incompatibilities could be missed (issue #702).
- :func:`rpy2.robjects.packages.importr` could fail when `lib_loc`
was specified (issue #705).
Release 3.3.3
=============
Bugs fixed
----------
- Fallback for when `str2lang` is missing (R < 3.6)
- Fix segfault with :meth:`PairListSexpVector.__getitem__` when
elements of the R pairlist have a `NILSXP` name (issue #700)
Release 3.3.2
=============
Bugs fixed
----------
- Initial fixes to have rpy2 running in ABI mode on Windows.
Few tests are not passing (many in callbacks for R's C API).
- System detection is now checking for FreeBSD.
Release 3.3.1
=============
Bugs fixed
-----------
- :meth:`rpy2.robjects.conversion.NameClassMap.update` can update
the mapping (:class:`dict`) or the default class.
Changes
-------
- Adding local converters was overwriting the base `NameClassMap`.
Release 3.3.0
=============
New features
------------
- Trying to import an R package that is not installed will now raise an
exception :class:`rpy2.robjects.packages.PackageNotInstalledError`.
- The R C API functions `void SET_FRAME(SEXP x, SEXP v)`,
`void SET_ENCLOS(SEXP x, SEXP v)` and `void SET_HASHTAB(SEXP x, SEXP v)`
are now accessible through rpy2.
- The module :mod:`rpy2.situation` can now return `LD_LIBRARY_PATH`
information about R. For example with
`python -m rpy2.situation LD_LIBRARY_PATH`
- :meth:`rpy2.robjects.methods.RS4.extends` lists the class names in the
inheritance line.
- The conversion of R objects to Python allows much more flexibility
and better allow the use of independent code converting different classes.
This is currently limited to R objects that are lists, environments, or
S4 objects. The Sphinx documentation contains an example. While this is
still work in progress this should already address concerns
at the origin of issue #539 about S4 classes.
- :class:`rpy2.robjects.language.LangVector` to map R language objects at
the `robjects` level.
- :class:`rpy2.robjects.vectors.PairlistVector` to map R pairlist objects at
the `robjects` level.
- An alternative function to display the output of R cells can be
specified using `-d` or `--display` in the magic arguments
(in :mod:`rpy2.ipython.rmagic`).
- Python classes representing underlying R objects no longer have to
exclusively rely on inheritance from :mod:`rpy2.rinterface` objects`.
An abstract class :class:`rpy2.rinterface_lib.sexp.SupportsSEXP` is added
to identify objects supporting a `__sexp__` protocol, and that abstract
class can also be used with type hints.
- :func:`rpy2.robjects.functions.wrap_r_functions` can create Python functions
with matching signature from R functions
- :func:`rpy2.robjects.functions.wrap_r_functions` can create Python functions
with matching signature from R functions.
- New class :class:`rpy2.rinterface_lib._rinterface_capi.UninitializedRCapsule`
to allow the instanciation of "placeholder" rpy2 objects before the
embedded R is initialized. This facilitate the use of static typing checks
such as mypy, mocking for tests that do not involve the execution of R
code, and allow cleaner implementations of module-level globals
that are R objects.
- New class :class:`rpy2.robjects.vectors.DateVector` to represent R dates.
- :class:`pandas.Series` containing date objects can now be converted to R
`Date` vectors.
Changes
-------
- When calling R C-API's `R_ParseVector` and a error occurs, the
exception message now contains the parsing status.
- :mod:`rpy2.rinterface_lib.embedded` has a module-level "constant"
`DEFAULT_C_STACK_LIMIT` used when initializing the embedded R.
- When creating a :mod:`rpy2.robjects.vectors.DataFrame` from (name, vector)
pairs, the names are no longer transformed to syntactically valid R
symbols (issue #660).
- The value `nan` in :mod:`pandas` Series with strings is now converted
to R NA (issue #668).
- Initial support for :const:`pandas.NA` (still experimental in pandas
at the time of writing, and rpy2 support is limited to arrays of strings).
- :mod:`pandas` series of dtype :class:`pandas.StringDType`, experimental in pandas 1.0,
are now supported by the converted (in the pandas-to-R direction) (issue #669)
- Version checking for the mapping of R packages in :mod:`rpy2.robjects.lib` is
now more permissive (check that version prefixes are matching).
Bugs fixed
-----------
- Building ABI only mode could require an API build environment (and fail
with an error when not present).
- SVG output for the R magic were incorrectly bytes objects.
- :meth:`rpy2.rinterface_lib.sexp.StrSexpVector.__getitem__` was returning the string
`'NA'` when an R NA value. Not it returns `rpy2.rinterface_lib.na_values.NA_Character`.
Release 3.2.7
=============
Bugs fixed
----------
- An f-string in `_rinterface_cffi_build.py` prevented installation
on Python 3.5 (issue #654).
Release 3.2.6
=============
Bugs fixed
----------
- The conversion of date/time object with specified timezones
was wrong when different than the local time zone (issue #634)
- Iterating over :mod:`rpy2.situation.iter_info()` could result
in a error because of a typo in the code.
Changes
-------
- :mod:`pandas` 1.0.0 breaks the conversion layer. A warning
is now emitted whenever trying to use `pandas` >= 1.0.
Release 3.2.5
=============
Bugs fixed
----------
- Latest release for R package `rlang` broke import through `importr()`.
A workaround for :mod:`rpy2.robjects.lib.ggplot2` is to rename the
offending R object (issue #631).
Changes
-------
- f-string requiring Python >= 3.6 removed.
Release 3.2.4
=============
Bugs fixed
----------
- An incomplete backport of the bug fixed in 3.2.3 broke the ABI mode.
Release 3.2.3
=============
Bugs fixed
-----------
- Error when parsing strings as R codes could result in a segfault.
Release 3.2.2
=============
Bugs fixed
----------
- Python format error when trying to report that the system is not reported
on Windows (issue #597).
- The setup script would error on build if R is not installed. It is now
printing an error message.
Release 3.2.1
=============
Bugs fixed
----------
- The wrapper for the R package `dbplyr` could not import the underlying
package (refactoring elsewhere was not propagated there).
- Creating R objects called `names` `globalenv` caused the method
:meth:`Sexp.names` to fail (issue #587).
- Whenever the pandas conversion was activated :class:`FloatSexpVector` instances
with the R class `POSIXct` attached where not corrected mapped back to pandas
datetime arrays. (issue #594).
- Fix installation when an installation when a prefix without write access is used
(issue #588).
Release 3.2.0
=============
New features
------------
- rpy2 can built and used with :mod:`cffi`'s ABI or API modes (releases 3.0.x and
3.1.x were using the ABI mode exclusively). At the time of writing the default
is still the ABI mode but the choice can be controlled through the environment variable
`RPY2_CFFI_MODE`. If set, possible values are `ABI` (default if the environment
variable is not set), `API`, or `BOTH`. When the latter, both `API` and `ABI`
modes are built, and the choice of which one to use can be made at run time.
Changes
-------
- The "consoleread" callback (reading input to the R console) is now assuming UTF-8
(was previously assuming ASCII) and is no longer trying to add a "new line" character
at the end of the input.
- Querying an R environment with an invalid key will generate a :class:`TypeError`
or a :class:`ValueError` depending on the issue (rather than always :class:`ValueError`
before.
Bugs fixed
----------
- `setup.py` is now again compatible with Python2 (issue #580).
- Unit tests were failing if numpy is not installed.
- :mod:`rpy2.situation` is no longer breaking when R is not the in path and
there is no environment variable `R_HOME`.
- Build script for the cffi interface is now using the environment
variable `R_HOME` whenever defined (rather that always infer it from the
R in the PATH).
- Converting R strings back to Python was incorrectly using `Latin1` while `UTF-8` was
intended (issue #537).
Release 3.1.0
=============
New features
------------
- Python matrix multiplication (`__matmul__` / `@`) added to
R :class:`Matrix` objects.
- An :class:`threading.RLock` is added to :mod:`rpy2.rinterface_lib.openrlib` and is
used by the context manager :func:`rpy2.rinterface_lib.memorymanagement.rmemory`
to ensure that protect/unprotect cycles cannot be broken by thread switching, at least
as long as the context manager is used to handle such cycles (see issue #571).
- The documentation covers the use of notebooks (mainly Jupyter/Jupyterlab).
- The PNG output in Jupyter notebooks R cells can now specify an argument `--type`
(passed as the named argument `type` in the R function `png`).
For example on some Linux systems and R installations, the type `cairo`
can fix issues when alpha transparency is used.
Changes
-------
- Added callbacks for `ptr_R_Busy()` and `ptr_R_ProcessEvents()`.
- `rstart` now an objects in :mod:`rpy2.rinterface_lib.embedded`
(set to `None` until R is initialized).
- Unit tests are included in a subpackage :mod:`rpy2.tests` as was the
case before release 3.0.0 (issue #528).
- Experimental initialization for Microsoft Windows.
- :mod:`rpy2.situation` is now also reporting the rpy2 version.
- :func:`rpy2.robjecs.package_utils.default_symbol_check_after` was
renamed :func:`rpy2.robjecs.package_utils.default_symbol_resolve`.
The named parameters `default_symbol_check_after` present in few methods
in :mod:`rpy2.robjects.packages` and :mod:`rpy2.robjects.functions` were
modified to keep a consistent naming.
- Trying to instantiate an :class:`rpy2.rlike.container.OrdDict` with a
a :class:`dict` will result in a :class:`TypeError` rather than a
:class:`ValueError`.
- Methods of :class:`rpy2.rlike.container.OrdDict` now raises a
:class:`NotImplementedError` when not implemented.
- The creation of R vectors from Python sequences is now relying on a method
:meth:`_populate_r_vector` that allows vectorized implementation to
to improve speed.
- Continuous integration tests run against Python 3.6, 3.7, and 3.8. It is
no longer checked against Python 3.5.
Bugs fixed
----------
- `aes` in :mod:`rpy2.robjects.lib.ggplot2` had stopped working with the
R package ggplot2 reaching version 3.2.0. (issue #562).
- Better handling of recent :mod:`pandas` arrays with missing values
(related to issue #544).
- The mapping of the R operator `%in%` reachable through the attribute `ro`
of R vectors was always returning `True`. It is now working properly.
- R POSIXct vectors with `NA` dates were triggering an error when converted
in a data frame converted to :mod:`pandas` (issue #561).
Release 3.0.5
=============
Bugs fixed
----------
- No longer allow installation if Python 3 but < 3.5.
- Fixed error `undefined symbol: DATAPTR` if R < 3.5 (issue #565).
Release 3.0.4
=============
Bugs fixed
----------
- Fixed conversion of `pandas` :class:`Series` of dtype `pandas.Int32Dtype`,
or `pandas.Int64Dtype` (issue #544).
Release 3.0.3
=============
Bugs fixed
----------
- Fixed the evaluation of R code using the "R magic" was delaying all
output to the end of the execution of that code, independently of
whether the attribute `cache_display_data` was `True` or `False`
(issue #543).
- Fixed conversion of :class:`pandas.Series` of `dtype` "object" when
all items are either all of the same type or are :obj:`None` (issue #540).
Release 3.0.2
=============
Bugs fixed
----------
- Failing to import `pandas` or `numpy` when loading the "R magic" extension
for jupyter/ipython was hiding the cause of the error in the `ImportError`
exception.
- Fallback when an R `POSIXct` vector does not had an attribute `"tzone"`
(issue #533).
- Callback for console reset was not set during R initialization.
- Fixed rternalized function returning rpy2 objects (issue #538).
- `--vanilla` is no longer among the default options used to initialize R
(issue #534).
Release 3.0.1
=============
Bugs fixed
----------
- Script to install R packages for docker image never made it to version
control.
- Conversion of R arrays/matrices into numpy object trigged a segfault
during garbage collection (issue #524).
Release 3.0.0
=============
New features
------------
- rpy2 can be installed without a development environment.
- Unit tests are now relying on the Python module `pytest`.
- :attr:`rpy2.rinterface.NA_Integer` is now only defined when the embedded R
is initialized.
Changes
-------
- complete rewrite of :mod:`rpy2.rinterface`.
:mod:`cffi` is now used to interface with the R compiled shared library.
This allows ABI calls and removes the need to compile binaries. However, if
compilation is available (when installing or preparing pre-compiled binaries)
faster implementations of performance bottlenecks will be available.
- calling :func:`rpy2.rinterface.endr` multiple times is now only ending R
the first time it is called (note: an ended R cannot successfully be
re-initialized).
- The conversion system in the mod:`rpy2.robjects.conversion` now has only
two conversions `py2rpy` and rpy2py`. `py2rpy` tries to convert any
Python object into an object rpy2 can use with R and `rpy2py` tries
to convert any rpy2 object into a either a non-rpy2 Python object or
a mod:`rpy2.robjects` level object.
- The method `get` for R environments is now called `find()` to avoid
confusion with the method of the same name in Python (:meth:`dict.get`).
- :class:`rpy2.robjects.vectors.Vector`, :class:`rpy2.robjects.vectors.Matrix`,
and :class:`rpy2.robjects.vectors.Array` can no longer be used to create
R arrays of unspecified type. New type-specific classes (for example for
vectors :class:`rpy2.robjects.vectors.IntVector`,
:class:`rpy2.robjects.vectors.BoolVector`,
:class:`rpy2.robjects.vectors.FloatVector`,
:class:`rpy2.robjects.vectors.ComplexVector`, or
:class:`rpy2.robjects.vectors.StrVector`) should be used instead.
- mod:`rpy2.rpy_classic`, an implementation of the `rpy` interface using
:mod:`rpy2.rinterface` is no longer available.
- :class:`rpy2.robjects.ParsedCode` and
:class:`rpy2.robjects.SourceCode` are moved to
:class:`rpy2.robjects.packages.ParsedCode` and
:class:`rpy2.robjects.packages.SourceCode`.
Bugs fixed
----------
- Row names in R data frames were lost when converting to pandas data frames
(issue #484).
Known issues
------------
- Mismatch between R's POSIXlt `wday` and Python time struct_time's `tm_wday`
(issue #523).
Release 2.9.6
=============
Bugs fixed
----------
- Latest release of :mod:`pandas` deprecated :meth:`DataFrame.from_items`.
(issue #514).
- Latest release of :mod:`pandas` requires categories to be a list
(not an other sequence).
Known issues
------------
- The numpy buffer implemented by R arrays is broken for complex numbers
Release 2.9.5
=============
Bugs fixed
----------
- Missing values in pandas :class:`Category` series were creating
invalid R factors when converted (issue #493).
Release 2.9.4
=============
Bugs fixed
----------
- Fallback for failure to import numpy or pandas is now dissociated from
failure to import :mod:`numpy2ri` or :mod:`pandas2ri` (issue #463).
- :func:`repr` for R POSIX date/time vectors is now showing a string
representation of the date/time rather than the timestamp as a float
(issue #467).
- The HTML representation of R data frame (the default representation in the
Jupyter notebook) was displaying an inconsistent number of rows
(found while workin on issue #466).
- Handle time zones in timezones in Pandas when converting to R data frames
(issue #454).
- When exiting the Python process, the R cleanup is now explicitly request
to happen before Python's exit. This is preventing possible segfaults
the process is terminating (issue #471).
- dplyr method `ungroup()` was missing from
:class:`rpy2.robjects.lib.dplyr.DataFrame` (issue #473).
Release 2.9.3
=============
Bugs fixed
----------
- Delegate finding where is local time zone file to either a user-specified
module-level variable `default_timezone` or to the third-party
module :mod:`tzlocal` (issue #448).
Release 2.9.2
=============
Changes
-------
- The pandas converter is converting :class:`pandas.Series` of `dtype` `"O"`
to :class:`rpy2.robjects.vectors.StrVector` objects, issueing a warning
about it (See issue #421).
- The conversion of pandas data frame is now working with columns rather
than rows (introduce in bug fix for issue #442 below) and this is expected
to result in more efficient conversions.
Bugs fixed
----------
- Allow floats in figure sizes for R magic (Pull request #63)
- Fixed pickling unpickling of robjects-level instances,
regression introduced in fix for issue #432 with release 2.9.1 (issue #443).
- Fixed broken unit test for columns of `dtype` `"O"` in `pandas` data frames.
- Fixed incorrect conversion of R factors in data frames to columns of
integers in pandas data frame (issue #442).
version 1.5.8
* Fix Enum bug (issue 1128): the enum_dict member of an EnumType read from a file
contains invalid values when the enum is large enough (more than 127 or 255
members).
* Binary wheels for aarch64 and python 3.10.
version 1.5.7
* don't try to mask vlens with default _FillValue, since vlens don't have a default _FillValue.
This gets rid of numpy DeprecationWarning (issue 1099).
* update docs to reflect the fact that a variable must be in collective mode before writing
compressed data to it in parallel. Added a test for this (examples/mpi_example_compressed.py).
* Fix OverflowError when dimension sizes become greater than 2**32-1 elements on Windows (Issue 1112).
* Don't return masked arrays for vlens (only for primitive and enum types
Summary of bugs fixed for version 6.4.0 (2021-10-30)
----------------------------------------------------
Improvements and fixes
- Reduce memory usage in BISTs for `copyobj`, `hgsave`.
- `hgsave.m`, `copyobj.m`: Use `'qt'` graphics toolkit in BISTs.
- `main.cc`: Use `getopt` to parse command line arguments.
- `main.cc`: Remove invalid case.
- Disable `getopt` error reporting in wrapper program.
- `interp1.m`: Don't interpret later numeric input as `xi`.
- `pkg`: Improve similar package name suggestion.
- Store parent name in function object when caching parents in scope.
- Avoid internal error and segfault with `eval` and scripts.
- `rmpath`: Prevent removing the current directory from the load path.
GUI
- Fix missing interpreter event in `octave-scintilla`.
- Fix opening a file in a custom editor.
Documentation
- Improve docstring for `disable_diagonal_matrix`, `disable_diagonal_matrix`,
and `disable_range`.
- `cbrt`: Clarify that function errors for non-real input.
- `dsearchn.m`: Added optional distance output description.
- Add Hungarian translation for project description files.
- Document fsolve output "info" -2.
Build system
- Correct error message for incompatible CXSparse.
Summary of bugs fixed for version 6.3.0 (2021-07-11)
----------------------------------------------------
Important notice
- This bug fix release breaks ABI compatiblity with Octave 6.2.0. Re-build
binaries (like .oct or .mex files) when updating to this version.
Improvements and fixes
- `ls-hdf5.cc`: Avoid throwing inside HDF5 function.
- `ls-hdf5.cc`: Handle non-zero terminated type strings.
- Fix occasional segfault in `symbfact`.
- `fsolve.m`: Fix undefined output error when using `Output` function.
- Fix compilation error with `iconv_t` on Solaris.
- build: Check for `stropts.h`.
- Avoid ambiguous call to `pow`.
- Fix context link when creating handle to nested function.
- `print.m`: Warn when figure is too large to be printed to output page.
- Defer clearing function vars until stack frame is deleted.
- Avoid memory leaks when returning handles to nested functions.
- Hold references to closure frames in anon functions if needed.
- `eigs`: Prevent possible segmentation fault at exit.
- Issue warning when gnuplot graphics toolkit is initialized.
- mpoles.m: Fix detection of pole multiplicity.
- Perform shutdown actions in interpreter destructor.
- build: Make relocation of dependencies with Octave optional.
- `qz.cc`: Return correct number of eigenvalues.
- `qz.cc`: Let test pass with LAPACK 3.9.1 and earlier versions.
- `pkg.m`: Use default prefixes unless otherwise set previously.
- `betaincinv.m`: Correctly handle small inputs.
- `betaincinv.m`: Correctly handle inputs very close to 1.0.
- `unistd-wrappers.c`: Allocate sufficient memory for `new_argv`.
- Mark system functions correctly if `OCTAVE_HOME` is non-canonical.
- Mark compiled system functions correctly if `OCTAVE_HOME` is non-canonical.
- Fix error if test suite is run before Octave is installed.
- `lo-array-errwarn.cc`: Include `<limits>`.
- Use `std::size_t` and `std::ptrdiff_t` in C++ code.
- Use `std::size_t` in more instances.
- Return proper number of stack frames for `dbstack (N)` call.
- Avoid ambiguous match of overloaded function.
- `lscov.m`: Relax BIST tolerance to pass with OpenBLAS.
- `print`: Fix error when `"px"` word is present in a figure.
- `logm.m`: Fix check for real negative values in complex vector.
- build: Set necessary flags to allow execution on Windows Vista.
- Declare base_parser destructor virtual.
- `hist.m`: Improve handling and docstring for third parameter "norm".
- `logm.m`: Allow tolerance in check for real negative values in complex vector.
- `expm.m`, `logm.m`: Use function `isdiag` to detect if input is a diagonal matrix.
- tests: Relax tolerance for some tests on macOS.
- `logspace.m`: Mark tests as known to fail on macOS.
- `hist.m`: Use deterministic test.
- `rgb2ind.m`: Reduce memory usage and eliminate randomness in test.
- `logm.m`: Allow larger tolerance for test on macOS.
- build: Use correct path to `octave` binary in build tree on macOS.
- build: Fix typo in folder to libraries when building `.oct` or `.mex` files.
- build: Set DL_LDFLAGS in the build rules for .oct or .mex files.
- `rgb2ind.m`: Suppress output in test.
- Improve documentation for `log2` function.
- `ind2sub`: Fix typo in "see also" section of docstring.
- `mrdivide`, `mldivide`: Document that functions might return minimum norm solutions.
- Fix scoping issue for handles to sibling nested functions.
- `ls-mat5.cc`: Avoid integer overflow in calculation of buffer size for zlib.
- Move top-level REPL from interpreter to evaluator.
- Avoid crash with `dbquit` when executing command in terminal from GUI.
GUI
- Fix calling external editor.
- Fix missing file suffix .m when saving a new script.
- Do not run files that are not saved as Octave files.
- Fix confirm overwrite for native editor file "save as" dialogs.
- Fix crash when GUI tries to restore missing previous Octave dir.
- Fix restoring the horizontal position of docked GUI widgets.
- Prevent floating widgets from re-opening after restart.
- Avoid crash in GUI for `rmdir("")`.
- Fix EOL mode when saving files under new names.
- Fix auto indentation of switch-structure in GUI editor.
- Avoid crash when closing GUI with open editor tabs.
- `octave-qscintilla.cc` (`contextmenu_run`): Fix keyboard command handling.
Documentation
- Improve Differential Equations chapter and example for lsode.
- Clarify usage of "Depends" keyword in package `DESCRIPTION` file.
- Add note that wildcard patterns for `save` are glob patterns.
- Change example for Delaunay triangulation to match the generating code.
- Document single precision issues with OpenGL graphics toolkits.
- Minor changes to documentation of single precision issues with OpenGL.
- Expand on documentation for command syntax.
- `isprop.m`: Document that function only works on graphics objects in Octave 6.X.
- Explain how to write dual-purpose executable scripts and Octave functions.
- Update keyword docstrings.
- Use Texinfo commands to improve `transpose()` docstring rendering.
- `betainc.m`, `betaincinv.m`: Correct non-TeX definition of beta incomplete integral.
- Grammarcheck documentation ahead of 6.3 release.
- Spellcheck documentation ahead of 6.3 release.
2.8.3 (2021-12-13)
------------------
- Fix more use of 'python' where 'python3' is intended.
2.8.2 (2021-12-06)
------------------
- Update documentation to reflect new 2.8 features.
- Fix array compression for non-native byte order
- Fix use of 'python' where 'python3' is intended.
- Fix schema URI resolving when the URI prefix is also
claimed by a legacy extension.
- Remove 'name' and 'version' attributes from NDArrayType
instances.
2.8.1 (2021-06-09)
------------------
- Fix bug in block manager when a new block is added to an existing
file without a block index.
2.8.0 (2021-05-12)
------------------
- Add ``yaml_tag_handles`` property to allow definition of custom yaml
``%TAG`` handles in the asdf file header.
- Add new resource mapping API for extending asdf with additional
schemas.
- Add global configuration mechanism.
- Drop support for automatic serialization of subclass
attributes.
- Support asdf:// as a URI scheme.
- Include only extensions used during serialization in
a file's metadata.
- Drop support for Python 3.5.
- Add new extension API to support versioned extensions.
- Permit wildcard in tag validator URIs.
- Implement support for ASDF Standard 1.6.0. This version of
the standard limits mapping keys to string, integer, or
boolean.
- Stop removing schema defaults for all ASDF Standard versions,
and automatically fill defaults only for versions <= 1.5.0.
- Stop removing keys with ``None`` values from the tree on write. This
fixes a long-standing issue where the tree structure is not preserved
on write, but will break ``ExtensionType`` subclasses that depend on
this behavior. Extension developers will need to modify their
``to_tree`` methods to check for ``None`` before adding a key to
the tree (or modify the schema to permit nulls, if that is the
intention).
- Deprecated the ``auto_inline`` argument to ``AsdfFile.write_to`` and
``AsdfFile.update`` and added ``AsdfConfig.array_inline_threshold``.
- Add ``edit`` subcommand to asdftool for efficient editing of
the YAML portion of an ASDF file.
- Increase limit on integer literals to signed 64-bit.
- Remove the ``asdf.test`` method and ``asdf.__githash__`` attribute.
- Add support for custom compression via extensions.
- Remove unnecessary ``.tree`` from search result paths.
- Drop support for bugs in older operating systems and Python versions.
- Add argument to ``asdftool diff`` that ignores tree nodes that match
a JMESPath expression.
- Fix behavior of ``exception`` argument to ``GenericFile.seek_until``.
- Fix issues in file type detection to allow non-seekable input and
filenames without recognizable extensions. Remove the ``asdf.asdf.is_asdf_file``
function.
- Update ``asdftool extensions`` and ``asdftool tags`` to incorporate
the new extension API.
- Add ``AsdfSearchResult.replace`` method for assigning new values to
search results.
- Search for block index starting from end of file. Fixes rare bug when
a data block contains a block index.
- Update asdf-standard to 1.6.0 tag.
2.7.5 (2021-06-09)
------------------
- Fix bug in ``asdf.schema.check_schema`` causing relative references in
metaschemas to be resolved incorrectly.
- Fix bug in block manager when a new block is added to an existing
file without a block index.
2.7.4 (2021-04-30)
------------------
- Fix pytest plugin failure under older versions of pytest.
- Copy array views when the base array is non-contiguous.
- Prohibit views over FITS arrays that change dtype.
- Add support for HTTPS URLs and following redirects.
- Prevent astropy warnings in tests when opening known bad files.
What's new in 1.3.5 (December 12, 2021)
---------------------------------------
Fixed regressions
~~~~~~~~~~~~~~~~~
- Fixed regression in :meth:`Series.equals` when comparing floats with dtype object to None (:issue:`44190`)
- Fixed regression in :func:`merge_asof` raising error when array was supplied as join key (:issue:`42844`)
- Fixed regression when resampling :class:`DataFrame` with :class:`DateTimeIndex` with empty groups and ``uint8``, ``uint16`` or ``uint32`` columns incorrectly raising ``RuntimeError`` (:issue:`43329`)
- Fixed regression in creating a :class:`DataFrame` from a timezone-aware :class:`Timestamp` scalar near a Daylight Savings Time transition (:issue:`42505`)
- Fixed performance regression in :func:`read_csv` (:issue:`44106`)
- Fixed regression in :meth:`Series.duplicated` and :meth:`Series.drop_duplicates` when Series has :class:`Categorical` dtype with boolean categories (:issue:`44351`)
- Fixed regression in :meth:`.GroupBy.sum` with ``timedelta64[ns]`` dtype containing ``NaT`` failing to treat that value as NA (:issue:`42659`)
- Fixed regression in :meth:`.RollingGroupby.cov` and :meth:`.RollingGroupby.corr` when ``other`` had the same shape as each group would incorrectly return superfluous groups in the result (:issue:`42915`)
v0.20.2
This is a bugfix release to resolve (:issue:`3391`, :issue:`5715`). It also
includes performance improvements in unstacking to a ``sparse`` array and a
number of documentation improvements.
Breaking changes
- Use complex nan when interpolating complex values out of bounds by default (instead of real nan) (:pull:`6019`).
Performance
- Significantly faster unstacking to a ``sparse`` array. :pull:`5577`
Bug fixes
- :py:func:`xr.map_blocks` and :py:func:`xr.corr` now work when dask is not installed (:issue:`3391`, :issue:`5715`, :pull:`5731`).
- Fix plot.line crash for data of shape ``(1, N)`` in _title_for_slice on format_item (:pull:`5948`).
- Fix a regression in the removal of duplicate backend entrypoints (:issue:`5944`, :pull:`5959`)
Documentation
- Better examples in docstrings for groupby and resampling reductions (:pull:`5871`).
Internal Changes
- Use ``importlib`` to replace functionality of ``pkg_resources`` in
backend plugins tests. (:pull:`5959`).
Kernels
volk_32f_stddev_and_mean_32f_x2: implemented Young and Cramer's algorithm
volk_32fc_accumulator_s32fc: Add new kernel
volk_16ic_x2_dot_prod_16ic_u_avx2: Fix Typo, was _axv2.
Remove _mm256_zeroupper() calls
Enforce consistent function prototypes
32fc_index_max: Improve speed of AVX2 version
conv_k7_r2: Disable broken AVX2 code
improve volk_8i_s32f_convert_32f for ARM NEON
Calculate cos in AVX512F
Calculate sin using AVX512F
Build
Fix python version detection
cmake: Check that 'distutils' is available
c11: Remove pre-C11 preprocessor instructions
2021-12-04:
Update version number.
2021-10-04:
Consistently use semicolons after DOUBLE_CONVERSION_ASSERT.
2021-07-16:
Fix spelling.
2021-05-19:
Loongarch is a RISC-style command system architecture.
Add support for loongarch architecture.
SciPy 1.7.3 is a bug-fix release that provides binary wheels
for MacOS arm64 with Python 3.8, 3.9, and 3.10. The MacOS arm64 wheels
are only available for MacOS version 12.0 and greater, as explained
in Issue 14688, linked below.
Issues closed for 1.7.3
-----------------------
* Segmentation fault on import of scipy.integrate on Apple M1 ARM...
* BUG: ARPACK's eigsh & OpenBLAS from Apple Silicon M1 (arm64)...
* four CI failures on pre-release job
* Remaining test failures for macOS arm64 wheel
* BUG: Segmentation fault caused by scipy.stats.qmc.qmc.update_discrepancy
Pull requests for 1.7.3
-----------------------
* BLD: update pyproject.toml for Python 3.10 changes
* BUG: out of bounds indexing in stats.qmc.update_discrepancy
* MAINT: skip a few failing tests in \`1.7.x\` for macOS arm64
GLPK 5.0
The copyright was transferred to the Free Software Foundation.
To fix some licensing problems the routines in the following
files were disabled by replacing with dummy ones that print an
error message:
src/api/gridgen.c
src/api/netgen.c
src/api/rmfgen.c
src/misc/qmd.c
src/misc/relax4.c
Note that this change does not affect the main faunctionality
of the package.
Some minor bugs were fixed.
v0.20.1 (5 November 2021)
-------------------------
This is a bugfix release to fix :issue:`5930`.
Bug fixes
~~~~~~~~~
- Fix a regression in the detection of the backend entrypoints (:issue:`5930`, :pull:`5931`)
By `Justus Magin <https://github.com/keewis>`_.
Documentation
~~~~~~~~~~~~~
- Significant improvements to :ref:`api`. By `Deepak Cherian <https://github.com/dcherian>`_.
.. _whats-new.0.20.0:
v0.20.0 (1 November 2021)
-------------------------
This release brings improved support for pint arrays, methods for weighted standard deviation, variance,
and sum of squares, the option to disable the use of the bottleneck library, significantly improved performance of
unstack, as well as many bugfixes and internal changes.
Many thanks to the 40 contributors to this release!:
Aaron Spring, Akio Taniguchi, Alan D. Snow, arfy slowy, Benoit Bovy, Christian Jauvin, crusaderky, Deepak Cherian,
Giacomo Caria, Illviljan, James Bourbeau, Joe Hamman, Joseph K Aicher, Julien Herzen, Kai Mühlbauer,
keewis, lusewell, Martin K. Scherer, Mathias Hauser, Max Grover, Maxime Liquet, Maximilian Roos, Mike Taves, Nathan Lis,
pmav99, Pushkar Kopparla, Ray Bell, Rio McMahon, Scott Staniewicz, Spencer Clark, Stefan Bender, Taher Chegini,
Thomas Nicholas, Tomas Chor, Tom Augspurger, Victor Negîrneac, Zachary Blackwood, Zachary Moon, and Zeb Nicholls.
New Features
~~~~~~~~~~~~
- Add ``std``, ``var``, ``sum_of_squares`` to :py:class:`~core.weighted.DatasetWeighted` and :py:class:`~core.weighted.DataArrayWeighted`.
By `Christian Jauvin <https://github.com/cjauvin>`_.
- Added a :py:func:`get_options` method to xarray's root namespace (:issue:`5698`, :pull:`5716`)
By `Pushkar Kopparla <https://github.com/pkopparla>`_.
- Xarray now does a better job rendering variable names that are long LaTeX sequences when plotting (:issue:`5681`, :pull:`5682`).
By `Tomas Chor <https://github.com/tomchor>`_.
- Add an option (``"use_bottleneck"``) to disable the use of ``bottleneck`` using :py:func:`set_options` (:pull:`5560`)
By `Justus Magin <https://github.com/keewis>`_.
- Added ``**kwargs`` argument to :py:meth:`open_rasterio` to access overviews (:issue:`3269`).
By `Pushkar Kopparla <https://github.com/pkopparla>`_.
- Added ``storage_options`` argument to :py:meth:`to_zarr` (:issue:`5601`, :pull:`5615`).
By `Ray Bell <https://github.com/raybellwaves>`_, `Zachary Blackwood <https://github.com/blackary>`_ and
`Nathan Lis <https://github.com/wxman22>`_.
- Histogram plots are set with a title displaying the scalar coords if any, similarly to the other plots (:issue:`5791`, :pull:`5792`).
By `Maxime Liquet <https://github.com/maximlt>`_.
- Slice plots display the coords units in the same way as x/y/colorbar labels (:pull:`5847`).
By `Victor Negîrneac <https://github.com/caenrigen>`_.
- Added a new :py:attr:`Dataset.chunksizes`, :py:attr:`DataArray.chunksizes`, and :py:attr:`Variable.chunksizes`
property, which will always return a mapping from dimension names to chunking pattern along that dimension,
regardless of whether the object is a Dataset, DataArray, or Variable. (:issue:`5846`, :pull:`5900`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.
Breaking changes
~~~~~~~~~~~~~~~~
- The minimum versions of some dependencies were changed:
=============== ====== ====
Package Old New
=============== ====== ====
cftime 1.1 1.2
dask 2.15 2.30
distributed 2.15 2.30
lxml 4.5 4.6
matplotlib-base 3.2 3.3
numba 0.49 0.51
numpy 1.17 1.18
pandas 1.0 1.1
pint 0.15 0.16
scipy 1.4 1.5
seaborn 0.10 0.11
sparse 0.8 0.11
toolz 0.10 0.11
zarr 2.4 2.5
=============== ====== ====
- The ``__repr__`` of a :py:class:`xarray.Dataset`'s ``coords`` and ``data_vars``
ignore ``xarray.set_option(display_max_rows=...)`` and show the full output
when called directly as, e.g., ``ds.data_vars`` or ``print(ds.data_vars)``
(:issue:`5545`, :pull:`5580`).
By `Stefan Bender <https://github.com/st-bender>`_.
Deprecations
~~~~~~~~~~~~
- Deprecate :py:func:`open_rasterio` (:issue:`4697`, :pull:`5808`).
By `Alan Snow <https://github.com/snowman2>`_.
- Set the default argument for `roll_coords` to `False` for :py:meth:`DataArray.roll`
and :py:meth:`Dataset.roll`. (:pull:`5653`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- :py:meth:`xarray.open_mfdataset` will now error instead of warn when a value for ``concat_dim`` is
passed alongside ``combine='by_coords'``.
By `Tom Nicholas <https://github.com/TomNicholas>`_.
Bug fixes
~~~~~~~~~
- Fix ZeroDivisionError from saving dask array with empty dimension (:issue: `5741`).
By `Joseph K Aicher <https://github.com/jaicher>`_.
- Fixed performance bug where ``cftime`` import attempted within various core operations if ``cftime`` not
installed (:pull:`5640`).
By `Luke Sewell <https://github.com/lusewell>`_
- Fixed bug when combining named DataArrays using :py:func:`combine_by_coords`. (:pull:`5834`).
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- When a custom engine was used in :py:func:`~xarray.open_dataset` the engine
wasn't initialized properly, causing missing argument errors or inconsistent
method signatures. (:pull:`5684`)
By `Jimmy Westling <https://github.com/illviljan>`_.
- Numbers are properly formatted in a plot's title (:issue:`5788`, :pull:`5789`).
By `Maxime Liquet <https://github.com/maximlt>`_.
- Faceted plots will no longer raise a `pint.UnitStrippedWarning` when a `pint.Quantity` array is plotted,
and will correctly display the units of the data in the colorbar (if there is one) (:pull:`5886`).
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- With backends, check for path-like objects rather than ``pathlib.Path``
type, use ``os.fspath`` (:pull:`5879`).
By `Mike Taves <https://github.com/mwtoews>`_.
- ``open_mfdataset()`` now accepts a single ``pathlib.Path`` object (:issue: `5881`).
By `Panos Mavrogiorgos <https://github.com/pmav99>`_.
- Improved performance of :py:meth:`Dataset.unstack` (:pull:`5906`). By `Tom Augspurger <https://github.com/TomAugspurger>`_.
Documentation
~~~~~~~~~~~~~
- Users are instructed to try ``use_cftime=True`` if a ``TypeError`` occurs when combining datasets and one of the types involved is a subclass of ``cftime.datetime`` (:pull:`5776`).
By `Zeb Nicholls <https://github.com/znicholls>`_.
- A clearer error is now raised if a user attempts to assign a Dataset to a single key of
another Dataset. (:pull:`5839`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.
Internal Changes
~~~~~~~~~~~~~~~~
- Explicit indexes refactor: avoid ``len(index)`` in ``map_blocks`` (:pull:`5670`).
By `Deepak Cherian <https://github.com/dcherian>`_.
- Explicit indexes refactor: decouple ``xarray.Index``` from ``xarray.Variable`` (:pull:`5636`).
By `Benoit Bovy <https://github.com/benbovy>`_.
- Fix ``Mapping`` argument typing to allow mypy to pass on ``str`` keys (:pull:`5690`).
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Annotate many of our tests, and fix some of the resulting typing errors. This will
also mean our typing annotations are tested as part of CI. (:pull:`5728`).
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Improve the performance of reprs for large datasets or dataarrays. (:pull:`5661`)
By `Jimmy Westling <https://github.com/illviljan>`_.
- Use isort's `float_to_top` config. (:pull:`5695`).
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Remove use of the deprecated ``kind`` argument in
:py:meth:`pandas.Index.get_slice_bound` inside :py:class:`xarray.CFTimeIndex`
tests (:pull:`5723`). By `Spencer Clark <https://github.com/spencerkclark>`_.
- Refactor `xarray.core.duck_array_ops` to no longer special-case dispatching to
dask versions of functions when acting on dask arrays, instead relying numpy
and dask's adherence to NEP-18 to dispatch automatically. (:pull:`5571`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- Add an ASV benchmark CI and improve performance of the benchmarks (:pull:`5796`)
By `Jimmy Westling <https://github.com/illviljan>`_.
- Use ``importlib`` to replace functionality of ``pkg_resources`` such
as version setting and loading of resources. (:pull:`5845`).
By `Martin K. Scherer <https://github.com/marscher>`_.
.. _whats-new.0.19.0:
v0.19.0 (23 July 2021)
----------------------
This release brings improvements to plotting of categorical data, the ability to specify how attributes
are combined in xarray operations, a new high-level :py:func:`unify_chunks` function, as well as various
deprecations, bug fixes, and minor improvements.
Many thanks to the 29 contributors to this release!:
Andrew Williams, Augustus, Aureliana Barghini, Benoit Bovy, crusaderky, Deepak Cherian, ellesmith88,
Elliott Sales de Andrade, Giacomo Caria, github-actions[bot], Illviljan, Joeperdefloep, joooeey, Julia Kent,
Julius Busecke, keewis, Mathias Hauser, Matthias Göbel, Mattia Almansi, Maximilian Roos, Peter Andreas Entschev,
Ray Bell, Sander, Santiago Soler, Sebastian, Spencer Clark, Stephan Hoyer, Thomas Hirtz, Thomas Nicholas.
New Features
~~~~~~~~~~~~
- Allow passing argument ``missing_dims`` to :py:meth:`Variable.transpose` and :py:meth:`Dataset.transpose`
(:issue:`5550`, :pull:`5586`)
By `Giacomo Caria <https://github.com/gcaria>`_.
- Allow passing a dictionary as coords to a :py:class:`DataArray` (:issue:`5527`,
reverts :pull:`1539`, which had deprecated this due to python's inconsistent ordering in earlier versions).
By `Sander van Rijn <https://github.com/sjvrijn>`_.
- Added :py:meth:`Dataset.coarsen.construct`, :py:meth:`DataArray.coarsen.construct` (:issue:`5454`, :pull:`5475`).
By `Deepak Cherian <https://github.com/dcherian>`_.
- Xarray now uses consolidated metadata by default when writing and reading Zarr
stores (:issue:`5251`).
By `Stephan Hoyer <https://github.com/shoyer>`_.
- New top-level function :py:func:`unify_chunks`.
By `Mattia Almansi <https://github.com/malmans2>`_.
- Allow assigning values to a subset of a dataset using positional or label-based
indexing (:issue:`3015`, :pull:`5362`).
By `Matthias Göbel <https://github.com/matzegoebel>`_.
- Attempting to reduce a weighted object over missing dimensions now raises an error (:pull:`5362`).
By `Mattia Almansi <https://github.com/malmans2>`_.
- Add ``.sum`` to :py:meth:`~xarray.DataArray.rolling_exp` and
:py:meth:`~xarray.Dataset.rolling_exp` for exponentially weighted rolling
sums. These require numbagg 0.2.1;
(:pull:`5178`).
By `Maximilian Roos <https://github.com/max-sixty>`_.
- :py:func:`xarray.cov` and :py:func:`xarray.corr` now lazily check for missing
values if inputs are dask arrays (:issue:`4804`, :pull:`5284`).
By `Andrew Williams <https://github.com/AndrewWilliams3142>`_.
- Attempting to ``concat`` list of elements that are not all ``Dataset`` or all ``DataArray`` now raises an error (:issue:`5051`, :pull:`5425`).
By `Thomas Hirtz <https://github.com/thomashirtz>`_.
- allow passing a function to ``combine_attrs`` (:pull:`4896`).
By `Justus Magin <https://github.com/keewis>`_.
- Allow plotting categorical data (:pull:`5464`).
By `Jimmy Westling <https://github.com/illviljan>`_.
- Allow removal of the coordinate attribute ``coordinates`` on variables by setting ``.attrs['coordinates']= None``
(:issue:`5510`).
By `Elle Smith <https://github.com/ellesmith88>`_.
- Added :py:meth:`DataArray.to_numpy`, :py:meth:`DataArray.as_numpy`, and :py:meth:`Dataset.as_numpy`. (:pull:`5568`).
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- Units in plot labels are now automatically inferred from wrapped :py:meth:`pint.Quantity` arrays. (:pull:`5561`).
By `Tom Nicholas <https://github.com/TomNicholas>`_.
Breaking changes
~~~~~~~~~~~~~~~~
- The default ``mode`` for :py:meth:`Dataset.to_zarr` when ``region`` is set
has changed to the new ``mode="r+"``, which only allows for overriding
pre-existing array values. This is a safer default than the prior ``mode="a"``,
and allows for higher performance writes (:pull:`5252`).
By `Stephan Hoyer <https://github.com/shoyer>`_.
- The main parameter to :py:func:`combine_by_coords` is renamed to `data_objects` instead
of `datasets` so anyone calling this method using a named parameter will need to update
the name accordingly (:issue:`3248`, :pull:`4696`).
By `Augustus Ijams <https://github.com/aijams>`_.
Deprecations
~~~~~~~~~~~~
- Removed the deprecated ``dim`` kwarg to :py:func:`DataArray.integrate` (:pull:`5630`)
- Removed the deprecated ``keep_attrs`` kwarg to :py:func:`DataArray.rolling` (:pull:`5630`)
- Removed the deprecated ``keep_attrs`` kwarg to :py:func:`DataArray.coarsen` (:pull:`5630`)
- Completed deprecation of passing an ``xarray.DataArray`` to :py:func:`Variable` - will now raise a ``TypeError`` (:pull:`5630`)
Bug fixes
~~~~~~~~~
- Fix a minor incompatibility between partial datetime string indexing with a
:py:class:`CFTimeIndex` and upcoming pandas version 1.3.0 (:issue:`5356`,
:pull:`5359`).
By `Spencer Clark <https://github.com/spencerkclark>`_.
- Fix 1-level multi-index incorrectly converted to single index (:issue:`5384`,
:pull:`5385`).
By `Benoit Bovy <https://github.com/benbovy>`_.
- Don't cast a duck array in a coordinate to :py:class:`numpy.ndarray` in
:py:meth:`DataArray.differentiate` (:pull:`5408`)
By `Justus Magin <https://github.com/keewis>`_.
- Fix the ``repr`` of :py:class:`Variable` objects with ``display_expand_data=True``
(:pull:`5406`)
By `Justus Magin <https://github.com/keewis>`_.
- Plotting a pcolormesh with ``xscale="log"`` and/or ``yscale="log"`` works as
expected after improving the way the interval breaks are generated (:issue:`5333`).
By `Santiago Soler <https://github.com/santisoler>`_
- :py:func:`combine_by_coords` can now handle combining a list of unnamed
``DataArray`` as input (:issue:`3248`, :pull:`4696`).
By `Augustus Ijams <https://github.com/aijams>`_.
Internal Changes
~~~~~~~~~~~~~~~~
- Run CI on the first & last python versions supported only; currently 3.7 & 3.9.
(:pull:`5433`)
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Publish test results & timings on each PR.
(:pull:`5537`)
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Explicit indexes refactor: add a ``xarray.Index.query()`` method in which
one may eventually provide a custom implementation of label-based data
selection (not ready yet for public use). Also refactor the internal,
pandas-specific implementation into ``PandasIndex.query()`` and
``PandasMultiIndex.query()`` (:pull:`5322`).
By `Benoit Bovy <https://github.com/benbovy>`_.
Changelog:
What's new in 1.3.4 (October 17, 2021)
These are the changes in pandas 1.3.4. See Release notes for a full changelog
including other versions of pandas.
-------------------------------------------------------------------------------
Fixed regressions
* Fixed regression in DataFrame.convert_dtypes() incorrectly converts byte
strings to strings (GH43183)
* Fixed regression in GroupBy.agg() where it was failing silently with mixed
data types along axis=1 and MultiIndex (GH43209)
* Fixed regression in merge() with integer and NaN keys failing with outer
merge (GH43550)
* Fixed regression in DataFrame.corr() raising ValueError with method=
"spearman" on 32-bit platforms (GH43588)
* Fixed performance regression in MultiIndex.equals() (GH43549)
* Fixed performance regression in GroupBy.first() and GroupBy.last() with
StringDtype (GH41596)
* Fixed regression in Series.cat.reorder_categories() failing to update the
categories on the Series (GH43232)
* Fixed regression in Series.cat.categories() setter failing to update the
categories on the Series (GH43334)
* Fixed regression in read_csv() raising UnicodeDecodeError exception when
memory_map=True (GH43540)
* Fixed regression in DataFrame.explode() raising AssertionError when column
is any scalar which is not a string (GH43314)
* Fixed regression in Series.aggregate() attempting to pass args and kwargs
multiple times to the user supplied func in certain cases (GH43357)
* Fixed regression when iterating over a DataFrame.groupby.rolling object
causing the resulting DataFrames to have an incorrect index if the input
groupings were not sorted (GH43386)
* Fixed regression in DataFrame.groupby.rolling.cov() and
DataFrame.groupby.rolling.corr() computing incorrect results if the input
groupings were not sorted (GH43386)
-------------------------------------------------------------------------------
Bug fixes
* Fixed bug in pandas.DataFrame.groupby.rolling() and
pandas.api.indexers.FixedForwardWindowIndexer leading to segfaults and
window endpoints being mixed across groups (GH43267)
* Fixed bug in GroupBy.mean() with datetimelike values including NaT values
returning incorrect results (GH43132)
* Fixed bug in Series.aggregate() not passing the first args to the user
supplied func in certain cases (GH43357)
* Fixed memory leaks in Series.rolling.quantile() and Series.rolling.median()
(GH43339)
-------------------------------------------------------------------------------
Other
* The minimum version of Cython needed to compile pandas is now 0.29.24 (
GH43729)
What's new in 1.3.3 (September 12, 2021)
These are the changes in pandas 1.3.3. See Release notes for a full changelog
including other versions of pandas.
-------------------------------------------------------------------------------
Fixed regressions
* Fixed regression in DataFrame constructor failing to broadcast for defined
Index and len one list of Timestamp (GH42810)
* Fixed regression in GroupBy.agg() incorrectly raising in some cases (
GH42390)
* Fixed regression in GroupBy.apply() where nan values were dropped even with
dropna=False (GH43205)
* Fixed regression in GroupBy.quantile() which was failing with pandas.NA (
GH42849)
* Fixed regression in merge() where on columns with ExtensionDtype or bool
data types were cast to object in right and outer merge (GH40073)
* Fixed regression in RangeIndex.where() and RangeIndex.putmask() raising
AssertionError when result did not represent a RangeIndex (GH43240)
* Fixed regression in read_parquet() where the fastparquet engine would not
work properly with fastparquet 0.7.0 (GH43075)
* Fixed regression in DataFrame.loc.__setitem__() raising ValueError when
setting array as cell value (GH43422)
* Fixed regression in is_list_like() where objects with __iter__ set to None
would be identified as iterable (GH43373)
* Fixed regression in DataFrame.__getitem__() raising error for slice of
DatetimeIndex when index is non monotonic (GH43223)
* Fixed regression in Resampler.aggregate() when used after column selection
would raise if func is a list of aggregation functions (GH42905)
* Fixed regression in DataFrame.corr() where Kendall correlation would
produce incorrect results for columns with repeated values (GH43401)
* Fixed regression in DataFrame.groupby() where aggregation on columns with
object types dropped results on those columns (GH42395, GH43108)
* Fixed regression in Series.fillna() raising TypeError when filling float
Series with list-like fill value having a dtype which couldn't cast
lostlessly (like float32 filled with float64) (GH43424)
* Fixed regression in read_csv() raising AttributeError when the file handle
is an tempfile.SpooledTemporaryFile object (GH43439)
* Fixed performance regression in
core.window.ewm.ExponentialMovingWindow.mean() (GH42333)
-------------------------------------------------------------------------------
Performance improvements
* Performance improvement for DataFrame.__setitem__() when the key or value
is not a DataFrame, or key is not list-like (GH43274)
-------------------------------------------------------------------------------
Bug fixes
* Fixed bug in DataFrameGroupBy.agg() and DataFrameGroupBy.transform() with
engine="numba" where index data was not being correctly passed into func (
GH43133)
What's new in 1.3.2 (August 15, 2021)
These are the changes in pandas 1.3.2. See Release notes for a full changelog
including other versions of pandas.
-------------------------------------------------------------------------------
Fixed regressions
* Performance regression in DataFrame.isin() and Series.isin() for nullable
data types (GH42714)
* Regression in updating values of Series using boolean index, created by
using DataFrame.pop() (GH42530)
* Regression in DataFrame.from_records() with empty records (GH42456)
* Fixed regression in DataFrame.shift() where TypeError occurred when
shifting DataFrame created by concatenation of slices and fills with values
(GH42719)
* Regression in DataFrame.agg() when the func argument returned lists and
axis=1 (GH42727)
* Regression in DataFrame.drop() does nothing if MultiIndex has duplicates
and indexer is a tuple or list of tuples (GH42771)
* Fixed regression where read_csv() raised a ValueError when parameters names
and prefix were both set to None (GH42387)
* Fixed regression in comparisons between Timestamp object and datetime64
objects outside the implementation bounds for nanosecond datetime64 (
GH42794)
* Fixed regression in Styler.highlight_min() and Styler.highlight_max() where
pandas.NA was not successfully ignored (GH42650)
* Fixed regression in concat() where copy=False was not honored in axis=1
Series concatenation (GH42501)
* Regression in Series.nlargest() and Series.nsmallest() with nullable
integer or float dtype (GH42816)
* Fixed regression in Series.quantile() with Int64Dtype (GH42626)
* Fixed regression in Series.groupby() and DataFrame.groupby() where
supplying the by argument with a Series named with a tuple would
incorrectly raise (GH42731)
-------------------------------------------------------------------------------
Bug fixes
* Bug in read_excel() modifies the dtypes dictionary when reading a file with
duplicate columns (GH42462)
* 1D slices over extension types turn into N-dimensional slices over
ExtensionArrays (GH42430)
* Fixed bug in Series.rolling() and DataFrame.rolling() not calculating
window bounds correctly for the first row when center=True and window is an
offset that covers all the rows (GH42753)
* Styler.hide_columns() now hides the index name header row as well as column
headers (GH42101)
* Styler.set_sticky() has amended CSS to control the column/index names and
ensure the correct sticky positions (GH42537)
* Bug in de-serializing datetime indexes in PYTHONOPTIMIZED mode (GH42866)
What's new in 1.3.1 (July 25, 2021)
These are the changes in pandas 1.3.1. See Release notes for a full changelog
including other versions of pandas.
-------------------------------------------------------------------------------
Fixed regressions
* Pandas could not be built on PyPy (GH42355)
* DataFrame constructed with an older version of pandas could not be
unpickled (GH42345)
* Performance regression in constructing a DataFrame from a dictionary of
dictionaries (GH42248)
* Fixed regression in DataFrame.agg() dropping values when the DataFrame had
an Extension Array dtype, a duplicate index, and axis=1 (GH42380)
* Fixed regression in DataFrame.astype() changing the order of noncontiguous
data (GH42396)
* Performance regression in DataFrame in reduction operations requiring
casting such as DataFrame.mean() on integer data (GH38592)
* Performance regression in DataFrame.to_dict() and Series.to_dict() when
orient argument one of 'records', 'dict', or 'split' (GH42352)
* Fixed regression in indexing with a list subclass incorrectly raising
TypeError (GH42433, GH42461)
* Fixed regression in DataFrame.isin() and Series.isin() raising TypeError
with nullable data containing at least one missing value (GH42405)
* Regression in concat() between objects with bool dtype and integer dtype
casting to object instead of to integer (GH42092)
* Bug in Series constructor not accepting a dask.Array (GH38645)
* Fixed regression for SettingWithCopyWarning displaying incorrect stacklevel
(GH42570)
* Fixed regression for merge_asof() raising KeyError when one of the by
columns is in the index (GH34488)
* Fixed regression in to_datetime() returning pd.NaT for inputs that produce
duplicated values, when cache=True (GH42259)
* Fixed regression in SeriesGroupBy.value_counts() that resulted in an
IndexError when called on a Series with one row (GH42618)
-------------------------------------------------------------------------------
Bug fixes
* Fixed bug in DataFrame.transpose() dropping values when the DataFrame had
an Extension Array dtype and a duplicate index (GH42380)
* Fixed bug in DataFrame.to_xml() raising KeyError when called with index=
False and an offset index (GH42458)
* Fixed bug in Styler.set_sticky() not handling index names correctly for
single index columns case (GH42537)
* Fixed bug in DataFrame.copy() failing to consolidate blocks in the result (
GH42579)
What's new in 1.3.0 (July 2, 2021)
These are the changes in pandas 1.3.0. See Release notes for a full changelog
including other versions of pandas.
Warning
When reading new Excel 2007+ (.xlsx) files, the default argument engine=None to
read_excel() will now result in using the openpyxl engine in all cases when the
option io.excel.xlsx.reader is set to "auto". Previously, some cases would use
the xlrd engine instead. See What's new 1.2.0 for background on this change.
-------------------------------------------------------------------------------
Enhancements
-------------------------------------------------------------------------------
Custom HTTP(s) headers when reading csv or json files
When reading from a remote URL that is not handled by fsspec (e.g. HTTP and
HTTPS) the dictionary passed to storage_options will be used to create the
headers included in the request. This can be used to control the User-Agent
header or send other custom headers (GH36688). For example:
In [1]: headers = {"User-Agent": "pandas"}
In [2]: df = pd.read_csv(
...: "https://download.bls.gov/pub/time.series/cu/cu.item",
...: sep="\t",
...: storage_options=headers
...: )
...:
-------------------------------------------------------------------------------
Read and write XML documents
We added I/O support to read and render shallow versions of XML documents with
read_xml() and DataFrame.to_xml(). Using lxml as parser, both XPath 1.0 and
XSLT 1.0 are available. (GH27554)
In [1]: xml = """<?xml version='1.0' encoding='utf-8'?>
...: <data>
...: <row>
...: <shape>square</shape>
...: <degrees>360</degrees>
...: <sides>4.0</sides>
...: </row>
...: <row>
...: <shape>circle</shape>
...: <degrees>360</degrees>
...: <sides/>
...: </row>
...: <row>
...: <shape>triangle</shape>
...: <degrees>180</degrees>
...: <sides>3.0</sides>
...: </row>
...: </data>"""
In [2]: df = pd.read_xml(xml)
In [3]: df
Out[3]:
shape degrees sides
0 square 360 4.0
1 circle 360 NaN
2 triangle 180 3.0
In [4]: df.to_xml()
Out[4]:
<?xml version='1.0' encoding='utf-8'?>
<data>
<row>
<index>0</index>
<shape>square</shape>
<degrees>360</degrees>
<sides>4.0</sides>
</row>
<row>
<index>1</index>
<shape>circle</shape>
<degrees>360</degrees>
<sides/>
</row>
<row>
<index>2</index>
<shape>triangle</shape>
<degrees>180</degrees>
<sides>3.0</sides>
</row>
</data>
For more, see Writing XML in the user guide on IO tools.
-------------------------------------------------------------------------------
Styler enhancements
We provided some focused development on Styler. See also the Styler
documentation which has been revised and improved (GH39720, GH39317, GH40493).
+ The method Styler.set_table_styles() can now accept more natural CSS
language for arguments, such as 'color:red;' instead of [('color',
'red')] (GH39563)
+ The methods Styler.highlight_null(), Styler.highlight_min(), and
Styler.highlight_max() now allow custom CSS highlighting instead of the
default background coloring (GH40242)
+ Styler.apply() now accepts functions that return an ndarray when axis=
None, making it now consistent with the axis=0 and axis=1 behavior (
GH39359)
+ When incorrectly formatted CSS is given via Styler.apply() or
Styler.applymap(), an error is now raised upon rendering (GH39660)
+ Styler.format() now accepts the keyword argument escape for optional
HTML and LaTeX escaping (GH40388, GH41619)
+ Styler.background_gradient() has gained the argument gmap to supply a
specific gradient map for shading (GH22727)
+ Styler.clear() now clears Styler.hidden_index and Styler.hidden_columns
as well (GH40484)
+ Added the method Styler.highlight_between() (GH39821)
+ Added the method Styler.highlight_quantile() (GH40926)
+ Added the method Styler.text_gradient() (GH41098)
+ Added the method Styler.set_tooltips() to allow hover tooltips; this
can be used enhance interactive displays (GH21266, GH40284)
+ Added the parameter precision to the method Styler.format() to control
the display of floating point numbers (GH40134)
+ Styler rendered HTML output now follows the w3 HTML Style Guide (
GH39626)
+ Many features of the Styler class are now either partially or fully
usable on a DataFrame with a non-unique indexes or columns (GH41143)
+ One has greater control of the display through separate sparsification
of the index or columns using the new styler options, which are also
usable via option_context() (GH41142)
+ Added the option styler.render.max_elements to avoid browser overload
when styling large DataFrames (GH40712)
+ Added the method Styler.to_latex() (GH21673, GH42320), which also
allows some limited CSS conversion (GH40731)
+ Added the method Styler.to_html() (GH13379)
+ Added the method Styler.set_sticky() to make index and column headers
permanently visible in scrolling HTML frames (GH29072)
-------------------------------------------------------------------------------
DataFrame constructor honors copy=False with dict
When passing a dictionary to DataFrame with copy=False, a copy will no longer
be made (GH32960).
In [3]: arr = np.array([1, 2, 3])
In [4]: df = pd.DataFrame({"A": arr, "B": arr.copy()}, copy=False)
In [5]: df
Out[5]:
A B
0 1 1
1 2 2
2 3 3
df["A"] remains a view on arr:
In [6]: arr[0] = 0
In [7]: assert df.iloc[0, 0] == 0
The default behavior when not passing copy will remain unchanged, i.e. a copy
will be made.
-------------------------------------------------------------------------------
PyArrow backed string data type
We've enhanced the StringDtype, an extension type dedicated to string data. (
GH39908)
It is now possible to specify a storage keyword option to StringDtype. Use
pandas options or specify the dtype using dtype='string[pyarrow]' to allow the
StringArray to be backed by a PyArrow array instead of a NumPy array of Python
objects.
The PyArrow backed StringArray requires pyarrow 1.0.0 or greater to be
installed.
Warning
string[pyarrow] is currently considered experimental. The implementation and
parts of the API may change without warning.
In [8]: pd.Series(['abc', None, 'def'], dtype=pd.StringDtype(storage="pyarrow"))
Out[8]:
0 abc
1 <NA>
2 def
dtype: string
You can use the alias "string[pyarrow]" as well.
In [9]: s = pd.Series(['abc', None, 'def'], dtype="string[pyarrow]")
In [10]: s
Out[10]:
0 abc
1 <NA>
2 def
dtype: string
You can also create a PyArrow backed string array using pandas options.
In [11]: with pd.option_context("string_storage", "pyarrow"):
....: s = pd.Series(['abc', None, 'def'], dtype="string")
....:
In [12]: s
Out[12]:
0 abc
1 <NA>
2 def
dtype: string
The usual string accessor methods work. Where appropriate, the return type of
the Series or columns of a DataFrame will also have string dtype.
In [13]: s.str.upper()
Out[13]:
0 ABC
1 <NA>
2 DEF
dtype: string
In [14]: s.str.split('b', expand=True).dtypes
Out[14]:
0 string
1 string
dtype: object
String accessor methods returning integers will return a value with Int64Dtype
In [15]: s.str.count("a")
Out[15]:
0 1
1 <NA>
2 0
dtype: Int64
-------------------------------------------------------------------------------
Centered datetime-like rolling windows
When performing rolling calculations on DataFrame and Series objects with a
datetime-like index, a centered datetime-like window can now be used (GH38780).
For example:
In [16]: df = pd.DataFrame(
....: {"A": [0, 1, 2, 3, 4]}, index=pd.date_range("2020", periods=5, freq="1D")
....: )
....:
In [17]: df
Out[17]:
A
2020-01-01 0
2020-01-02 1
2020-01-03 2
2020-01-04 3
2020-01-05 4
In [18]: df.rolling("2D", center=True).mean()
Out[18]:
A
2020-01-01 0.5
2020-01-02 1.5
2020-01-03 2.5
2020-01-04 3.5
2020-01-05 4.0
-------------------------------------------------------------------------------
Other enhancements
* DataFrame.rolling(), Series.rolling(), DataFrame.expanding(), and
Series.expanding() now support a method argument with a 'table' option that
performs the windowing operation over an entire DataFrame. See Window
Overview for performance and functional benefits (GH15095, GH38995)
* ExponentialMovingWindow now support a online method that can perform mean
calculations in an online fashion. See Window Overview (GH41673)
* Added MultiIndex.dtypes() (GH37062)
* Added end and end_day options for the origin argument in DataFrame.resample
() (GH37804)
* Improved error message when usecols and names do not match for read_csv()
and engine="c" (GH29042)
* Improved consistency of error messages when passing an invalid win_type
argument in Window methods (GH15969)
* read_sql_query() now accepts a dtype argument to cast the columnar data
from the SQL database based on user input (GH10285)
* read_csv() now raising ParserWarning if length of header or given names
does not match length of data when usecols is not specified (GH21768)
* Improved integer type mapping from pandas to SQLAlchemy when using
DataFrame.to_sql() (GH35076)
* to_numeric() now supports downcasting of nullable ExtensionDtype objects (
GH33013)
* Added support for dict-like names in MultiIndex.set_names and
MultiIndex.rename (GH20421)
* read_excel() can now auto-detect .xlsb files and older .xls files (GH35416,
GH41225)
* ExcelWriter now accepts an if_sheet_exists parameter to control the
behavior of append mode when writing to existing sheets (GH40230)
* Rolling.sum(), Expanding.sum(), Rolling.mean(), Expanding.mean(),
ExponentialMovingWindow.mean(), Rolling.median(), Expanding.median(),
Rolling.max(), Expanding.max(), Rolling.min(), and Expanding.min() now
support Numba execution with the engine keyword (GH38895, GH41267)
* DataFrame.apply() can now accept NumPy unary operators as strings, e.g.
df.apply("sqrt"), which was already the case for Series.apply() (GH39116)
* DataFrame.apply() can now accept non-callable DataFrame properties as
strings, e.g. df.apply("size"), which was already the case for Series.apply
() (GH39116)
* DataFrame.applymap() can now accept kwargs to pass on to the user-provided
func (GH39987)
* Passing a DataFrame indexer to iloc is now disallowed for
Series.__getitem__() and DataFrame.__getitem__() (GH39004)
* Series.apply() can now accept list-like or dictionary-like arguments that
aren't lists or dictionaries, e.g. ser.apply(np.array(["sum", "mean"])),
which was already the case for DataFrame.apply() (GH39140)
* DataFrame.plot.scatter() can now accept a categorical column for the
argument c (GH12380, GH31357)
* Series.loc() now raises a helpful error message when the Series has a
MultiIndex and the indexer has too many dimensions (GH35349)
* read_stata() now supports reading data from compressed files (GH26599)
* Added support for parsing ISO 8601-like timestamps with negative signs to
Timedelta (GH37172)
* Added support for unary operators in FloatingArray (GH38749)
* RangeIndex can now be constructed by passing a range object directly e.g.
pd.RangeIndex(range(3)) (GH12067)
* Series.round() and DataFrame.round() now work with nullable integer and
floating dtypes (GH38844)
* read_csv() and read_json() expose the argument encoding_errors to control
how encoding errors are handled (GH39450)
* GroupBy.any() and GroupBy.all() use Kleene logic with nullable data types (
GH37506)
* GroupBy.any() and GroupBy.all() return a BooleanDtype for columns with
nullable data types (GH33449)
* GroupBy.any() and GroupBy.all() raising with object data containing pd.NA
even when skipna=True (GH37501)
* GroupBy.rank() now supports object-dtype data (GH38278)
* Constructing a DataFrame or Series with the data argument being a Python
iterable that is not a NumPy ndarray consisting of NumPy scalars will now
result in a dtype with a precision the maximum of the NumPy scalars; this
was already the case when data is a NumPy ndarray (GH40908)
* Add keyword sort to pivot_table() to allow non-sorting of the result (
GH39143)
* Add keyword dropna to DataFrame.value_counts() to allow counting rows that
include NA values (GH41325)
* Series.replace() will now cast results to PeriodDtype where possible
instead of object dtype (GH41526)
* Improved error message in corr and cov methods on Rolling, Expanding, and
ExponentialMovingWindow when other is not a DataFrame or Series (GH41741)
* Series.between() can now accept left or right as arguments to inclusive to
include only the left or right boundary (GH40245)
* DataFrame.explode() now supports exploding multiple columns. Its column
argument now also accepts a list of str or tuples for exploding on multiple
columns at the same time (GH39240)
* DataFrame.sample() now accepts the ignore_index argument to reset the index
after sampling, similar to DataFrame.drop_duplicates() and
DataFrame.sort_values() (GH38581)
-------------------------------------------------------------------------------
Notable bug fixes
These are bug fixes that might have notable behavior changes.
-------------------------------------------------------------------------------
Categorical.unique now always maintains same dtype as original
Previously, when calling Categorical.unique() with categorical data, unused
categories in the new array would be removed, making the dtype of the new array
different than the original (GH18291)
As an example of this, given:
In [19]: dtype = pd.CategoricalDtype(['bad', 'neutral', 'good'], ordered=True)
In [20]: cat = pd.Categorical(['good', 'good', 'bad', 'bad'], dtype=dtype)
In [21]: original = pd.Series(cat)
In [22]: unique = original.unique()
Previous behavior:
In [1]: unique
['good', 'bad']
Categories (2, object): ['bad' < 'good']
In [2]: original.dtype == unique.dtype
False
New behavior:
In [23]: unique
Out[23]:
['good', 'bad']
Categories (3, object): ['bad' < 'neutral' < 'good']
In [24]: original.dtype == unique.dtype
Out[24]: True
-------------------------------------------------------------------------------
Preserve dtypes in DataFrame.combine_first()
DataFrame.combine_first() will now preserve dtypes (GH7509)
In [25]: df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2])
In [26]: df1
Out[26]:
A B
0 1 1
1 2 2
2 3 3
In [27]: df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4])
In [28]: df2
Out[28]:
B C
2 4 1
3 5 2
4 6 3
In [29]: combined = df1.combine_first(df2)
Previous behavior:
In [1]: combined.dtypes
Out[2]:
A float64
B float64
C float64
dtype: object
New behavior:
In [30]: combined.dtypes
Out[30]:
A float64
B int64
C float64
dtype: object
-------------------------------------------------------------------------------
Groupby methods agg and transform no longer changes return dtype for callables
Previously the methods DataFrameGroupBy.aggregate(), SeriesGroupBy.aggregate(),
DataFrameGroupBy.transform(), and SeriesGroupBy.transform() might cast the
result dtype when the argument func is callable, possibly leading to
undesirable results (GH21240). The cast would occur if the result is numeric
and casting back to the input dtype does not change any values as measured by
np.allclose. Now no such casting occurs.
In [31]: df = pd.DataFrame({'key': [1, 1], 'a': [True, False], 'b': [True, True]})
In [32]: df
Out[32]:
key a b
0 1 True True
1 1 False True
Previous behavior:
In [5]: df.groupby('key').agg(lambda x: x.sum())
Out[5]:
a b
key
1 True 2
New behavior:
In [33]: df.groupby('key').agg(lambda x: x.sum())
Out[33]:
a b
key
1 1 2
-------------------------------------------------------------------------------
float result for GroupBy.mean(), GroupBy.median(), and GroupBy.var()
Previously, these methods could result in different dtypes depending on the
input values. Now, these methods will always return a float dtype. (GH41137)
In [34]: df = pd.DataFrame({'a': [True], 'b': [1], 'c': [1.0]})
Previous behavior:
In [5]: df.groupby(df.index).mean()
Out[5]:
a b c
0 True 1 1.0
New behavior:
In [35]: df.groupby(df.index).mean()
Out[35]:
a b c
0 1.0 1.0 1.0
-------------------------------------------------------------------------------
Try operating inplace when setting values with loc and iloc
When setting an entire column using loc or iloc, pandas will try to insert the
values into the existing data rather than create an entirely new array.
In [36]: df = pd.DataFrame(range(3), columns=["A"], dtype="float64")
In [37]: values = df.values
In [38]: new = np.array([5, 6, 7], dtype="int64")
In [39]: df.loc[[0, 1, 2], "A"] = new
In both the new and old behavior, the data in values is overwritten, but in the
old behavior the dtype of df["A"] changed to int64.
Previous behavior:
In [1]: df.dtypes
Out[1]:
A int64
dtype: object
In [2]: np.shares_memory(df["A"].values, new)
Out[2]: False
In [3]: np.shares_memory(df["A"].values, values)
Out[3]: False
In pandas 1.3.0, df continues to share data with values
New behavior:
In [40]: df.dtypes
Out[40]:
A float64
dtype: object
In [41]: np.shares_memory(df["A"], new)
Out[41]: False
In [42]: np.shares_memory(df["A"], values)
Out[42]: True
-------------------------------------------------------------------------------
Never operate inplace when setting frame[keys] = values
When setting multiple columns using frame[keys] = values new arrays will
replace pre-existing arrays for these keys, which will not be over-written (
GH39510). As a result, the columns will retain the dtype(s) of values, never
casting to the dtypes of the existing arrays.
In [43]: df = pd.DataFrame(range(3), columns=["A"], dtype="float64")
In [44]: df[["A"]] = 5
In the old behavior, 5 was cast to float64 and inserted into the existing array
backing df:
Previous behavior:
In [1]: df.dtypes
Out[1]:
A float64
In the new behavior, we get a new array, and retain an integer-dtyped 5:
New behavior:
In [45]: df.dtypes
Out[45]:
A int64
dtype: object
-------------------------------------------------------------------------------
Consistent casting with setting into Boolean Series
Setting non-boolean values into a Series with dtype=bool now consistently casts
to dtype=object (GH38709)
In [46]: orig = pd.Series([True, False])
In [47]: ser = orig.copy()
In [48]: ser.iloc[1] = np.nan
In [49]: ser2 = orig.copy()
In [50]: ser2.iloc[1] = 2.0
Previous behavior:
In [1]: ser
Out [1]:
0 1.0
1 NaN
dtype: float64
In [2]:ser2
Out [2]:
0 True
1 2.0
dtype: object
New behavior:
In [51]: ser
Out[51]:
0 True
1 NaN
dtype: object
In [52]: ser2
Out[52]:
0 True
1 2.0
dtype: object
-------------------------------------------------------------------------------
GroupBy.rolling no longer returns grouped-by column in values
The group-by column will now be dropped from the result of a groupby.rolling
operation (GH32262)
In [53]: df = pd.DataFrame({"A": [1, 1, 2, 3], "B": [0, 1, 2, 3]})
In [54]: df
Out[54]:
A B
0 1 0
1 1 1
2 2 2
3 3 3
Previous behavior:
In [1]: df.groupby("A").rolling(2).sum()
Out[1]:
A B
A
1 0 NaN NaN
1 2.0 1.0
2 2 NaN NaN
3 3 NaN NaN
New behavior:
In [55]: df.groupby("A").rolling(2).sum()
Out[55]:
B
A
1 0 NaN
1 1.0
2 2 NaN
3 3 NaN
-------------------------------------------------------------------------------
Removed artificial truncation in rolling variance and standard deviation
Rolling.std() and Rolling.var() will no longer artificially truncate results
that are less than ~1e-8 and ~1e-15 respectively to zero (GH37051, GH40448,
GH39872).
However, floating point artifacts may now exist in the results when rolling
over larger values.
In [56]: s = pd.Series([7, 5, 5, 5])
In [57]: s.rolling(3).var()
Out[57]:
0 NaN
1 NaN
2 1.333333e+00
3 4.440892e-16
dtype: float64
-------------------------------------------------------------------------------
GroupBy.rolling with MultiIndex no longer drops levels in the result
GroupBy.rolling() will no longer drop levels of a DataFrame with a MultiIndex
in the result. This can lead to a perceived duplication of levels in the
resulting MultiIndex, but this change restores the behavior that was present in
version 1.1.3 (GH38787, GH38523).
In [58]: index = pd.MultiIndex.from_tuples([('idx1', 'idx2')], names=['label1', 'label2'])
In [59]: df = pd.DataFrame({'a': [1], 'b': [2]}, index=index)
In [60]: df
Out[60]:
a b
label1 label2
idx1 idx2 1 2
Previous behavior:
In [1]: df.groupby('label1').rolling(1).sum()
Out[1]:
a b
label1
idx1 1.0 2.0
New behavior:
In [61]: df.groupby('label1').rolling(1).sum()
Out[61]:
a b
label1 label1 label2
idx1 idx1 idx2 1.0 2.0
-------------------------------------------------------------------------------
Backwards incompatible API changes
-------------------------------------------------------------------------------
Increased minimum versions for dependencies
Some minimum supported versions of dependencies were updated. If installed, we
now require:
Package Minimum Version Required Changed
numpy 1.17.3 X X
pytz 2017.3 X
python-dateutil 2.7.3 X
bottleneck 1.2.1
numexpr 2.7.0 X
pytest (dev) 6.0 X
mypy (dev) 0.812 X
setuptools 38.6.0 X
For optional libraries the general recommendation is to use the latest version.
The following table lists the lowest version per library that is currently
being tested throughout the development of pandas. Optional libraries below the
lowest tested version may still work, but are not considered supported.
Package Minimum Version Changed
beautifulsoup4 4.6.0
fastparquet 0.4.0 X
fsspec 0.7.4
gcsfs 0.6.0
lxml 4.3.0
matplotlib 2.2.3
numba 0.46.0
openpyxl 3.0.0 X
pyarrow 0.17.0 X
pymysql 0.8.1 X
pytables 3.5.1
s3fs 0.4.0
scipy 1.2.0
sqlalchemy 1.3.0 X
tabulate 0.8.7 X
xarray 0.12.0
xlrd 1.2.0
xlsxwriter 1.0.2
xlwt 1.3.0
pandas-gbq 0.12.0
See Dependencies and Optional dependencies for more.
-------------------------------------------------------------------------------
Other API changes
* Partially initialized CategoricalDtype objects (i.e. those with categories=
None) will no longer compare as equal to fully initialized dtype objects (
GH38516)
* Accessing _constructor_expanddim on a DataFrame and _constructor_sliced on
a Series now raise an AttributeError. Previously a NotImplementedError was
raised (GH38782)
* Added new engine and **engine_kwargs parameters to DataFrame.to_sql() to
support other future 'SQL engines'. Currently we still only use
SQLAlchemy under the hood, but more engines are planned to be supported
such as turbodbc (GH36893)
* Removed redundant freq from PeriodIndex string representation (GH41653)
* ExtensionDtype.construct_array_type() is now a required method instead of
an optional one for ExtensionDtype subclasses (GH24860)
* Calling hash on non-hashable pandas objects will now raise TypeError with
the built-in error message (e.g. unhashable type: 'Series'). Previously it
would raise a custom message such as 'Series' objects are mutable, thus
they cannot be hashed. Furthermore, isinstance(<Series>,
abc.collections.Hashable) will now return False (GH40013)
* Styler.from_custom_template() now has two new arguments for template names,
and removed the old name, due to template inheritance having been
introducing for better parsing (GH42053). Subclassing modifications to
Styler attributes are also needed.
-------------------------------------------------------------------------------
Build
* Documentation in .pptx and .pdf formats are no longer included in wheels or
source distributions. (GH30741)
-------------------------------------------------------------------------------
Deprecations
-------------------------------------------------------------------------------
Deprecated dropping nuisance columns in DataFrame reductions and
DataFrameGroupBy operations
Calling a reduction (e.g. .min, .max, .sum) on a DataFrame with numeric_only=
None (the default), columns where the reduction raises a TypeError are silently
ignored and dropped from the result.
This behavior is deprecated. In a future version, the TypeError will be raised,
and users will need to select only valid columns before calling the function.
For example:
In [62]: df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})
In [63]: df
Out[63]:
A B
0 1 2016-01-01
1 2 2016-01-02
2 3 2016-01-03
3 4 2016-01-04
Old behavior:
In [3]: df.prod()
Out[3]:
Out[3]:
A 24
dtype: int64
Future behavior:
In [4]: df.prod()
...
TypeError: 'DatetimeArray' does not implement reduction 'prod'
In [5]: df[["A"]].prod()
Out[5]:
A 24
dtype: int64
Similarly, when applying a function to DataFrameGroupBy, columns on which the
function raises TypeError are currently silently ignored and dropped from the
result.
This behavior is deprecated. In a future version, the TypeError will be raised,
and users will need to select only valid columns before calling the function.
For example:
In [64]: df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})
In [65]: gb = df.groupby([1, 1, 2, 2])
Old behavior:
In [4]: gb.prod(numeric_only=False)
Out[4]:
A
1 2
2 12
Future behavior:
In [5]: gb.prod(numeric_only=False)
...
TypeError: datetime64 type does not support prod operations
In [6]: gb[["A"]].prod(numeric_only=False)
Out[6]:
A
1 2
2 12
-------------------------------------------------------------------------------
Other Deprecations
* Deprecated allowing scalars to be passed to the Categorical constructor (
GH38433)
* Deprecated constructing CategoricalIndex without passing list-like data (
GH38944)
* Deprecated allowing subclass-specific keyword arguments in the Index
constructor, use the specific subclass directly instead (GH14093, GH21311,
GH22315, GH26974)
* Deprecated the astype() method of datetimelike (timedelta64[ns], datetime64
[ns], Datetime64TZDtype, PeriodDtype) to convert to integer dtypes, use
values.view(...) instead (GH38544)
* Deprecated MultiIndex.is_lexsorted() and MultiIndex.lexsort_depth(), use
MultiIndex.is_monotonic_increasing() instead (GH32259)
* Deprecated keyword try_cast in Series.where(), Series.mask(),
DataFrame.where(), DataFrame.mask(); cast results manually if desired (
GH38836)
* Deprecated comparison of Timestamp objects with datetime.date objects.
Instead of e.g. ts <= mydate use ts <= pd.Timestamp(mydate) or ts.date() <=
mydate (GH36131)
* Deprecated Rolling.win_type returning "freq" (GH38963)
* Deprecated Rolling.is_datetimelike (GH38963)
* Deprecated DataFrame indexer for Series.__setitem__() and
DataFrame.__setitem__() (GH39004)
* Deprecated ExponentialMovingWindow.vol() (GH39220)
* Using .astype to convert between datetime64[ns] dtype and DatetimeTZDtype
is deprecated and will raise in a future version, use obj.tz_localize or
obj.dt.tz_localize instead (GH38622)
* Deprecated casting datetime.date objects to datetime64 when used as
fill_value in DataFrame.unstack(), DataFrame.shift(), Series.shift(), and
DataFrame.reindex(), pass pd.Timestamp(dateobj) instead (GH39767)
* Deprecated Styler.set_na_rep() and Styler.set_precision() in favor of
Styler.format() with na_rep and precision as existing and new input
arguments respectively (GH40134, GH40425)
* Deprecated Styler.where() in favor of using an alternative formulation with
Styler.applymap() (GH40821)
* Deprecated allowing partial failure in Series.transform() and
DataFrame.transform() when func is list-like or dict-like and raises
anything but TypeError; func raising anything but a TypeError will raise in
a future version (GH40211)
* Deprecated arguments error_bad_lines and warn_bad_lines in read_csv() and
read_table() in favor of argument on_bad_lines (GH15122)
* Deprecated support for np.ma.mrecords.MaskedRecords in the DataFrame
constructor, pass {name: data[name] for name in data.dtype.names} instead (
GH40363)
* Deprecated using merge(), DataFrame.merge(), and DataFrame.join() on a
different number of levels (GH34862)
* Deprecated the use of **kwargs in ExcelWriter; use the keyword argument
engine_kwargs instead (GH40430)
* Deprecated the level keyword for DataFrame and Series aggregations; use
groupby instead (GH39983)
* Deprecated the inplace parameter of Categorical.remove_categories(),
Categorical.add_categories(), Categorical.reorder_categories(),
Categorical.rename_categories(), Categorical.set_categories() and will be
removed in a future version (GH37643)
* Deprecated merge() producing duplicated columns through the suffixes
keyword and already existing columns (GH22818)
* Deprecated setting Categorical._codes, create a new Categorical with the
desired codes instead (GH40606)
* Deprecated the convert_float optional argument in read_excel() and
ExcelFile.parse() (GH41127)
* Deprecated behavior of DatetimeIndex.union() with mixed timezones; in a
future version both will be cast to UTC instead of object dtype (GH39328)
* Deprecated using usecols with out of bounds indices for read_csv() with
engine="c" (GH25623)
* Deprecated special treatment of lists with first element a Categorical in
the DataFrame constructor; pass as pd.DataFrame({col: categorical, ...})
instead (GH38845)
* Deprecated behavior of DataFrame constructor when a dtype is passed and the
data cannot be cast to that dtype. In a future version, this will raise
instead of being silently ignored (GH24435)
* Deprecated the Timestamp.freq attribute. For the properties that use it (
is_month_start, is_month_end, is_quarter_start, is_quarter_end,
is_year_start, is_year_end), when you have a freq, use e.g.
freq.is_month_start(ts) (GH15146)
* Deprecated construction of Series or DataFrame with DatetimeTZDtype data
and datetime64[ns] dtype. Use Series(data).dt.tz_localize(None) instead (
GH41555, GH33401)
* Deprecated behavior of Series construction with large-integer values and
small-integer dtype silently overflowing; use Series(data).astype(dtype)
instead (GH41734)
* Deprecated behavior of DataFrame construction with floating data and
integer dtype casting even when lossy; in a future version this will remain
floating, matching Series behavior (GH41770)
* Deprecated inference of timedelta64[ns], datetime64[ns], or DatetimeTZDtype
dtypes in Series construction when data containing strings is passed and no
dtype is passed (GH33558)
* In a future version, constructing Series or DataFrame with datetime64[ns]
data and DatetimeTZDtype will treat the data as wall-times instead of as
UTC times (matching DatetimeIndex behavior). To treat the data as UTC
times, use pd.Series(data).dt.tz_localize("UTC").dt.tz_convert(dtype.tz) or
pd.Series(data.view("int64"), dtype=dtype) (GH33401)
* Deprecated passing lists as key to DataFrame.xs() and Series.xs() (GH41760)
* Deprecated boolean arguments of inclusive in Series.between() to have
{"left", "right", "neither", "both"} as standard argument values (GH40628)
* Deprecated passing arguments as positional for all of the following, with
exceptions noted (GH41485):
+ concat() (other than objs)
+ read_csv() (other than filepath_or_buffer)
+ read_table() (other than filepath_or_buffer)
+ DataFrame.clip() and Series.clip() (other than upper and lower)
+ DataFrame.drop_duplicates() (except for subset), Series.drop_duplicates
(), Index.drop_duplicates() and MultiIndex.drop_duplicates()
+ DataFrame.drop() (other than labels) and Series.drop()
+ DataFrame.dropna() and Series.dropna()
+ DataFrame.ffill(), Series.ffill(), DataFrame.bfill(), and Series.bfill
()
+ DataFrame.fillna() and Series.fillna() (apart from value)
+ DataFrame.interpolate() and Series.interpolate() (other than method)
+ DataFrame.mask() and Series.mask() (other than cond and other)
+ DataFrame.reset_index() (other than level) and Series.reset_index()
+ DataFrame.set_axis() and Series.set_axis() (other than labels)
+ DataFrame.set_index() (other than keys)
+ DataFrame.sort_index() and Series.sort_index()
+ DataFrame.sort_values() (other than by) and Series.sort_values()
+ DataFrame.where() and Series.where() (other than cond and other)
+ Index.set_names() and MultiIndex.set_names() (except for names)
+ MultiIndex.codes() (except for codes)
+ MultiIndex.set_levels() (except for levels)
+ Resampler.interpolate() (other than method)
-------------------------------------------------------------------------------
Performance improvements
* Performance improvement in IntervalIndex.isin() (GH38353)
* Performance improvement in Series.mean() for nullable data types (GH34814)
* Performance improvement in Series.isin() for nullable data types (GH38340)
* Performance improvement in DataFrame.fillna() with method="pad" or method=
"backfill" for nullable floating and nullable integer dtypes (GH39953)
* Performance improvement in DataFrame.corr() for method=kendall (GH28329)
* Performance improvement in DataFrame.corr() for method=spearman (GH40956,
GH41885)
* Performance improvement in Rolling.corr() and Rolling.cov() (GH39388)
* Performance improvement in RollingGroupby.corr(), ExpandingGroupby.corr(),
ExpandingGroupby.corr() and ExpandingGroupby.cov() (GH39591)
* Performance improvement in unique() for object data type (GH37615)
* Performance improvement in json_normalize() for basic cases (including
separators) (GH40035 GH15621)
* Performance improvement in ExpandingGroupby aggregation methods (GH39664)
* Performance improvement in Styler where render times are more than 50%
reduced and now matches DataFrame.to_html() (GH39972 GH39952, GH40425)
* The method Styler.set_td_classes() is now as performant as Styler.apply()
and Styler.applymap(), and even more so in some cases (GH40453)
* Performance improvement in ExponentialMovingWindow.mean() with times (
GH39784)
* Performance improvement in GroupBy.apply() when requiring the Python
fallback implementation (GH40176)
* Performance improvement in the conversion of a PyArrow Boolean array to a
pandas nullable Boolean array (GH41051)
* Performance improvement for concatenation of data with type
CategoricalDtype (GH40193)
* Performance improvement in GroupBy.cummin() and GroupBy.cummax() with
nullable data types (GH37493)
* Performance improvement in Series.nunique() with nan values (GH40865)
* Performance improvement in DataFrame.transpose(), Series.unstack() with
DatetimeTZDtype (GH40149)
* Performance improvement in Series.plot() and DataFrame.plot() with entry
point lazy loading (GH41492)
-------------------------------------------------------------------------------
Bug fixes
-------------------------------------------------------------------------------
Categorical
* Bug in CategoricalIndex incorrectly failing to raise TypeError when scalar
data is passed (GH38614)
* Bug in CategoricalIndex.reindex failed when the Index passed was not
categorical but whose values were all labels in the category (GH28690)
* Bug where constructing a Categorical from an object-dtype array of date
objects did not round-trip correctly with astype (GH38552)
* Bug in constructing a DataFrame from an ndarray and a CategoricalDtype (
GH38857)
* Bug in setting categorical values into an object-dtype column in a
DataFrame (GH39136)
* Bug in DataFrame.reindex() was raising an IndexError when the new index
contained duplicates and the old index was a CategoricalIndex (GH38906)
* Bug in Categorical.fillna() with a tuple-like category raising
NotImplementedError instead of ValueError when filling with a non-category
tuple (GH41914)
-------------------------------------------------------------------------------
Datetimelike
* Bug in DataFrame and Series constructors sometimes dropping nanoseconds
from Timestamp (resp. Timedelta) data, with dtype=datetime64[ns] (resp.
timedelta64[ns]) (GH38032)
* Bug in DataFrame.first() and Series.first() with an offset of one month
returning an incorrect result when the first day is the last day of a month
(GH29623)
* Bug in constructing a DataFrame or Series with mismatched datetime64 data
and timedelta64 dtype, or vice-versa, failing to raise a TypeError (GH38575
, GH38764, GH38792)
* Bug in constructing a Series or DataFrame with a datetime object out of
bounds for datetime64[ns] dtype or a timedelta object out of bounds for
timedelta64[ns] dtype (GH38792, GH38965)
* Bug in DatetimeIndex.intersection(), DatetimeIndex.symmetric_difference(),
PeriodIndex.intersection(), PeriodIndex.symmetric_difference() always
returning object-dtype when operating with CategoricalIndex (GH38741)
* Bug in DatetimeIndex.intersection() giving incorrect results with non-Tick
frequencies with n != 1 (GH42104)
* Bug in Series.where() incorrectly casting datetime64 values to int64 (
GH37682)
* Bug in Categorical incorrectly typecasting datetime object to Timestamp (
GH38878)
* Bug in comparisons between Timestamp object and datetime64 objects just
outside the implementation bounds for nanosecond datetime64 (GH39221)
* Bug in Timestamp.round(), Timestamp.floor(), Timestamp.ceil() for values
near the implementation bounds of Timestamp (GH39244)
* Bug in Timedelta.round(), Timedelta.floor(), Timedelta.ceil() for values
near the implementation bounds of Timedelta (GH38964)
* Bug in date_range() incorrectly creating DatetimeIndex containing NaT
instead of raising OutOfBoundsDatetime in corner cases (GH24124)
* Bug in infer_freq() incorrectly fails to infer 'H' frequency of
DatetimeIndex if the latter has a timezone and crosses DST boundaries (
GH39556)
* Bug in Series backed by DatetimeArray or TimedeltaArray sometimes failing
to set the array's freq to None (GH41425)
-------------------------------------------------------------------------------
Timedelta
* Bug in constructing Timedelta from np.timedelta64 objects with
non-nanosecond units that are out of bounds for timedelta64[ns] (GH38965)
* Bug in constructing a TimedeltaIndex incorrectly accepting np.datetime64
("NaT") objects (GH39462)
* Bug in constructing Timedelta from an input string with only symbols and no
digits failed to raise an error (GH39710)
* Bug in TimedeltaIndex and to_timedelta() failing to raise when passed
non-nanosecond timedelta64 arrays that overflow when converting to
timedelta64[ns] (GH40008)
-------------------------------------------------------------------------------
Timezones
* Bug in different tzinfo objects representing UTC not being treated as
equivalent (GH39216)
* Bug in dateutil.tz.gettz("UTC") not being recognized as equivalent to other
UTC-representing tzinfos (GH39276)
-------------------------------------------------------------------------------
Numeric
* Bug in DataFrame.quantile(), DataFrame.sort_values() causing incorrect
subsequent indexing behavior (GH38351)
* Bug in DataFrame.sort_values() raising an IndexError for empty by (GH40258)
* Bug in DataFrame.select_dtypes() with include=np.number would drop numeric
ExtensionDtype columns (GH35340)
* Bug in DataFrame.mode() and Series.mode() not keeping consistent integer
Index for empty input (GH33321)
* Bug in DataFrame.rank() when the DataFrame contained np.inf (GH32593)
* Bug in DataFrame.rank() with axis=0 and columns holding incomparable types
raising an IndexError (GH38932)
* Bug in Series.rank(), DataFrame.rank(), and GroupBy.rank() treating the
most negative int64 value as missing (GH32859)
* Bug in DataFrame.select_dtypes() different behavior between Windows and
Linux with include="int" (GH36596)
* Bug in DataFrame.apply() and DataFrame.agg() when passed the argument func=
"size" would operate on the entire DataFrame instead of rows or columns (
GH39934)
* Bug in DataFrame.transform() would raise a SpecificationError when passed a
dictionary and columns were missing; will now raise a KeyError instead (
GH40004)
* Bug in GroupBy.rank() giving incorrect results with pct=True and equal
values between consecutive groups (GH40518)
* Bug in Series.count() would result in an int32 result on 32-bit platforms
when argument level=None (GH40908)
* Bug in Series and DataFrame reductions with methods any and all not
returning Boolean results for object data (GH12863, GH35450, GH27709)
* Bug in Series.clip() would fail if the Series contains NA values and has
nullable int or float as a data type (GH40851)
* Bug in UInt64Index.where() and UInt64Index.putmask() with an np.int64 dtype
other incorrectly raising TypeError (GH41974)
* Bug in DataFrame.agg() not sorting the aggregated axis in the order of the
provided aggregation functions when one or more aggregation function fails
to produce results (GH33634)
* Bug in DataFrame.clip() not interpreting missing values as no threshold (
GH40420)
-------------------------------------------------------------------------------
Conversion
* Bug in Series.to_dict() with orient='records' now returns Python native
types (GH25969)
* Bug in Series.view() and Index.view() when converting between datetime-like
(datetime64[ns], datetime64[ns, tz], timedelta64, period) dtypes (GH39788)
* Bug in creating a DataFrame from an empty np.recarray not retaining the
original dtypes (GH40121)
* Bug in DataFrame failing to raise a TypeError when constructing from a
frozenset (GH40163)
* Bug in Index construction silently ignoring a passed dtype when the data
cannot be cast to that dtype (GH21311)
* Bug in StringArray.astype() falling back to NumPy and raising when
converting to dtype='categorical' (GH40450)
* Bug in factorize() where, when given an array with a numeric NumPy dtype
lower than int64, uint64 and float64, the unique values did not keep their
original dtype (GH41132)
* Bug in DataFrame construction with a dictionary containing an array-like
with ExtensionDtype and copy=True failing to make a copy (GH38939)
* Bug in qcut() raising error when taking Float64DType as input (GH40730)
* Bug in DataFrame and Series construction with datetime64[ns] data and dtype
=object resulting in datetime objects instead of Timestamp objects (GH41599
)
* Bug in DataFrame and Series construction with timedelta64[ns] data and
dtype=object resulting in np.timedelta64 objects instead of Timedelta
objects (GH41599)
* Bug in DataFrame construction when given a two-dimensional object-dtype
np.ndarray of Period or Interval objects failing to cast to PeriodDtype or
IntervalDtype, respectively (GH41812)
* Bug in constructing a Series from a list and a PandasDtype (GH39357)
* Bug in creating a Series from a range object that does not fit in the
bounds of int64 dtype (GH30173)
* Bug in creating a Series from a dict with all-tuple keys and an Index that
requires reindexing (GH41707)
* Bug in infer_dtype() not recognizing Series, Index, or array with a Period
dtype (GH23553)
* Bug in infer_dtype() raising an error for general ExtensionArray objects.
It will now return "unknown-array" instead of raising (GH37367)
* Bug in DataFrame.convert_dtypes() incorrectly raised a ValueError when
called on an empty DataFrame (GH40393)
-------------------------------------------------------------------------------
Strings
* Bug in the conversion from pyarrow.ChunkedArray to StringArray when the
original had zero chunks (GH41040)
* Bug in Series.replace() and DataFrame.replace() ignoring replacements with
regex=True for StringDType data (GH41333, GH35977)
* Bug in Series.str.extract() with StringArray returning object dtype for an
empty DataFrame (GH41441)
* Bug in Series.str.replace() where the case argument was ignored when regex=
False (GH41602)
-------------------------------------------------------------------------------
Interval
* Bug in IntervalIndex.intersection() and IntervalIndex.symmetric_difference
() always returning object-dtype when operating with CategoricalIndex (
GH38653, GH38741)
* Bug in IntervalIndex.intersection() returning duplicates when at least one
of the Index objects have duplicates which are present in the other (
GH38743)
* IntervalIndex.union(), IntervalIndex.intersection(),
IntervalIndex.difference(), and IntervalIndex.symmetric_difference() now
cast to the appropriate dtype instead of raising a TypeError when operating
with another IntervalIndex with incompatible dtype (GH39267)
* PeriodIndex.union(), PeriodIndex.intersection(),
PeriodIndex.symmetric_difference(), PeriodIndex.difference() now cast to
object dtype instead of raising IncompatibleFrequency when operating with
another PeriodIndex with incompatible dtype (GH39306)
* Bug in IntervalIndex.is_monotonic(), IntervalIndex.get_loc(),
IntervalIndex.get_indexer_for(), and IntervalIndex.__contains__() when NA
values are present (GH41831)
-------------------------------------------------------------------------------
Indexing
* Bug in Index.union() and MultiIndex.union() dropping duplicate Index values
when Index was not monotonic or sort was set to False (GH36289, GH31326,
GH40862)
* Bug in CategoricalIndex.get_indexer() failing to raise InvalidIndexError
when non-unique (GH38372)
* Bug in IntervalIndex.get_indexer() when target has CategoricalDtype and
both the index and the target contain NA values (GH41934)
* Bug in Series.loc() raising a ValueError when input was filtered with a
Boolean list and values to set were a list with lower dimension (GH20438)
* Bug in inserting many new columns into a DataFrame causing incorrect
subsequent indexing behavior (GH38380)
* Bug in DataFrame.__setitem__() raising a ValueError when setting multiple
values to duplicate columns (GH15695)
* Bug in DataFrame.loc(), Series.loc(), DataFrame.__getitem__() and
Series.__getitem__() returning incorrect elements for non-monotonic
DatetimeIndex for string slices (GH33146)
* Bug in DataFrame.reindex() and Series.reindex() with timezone aware indexes
raising a TypeError for method="ffill" and method="bfill" and specified
tolerance (GH38566)
* Bug in DataFrame.reindex() with datetime64[ns] or timedelta64[ns]
incorrectly casting to integers when the fill_value requires casting to
object dtype (GH39755)
* Bug in DataFrame.__setitem__() raising a ValueError when setting on an
empty DataFrame using specified columns and a nonempty DataFrame value (
GH38831)
* Bug in DataFrame.loc.__setitem__() raising a ValueError when operating on a
unique column when the DataFrame has duplicate columns (GH38521)
* Bug in DataFrame.iloc.__setitem__() and DataFrame.loc.__setitem__() with
mixed dtypes when setting with a dictionary value (GH38335)
* Bug in Series.loc.__setitem__() and DataFrame.loc.__setitem__() raising
KeyError when provided a Boolean generator (GH39614)
* Bug in Series.iloc() and DataFrame.iloc() raising a KeyError when provided
a generator (GH39614)
* Bug in DataFrame.__setitem__() not raising a ValueError when the right hand
side is a DataFrame with wrong number of columns (GH38604)
* Bug in Series.__setitem__() raising a ValueError when setting a Series with
a scalar indexer (GH38303)
* Bug in DataFrame.loc() dropping levels of a MultiIndex when the DataFrame
used as input has only one row (GH10521)
* Bug in DataFrame.__getitem__() and Series.__getitem__() always raising
KeyError when slicing with existing strings where the Index has
milliseconds (GH33589)
* Bug in setting timedelta64 or datetime64 values into numeric Series failing
to cast to object dtype (GH39086, GH39619)
* Bug in setting Interval values into a Series or DataFrame with mismatched
IntervalDtype incorrectly casting the new values to the existing dtype (
GH39120)
* Bug in setting datetime64 values into a Series with integer-dtype
incorrectly casting the datetime64 values to integers (GH39266)
* Bug in setting np.datetime64("NaT") into a Series with Datetime64TZDtype
incorrectly treating the timezone-naive value as timezone-aware (GH39769)
* Bug in Index.get_loc() not raising KeyError when key=NaN and method is
specified but NaN is not in the Index (GH39382)
* Bug in DatetimeIndex.insert() when inserting np.datetime64("NaT") into a
timezone-aware index incorrectly treating the timezone-naive value as
timezone-aware (GH39769)
* Bug in incorrectly raising in Index.insert(), when setting a new column
that cannot be held in the existing frame.columns, or in Series.reset_index
() or DataFrame.reset_index() instead of casting to a compatible dtype (
GH39068)
* Bug in RangeIndex.append() where a single object of length 1 was
concatenated incorrectly (GH39401)
* Bug in RangeIndex.astype() where when converting to CategoricalIndex, the
categories became a Int64Index instead of a RangeIndex (GH41263)
* Bug in setting numpy.timedelta64 values into an object-dtype Series using a
Boolean indexer (GH39488)
* Bug in setting numeric values into a into a boolean-dtypes Series using at
or iat failing to cast to object-dtype (GH39582)
* Bug in DataFrame.__setitem__() and DataFrame.iloc.__setitem__() raising
ValueError when trying to index with a row-slice and setting a list as
values (GH40440)
* Bug in DataFrame.loc() not raising KeyError when the key was not found in
MultiIndex and the levels were not fully specified (GH41170)
* Bug in DataFrame.loc.__setitem__() when setting-with-expansion incorrectly
raising when the index in the expanding axis contained duplicates (GH40096)
* Bug in DataFrame.loc.__getitem__() with MultiIndex casting to float when at
least one index column has float dtype and we retrieve a scalar (GH41369)
* Bug in DataFrame.loc() incorrectly matching non-Boolean index elements (
GH20432)
* Bug in indexing with np.nan on a Series or DataFrame with a
CategoricalIndex incorrectly raising KeyError when np.nan keys are present
(GH41933)
* Bug in Series.__delitem__() with ExtensionDtype incorrectly casting to
ndarray (GH40386)
* Bug in DataFrame.at() with a CategoricalIndex returning incorrect results
when passed integer keys (GH41846)
* Bug in DataFrame.loc() returning a MultiIndex in the wrong order if an
indexer has duplicates (GH40978)
* Bug in DataFrame.__setitem__() raising a TypeError when using a str
subclass as the column name with a DatetimeIndex (GH37366)
* Bug in PeriodIndex.get_loc() failing to raise a KeyError when given a
Period with a mismatched freq (GH41670)
* Bug .loc.__getitem__ with a UInt64Index and negative-integer keys raising
OverflowError instead of KeyError in some cases, wrapping around to
positive integers in others (GH41777)
* Bug in Index.get_indexer() failing to raise ValueError in some cases with
invalid method, limit, or tolerance arguments (GH41918)
* Bug when slicing a Series or DataFrame with a TimedeltaIndex when passing
an invalid string raising ValueError instead of a TypeError (GH41821)
* Bug in Index constructor sometimes silently ignoring a specified dtype (
GH38879)
* Index.where() behavior now mirrors Index.putmask() behavior, i.e.
index.where(mask, other) matches index.putmask(~mask, other) (GH39412)
-------------------------------------------------------------------------------
Missing
* Bug in Grouper did not correctly propagate the dropna argument;
DataFrameGroupBy.transform() now correctly handles missing values for
dropna=True (GH35612)
* Bug in isna(), Series.isna(), Index.isna(), DataFrame.isna(), and the
corresponding notna functions not recognizing Decimal("NaN") objects (
GH39409)
* Bug in DataFrame.fillna() not accepting a dictionary for the downcast
keyword (GH40809)
* Bug in isna() not returning a copy of the mask for nullable types, causing
any subsequent mask modification to change the original array (GH40935)
* Bug in DataFrame construction with float data containing NaN and an integer
dtype casting instead of retaining the NaN (GH26919)
* Bug in Series.isin() and MultiIndex.isin() didn't treat all nans as
equivalent if they were in tuples (GH41836)
-------------------------------------------------------------------------------
MultiIndex
* Bug in DataFrame.drop() raising a TypeError when the MultiIndex is
non-unique and level is not provided (GH36293)
* Bug in MultiIndex.intersection() duplicating NaN in the result (GH38623)
* Bug in MultiIndex.equals() incorrectly returning True when the MultiIndex
contained NaN even when they are differently ordered (GH38439)
* Bug in MultiIndex.intersection() always returning an empty result when
intersecting with CategoricalIndex (GH38653)
* Bug in MultiIndex.difference() incorrectly raising TypeError when indexes
contain non-sortable entries (GH41915)
* Bug in MultiIndex.reindex() raising a ValueError when used on an empty
MultiIndex and indexing only a specific level (GH41170)
* Bug in MultiIndex.reindex() raising TypeError when reindexing against a
flat Index (GH41707)
-------------------------------------------------------------------------------
I/O
* Bug in Index.__repr__() when display.max_seq_items=1 (GH38415)
* Bug in read_csv() not recognizing scientific notation if the argument
decimal is set and engine="python" (GH31920)
* Bug in read_csv() interpreting NA value as comment, when NA does contain
the comment string fixed for engine="python" (GH34002)
* Bug in read_csv() raising an IndexError with multiple header columns and
index_col is specified when the file has no data rows (GH38292)
* Bug in read_csv() not accepting usecols with a different length than names
for engine="python" (GH16469)
* Bug in read_csv() returning object dtype when delimiter="," with usecols
and parse_dates specified for engine="python" (GH35873)
* Bug in read_csv() raising a TypeError when names and parse_dates is
specified for engine="c" (GH33699)
* Bug in read_clipboard() and DataFrame.to_clipboard() not working in WSL (
GH38527)
* Allow custom error values for the parse_dates argument of read_sql(),
read_sql_query() and read_sql_table() (GH35185)
* Bug in DataFrame.to_hdf() and Series.to_hdf() raising a KeyError when
trying to apply for subclasses of DataFrame or Series (GH33748)
* Bug in HDFStore.put() raising a wrong TypeError when saving a DataFrame
with non-string dtype (GH34274)
* Bug in json_normalize() resulting in the first element of a generator
object not being included in the returned DataFrame (GH35923)
* Bug in read_csv() applying the thousands separator to date columns when the
column should be parsed for dates and usecols is specified for engine=
"python" (GH39365)
* Bug in read_excel() forward filling MultiIndex names when multiple header
and index columns are specified (GH34673)
* Bug in read_excel() not respecting set_option() (GH34252)
* Bug in read_csv() not switching true_values and false_values for nullable
Boolean dtype (GH34655)
* Bug in read_json() when orient="split" not maintaining a numeric string
index (GH28556)
* read_sql() returned an empty generator if chunksize was non-zero and the
query returned no results. Now returns a generator with a single empty
DataFrame (GH34411)
* Bug in read_hdf() returning unexpected records when filtering on
categorical string columns using the where parameter (GH39189)
* Bug in read_sas() raising a ValueError when datetimes were null (GH39725)
* Bug in read_excel() dropping empty values from single-column spreadsheets (
GH39808)
* Bug in read_excel() loading trailing empty rows/columns for some filetypes
(GH41167)
* Bug in read_excel() raising an AttributeError when the excel file had a
MultiIndex header followed by two empty rows and no index (GH40442)
* Bug in read_excel(), read_csv(), read_table(), read_fwf(), and
read_clipboard() where one blank row after a MultiIndex header with no
index would be dropped (GH40442)
* Bug in DataFrame.to_string() misplacing the truncation column when index=
False (GH40904)
* Bug in DataFrame.to_string() adding an extra dot and misaligning the
truncation row when index=False (GH40904)
* Bug in read_orc() always raising an AttributeError (GH40918)
* Bug in read_csv() and read_table() silently ignoring prefix if names and
prefix are defined, now raising a ValueError (GH39123)
* Bug in read_csv() and read_excel() not respecting the dtype for a
duplicated column name when mangle_dupe_cols is set to True (GH35211)
* Bug in read_csv() silently ignoring sep if delimiter and sep are defined,
now raising a ValueError (GH39823)
* Bug in read_csv() and read_table() misinterpreting arguments when
sys.setprofile had been previously called (GH41069)
* Bug in the conversion from PyArrow to pandas (e.g. for reading Parquet)
with nullable dtypes and a PyArrow array whose data buffer size is not a
multiple of the dtype size (GH40896)
* Bug in read_excel() would raise an error when pandas could not determine
the file type even though the user specified the engine argument (GH41225)
* Bug in read_clipboard() copying from an excel file shifts values into the
wrong column if there are null values in first column (GH41108)
* Bug in DataFrame.to_hdf() and Series.to_hdf() raising a TypeError when
trying to append a string column to an incompatible column (GH41897)
-------------------------------------------------------------------------------
Period
* Comparisons of Period objects or Index, Series, or DataFrame with
mismatched PeriodDtype now behave like other mismatched-type comparisons,
returning False for equals, True for not-equal, and raising TypeError for
inequality checks (GH39274)
-------------------------------------------------------------------------------
Plotting
* Bug in plotting.scatter_matrix() raising when 2d ax argument passed (
GH16253)
* Prevent warnings when Matplotlib's constrained_layout is enabled (GH25261)
* Bug in DataFrame.plot() was showing the wrong colors in the legend if the
function was called repeatedly and some calls used yerr while others didn
t (GH39522)
* Bug in DataFrame.plot() was showing the wrong colors in the legend if the
function was called repeatedly and some calls used secondary_y and others
use legend=False (GH40044)
* Bug in DataFrame.plot.box() when dark_background theme was selected, caps
or min/max markers for the plot were not visible (GH40769)
-------------------------------------------------------------------------------
Groupby/resample/rolling
* Bug in GroupBy.agg() with PeriodDtype columns incorrectly casting results
too aggressively (GH38254)
* Bug in SeriesGroupBy.value_counts() where unobserved categories in a
grouped categorical Series were not tallied (GH38672)
* Bug in SeriesGroupBy.value_counts() where an error was raised on an empty
Series (GH39172)
* Bug in GroupBy.indices() would contain non-existent indices when null
values were present in the groupby keys (GH9304)
* Fixed bug in GroupBy.sum() causing a loss of precision by now using Kahan
summation (GH38778)
* Fixed bug in GroupBy.cumsum() and GroupBy.mean() causing loss of precision
through using Kahan summation (GH38934)
* Bug in Resampler.aggregate() and DataFrame.transform() raising a TypeError
instead of SpecificationError when missing keys had mixed dtypes (GH39025)
* Bug in DataFrameGroupBy.idxmin() and DataFrameGroupBy.idxmax() with
ExtensionDtype columns (GH38733)
* Bug in Series.resample() would raise when the index was a PeriodIndex
consisting of NaT (GH39227)
* Bug in RollingGroupby.corr() and ExpandingGroupby.corr() where the groupby
column would return 0 instead of np.nan when providing other that was
longer than each group (GH39591)
* Bug in ExpandingGroupby.corr() and ExpandingGroupby.cov() where 1 would be
returned instead of np.nan when providing other that was longer than each
group (GH39591)
* Bug in GroupBy.mean(), GroupBy.median() and DataFrame.pivot_table() not
propagating metadata (GH28283)
* Bug in Series.rolling() and DataFrame.rolling() not calculating window
bounds correctly when window is an offset and dates are in descending order
(GH40002)
* Bug in Series.groupby() and DataFrame.groupby() on an empty Series or
DataFrame would lose index, columns, and/or data types when directly using
the methods idxmax, idxmin, mad, min, max, sum, prod, and skew or using
them through apply, aggregate, or resample (GH26411)
* Bug in GroupBy.apply() where a MultiIndex would be created instead of an
Index when used on a RollingGroupby object (GH39732)
* Bug in DataFrameGroupBy.sample() where an error was raised when weights was
specified and the index was an Int64Index (GH39927)
* Bug in DataFrameGroupBy.aggregate() and Resampler.aggregate() would
sometimes raise a SpecificationError when passed a dictionary and columns
were missing; will now always raise a KeyError instead (GH40004)
* Bug in DataFrameGroupBy.sample() where column selection was not applied
before computing the result (GH39928)
* Bug in ExponentialMovingWindow when calling __getitem__ would incorrectly
raise a ValueError when providing times (GH40164)
* Bug in ExponentialMovingWindow when calling __getitem__ would not retain
com, span, alpha or halflife attributes (GH40164)
* ExponentialMovingWindow now raises a NotImplementedError when specifying
times with adjust=False due to an incorrect calculation (GH40098)
* Bug in ExponentialMovingWindowGroupby.mean() where the times argument was
ignored when engine='numba' (GH40951)
* Bug in ExponentialMovingWindowGroupby.mean() where the wrong times were
used the in case of multiple groups (GH40951)
* Bug in ExponentialMovingWindowGroupby where the times vector and values
became out of sync for non-trivial groups (GH40951)
* Bug in Series.asfreq() and DataFrame.asfreq() dropping rows when the index
was not sorted (GH39805)
* Bug in aggregation functions for DataFrame not respecting numeric_only
argument when level keyword was given (GH40660)
* Bug in SeriesGroupBy.aggregate() where using a user-defined function to
aggregate a Series with an object-typed Index causes an incorrect Index
shape (GH40014)
* Bug in RollingGroupby where as_index=False argument in groupby was ignored
(GH39433)
* Bug in GroupBy.any() and GroupBy.all() raising a ValueError when using with
nullable type columns holding NA even with skipna=True (GH40585)
* Bug in GroupBy.cummin() and GroupBy.cummax() incorrectly rounding integer
values near the int64 implementations bounds (GH40767)
* Bug in GroupBy.rank() with nullable dtypes incorrectly raising a TypeError
(GH41010)
* Bug in GroupBy.cummin() and GroupBy.cummax() computing wrong result with
nullable data types too large to roundtrip when casting to float (GH37493)
* Bug in DataFrame.rolling() returning mean zero for all NaN window with
min_periods=0 if calculation is not numerical stable (GH41053)
* Bug in DataFrame.rolling() returning sum not zero for all NaN window with
min_periods=0 if calculation is not numerical stable (GH41053)
* Bug in SeriesGroupBy.agg() failing to retain ordered CategoricalDtype on
order-preserving aggregations (GH41147)
* Bug in GroupBy.min() and GroupBy.max() with multiple object-dtype columns
and numeric_only=False incorrectly raising a ValueError (GH41111)
* Bug in DataFrameGroupBy.rank() with the GroupBy object's axis=0 and the
rank method's keyword axis=1 (GH41320)
* Bug in DataFrameGroupBy.__getitem__() with non-unique columns incorrectly
returning a malformed SeriesGroupBy instead of DataFrameGroupBy (GH41427)
* Bug in DataFrameGroupBy.transform() with non-unique columns incorrectly
raising an AttributeError (GH41427)
* Bug in Resampler.apply() with non-unique columns incorrectly dropping
duplicated columns (GH41445)
* Bug in Series.groupby() aggregations incorrectly returning empty Series
instead of raising TypeError on aggregations that are invalid for its
dtype, e.g. .prod with datetime64[ns] dtype (GH41342)
* Bug in DataFrameGroupBy aggregations incorrectly failing to drop columns
with invalid dtypes for that aggregation when there are no valid columns (
GH41291)
* Bug in DataFrame.rolling.__iter__() where on was not assigned to the index
of the resulting objects (GH40373)
* Bug in DataFrameGroupBy.transform() and DataFrameGroupBy.agg() with engine=
"numba" where *args were being cached with the user passed function (
GH41647)
* Bug in DataFrameGroupBy methods agg, transform, sum, bfill, ffill, pad,
pct_change, shift, ohlc dropping .columns.names (GH41497)
-------------------------------------------------------------------------------
Reshaping
* Bug in merge() raising error when performing an inner join with partial
index and right_index=True when there was no overlap between indices (
GH33814)
* Bug in DataFrame.unstack() with missing levels led to incorrect index names
(GH37510)
* Bug in merge_asof() propagating the right Index with left_index=True and
right_on specification instead of left Index (GH33463)
* Bug in DataFrame.join() on a DataFrame with a MultiIndex returned the wrong
result when one of both indexes had only one level (GH36909)
* merge_asof() now raises a ValueError instead of a cryptic TypeError in case
of non-numerical merge columns (GH29130)
* Bug in DataFrame.join() not assigning values correctly when the DataFrame
had a MultiIndex where at least one dimension had dtype Categorical with
non-alphabetically sorted categories (GH38502)
* Series.value_counts() and Series.mode() now return consistent keys in
original order (GH12679, GH11227 and GH39007)
* Bug in DataFrame.stack() not handling NaN in MultiIndex columns correctly (
GH39481)
* Bug in DataFrame.apply() would give incorrect results when the argument
func was a string, axis=1, and the axis argument was not supported; now
raises a ValueError instead (GH39211)
* Bug in DataFrame.sort_values() not reshaping the index correctly after
sorting on columns when ignore_index=True (GH39464)
* Bug in DataFrame.append() returning incorrect dtypes with combinations of
ExtensionDtype dtypes (GH39454)
* Bug in DataFrame.append() returning incorrect dtypes when used with
combinations of datetime64 and timedelta64 dtypes (GH39574)
* Bug in DataFrame.append() with a DataFrame with a MultiIndex and appending
a Series whose Index is not a MultiIndex (GH41707)
* Bug in DataFrame.pivot_table() returning a MultiIndex for a single value
when operating on an empty DataFrame (GH13483)
* Index can now be passed to the numpy.all() function (GH40180)
* Bug in DataFrame.stack() not preserving CategoricalDtype in a MultiIndex (
GH36991)
* Bug in to_datetime() raising an error when the input sequence contained
unhashable items (GH39756)
* Bug in Series.explode() preserving the index when ignore_index was True and
values were scalars (GH40487)
* Bug in to_datetime() raising a ValueError when Series contains None and NaT
and has more than 50 elements (GH39882)
* Bug in Series.unstack() and DataFrame.unstack() with object-dtype values
containing timezone-aware datetime objects incorrectly raising TypeError (
GH41875)
* Bug in DataFrame.melt() raising InvalidIndexError when DataFrame has
duplicate columns used as value_vars (GH41951)
-------------------------------------------------------------------------------
Sparse
* Bug in DataFrame.sparse.to_coo() raising a KeyError with columns that are a
numeric Index without a 0 (GH18414)
* Bug in SparseArray.astype() with copy=False producing incorrect results
when going from integer dtype to floating dtype (GH34456)
* Bug in SparseArray.max() and SparseArray.min() would always return an empty
result (GH40921)
-------------------------------------------------------------------------------
ExtensionArray
* Bug in DataFrame.where() when other is a Series with an ExtensionDtype (
GH38729)
* Fixed bug where Series.idxmax(), Series.idxmin(), Series.argmax(), and
Series.argmin() would fail when the underlying data is an ExtensionArray (
GH32749, GH33719, GH36566)
* Fixed bug where some properties of subclasses of PandasExtensionDtype where
improperly cached (GH40329)
* Bug in DataFrame.mask() where masking a DataFrame with an ExtensionDtype
raises a ValueError (GH40941)
-------------------------------------------------------------------------------
Styler
* Bug in Styler where the subset argument in methods raised an error for some
valid MultiIndex slices (GH33562)
* Styler rendered HTML output has seen minor alterations to support w3 good
code standards (GH39626)
* Bug in Styler where rendered HTML was missing a column class identifier for
certain header cells (GH39716)
* Bug in Styler.background_gradient() where text-color was not determined
correctly (GH39888)
* Bug in Styler.set_table_styles() where multiple elements in CSS-selectors
of the table_styles argument were not correctly added (GH34061)
* Bug in Styler where copying from Jupyter dropped the top left cell and
misaligned headers (GH12147)
* Bug in Styler.where where kwargs were not passed to the applicable callable
(GH40845)
* Bug in Styler causing CSS to duplicate on multiple renders (GH39395,
GH40334)
-------------------------------------------------------------------------------
Other
* inspect.getmembers(Series) no longer raises an AbstractMethodError (GH38782
)
* Bug in Series.where() with numeric dtype and other=None not casting to nan
(GH39761)
* Bug in assert_series_equal(), assert_frame_equal(), assert_index_equal()
and assert_extension_array_equal() incorrectly raising when an attribute
has an unrecognized NA type (GH39461)
* Bug in assert_index_equal() with exact=True not raising when comparing
CategoricalIndex instances with Int64Index and RangeIndex categories (
GH41263)
* Bug in DataFrame.equals(), Series.equals(), and Index.equals() with
object-dtype containing np.datetime64("NaT") or np.timedelta64("NaT") (
GH39650)
* Bug in show_versions() where console JSON output was not proper JSON (
GH39701)
* pandas can now compile on z/OS when using xlc (GH35826)
* Bug in pandas.util.hash_pandas_object() not recognizing hash_key, encoding
and categorize when the input object type is a DataFrame (GH41404)
What's new in 1.2.5 (June 22, 2021)
These are the changes in pandas 1.2.5. See Release notes for a full changelog
including other versions of pandas.
-------------------------------------------------------------------------------
Fixed regressions
* Fixed regression in concat() between two DataFrame where one has an Index
that is all-None and the other is DatetimeIndex incorrectly raising (
GH40841)
* Fixed regression in DataFrame.sum() and DataFrame.prod() when min_count and
numeric_only are both given (GH41074)
* Fixed regression in read_csv() when using memory_map=True with an non-UTF8
encoding (GH40986)
* Fixed regression in DataFrame.replace() and Series.replace() when the
values to replace is a NumPy float array (GH40371)
* Fixed regression in ExcelFile() when a corrupt file is opened but not
closed (GH41778)
* Fixed regression in DataFrame.astype() with dtype=str failing to convert
NaN in categorical columns (GH41797)
Changes:
- show binary conversion output in octets for readability
- handle ^D
- quit program or 'exit' or 'quit'
- fix broken terminal with calc as backend on "undefined input" (#36)
1.9
Highlights
The internal implementation of Matrix and other matrix classes (SparseMatrix etc) is now DomainMatrix. The ZZ and QQ domains are used for matrices with only integer or rational elements. Otherwise the new EXRAW domain is used. This should be backwards compatible although many internal methods and attributes are changed. At the time of this change the DomainMatrix routines are only used for addition and multiplication of matrices and some other simple low-level operations. Further changes will use DomainMatrix routines for operations like rref, det, lu etc and are expected to lead to big speedups for these computations. At this stage those big speedups are not realised but some basic operations such as indexing a matrix like M[0, 0] could potentially be slower. The new implementation can be much faster for most operations and is expected to lead to significant speed ups over the next few SymPy releases.
Leading term methods now raise PoleError at singularities. There was a long-standing issue of incorrect handling of leading term at singularities, where earlier, for compatibility reasons, the original expression itself was incorrectly returned. exp(1/x).as_leading_term(x) returned exp(1/x), but it does not have any leading term as x->0, so an error must be raised. Note that leadterm used to throw a ValueError even in the previous implementation as the original expression depends on the symbol x. A few examples of functions where this change would be visible - Pow, exp, log, factorial and gamma.
Version 2.16
Fixed a ValueError raised in the excerpt command when an ephemeris segment needs to be entirely skipped because it has no overlap with the user-specified range of dates.
Added a __version__ constant to the package’s top level.
Upstream changes:
1.999827 2021-10-03
* Improve error message for missing library argument.
* Skip tests that don't work on older Perls. Also skip tests that compare
floating point numbers.
1.999826 2021-10-01
* Improve documentation related to floating point literals.
* Skip tests that fail due to Perl's broken handling of floating point literals
before v5.32.0.
Version 1.0.3 Release Notes (October 14, 2021)
==============================================
Potentially breaking change:
- argument ``x`` is now required for the ``guess`` method of Models
To get reasonable estimates for starting values one should always supply both ``x`` and ``y`` values; in some cases it would work
when only providing ``data`` (i.e., y-values). With the change above, ``x`` is now required in the ``guess`` method call, so scripts might
need to be updated to explicitly supply ``x``.
Bug fixes/enhancements:
- do not overwrite user-specified figure titles in Model.plot() functions and allow setting with ``title`` keyword argument
- preserve Parameters subclass in deepcopy
- coerce ``data`` and ``indepdent_vars`` to NumPy array with ``dtype=float64`` or ``dtype=complex128`` where applicable
- fix collision between parameter names in built-in models and user-specified parameters
- correct error message in PolynomialModel
- improved handling of altered JSON data
- map ``max_nfev`` to ``maxiter`` when using ``differential_evolution``
- correct use of noise versus experimental uncertainty in the documentation
- specify return type of ``eval`` method more precisely and allow for plotting of (Complex)ConstantModel by coercing their
``float``, ``int``, or ``complex`` return value to a ``numpy.ndarray``
- fix ``dho`` (Damped Harmonic Oscillator) lineshape
- reset ``Minimizer._abort`` to ``False`` before starting a new fit
- fix typo in ``guess_from_peak2d``
Various:
- update asteval dependency to >= 0.9.22 to avoid DeprecationWarnings from NumPy v1.20.0
- remove incorrectly spelled ``DonaichModel`` and ``donaich`` lineshape, deprecated in version 1.0.1
- remove occurrences of OrderedDict throughout the code; dict is order-preserving since Python 3.6
- update the contributing instructions
- (again) defer import of matplotlib to when it is needed
- fix description of ``name`` argument in ``Parameters.add``
- update dependencies, make sure a functional development environment is installed on Windows
- use ``setuptools_scm`` for version info instead of ``versioneer``
- transition to using ``f-strings``
- mark ``test_manypeaks_speed.py`` as flaky to avoid intermittent test failures
- update scipy dependency to >= 1.14.0
- improvement to output of examples in sphinx-gallery and use higher resolution figures
- remove deprecated functions ``lmfit.printfuncs.report_errors`` and ``asteval`` argument in ``Parameters`` class
.. _whatsnew_102_label:
Version 1.0.2 Release Notes (February 7, 2021)
==============================================
Version 1.0.2 officially supports Python 3.9 and has dropped support for Python 3.5. The minimum version
of the following dependencies were updated: asteval>=0.9.21, numpy>=1.18, and scipy>=1.3.
New features:
- added two-dimensional Gaussian lineshape and model
- all built-in models are now registered in ``lmfit.models.lmfit_models``; new Model class attribute ``valid_forms``
- added a SineModel
- add the ``run_mcmc_kwargs argument`` to ``Minimizer.emcee`` to pass to the ``emcee.EnsembleSampler.run_mcmc`` function
Bug fixes:
- ``ModelResult.eval_uncertainty`` should use provided Parameters
- center in lognormal model can be negative
- restore best-fit values after calculation of covariance matrix
- add helper-function ``not_zero`` to prevent ZeroDivisionError in lineshapes and use in exponential lineshape
- save ``last_internal_values`` and use to restore internal values if fit is aborted
- dumping a fit using the ``lbfgsb`` method now works, convert bytes to string if needed
- fix use of callable Jacobian for scalar methods
- preserve float/int types when encoding for JSON
- better support for saving/loading of ExpressionModels and assure that ``init_params`` and ``init_fit`` are set when loading a ``ModelResult``
Various:
- update minimum dependencies
- improvements in coding style, docstrings, CI, and test coverage
- fix typo in Oscillator
- add example using SymPy
- allow better custom pool for emcee()
- update NIST Strd reference functions and tests
- make building of documentation cross-platform
- relax module name check in ``test_check_ast_errors`` for Python 3.9
- fix/update layout of documentation, now uses the sphinx13 theme
- fixed DeprecationWarnings reported by NumPy v1.2.0
- increase value of ``tiny`` and check for it in bounded parameters to avoid "parameter not moving from initial value"
- add ``max_nfev`` to ``basinhopping`` and ``brute`` (now supported everywhere in lmfit) and set to more uniform default values
- use Azure Pipelines for CI, drop Travis
SciPy 1.7.2 is a bug-fix release with no new features
compared to 1.7.1. Notably, the release includes wheels
for Python 3.10, and wheels are now built with a newer
version of OpenBLAS, 0.3.17. Python 3.10 wheels are provided
for MacOS x86_64 (thin, not universal2 or arm64 at this time),
and Windows/Linux 64-bit. Many wheels are now built with newer
versions of manylinux, which may require newer versions of pip.
The NumPy 1.21.4 is a maintenance release that fixes a few bugs
discovered after 1.21.3. The most important fix here is a fix for the
NumPy header files to make them work for both x86_64 and M1 hardware
when included in the Mac universal2 wheels. Previously, the header files
only worked for M1 and this caused problems for folks building x86_64
extensions. This problem was not seen before Python 3.10 because there
were thin wheels for x86_64 that had precedence. This release also
provides thin x86_64 Mac wheels for Python 3.10.
Upstream changes:
CHANGES IN R 4.1.2:
C-LEVEL FACILITIES:
* The workaround in headers R.h and Rmath.h (using namespace std;)
for the Oracle Developer Studio compiler is no longer needed now
C++11 is required so has been removed. A couple more usages of
log() (which should have been std::log()) with an int argument
are reported on Solaris.
* The undocumented limit of 4095 bytes on messages from the
S-compatibility macros PROBLEM and MESSAGE is now documented and
longer messages will be silently truncated rather than
potentially causing segfaults.
* If the R_NO_SEGV_HANDLER environment variable is non-empty, the
signal handler for SEGV/ILL/BUS signals (which offers recovery
user interface) is not set. This allows more reliable debugging
of crashes that involve the console.
DEPRECATED AND DEFUNCT:
* The legacy S-compatibility macros PROBLEM, MESSAGE, ERROR, WARN,
WARNING, RECOVER, ... are deprecated and will be hidden in R
4.2.0. R's native interface of Rf_error and Rf_warning has long
been preferred.
BUG FIXES:
* .mapply(F, dots, .) no longer segfaults when dots is not a list
and uses match.fun(F) as always documented; reported by Andrew
Simmons in PR#18164.
* hist(<Date>, ...) and hist(<POSIXt>, ...) no longer pass
arguments for rect() (such as col and density) to axis().
(Thanks to Sebastian Meyer's PR#18171.)
* \Sexpr{ch} now preserves Encoding(ch). (Thanks to report and
patch by Jeroen Ooms in PR#18152.)
* Setting the RNG to "Marsaglia-Multicarry" e.g., by RNGkind(), now
warns in more places, thanks to Andr'e Gillibert's report and
patch in PR#18168.
* gray(numeric(), alpha=1/2) no longer segfaults, fixing PR#18183,
reported by Till Krenz.
* Fixed dnbinom(x, size=<very_small>, .., log=TRUE) regression,
reported by Martin Morgan.
* as.Date.POSIXlt(x) now keeps names(x), thanks to Davis Vaughan's
report and patch in PR#18188.
* model.response() now strips an "AsIs" class typically, thanks to
Duncan Murdoch's report and other discussants in PR#18190.
* try() is considerably faster in case of an error and long call,
as e.g., from some do.call(). Thanks to Alexander Kaever's
suggestion posted to R-devel.
* qqline(y = <object>) such as y=I(.), now works, see also
PR#18190.
* Non-integer mgp par() settings are now handled correctly in
axis() and mtext(), thanks to Mikael Jagan and Duncan Murdoch's
report and suggestion in PR#18194.
* formatC(x) returns length zero character() now, rather than ""
when x is of length zero, as documented, thanks to Davis
Vaughan's post to R-devel.
* removeSource(fn) now retains (other) attributes(fn).
0.9.25
fixes import errors for Py3.6 and 3.7, setting version with importlib_metadata.version if available.
also fixes CI testing with github actions so that the proper version of Python is actually used in the test!
0.9.24
use setuptools_scm and importlib for version
SciPy 1.7.1 is a bug-fix release with no new features compared to 1.7.0.
1.7.0:
A new submodule for quasi-Monte Carlo, scipy.stats.qmc, was added
The documentation design was updated to use the same PyData-Sphinx theme as NumPy and other ecosystem libraries.
We now vendor and leverage the Boost C++ library to enable numerous improvements for long-standing weaknesses in scipy.stats
scipy.stats has six new distributions, eight new (or overhauled) hypothesis tests, a new function for bootstrapping, a class that enables fast random variate sampling and percentile point function evaluation, and many other enhancements.
cdist and pdist distance calculations are faster for several metrics, especially weighted cases, thanks to a rewrite to a new C++ backend framework
A new class for radial basis function interpolation, RBFInterpolator, was added to address issues with the Rbf class.
1.21.0
New functions
Add PCG64DXSM BitGenerator
Deprecations
The .dtype attribute must return a dtype
Inexact matches for numpy.convolve and numpy.correlate are deprecated
np.typeDict has been formally deprecated
Exceptions will be raised during array-like creation
Four ndarray.ctypes methods have been deprecated
Expired deprecations
Remove deprecated PolyBase and unused PolyError and PolyDomainError
Compatibility notes
Error type changes in universal functions
__array_ufunc__ argument validation
__array_ufunc__ and additional positional arguments
Validate input values in Generator.uniform
/usr/include removed from default include paths
Changes to comparisons with dtype=...
Changes to dtype and signature arguments in ufuncs
Ufunc signature=... and dtype= generalization and casting
Distutils forces strict floating point model on clang
C API changes
Use of ufunc->type_resolver and “type tuple”
New Features
Added a mypy plugin for handling platform-specific numpy.number precisions
Let the mypy plugin manage extended-precision numpy.number subclasses
New min_digits argument for printing float values
f2py now recognizes Fortran abstract interface blocks
BLAS and LAPACK configuration via environment variables
A runtime-subcriptable alias has been added for ndarray
Improvements
Arbitrary period option for numpy.unwrap
np.unique now returns single NaN
Generator.rayleigh and Generator.geometric performance improved
Placeholder annotations have been improved
Performance improvements
Improved performance in integer division of NumPy arrays
Improve performance of np.save and np.load for small arrays
Changes
numpy.piecewise output class now matches the input class
Enable Accelerate Framework
Upstream changes:
2021.06.23: Changes between NTL 11.5.0 and 11.5.1
Fixed bug that prevented compilation on IBM Z.
2021.06.20: Changes between NTL 11.4.4 and 11.5.0
Added a new configuration option NTL_RANDOM_AES256CTR. The default is off. Configure with NTL_RANDOM_AES256CTR=on to replace the default ChaCha20 Pseudo-Random Number Generator (PRNG) with 256-bit AES counter mode. On certain plaforms (modern x86 and IBM System/390x), special instructions are exploited to improve performance.
Using AES in place of ChaCha may break inter-operability of applications that depend on the behavior of the PRNG.
Using AES in place of ChaCha may affect the performance positively or negatively. On IBM System/390x, there is a marked performance improvement. On x86 there may be a moderate performance improvement or degredation. On any other platforms, where there is no hardware support for AES (or none that is exploited by NTL), there will likely be a marked performance degredation.
Thanks to Patrick Steuer for contributing this code.
Release 1.3.2
CI: fix building wheels on GHA
* ci: fix wheel build command
* ci: remove references to submodules
* ci: fix sdist command and remove Python 3.6 from the matrix
* ci: slightly alter invocation
* ci: disable emulation
* ci: smaller matrix
* ci: use a small matrix but with all python versions
* ci: use manylinux 2010 for CPython 3.9+
* ci: split again matrix per python version given how slow emulation is
Fix also the artifact upload
* ci: fix typo
* ci: typo
Pythran is an ahead of time compiler for a subset of the Python language, with
a focus on scientific computing. It takes a Python module annotated with a few
interface descriptions and turns it into a native Python module with the same
interface, but (hopefully) faster.
Upstream changes:
1.999825 2021-09-28
* Make Math::BigInt accept integers regardless of whether they are written as
decimal, binary, octal, or hexadecimal integers or decimal, binary, octal, or
hexadecimal floating point number.
* When numeric constants are overloaded (with the ":constant" option) in
Math::BigInt, every numeric constant that represent an integer is converted
to an object regardless of how it is written. All finite non-integers are
converted to a NaN.
* When numeric constants are overloaded (with the ":constant" option) in
Math::BigFloat, every numeric constant is converted to an object regardless
of how it is written.
* Add method from_dec() (cf. from_bin(), from_oct(), and from_hex()). It is
like new() except that it does not accept anything but a string representing a
finite decimal number.
1.999824 2021-09-20
* Don't allow mixing math libraries. Use the first backend math library that is
successfully loaded, and ignore any further attempts at loading a different
backend library. This is a solution to the re-occurring problem of using
objects using different math libraries.
* Add missing documentation.
* Miscellaneous minor improvements.
1.999823 2021-07-12
* Improve the handling of the backend libraries. Provide more useful warnings
and error messages. Update the documentation.
1.999822 2021-07-09
* Make the from_hex(), from_oct(), and from_bin() methods consistent with
CORE::oct(), which does not require a leading "0" before the letter ("x",
"o", or "b").
* Make the from_oct() and new() methods accept octal numbers with prefix
"0o", "0O", "o" (lowercase letter o), and "O" (capital letter O).
* Make the from_bin() and new() methods accept binary numbers with
prefix "0b", "0B", "b", and "B".
* Make the from_hex() and new() methods accept hexadecimal numbers with
prefix "0x", "0X", "x", and "X".
* Update test files to match with the above.
1.999821 2021-07-06
* Make new() and from_hex() accept the "0X" prefix, not just the "0x" prefix,
but not accept just "X" or "x". Now, "0XFF" returns 255, not NaN.
* Make new() and from_bin() accept the "0B" prefix, not just the "0b" prefix, but
not accept just "B" or "b". Now, "0B1111" returns 255, not NaN.
* Make new() and from_oct() accept the "0o" and "0O" prefixes, but not accept
just "O" (capital letter O) or "o" (lowercase letter o). Now, "0o377" and
"0O377" return 255, not NaN. Also intepret floating point numbers with a
leading zero and a binary exponent as an octal number, so that "01.4p0"
returns 1.5, not NaN. There is still no ambiguety, since decimal floating
point numbers use "e" or "E" before the exponent, and binary and hexadecimal
floating point numbers use a "0b"/"0B" or "0x"/"0x" prefix, respectively.
1.999820 2021-07-06
* Fix bug and improve error messages in Math::BigInt::import().
1.999819 2021-07-02
* Add method btfac() (triple factorial) and bmfac() (multi-factorial),
including tests and documentation.
* Add missing and correct erroneous documentation for bfac() (factorial)
and bdfac() (double factorial). Also correct handling of special cases
and add tests for these cases.
* Fix error in bsin() and bcos() causing them to hang indefinitely if the
invocand is +/-inf.
* Make it possible for the end user to specify the base length used internally
in Math::BigInt::Calc.
FFTW 3.3.10:
* Fix bug that would cause 2-way SIMD (notably SSE2 in double precision)
to attempt unaligned accesses in certain obscure cases, causing
segfaults.
The following test triggers the bug (SSE2, double precision):
./tests/bench -oexhaustive r4*2:5:3
This test computes a pair of length-4 real->complex transforms where
the second input is 5 real numbers away from the first input. That
is, there is a gap of one real number between the first and second
input array. The -oexhaustive level allow FFTW to attempt to
compute this transform by reducing it to a pair of complex
transforms of length 2, but now the second input is not aligned to a
complex-number boundary. The fact that 5 is odd is the problem.
The bug cannot occur in complex->complex transforms because the
complex interface accepts strides in units of complex numbers, so
strides are aligned by construction.
This bug has been around at least since fftw-3.1.2 (July 2006), and
probably since fftw-3.0 (2003).
A tool to provide an easy, intuitive and consistent access to
information contained in various R models, like model formulas, model
terms, information about random effects, data that was used to fit the
model or data from response variables. 'insight' mainly revolves
around two types of functions: Functions that find (the names of)
information, starting with 'find_', and functions that get the
underlying data, starting with 'get_'. The package has a consistent
syntax and works with many different model objects, where otherwise
functions to access these information are missing.
29 February 2020:
* Replaced deprecated functions from testhat framework in unit tests (contributed by Avraham Adler).
26 February 2020:
* Fixed warnings (as requested by CRAN): R CMD config variables 'CPP' and 'CXXCPP' are deprecated.
20 October 2018:
* Exposed CCSAQ algorithm in R interface (contributed by Julien Chiquet).
03 October 2018:
* Build process was changed to solve issues on several OS (many thanks to the CRAN maintainers).
4.1-1 CRAN
4.1 svyquantile() has been COMPLETELY REWRITTEN. The old version is available
as oldsvyquantile() (for David Eduardo Jorquera Petersen)
svycontrast()'s improvements for statistics with replicates are now also there with
svyby(), for domain comparisons (Robert Baskin)
svyttest() now gives an error message if the binary group variable isn't binary
(for StackOverflow 60930323)
confint.svyglm Wald-type intervals now correctly label the columns (eg 2.5%, 97.5%)
(for Molly Petersen)
svyolr() using linearisation had the wrong standard errors for intercepts
other than the first, if extracted using vcov (it was correct in summary() output)
svyglm() gave deffs that were too large by a factor of nrow(design). (Adrianne Bradford)
svycoxph() now warns if you try to use frailty or other penalised terms, because they
just come from calling coxph and I have no reason to believe they work correctly
in complex samples (for Claudia Rivera)
coef.svyglm() now has a complete= argument to match coef.default(). (for Thomas Leeper)
summary.svyglm() now gives NA p-values and a warning, rather than Inf standard errors,
when the residual df are zero or negative (for Dan Simpson and Lauren Kennedy)
In the multigroup case, svyranktest() now documents which elements of the 'htest'
object have which parts of the result, because it's a bit weird (for Justin Allen)
svycontrast() gets a new argument add=TRUE to keep the old coefficients as well
twophase() can now take strata= arguments that are character, not just factor
or numeric. (for Pam Shaw)
add reference to Chen & Lumley on tail probabilities for quadratic forms.
add reference to Breslow et al for calibrate()
add svyqqplot and svyqqmath for quantile-quantile plots
SE.svyby would grab confidence interval limits instead of SEs if vartype=c("ci","se").
svylogrank(method="small") was wrong (though method="score" and method="large" are ok),
because of problems in obtaining the at-risk matrix from coxph.detail. (for Zhiwen Yao)
added as.svrepdesign.svyimputationList and withReplicates.svyimputationList
(for Ángel Rodríguez Laso)
logLik.svyglm used to return the deviance and now divides it by -2
svybys() to make multiple tables by separate variables rather than a joint table
(for Hannah Evans)
added predictat= option to svypredmeans for Steven Johnston.
Fixed bug in postStratify.svyrep.design, was reweighting all reps the same (Steven Johnston)
Fix date for Thomas & Rao (1987) (Neil Diamond)
Add svygofchisq() for one-sample chisquared goodness of fit (for Natalie Gallagher)
confint.svyglm(method="Wald") now uses t distribution with design df by default.
(for Ehsan Karim)
confint.svyglm() checks for zero/negative degrees of freedom
confint.svyglm() checks for zero/negative degrees of freedom
mrb bootstrap now doesn't throw an error when there's a single PSU in a stratum
(Steve White)
oldsvyquantile() bug with producing replicate-weight confidence intervals for
multiple quantiles (Ben Schneider)
regTermTest(,method="LRT") didn't work if the survey design object and model were
defined in a function (for Keiran Shao)
svyglm() has clearer error message when the subset= argument contains NAs (for Pam Shaw)
and when the weights contain NAs (for Paige Johnson)
regTermTest was dropping the first term for coxph() models (Adam Elder)
svydesign() is much faster for very large datasets with character ids or strata.
svyglm() now works with na.action=na.exclude (for Terry Therneau)
extractAIC.svylm does the design-based AIC for the two-parameter Gaussian model, so
estimating the variance parameter as well as the regression parameters.
(for Benmei Liu and Barry Graubard)
svydesign(, pps=poisson_sampling()) for Poisson sampling, and ppscov() for
specifying PPS design with weighted or unweighted covariance of sampling indicators
(for Claudia Rivera Rodriguez)
4.0 Some (and eventually nearly all) functions now return influence functions when
called with a survey.design2 object and the influence=TRUE option. These allow
svyby() to estimate covariances between domains, which could previously only be
done for replicate-weight designs, and so allow svycontrast() to do domain contrasts
- svymean, svytotal, svyratio, svymle, svyglm, svykappa
Nonlinear least squares with svynls() now available
Document that predict.svyglm() doesn't use a rescaled residual mean square
to estimate standard errors, and so disagrees with some textbooks. (for Trent Buskirk)
3.38 When given a statistic including replicates, svycontrast() now transforms the replicates
and calculates the variance, rather than calculating the variance then using the
delta method. Allows geometric means to exactly match SAS/SUDAAN (for Robert Baskin)
vcov.svyrep.design to simplify computing variances from replicates (for William Pelham)
svykm() no longer throws an error with single-observation domains (for Guy Cafri)
Documentation for svyglm() specifies that it has always returned
model-robust standard errors. (for various people wanting to fit relative risk
regression models).
3.37 RODBC database connections are no longer supported.
Use the DBI-compatible 'odbc' package
set scale<-1 if it is still NULL after processing, inside svrepdesign()
[https://stats.stackexchange.com/questions/409463]
Added withPV for replicate-weight designs [for Tomasz Żółtak]
svyquantile for replicate-weight designs now uses a supplied alpha to get
confidence intervals and estimates SE by dividing confidence interval length
by twice abs(qnorm(alpha/2)). [For Klaus Ignacio Lehmann Melendez]
All the svyquantile methods now take account of design degrees of freedom and
use t distributions for confidence intervals. Specify df=Inf to get a Normal.
[For Klaus Ignacio Lehmann Melendez]
svyivreg() for 2-stage least-squares (requires the AER package)
warn when rho= is used with type="BRR" in svrepdesign [for Tomasz Żółtak]
Add "ACS" and "successive-difference" to type= in svrepdesign(),
for the American Community Survey weights
Add "JK2" to type= in svrepdesign
Warn when scale, rscales are supplied unnecessarily to svyrepdesign
More explanation of 'symbolically nested' in anova.svyglm
Link to blog post about design df with replicate weights.
Chase 'Encyclopedia of Design Theory' link again.
# tibble 3.1.4
## Features
- `as.data.frame.tbl_df()` strips inner column names (#837).
- `new_tibble()` allows omitting the `nrow` argument again (#781).
## Documentation
- Move `vignette("digits")`, `vignette("numbers")`, `?num` and `?char`
from the pillar package here (#913).
- Replace `iris` by `trees` (#943).
- Various documentation improvements.
- New `?tibble_options` help page (#912).
## Performance
- `x[i, j] <- one_row_value` avoids explicit recycling of the
right-hand side, the recycling happens implicitly in
`vctrs::vec_assign()` for performance (#922).
## Internal
- Vignettes are now tested with a snapshot test (#919).
- `new_tibble()` uses `vctrs::new_data_frame()` internally (#726,
@DavisVaughan).
- Adapt to pillar 1.6.2.
- Fix tests for compatibility with pillar 1.6.2.
# tibble 3.1.3
## Bug fixes
- `tbl[row, col] <- rhs` treats an all-`NA` logical vector as a
missing value both for existing data (#773) and for the right-hand
side value (#868). This means that a column initialized with `NA`
(of type `logical`) will change its type when a row is updated to a
value of a different type.
- `[[<-()` supports symbols (#893).
## Features
- `as_tibble_row()` supports arbitrary vectors (#797).
- `enframe()` and `deframe()` support arbitrary vectors (#730).
- `tibble()` and `tibble_row()` ignore all columns that evaluate to
`NULL`, not only those where a verbatim `NULL` is passed (#895,
#900).
- `new_tibble()` is now faster (#901, @mgirlich).
## Internal
- Establish compatibility with rlang > 0.4.11 (#908).
- Use `pillar::dim_desc()` (#859).
- Establish compatibility with testthat > 3.0.3 (#896, @lionel-).
- Bump required versions of ellipsis and vctrs to avoid warning during
package load.
(R CMD Rdconv -t txt math/R-robustbase/work/robustbase/inst/NEWS.Rd)
CHANGES in robustbase VERSION 0.93-8 (2021-06-01, svn r879):
NEW FEATURES:
* 'scaleTau2()' gets new optional 'iter = 1' and 'tol.iter'
arguments; mostly experimentally to see if or when iteration
makes sense.
* 'Qn(x, *)' gets new optional 'k = .' to indicate the
"quantile" i.e., order statistic to be computed (with
default as previously hard-coded).
Experimentally to try for cases where more than n/2
observations coincide (with the median), i.e., 'x[i] == x0
== median(x[])', and hence 'Qn(x)' and 'mad(x)' are zero.
* 'adjOutlyingness()' gets new option 'IQRtype = 7'.
Tweaks:
* For tests: *again* differences found in the non-sensical
'adjOutlyingness()' example (with large p/n, hence many
"random" values in the order of 1e15). Disable the test for
now (and record the result in *.Rout).
BUG FIXES:
* The 'test()' utility in 'tests/lmrob-ex12.R' no longer calls
'matrix(x, n,4)' where the length of x does not match '4n'.
Similar change in 'tests/mc-strict.R'
CHANGES in robustbase VERSION 0.93-7 (2021-01-03, svn r865):
NEW FEATURES:
* Use '\CRANpkg{.}' in most places, providing web links to the
respective CRAN package page.
* 'adjOutlyingness()' now gains optional parameters to be
passed to 'mc()'.
BUG FIXES:
* update the internal man page, so new 'checkRdContents()' is
happy.
* fix several '\url{.}''s that now are diagnosed as 'moved'.
* 'adjOutlyingness()' finally works with 'p.samp > p'.
* 'scaleTau2()' now works with 'Inf' and very large values,
and obeys new 'na.rm = FALSE' argument.
* add 'check.environment=FALSE' to some of the 'all.equal()'
calls (for 'R-devel', i.e., future R 4.1.x).
* 'wgt.himedian(numeric())' now returns 'NA' instead of
occasionally seg.faulting or inf.looping. Ditto for a case
when called from 'Qn()'.
CHANGES in robustbase VERSION 0.93-6 (2020-03-20, svn r854):
NEW FEATURES:
* 'splitFrame()' now treats 'character' columns also as
categorical (the same as 'factor's).
Tweaks:
* Small updates, also in checks for newer compiler settings,
e.g., 'FCLEN' macro; also F77_*() etc, in order to fix 'LTO'
issues.
* More careful or _less_ calling 'intpr()': correct "Rank" of
array (for gfortran/gcc 10, when '-fallow-argument-mismatch'
is not set).
2021-07-26 Tomoaki NISHIYAMA <tomoakin@staff.kanazawa-u.ac.jp>
* Change LICENCE to GPL-3
* import new config.guess and config.sub
* Drop an unused variable RS_PostgreSQL_closeManager_t
* Use seq_along() instead of seq(along=)
* -Wno-stringop-truncation for libpq compilation on windows
* Change Description for new version and license.
* fix type as pointed out by PR #109
* http to https transition for URLs
Version 2.5-2, 2021-08-20
* Support hdf5 filters via multi-filter interface (netcdf>=4.8.0)
* Windows: update binary packages to netcdf 4.7.4 with OpenDAP
* Generate type conversions with m4 macros
* Reduce CPU time for utcal.nc example to pass CRAN checks
pbkrtest v0.5.1 (Release date: 2021-03-09)
============================================
Changes
* Improved documentation
pbkrtest v0.5-0.0 (Release date: 2020-08-04)
============================================
Changes
* Satterthwaite approximation added via the SATmodcomp function.
* Checks for models being nested is not performed for parametric
bootstrap any longer. Reason is that the simr package use parametric
bootstrap for testing variance components being zero.
* doi added to DESCRIPTION file
pbkrtest v0.4-8.6 (Release date: 2020-02-20)
============================================
Bug fixes:
* documentation fixed ddf_Lb is now exported
* mclapply issue for windows fixed
* vcovAdj.lmerMod is exported to make emmeans work. Contact Russ Lenth
to make emmeans used generic function vcovAdj.
pbkrtest v0.4-8 (Release date: 2020-02-20)
==========================================
Bug fixes:
* Issue related to class() versus inherits() fixed.
Changes:
* NEWS file added
* NAMESPACE file is now generated automatically
Summarizes key information about statistical objects in tidy tibbles.
This makes it easy to report results, create plots and consistently
work with large numbers of models at once. Broom provides three verbs
that each provide different types of information about a model. tidy()
summarizes information about model components such as coefficients of
a regression. glance() reports information about an entire model, such
as goodness of fit measures like AIC and BIC. augment() adds
information about individual observations to a dataset, such as fitted
values or influence measures.
# tidyr 1.1.3
* tidyr verbs no longer have "default" methods for lazyeval fallbacks. This
means that you'll get clearer error messages (#1036).
* `uncount()` error for non-integer weights and gives a clearer error message
for negative weights (@mgirlich, #1069).
* You can once again unnest dates (#1021, #1089).
* `pivot_wider()` works with data.table and empty key variables (@mgirlich, #1066).
* `separate_rows()` works for factor columns (@mgirlich, #1058).
# tidyr 1.1.2
* `separate_rows()` returns to 1.1.0 behaviour for empty strings
(@rjpatm, #1014).
# tidyr 1.1.1
* New tidyr logo!
* stringi dependency has been removed; this was a substantial dependency that
make tidyr hard to compile in resource constrained environments
(@rjpat, #936).
* Replace Rcpp with cpp11. See <https://cpp11.r-lib.org/articles/motivations.html>
for reasons why.
# tidyr 1.1.0
## General features
* `pivot_longer()`, `hoist()`, `unnest_wider()`, and `unnest_longer()` gain
new `transform` arguments; these allow you to transform values "in flight".
They are partly needed because vctrs coercion rules have become stricter,
but they give you greater flexibility than was available previously (#921).
* Arguments that use tidy selection syntax are now clearly documented and
have been updated to use tidyselect 1.1.0 (#872).
## Pivoting improvements
* Both `pivot_wider()` and `pivot_longer()` are considerably more performant,
thanks largely to improvements in the underlying vctrs code
(#790, @DavisVaughan).
* `pivot_longer()` now supports `names_to = character()` which prevents the
name column from being created (#961).
```{r}
df <- tibble(id = 1:3, x_1 = 1:3, x_2 = 4:6)
df %>% pivot_longer(-id, names_to = character())
```
* `pivot_longer()` no longer creates a `.copy` variable in the presence of
duplicate column names. This makes it more consistent with the handling
of non-unique specs.
* `pivot_longer()` automatically disambiguates non-unique ouputs, which can
occur when the input variables include some additional component that you
don't care about and want to discard (#792, #793).
```{r}
df <- tibble(id = 1:3, x_1 = 1:3, x_2 = 4:6)
df %>% pivot_longer(-id, names_pattern = "(.)_.")
df %>% pivot_longer(-id, names_sep = "_", names_to = c("name", NA))
df %>% pivot_longer(-id, names_sep = "_", names_to = c(".value", NA))
```
* `pivot_wider()` gains a `names_sort` argument which allows you to sort
column names in order. The default, `FALSE`, orders columms by their
first appearance (#839). In a future version, I'll consider changing the
default to `TRUE`.
* `pivot_wider()` gains a `names_glue` argument that allows you to construct
output column names with a glue specification.
* `pivot_wider()` arguments `values_fn` and `values_fill` can now be single
values; you now only need to use a named list if you want to use different
values for different value columns (#739, #746). They also get improved
errors if they're not of the expected type.
## Rectangling
* `hoist()` now automatically names pluckers that are a single string (#837).
It error if you use duplicated column names (@mgirlich, #834), and now uses
`rlang::list2()` behind the scenes (which means that you can now use `!!!`
and `:=`) (#801).
* `unnest_longer()`, `unnest_wider()`, and `hoist()` do a better job
simplifying list-cols. They no longer add unneeded `unspecified()` when
the result is still a list (#806), and work when the list contains
non-vectors (#810, #848).
* `unnest_wider(names_sep = "")` now provides default names for unnamed inputs,
suppressing the many previous name repair messages (#742).
## Nesting
* `pack()` and `nest()` gains a `.names_sep` argument allows you to strip outer
names from inner names, in symmetrical way to how the same argument to
`unpack()` and `unnest()` combines inner and outer names (#795, #797).
* `unnest_wider()` and `unnest_longer()` can now unnest `list_of` columns. This
is important for unnesting columns created from `nest()` and with
`pivot_wider()`, which will create `list_of` columns if the id columns are
non-unique (#741).
## Bug fixes and minor improvements
* `chop()` now creates list-columns of class `vctrs::list_of()`. This helps
keep track of the type in case the chopped data frame is empty, allowing
`unchop()` to reconstitute a data frame with the correct number and types
of column even when there are no observations.
* `drop_na()` now preserves attributes of unclassed vectors (#905).
* `expand()`, `expand_grid()`, `crossing()`, and `nesting()` once again
evaluate their inputs iteratively, so you can refer to freshly created
columns, e.g. `crossing(x = seq(-2, 2), y = x)` (#820).
* `expand()`, `expand_grid()`, `crossing()`, and `nesting()` gain a
`.name_repair` giving you control over their name repair strategy
(@jeffreypullin, #798).
* `extract()` lets you use `NA` in `into`, as documented (#793).
* `extract()`, `separate()`, `hoist()`, `unnest_longer()`, and `unnest_wider()`
give a better error message if `col` is missing (#805).
* `pack()`'s first argument is now `.data` instead of `data` (#759).
* `pivot_longer()` now errors if `values_to` is not a length-1 character vector
(#949).
* `pivot_longer()` and `pivot_wider()` are now generic so implementations
can be provided for objects other than data frames (#800).
* `pivot_wider()` can now pivot data frame columns (#926)
* `unite(na.rm = TRUE)` now works for all types of variable, not just character
vectors (#765).
* `unnest_wider()` gives a better error message if you attempt to unnest
multiple columns (#740).
* `unnest_auto()` works when the input data contains a column called `col`
(#959).
version 0.9.8
- Fixed some issues on C-level causing problems with the
CLANG compiler. (Thanks to Brian Ripley for not only
reporting this, but also sending updated code with
fixes).
version 0.9.7
- Fixes in use of INTEGER() and VECTOR_ELT() after updates in R's C API.
this affected 'afind' and 'max_length' (internally). (Thanks to Luke
Tierny and Kurt Hornik for the notification).
- Fix in 'amatch' causing utf-8 characters to be ignored in some
cases (thanks to Joan Mime for reporting #78).
- Fix: segfault when 'afind' was called with many search patterns or many
texts to be searched.
- Fix: stringsimmatrix was not normalized correctly (Thanks to Tamas Ferenci
for reporting GH).
version 0.9.6.3
- Resubmit. Fixed an URL redirect that was detected by CRAN.
version 0.9.6.2
- Resubmit. Fixed url issues detected by CRAN, added doi to description
as per CRAN request.
version 0.9.6.1
- Bugfix: afind/grab/grabl returned wrong results on MacOS only.
(thanks to Prof. Brian Ripley for the notification and for running tests
on his personal machine and to Tomas Kalibera for making the
ubuntu-rchk docker image available).
version 0.9.6
- New function 'afind': find approximate matches in text based on string distance.
- New functions 'grab', 'grabl': fuzzy matching equivalent to 'grep' and 'grepl'.
- New function 'extract': fuzzy matching equivalent of stringr::str_extract.
- New algorithm 'running_cosine': fast fuzzy text search using cosine distance.
- New function 'stringsimmatrix' (Thanks to Johannes Gruber).
- Number of threads used is now reported when loading 'stringdist'.
- Internal fixes (in some cases class() == 'class' was used).
25 Aug 2020: Statmod 1.4.35
- Fix Bug in tweedie(link.power=0) so that the resulting functions
$linkinv() and $mu.eta() preserve the attributes of their
arguments.
16 Feb 2020: statmod 1.4.34
- Improve the model description provided in the remlscoregamma() help
page.
- tweedie() now checks whether `var.power` or `link.power` are
character strings instead of numeric. If `var.power` is one of the
standard family names ("gaussian", "poisson", "gamma" or
"inverse.gaussian") or `link.power` is one of the standard link
functions ("identity","log","inverse") then the argument is reset
to the corresponding numerical value with a message, otherwise an
informative error message is given.
- Cleaning up of internal code to avoid partial matching of function
arguments, attributes or list component names. The automatic package
tests are now run with the warnPartialMatchArgs,
warnPartialMatchAttr and warnPartialMatchDollar options all set to
TRUE.
4 Jan 2020: statmod 1.4.33
- The components returned by mixedModel2Fit() relating to fixed
coefficients are now documented explicitly. The help page has been
corrected to refer to the argument `only.varcomp` instead of
`fixed.estimates`. The vector of `reml.residuals` is no longer
part of the output.
- The test file has been slightly revised using zapsmall() so ensure
that the test output file remains correct for R with ATLAS BLAS.