5.1.0
Changed
Strip debugging symbols from Linux binaries
5.0.0
Added
Use cibuildwheel to build wheels
Removed
Drop support for soon-EOL Python 3.6
Fixed
Install Twine to upload to PyPI
documentation:
* configure: Add missing AC_ARG_VAR for all programs so that they are
documented in --help output, and so that autoconf knows they are "precious"
and preserves them if configure is rerun even when they're specified via an
environment variable.
* Add usage examples for $jsonobject.
* Fix path to omega in quickstart document. Fixes#813, reported by Jim Lynch.
* Update for the IRC channel move from freenode to libera.chat.
indexers:
* Fix handling of UTF-16 BOMs in XML and HTML - we had the sense of the
endianness indicated by the BOM the wrong way round.
* Avoid making an extra temporary copy of HTML/XML data which has a UTF16 BOM.
* We now ignore an end of line immediately after a PHP close tag to match what
PHP does.
* omindex:
+ Fix handling of formatted xlsx dates in certain cases.
* scriptindex:
+ Add new scriptindex whitespace removal actions `ltrim`, `rtrim`, `squash`,
and `trim`.
+ Improve `truncate` action - if a word ends exactly on the requested length
we now leave it in place rather than removing it.
+ Report the location of previous `unique` action in the error given when
`unique` is used more than once.
omega:
* Clamp START and END with packed timestamps. The 4-byte unsigned packed
time_t format can't represent dates before 1970 or after Sun 07 Feb 2106
06:28:15 UTC so clamp dates before or after these - previously they would
wrap around.
* The JSON produced by $jsonobject no longer contains newlines, which makes it
usable as a single line serialisation format without post-processing.
* Add $base64 OmegaScript command.
* omega: Add flag_no_positions to wrap new
Xapian::QueryParser::FLAG_NO_POSITIONS.
templates:
* Fix topterms template to not trigger early matching. We were checking $msize
before including the `query` template, but doing so would trigger the query
to be run, which means that settings early in the `query` template which
should affect the result (such as $setmap{prefix,...}) were being ignored
when the `topterms` template was used. Partly addresses #815, reported by
Gennadiy.
* Add field support to opensearch and xml templates. These templates now also
search title, topic and filename by default and support `title:`, `author:`
and `topic:` in the query string (both like the template `query` already
does). Fixes remaining issue in #815, reported by Gennadiy.
testsuite:
* Expand omegatest. All scriptindex actions now have test coverage.
build system:
* Replace uses of obsolete autoconf macros, fixing warnings if configure is
regenerated with a recent release of autoconf.
portability:
* Don't automatically use _FORTIFY_SOURCE on mingw-w64. Recent mingw-w64
versions require -lssp to be linked when _FORTIFY_SOURCE is enabled, so just
skip the automatic enabling. Users who want to enable it can specify it
explicitly.
Fixes#808, reported by xpbxf4.
* Automatically enable GCC warnings -Wduplicated-cond and -Wduplicated-branches
if using a GCC version new enough to support them. The usefulness of
-Wduplicated-cond was highlighted by dcb in #816.
* Fix GCC -Wshadow warning.
* Use clock_gettime() and nanosleep() under modern mingw as these allow higher
precision than what we previously used.
API:
* New QueryParser::FLAG_NO_POSITIONS flag. With this flag enabled, any query
operations which would use positional information are replaced by the nearest
equivalent which doesn't (so phrase searches, NEAR and ADJ will result in
OP_AND). This is intended to replace the automatic conversion of OP_PHRASE,
etc to OP_AND when a database has no positional information, which will no
longer happen in the release series after 1.4.
* Give a compile error for code which adds a Database to WritableDatabase.
Prior to 1.4.19, this compiled and effectively created a "black-hole" shard
which quietly discarded any changes made to it.
In 1.4.19 it's still possible to perform this operation by assigning the
WritableDatabase to a Database first, which is harder to fix. This case
throws an exception on git master where it's easier to address.
Reported by David Bremner on #xapian.
* Fix TermIterator::skip_to() with sharded databases which sometimes was
failing to advance all the way to the requested term. Uncovered while
addressing warning from GCC's -Wduplicated-cond, reported by dcb in #816.
* Clamp edit distance to one less than the length of the word we've been asked
to correct, which makes the algorithm we use more efficient. We already
require suggestion to have at least one character in common, so the only
change to suggestions is we'll no longer suggest corrections which are
twice as long or longer even if the edit distance would allow it, which
seems like an improvement in itself.
* Minor optimisation expanding wildcards.
* PostingIterator::get_description(): For an all-docs iterator on a glass
database, get_description() would call get_docid() which isn't valid to
do once the iterator has reached the end.
testsuite:
* Expand allterms test coverage.
matcher:
* Fetch wdf upper bound from postlist which avoids an extra postlist table
cursor seek per weighted query term, and also means we now use a per-shard
wdf upper bound for local shards which will in typically give a tighter
weight upper bound which will tend to make various other matcher
optimisations more effective. Eric Wong reported this speeds up a
particularly slow case from ~2 minutes to ~3 seconds.
With this change, OP_ELITE_SET can now select a different subset of terms for
each shard regardless of shard type (previously this only happened for remote
shards).
* Avoid triggering a pointless maximum weight recalculation if an unweighted
child of a MultiAndPostList prunes.
* Only check if the database has positional information when the query
uses positional information. This should help improve notmuch delete
performance. Thanks to andreas on #notmuch for analysis of the problem.
glass backend:
* Optimise Glass::Inverter::has_positions(). Use const auto& instead of just
auto for the loop variables. Reported to be faster by andreas on #notmuch.
* Cache result of Glass::Inverter::has_positions() since calculating it is
potentially very expensive, while maintaining a cached answer is very cheap.
remote backend:
* Add missing closing parenthesis to reported remote prog context, which has
been missing since this code was first added over 20 years ago! Spotted by
Gaurav Arora.
build system:
* Enable compiler option -fno-semantic-interposition if supported.
This GCC option allows the compiler to optimise essentially assuming
that functions/variables aren't replaced at dynamic link time.
Such replacement is not something that it's useful to do for Xapian
symbols, and we already turn on -Bsymbolic-functions by default which
prevents such replacement anyway by resolving references within the
library at build time.
Reduces the size of the stripped library on x86-64 Debian unstable by
~1%, and likely makes it faster too.
* Avoid bogus deprecation warning when compiling with GCC without optimisation.
In this situation, GCC emits a deprecation warning for code in the definition
of QueryParser::add_valuerangeprocessor() which is provided for backwards
API compatibility even if this method is never used anywhere.
This isn't helpful, especially if the user is using -Werror, so disable the
-Wdeprecated-deprecations warning for this code.
Reported by starmad on #xapian.
* Fix GCC -Wmaybe-uninitialized warning. The warning seems bogus as it's about
the this pointer being passed to a method which doesn't reference the object,
but we can just make the method static to avoid the warning, and that's
arguably cleaner for a method called from the object initialiser list.
* Automatically enable GCC warnings -Wduplicated-cond and -Wduplicated-branches
if using a GCC version new enough to support them. The usefulness of
-Wduplicated-cond was highlighted by dcb in #816.
* Replace uses of obsolete autoconf macros, fixing warnings if configure is
regenerated with a recent release of autoconf.
* Simplify configure probe for sigsetjmp and siglongjmp. Just probe
individually with AC_CHECK_DECLS and then check that both exist with a
preprocessor check.
* Update XO_LIB_XAPIAN to fix warning that AC_ERROR is obsolete with modern
autoconf.
* Support linking against static libxapian with cmake. Patch from Anonymous
Maarten in https://github.com/xapian/xapian/pull/317
* Clean up handling of libs we link libxapian with - previously any libraries
explicitly specified to configure by the user via LIBS=... as well as -lm
(if configure determined it was needed) could get added to XAPIAN_LIBS
multiple times, as well as also getting added to the libxapian link command
anyway by automake/libtool standard handling.
Specifying a library more than once on the link line is not a problem on
common platforms, but may be an issue somewhere (and it's on less common
platforms where the user is more likely to have to specify LIBS to configure
and/or where -lm may be needed).
documentation:
* configure: Add missing AC_ARG_VAR for all programs so that they are
documented in --help output, and so that autoconf knows they are "precious"
and preserves them if configure is rerun even when they're specified via an
environment variable.
* Don't use x^2 to mean x squared in API docs. This is potentially confusing
since in C/C++ (and some other languages), ^ means exclusive-or. Write x²
instead, which should be clear to all readers.
* Improve docs for Xapian::Stopper and SimpleStopper.
* docs/intro_ir.rst: Fixed an incorrect term index. Patch from Jaak Ristioja
in https://github.com/xapian/xapian/pull/321.
* Update for the IRC channel move from freenode to libera.chat.
examples:
* quest: Don't enable spelling correction by default. It was really only on by
default because the spelling correction support in quest was added before
--flags. It seems more helpful for the default to match the
Xapian::QueryParser API, and also this fixes the weird situation that
`--flags default` isn't the default you get without any `--flags` option.
* quest: Multiple `--flags` options now get combined - previously only the last
was used.
portability:
* Don't automatically use _FORTIFY_SOURCE on mingw-w64. Recent mingw-w64
versions require -lssp to be linked when _FORTIFY_SOURCE is enabled, so just
skip the automatic enabling. Users who want to enable it can specify it
explicitly.
Fixes#808, reported by xpbxf4.
* Workaround NFS issue in test harness function for deleting test databases.
On NFS, rmdir() can fail with EEXIST or ENOTEMPTY (POSIX allows either)
due to .nfs* files which are used by NFS clients to implement the Unix
semantics of a deleted but open file continuing to exist. We now sleep
and retry a few times in this situation to give the NFS client a chance
to process the closing of the open handle. Problem mentioned in #631.
* configure: Drop -lm special case for Sun C++ as this no longer seems to
be required. Tested with Sun C++ 5.13, which is the oldest version we
now support due to us now requiring C++11.
* Use strerrordesc_np() if available. This is a GNU-specific replacement for
sys_errlist and sys_nerr. It was added in glibc 2.32 since which sys_errlist
and sys_nerr are no longer declared in the headers.
* Update debug logging to use std::uncaught_exceptions() under C++17 and later
since this allows the debug logging to detect a function without RETURN()
annotation which exits normally while there's an uncaught exception
(previously the debug logging would think the stack was being unwound through
the function). This also avoids deprecation warnings - the old
std::uncaught_exception() (note: singular) function was deprecated by
C++17 and removed in C++20.
* Increase size of buffer passed to strerror_r() from 128 to 1024 bytes, which
is the size recommended by the man page on Linux.
* Fix -Wdeprecated-copy warning from clang 13.
Version 2.11.0
--------------
- Added lexers:
* BDD
* Elpi
* LilyPond
* Maxima
* Rita
* Savi
* Sed
* Sophia contracts
* Spice
* ``.SRCINFO``
- Updated lexers:
* ABNF: Allow one-character rules
* Assembly: Fix incorrect token endings
* Bibtex: Distinguish between ``comment`` and ``commentary``
* C family: Support unicode identifiers
* CDDL: Fix slow lexing speed
* Debian control: Add missing fields
* Devicetree: Recognize hexadecimal addresses for nodes
* GDScript: Add ``void`` data type
* GSQL
- Fix comment handling
- Fix catastrophic backtracking
* HTML, XML: Improve comment handling
* Java: Add ``yield``
* Makefiles
* objdump-nasm: Improve handling of ``--no-show-raw-insn`` dumps
* Prolog: Support escaped ``\`` inside quoted strings
* Python:
- Support ``~`` in tracebacks
- Support the pattern matching keywords
* RobotFramework: Improve empty brace handling
* Terraform
- Add the 'set' type
- Support heredocs
- Added styles:
* Dracula
* Friendly Grayscale
* LilyPond
* One-Dark
.. note::
All of the new styles unfortunately do not conform to WCAG recommendations.
- There is new infrastructure in place to improve style accessibility. The default style has been updated to conform to WCAG recommendations. All styles are now checked for sufficient contrast by default to prevent regressions.
- Clean up unused imports
- Fix multiple lexers producing repeated single-character tokens
- Fix multiple lexers marking whitespace as ``Text``
- Remove duplicated assignments in the Paraiso style
- ``pygmentize`` supports JSON output for the various list functions now, making it easier to consume them from scripts.
- Use the ``shell`` lexer for ``kshrc`` files
- Use the ``ruby`` lexer for ``Vagrantfile`` files
- Use the C lexer for ``.xbm`` and ``.xpm`` files
- Add a ``groff`` formatter
- Update documentation
- Line anchors now link to themselves
- Add official support for Python 3.10
- Fix several missing colors in dark styles: Gruvbox dark, Monokai, Rrt, Sas, Strata dark
- Associate more file types with ``man`` pages
- The ``HtmlFormatter`` can now emit tooltips for each token to ease debugging of lexers
- Add ``f90`` as an alias for ``fortran``
Release 3.2.1
The release contains the fix the inclusion of both cpp11 and cpp17 headers on C++17 compilation. Also some additional tests for using string literals and string objects with modern compilers.
Release 3.2
Optional support for C++ 17 std::string_view.
Release 3.1.2
Fix for Issue 72.
Release 3.1.1
Include the commits from the previous year.
Release 3.1
This release adds one new API call: unchecked::replace_invalid().
Other changes are mostly about testing and installation.
Release 3.0.3
A minor release that contains fix for Issue 31 Program fails to link when including utf8.h in multiple files.
Release 3.0.2
This minor release contains:
Fix of the project version number at CMakeLists.txt
Continuous Integration with Google Tests and CircleCI
Release 3.0.1
A minor release with a fix for a header guard.
Release 3.0
This is a major release that introduces the following functionality:
New convenience API for C++ 11 and later compilers. The library still works with C++ 98/03 compliant compilers, just without the new functions.
advance() function works in both directions.
The following deprecated functions were removed:
previous() - deprecated since version 1.02.
is_bom() - deprecated since version 2.3.
Changes:
0.8
---
- Optimize number parsing for large number datasets
- Add -F and -R options to allow to specify a different field and record
separator
- Print \n and \t also when using -F/-R options
- Documentation improvements
0.7
---
- Use unlocked I/O by default
- Fix gcc warnings
- Documentation improvements
Release 4.3.2 (released Dec 19, 2021)
=====================================
Bugs fixed
----------
* C and C++, parse fundamental types no matter the order of simple type
specifiers.
IMAP mailbox names are encoded in a modified UTF-7 when names
contain international characters outside of the printable ASCII
range. The modified UTF-7 encoding is defined in RFC2060 (section
5.1.3).
ChangeLog (from NEWS):
* Version 4.5.2 (Feb 3, 2021)
- Add missing modules to record.scm.
* Version 4.5.1 (Jan 11, 2020)
- Allow false values in JSON mappings.
(Fixes#70)
* Version 4.5.0 (Jan 3, 2020)
- Introduce (define-json-type) a much simpler way to define JSON
objects and record mappings. It makes use of the existing
(define-json-mapping).
* Version 4.4.1 (Nov 29, 2020)
- Fixed a few parsing issues from JSON Parsing Test Suite
(https://github.com/nst/JSONTestSuite).
(Fixes#67)
* Version 4.4.0 (Oct 22, 2020)
- Record-JSON mapping now can define another optional procedure
record->scm to convert a record to an alist. (Fixes#63)
- Record-JSON mapping now allows using *unspecified* values to
indicate a field record should not be serialized. (Fixes#61)
- Improve pretty printing.
(thanks to Jonas Schürmann)
* Version 4.3.2 (Jul 23, 2020)
- Fix unicode for values from E000 and upwards.
(thanks again to pkill9 and RhodiumToad from #guile)
* Version 4.3.1 (Jul 22, 2020)
- Fix unicode codepoint with surrogate pairs.
(thanks to pkill9 and RhodiumToad from #guile)
* Version 4.3.0 (Jul 3, 2020)
- Make RECORD->JSON optional in (define-json-mapping).
* Version 4.2.0 (Jun 30, 2020)
- Introduce (define-json-mapping) which allows converting a JSON
object into a record type and vice versa. The initial code for
this feature was copied from the GNU Guix project.
4.7.1 (2021-12-13)
Features added
Chunked Unicode string parsing via parser.feed() now encodes the input data to the native UTF-8 encoding directly, instead of going through Py_UNICODE / wchar_t encoding first, which previously required duplicate recoding in most cases.
Bugs fixed
The standard namespace prefixes were mishandled during "C14N2" serialisation on Python 3. See https://mail.python.org/archives/list/lxml@python.org/thread/6ZFBHFOVHOS5GFDOAMPCT6HM5HZPWQ4Q/
lxml.objectify previously accepted non-XML numbers with underscores (like "1_000") as integers or float values in Python 3.6 and later. It now adheres to the number format of the XML spec again.
Static wheels of lxml now contain the header files of zlib and libiconv (in addition to the already provided headers of libxml2/libxslt/libexslt).
Other changes
Wheels include libxml2 2.9.12+ and libxslt 1.1.34 (also on Windows).
4.7.0 (2021-12-13)
Release retracted due to missing files in lxml/includes/.
4.6.5 (2021-12-12)
Bugs fixed
A vulnerability (GHSL-2021-1038) in the HTML cleaner allowed sneaking script content through SVG images (CVE-2021-43818).
A vulnerability (GHSL-2021-1037) in the HTML cleaner allowed sneaking script content through CSS imports and other crafted constructs (CVE-2021-43818).
v1.9.1
* Improve error reporting for encoded data
* Fix attribute duplicates in attribute group
* Add process_skipped optional argument to decoding/encoding
ugrep v3.3.12
Updated Windows CRLF output while maintaining grep compatibility. Faster column -k by on-demand computation. Faster grep files containing long Iines of text and binary files. This update corrects and further improves the input buffering method, which performed sub-optimally for long lines containing many pattern matches.
No release note nor change log. Please refer commit log
<https://github.com/s-yata/marisa-trie/commits/master> in detail.
0.2.5 (2018-05-19)
0.2.6 (2020-06-14)
This commit also updates language binding packages: p5-marisa, py-marisa and
ruby-marisa.
Fix a build failure since new RC version of clap crate was released.
Fix a dynamic link error of pcre2 library by linking the library
statically. The error could happen when you installed Homebrew to
non-default location on macOS (#6).
Add --regex-size-limit option for built-in grep feature.
Add --dfa-size-limit option for built-in grep feature.
Use Rust compiler v1.57 to build binaries.
3.13.1
Fixed
Temporarily comment out to avoid warning during import humanize
3.13.0
Added
Add da_DK language
Fix and add Russian and Ukrainian words
Add missing strings for Polish translation
Add Traditional Chinese (zh-HK)
Changed
Remove redundant setuptools from install_requires
Deprecated
This is the last release to support Python 3.6
Deprecate private functions
Reinstate VERSION and deprecate
0.9.27 (2021-11-29)
* Add support for Ruby 3.0 endless method definitions. (#1376, #1381)
* Add existence check for README file (#1367)
* Support module_function decorator (#1365)
* Add CommonMarker markup support (-m commonmarker) (#1157, #1388)
* Fix nested array parsing (#1389)
* Add WEBrick as a runtime dependency for Ruby 3.0 support (#1400)
* Support fail_on_warning option in yard stats command (#1392)
* Better integration with Sorbet (#1401)
* Handle include mixins on complex paths (#1386)
* Fix @!scope maintaining state in lone comment blocks (#1411)
* Remove support for Travis CI