* Add U+2116 NUMERO SIGN
* Add U+05BE HEBREW PUNCTUATION MAQAF
unidecode 0.04.20:
* Fixed transliteration of circled Latin letters and numbers
* Add square unit symbols.
* Add Latin variants in U+20xx and U+21xx pages.
* Fix U+02B1 MODIFIER LETTER SMALL H WITH HOOK.
* Fix U+205F MEDIUM MATHEMATICAL SPACE.
* Add "DIGIT ... COMMA" and "PARANTHESIZED LATIN CAPITAL LETTER"
in U+1F1xx page.
* Add missing vulgar fractions and a/c, a/s, c/o, c/u symbols.
* Add universal Wheel release
go14 has no relro support AFAICT.
go-1.8.3 has if you use -buildmode=pie, but it claims it's not supported
on Linux.
Disable relro checking for go packages until bsiegert has time to
look at this.
Changes include:
ocaml-expat-1.0.0
- New maintainer (whitequark@whitequark.org)
- Support for the bytes type
- Build system improvements to support cross-compilation and systems
without shared libraries
v1.0.1 2016-03-07 La Forclaz (VS)
---------------------------------
- OCaml 4.05.0 compatibility (removal of `Uchar.dump`).
v1.0.0 2016-11-23 Zagreb
------------------------
- Support for RFC 7195/ECMA-404. This means that any JSON value can
now be codec as JSON text, in RFC 4627 (obsoleted by 7195) this
could only be an array or an object. If your code was relying on the
fact the first decoded lexeme was either a `Os` or `As`,
you will need to review that.
- Fix `Jsonm.decode` not eventually returning `End` on toplevel
decode error.
- OCaml standard library `Uchar.t` support. At the API level only
some cases of `Jsonm.error` change.
- Uutf 1.0.0 support.
- Safe string support.
- Build depend on topkg.
- Relicensed from BSD3 to ISC.
v1.0.1 2016-03-07 La Forclaz (VS)
---------------------------------
- OCaml 4.05.0 compatibility (removal of `Uchar.dump`).
v1.0.0 2016-11-23 Zagreb
------------------------
- `Uutf.String.fold_utf_{8,16be,16le}`, allow substring folding via
optional arguments. Thanks to Raphaël Proust for the idea and the
patch.
- OCaml standard library `Uchar.t` support.
- Removes and substitutes `type Uutf.uchar = int` by the (abstract)
`Uchar.t` type. `Uchar.{of,to}_int` allows to recover the previous
representation.
- Removes `Uutf.{is_uchar,cp_to_string,pp_cp}`. `Uchar.{is_valid,dump}`
can be used instead.
- Safe string support. Manual sources and destinations now work on bytes
rather than strings.
- Build depend on topkg.
- Relicense from BSD3 to ISC.
Upstream changes:
1.600 2017-06-23
- New maintainer: LEEJO
- Add Changes file
- Add link to github repo
- Add strict and warnings
- Add LICENSE to POD + LICENSE file
- Add META.* files through make dist
- Add .travis.yml for CI
indexers:
* omindex:
+ 1.4.3 added a new --sample option, but contrary to the documentation
the default behaviour was to take the sample from the meta description
(which was the hard-wired behaviour in 1.4.2 and earlier). The default
has now been changed to take the sample from the body.
+ Index .shtm, .xhtml and .xhtm as HTML by default - .shtm is another
extension used for server-parsed HTML (in addition to the more common
.shtml), and .xhtm and .xhtml are XHTML.
+ Fix fallback lookup for extension containing upper case. User mappings
worked, but built-in extension to MIME type mappings were effectively being
ignored (because the result of the function call was not being checked).
Bug introduced in 1.3.4.
+ Fix term-based date ranges, broken by changes in 1.4.2. Found and
diagnosed by Gaurav Arora.
+ Handle date range with start after end better - with term-based ranges,
this used to generate a bogus filter, but now just generates Dlatest.
+ Use Y-term when range starts/ends at year start/end. Previously we used 12
M-terms for these cases.
+ Use full leap-year check when constructing term-based date ranges -
previous code was good until 2100, but even then it would only result
in an extra term being included for a non-existent February 29th in
rare cases.
+ Add support for indexing vCard files if Perl and its Text::vCard module
are available.
+ Recognise application/x-rpm as alternative type since libmagic reports this
rather than application/x-redhat-package-manager.
+ Use official MIME type application/vnd.debian.binary-package for debian
packages. We used to map .deb and .udeb to application/x-debian-package,
but in 2014 (after we added that support for .deb) an official type was
registered with IANA. We now map extensions .deb and .udeb to the official
type, but the unofficial type is still recognised (older versions of
libmagic probably report it, and users may be mapping to it).
+ Handle PHP as MIME type text/x-php. The main difference this makes is that
PHP files which don't have extension '.php' (e.g. .phtml, .phps, .php5,
.ph4, etc) get identified by libmagic as text/x-php and will now be indexed.
It also means that the user can now more easily configure different filters
for HTML and PHP.
+ Don't use meta description as sample by default. Now we have dynamic
snippets (via $snippet), the body text is a better default. Also generated
HTML sometimes has unhelpful content in the meta description. To get the
previous behaviour, use the new omindex command line option:
--sample=description
omega:
* New OmegaScript command $cgiparams which returns a list of the parameter
names.
* Handle tab in a CGI parameter name in the same way as space. Mostly this is
a way to avoid having tabs in CGI parameter names - they aren't useful, but
if they could have tabs in we can't put CGI parameter names in a list.
templates:
* query: Fix highlighting of matching terms. We were using both $snippet and
$highlight, which results in double highlighting and HTML escaping, most
noticeable by literal <strong> and </strong> appearing around matching terms
in the rendered HTML snippet. Reported by Mark Thomas on xapian-discuss.
build system:
* If gen-mimemap failed after creating mimemap.h, the rule wouldn't get rerun.
API:
* Database::check():
+ Fix checking a single table - changes in 1.4.2 broke such checks unless you
specified the table without any extension.
+ Errors from failing to find the file specified are now thrown as
DatabaseOpeningError (was DatabaseError, of which DatabaseOpeningError is
a subclass so existing code should continue to work). Also improved the
error message when the file doesn't exist is better.
* Drop OP_SCALE_WEIGHT over OP_VALUE_RANGE, OP_VALUE_GE and OP_VALUE_LE in the
Query constructor. These operators always return weight 0 so OP_SCALE_WEIGHT
over them has no effect. Eliminating it at query construction time is cheap
(we only need to check the type of the subquery), eliminates the confusing
"0 * " from the query description, and means the OP_SCALE_WEIGHT Query object
can be released sooner. Inspired by Shivanshu Chauhan asking about the query
description on IRC.
* Drop OP_SCALE_WEIGHT on the right side of OP_AND_NOT in the Query
constructor. OP_AND_NOT takes no weight from the right so OP_SCALE_WEIGHT
has no effect there. Eliminating it at query construction time is cheap
(just need to check the subquery's type), eliminates the confusing "0 * "
from the query description, and means the OP_SCALE_WEIGHT object can be
released sooner.
* MSet::snippet(): Favour candidate snippets which contain more of a diversity
of matching terms by discounting the relevance of repeated terms using an
exponential decay. A snippet which contains more terms from the query is
likely to be better than one which contains the same term or terms multiple
times, but a repeated term is still interesting, just less with each
additional appearance. Diversity issue highlighted by Robert Stepanek's
patch in https://github.com/xapian/xapian/pull/117 - testcases taken from his
patch.
* MSet::snippet(): New flag SNIPPET_EMPTY_WITHOUT_MATCH to get an empty snippet
if there are no matches in the text passed in. Implemented by Robert
Stepanek.
* Round MSet::get_matches_estimated() to an appropriate number of significant
figures. The algorithm used looks at the lower and upper bound and where the
estimate sits between them, and then picks an appropriate number of
significant figures. Thanks to Sébastien Le Callonnec for help sorting out a
portability issue on OS X.
* Add Database::locked() method - where possible this non-invasively checks if
the database is currently open for writing, which can be useful for
dashboards and other status reporting tools.
testsuite:
* Add more tests of Database::check(). Fixes#238, reported by Richard
Boulton.
* Make apitest testcase nosuchdb1 fail if we manage to open the DB.
* Skip testcases which throw NetworkError with errno value ECHILD - this
indicates system resource starvation rather than a Xapian bug. Such failures
are seen on Debian buildds from time to time, see:
https://bugs.debian.org/681941
* Use terms that exist in the database for most snippet tests. It's good to
test that snippet highlighting works for terms that aren't in the database,
but it's not good for all our snippet tests to feature such terms - it's
not the common usage.
matcher:
* Fix incorrect results due to uninitialised memory. The array holding max
weight values in MultiAndPostList is never initialised if the operator is
unweighted, but the values are still used to calculate the max weight to pass
to subqueries, leading to incorrect results. This can be observed with an OR
under an unweighted AND (e.g. OR under AND on the right side of AND_NOT).
The fix applied is to simply default initialise this array, which should lead
to a max weight of 0.0 being passed on to subqueries. Bug reported in
notmuch by Kirill A. Shutemov, and forwarded by David Bremner.
* Improve value range upper bound and estimated matches. The value slot
frequency provides a tighter upper bound than Database::get_doccount().
The estimate is now calculated by working out the proportion of possible
values between the slot lower and upper bounds which the range covers
(assuming a uniform distribution). This seems to work fairly well in
practice, and is certainly better than the crude estimate we were using:
Database::get_doccount() / 2
* Handle arbitrary combinations of OP_OR under OP_NEAR/OP_PHRASE, partly
addressing #508. Thanks to Jean-Francois Dockes for motivation and testing.
* Only convert OP_PHRASE to OP_AND if full DB has no positions. Until now the
conversion was done independently for each sub-database, but being consistent
with the results from a database containing all the same documents seems more
useful.
* Avoid double get_wdf() call for first subquery of OP_NEAR and OP_PHRASE,
which will speed them up by a small amount.
documentation:
* Correct "Query::feature_flag" -> "QueryParser::feature_flag". Fixes#747,
reported by James Aylett.
* Rename set_metadata() `value` parameter to `metadata`. This change is
particularly motivated by making it easier to map this case specially in SWIG
bindings, but the new name is also clearer and better documents its purpose.
* Rename value range parameters. The new names (`range_limit` instead of
`limit`, `range_lower` instead of `begin` and `range_upper` instead of `end`)
are particularly motivated by making it easier to map them specially in SWIG
bindings, but they're also clearer names which better document their
purposes.
* Change "(key, tag)" to "(key, value)" in user metadata docs. The user
metadata is essentially what's often called a "key-value store" so users
are likely to be familiar with that terminology.
* Consistently name parameter of Weight::unserialise() overridden forms.
In xapian/weight.h it was almost always named `serialised`, but LMWeight
named it `s` and CoordWeight omitted the name.
* Fix various minor documentation comment typos.
* INSTALL: Update section about -Bsymbolic-functions which is not a new
GNU ld feature at this point.
tools:
* xapian-delve: Uses new Database::locked() method to report if the database
is currently locked.
portability:
* Fix configure probe for __builtin_exp10() to work around bug on mingw - there
GCC generates a call to exp10() for __builtin_exp10() but there is no exp10()
function in the C library, so we get a link failure. Use a full link test
instead to avoid this issue. Reported by Mario Emmenlauer on xapian-devel.
* Fix configure probe for log2() which was failing on at least some platforms
due to ambiguity between overloaded forms of log2(). Make the probe
explicitly check for log2(double) to avoid this problem.
* Workaround the unhelpful semantics of AI_ADDRCONFIG on platforms which follow
the old RFC instead of POSIX (such as Linux) - if only loopback networking is
configured, localhost won't resolve by name or IP address, which causes
testsuites using the remote backend over localhost to fail in auto-build
environments which deliberately disable networking during builds. The
workaround implemented is to check if the hostname is "::1", "127.0.0.1" or
"localhost" and disable AI_ADDRCONFIG for these. This doesn't catch all
possible ways to specify localhost, but should catch all the ways these might
be specified in a testsuite. Fixes https://bugs.debian.org/853107, reported
by Daniel Schepler and the root cause uncovered by James Clarke.
* Fix build failure cross-compiling for android due to not pulling in header
for errno.
* Fix compiler warnings.
debug code:
* Adjust assertion in InMemoryPostList. Calling skip_to() is fine when the
postlist hasn't been started yet (but the assertion was failing for a term
not in the database). Latent bug, triggered by testcases complexphrase1 and
complexnear1 as updated for addition of support for OP_OR subqueries of
OP_PHRASE/OP_NEAR.
Upstream changes:
- 1.18 H78M5qm1 Sat Jul 8 05:52:48:01 -0500 2017
* fixed new() to check file or xml to detect standalone in
declaration, from <HTTPS://RT.CPAN.Org/Ticket/Display.html?id=122389>
(Thanks Alex!)
* traced tidy() memory leak from
<HTTPS://RT.CPAN.Org/Ticket/Display.html?id=120296> (Thanks Jozef!)
which seems to come from every XPath->findnodes() call
* aligned synopsis comments
* updated write() to use output encoding UTF-8 since that's what
almost all XML should rely on (with thanks to RJBS for teaching me
much from his great talk at <HTTPS://YouTube.Com/watch?v=TmTeXcEixEg>)
* collapsed trailing curly braces on code blocks
* added croak for any failed file open attempt
Upstream changes:
version 3.22: Fri 30 Jun 10:03:10 CEST 2017
Fixes:
- ::XOP::Include read from file always died
rt.cpan.org#119955 [Pavel Trushkin]
- ::XOP::Include read should enforce raw mode
rt.cpan.org#119955 [Pavel Trushkin]
Update DEPENDS
Upstream changes:
version 1.58: Tue 27 Jun 16:50:29 CEST 2017
Fixes:
- early facet on missing field [Bernhard Reutner-Fischer]
Improvements:
- move to Log::Report 1.20, which has considerable changes.
version 1.57: Wed 14 Jun 14:48:18 CEST 2017
Fixes:
- better separation between lexical- and value-space facets.
rt.cpan.org#121946 [Nils Barkald]
- json_friendly changes broke some (semi-illegal) enumeration and
pattern facets. Now new solution with dualvar [Wesley Schwengle]
Upstream changes:
2.033 2017-07-06
- [RT #122371] Remove a couple of improperly-placed weaken statements
(reported by Phil Perry).
- [RT #122372] Fix weakening when a page is added to the end of a multiple
page document (reported by Phil Perry).
- Fix Bank Gothic core font (reported by Phil Perry).
2.032 2017-07-02
- PDF::API2 has many circular references, and the end() method doesn't clear
them all, so memory is leaked. This release uses Scalar::Util's weaken()
function to improve garbage collection. A significant number of circular
references have been weakened, though many likely still remain.
- [RT #120756] Eliminate a warning for an ambiguous call to CORE::open
(first reported by Abdelbaki Brahmia).
- $text->text_justified() and $text->text_fill_justified() now adjust the
space between words rather than stretching individual characters in order
to get the text to fit.
- [RT #120397] Indirect references and indirect objects can have comments
embedded in their whitespace, and their object number and generation may
be split across multiple lines, which may not all be buffered (reported by
SPROUT).
- [RT #120450] Fix PDF::API2->open($filename)->stringify() (reported by
SPROUT).
- Fix an off-by-one error when calculating text width while charspace is
non-zero.
- [RT #120048] Fix PDF::API2->synfont() (broken in 2.029, fixed by Vadim
Repin) and add basic testing.