- buffer: add buffer.transcode to transcode a buffer's content from one
encoding to another primarily using ICU
- child_process: add public API for IPC channel
- icu:
- Upgraded to ICU 58 - small icu
- Add cldr, tz, and unicode to process.versions
- lib: make String(global) === '\[object global\]'
- libuv: Upgraded to 1.10.0
- readline: use icu based string width calculation
- src:
- add NODE_PRESERVE_SYMLINKS environment variable that has the same effect
as the --preserve-symlinks flag
- Fix String#toLocaleUpperCase() and String#toLocaleLowerCase()
---------------------------------
0.35 2016/11/03 08:30:00
- Minor POD updates.
- Added catastrophic failure protection to _croak_or_return() by adding
local $SIG{PIPE} = "IGNORE"; before connection termination logic. Limits
the scope to just this one code block.
0.34 2016/07/27 08:30:00
- BEHAVIOR CHANGE - Added fix_supported() as a way to make corrections to
supported(). Editing the returned hash reference of _help() no longer
works! This new method does both additions & removals.
- BEHAVIOUR CHANGE - Modified _mfmt() & _mdtm() to be able to handle localtime
vs gmtime based on changes to how PreserveTimestamp work. As an alternate
way, the behavior can be overriden by a new $local_flag option. The
default behavior is still GMT time. See the POD for a description of how
PreserveTimestamp now works.
- Made POD clarification & other comment updates.
- Increased TRACE_MOD from 5 to 10 blocks.
- Added BEGIN block to detect if IPv6 support is possible. It does this by
asking IO::Socket::SSL instead of reinventing the wheel.
- Moved the generation of the Debug log header info for CPAN support to BEGIN
as well so that this info gets centralized instead of repeated.
- Added Domain/Family as a new option for choosing IPv4 vs IPv6.
- Added OverrideHELP => -1 option to use FEAT instead, when HELP is broken!
- Updated quot() to recognize that MLSD also requires a data channel.
Also improved the disable HELP logic used here.
- Broke up _feat() into _feat() & feat(). Also added feat() to the POD.
Done since under some circumstances the feature list can be dynamic!
Also changed logic on how to tell if OPTS is supported or not.
Finally drops HELP from the list of FEAT commands returned if OverrideHELP
was used.
- Rewrote _help() to make it less confusing. Adding OverrideHELP=>-1 made it
clear it was too messy to support ongoing. Much more understandable now.
Also made it more reliable to get the list of site commands supported.
- Fixed PreserveTimestamp bug in transfer() & xtransfer().
- Added new option xWait for use by xput() & xtransfer(). Some servers won't
honor the rename of the scratch file to it's final name without instituting
a delay. So this option allows you to specify one.
- README - Added more notes about turning on/off SSL logging. Newer
versions allow for dynamic turning on/off. Also updated comments on the
naming of the trace logs.
- t/10-complex.t - Changes to the main test script!
* Now uses fix_supported() in it's is_file() tests since these test cases
hit some lies told by some servers!
* Fixed so it's main logs are named after this test program like the other
test cases do.
* Added new test to verify if the MDTM command correctly uses GMT time
instead of local time. (Assumes MFMT will use the same time zone!) Did
this test early enough so that the last connection used the correct
PreserveTimestamp settings for tests depending on it!
* Added xWait of 1 second to deal with problem FTP/S servers that require
a wait for the xput & xtransfer tests to work.
- t/05-readonly.t - Renamed 05-simple to 05-readonly to more acurately
describe the types of tests this test script does! Updated in MANIFEST &
README as well.
- build: It is now possible to build the documentation from the release
tarball
- buffer: Buffer.alloc() will no longer incorrectly return a zero filled
buffer when an encoding is passed
- deps: upgrade npm in LTS to 2.15.11
- repl: Enable tab completion for global properties
- url: url.format() will now encode all # in search
Changes in 1.7.7:
At the suggestion of Peter Spiess-Knafl, we will bump the
SOVERSION independent of the MAJOR.MINOR.MICRO version, in case
we break binary compatibility.
Changes in 1.7.6:
Prevent possible SEGV. (Thanks to @ngg.)
Add RPATH for OSX libs. (Please let us know if this causes a problem.)
Changes in 1.7.5:
Fix locale for decimal points
Plus a fix for Android
int64_t for 64-bit integers
Optionally suppress space after comma
Avoid null for empty stringValue
Fix null ctor/dtor, using a "Meyers Singleton"
Thanks to @marklakata and @BillyDonahue in #488 in #490.
* Clear up dependencies, enable test target.
5.01
- Doc fixes
5.00
- This version adds Elasticsearch 5.x compatibility, and makes it
the default.
- It also adds deprecation logging which logs to STDERR by default.
- The Hijk backend will not work with Elasticsearch 5.x until this bug
is fixed: https://rt.cpan.org/Ticket/Display.html?id=118425
BREAKING CHANGES:
- The 0.90, 1.x, and 2.x compatible clients no longer ship by default.
You should install one of the following:
* Search::Elasticsearch::Client::2_0
* Search::Elasticsearch::Client::2_0::Async
* Search::Elasticsearch::Client::1_0
* Search::Elasticsearch::Client::1_0::Async
* Search::Elasticsearch::Client::0_90
* Search::Elasticsearch::Client::0_90::Async
- The code has been reorganised so that all client-related modules
are under the S::E::API_VERSION::Client namespace.
This includes S::E::Bulk and S::E::Scroll.
- Plugin authors note: the format for the API in ...Role::API has changed.
- S::E::Cxn::HTTP has been rolled into S::E::Cxn as Elasticsearch
no longer supports other protocols.
Send Log::Any logs to a subroutine.
This adapter lets you specify callback subroutine to be called by
Log::Any's logging methods (like $log->debug(), $log->error(), etc) and
detection methods (like $log->is_warning(), $log->is_fatal(), etc.).
---------------------------
version 2.76
Include 0.0.0.0/8 in DNS rebind checks. This range
translates to hosts on the local network, or, at
least, 0.0.0.0 accesses the local host, so could
be targets for DNS rebinding. See RFC 5735 section 3
for details. Thanks to Stephen R旦ttger for the bug report.
Enhance --add-subnet to allow arbitrary subnet addresses.
Thanks to Ed Barsley for the patch.
Respect the --no-resolv flag in inotify code. Fixes bug
which caused dnsmasq to fail to start if a resolv-file
was a dangling symbolic link, even of --no-resolv set.
Thanks to Alexander Kurtz for spotting the problem.
Fix crash when an A or AAAA record is defined locally,
in a hosts file, and an upstream server sends a reply
that the same name is empty. Thanks to Edwin T旦r旦k for
the patch.
Fix failure to correctly calculate cache-size when
reading a hosts-file fails. Thanks to Andr辿 Gl端pker
for the patch.
Fix wrong answer to simple name query when --domain-needed
set, but no upstream servers configured. Dnsmasq returned
REFUSED, in this case, when it should be the same as when
upstream servers are configured - NOERROR. Thanks to
Allain Legacy for spotting the problem.
Return REFUSED when running out of forwarding table slots,
not SERVFAIL.
Add --max-port configuration. Thanks to Hans Dedecker for
the patch.
Add --script-arp and two new functions for the dhcp-script.
These are "arp" and "arp-old" which announce the arrival and
removal of entries in the ARP or nieghbour tables.
Extend --add-mac to allow a new encoding of the MAC address
as base64, by configurting --add-mac=base64
Add --add-cpe-id option.
Don't crash with divide-by-zero if an IPv6 dhcp-range
is declared as a whole /64.
(ie xx::0 to xx::ffff:ffff:ffff:ffff)
Thanks to Laurent Bendel for spotting this problem.
Add support for a TTL parameter in --host-record and
--cname.
Add --dhcp-ttl option.
Add --tftp-mtu option. Thanks to Patrick McLean for the
initial patch.
Check return-code of inet_pton() when parsing dhcp-option.
Bad addresses could fail to generate errors and result in
garbage dhcp-options being sent. Thanks to Marc Branchaud
for spotting this.
Fix wrong value for EDNS UDP packet size when using
--servers-file to define upstream DNS servers. Thanks to
Scott Bonar for the bug report.
Move the dhcp_release and dhcp_lease_time tools from
contrib/wrt to contrib/lease-tools.
Add dhcp_release6 to contrib/lease-tools. Many thanks
to Sergey Nechaev for this code.
To avoid filling logs in configurations which define
many upstream nameservers, don't log more that 30 servers.
The number to be logged can be changed as SERVERS_LOGGED
in src/config.h.
Swap the values if BC_EFI and x86-64_EFI in --pxe-service.
These were previously wrong due to an error in RFC 4578.
If you're using BC_EFI to boot 64-bit EFI machines, you
will need to update your config.
Add ARM32_EFI and ARM64_EFI as valid architectures in
--pxe-service.
Fix PXE booting for UEFI architectures. Modify PXE boot
sequence in this case to force the client to talk to dnsmasq
over port 4011. This makes PXE and especially proxy-DHCP PXE
work with these archictectures.
Workaround problems with UEFI PXE clients. There exist
in the wild PXE clients which have problems with PXE
boot menus. To work around this, when there's a single
--pxe-service which applies to client, then that target
will be booted directly, rather then sending a
single-item boot menu.
Many thanks to Jarek Polok, Michael Kuron and Dreamcat4
for their work on the long-standing UEFI PXE problem.
Subtle change in the semantics of "basename" in
--pxe-service. The historical behaviour has always been
that the actual filename downloaded from the TFTP server
is <basename>.<layer> where <layer> is an integer which
corresponds to the layer parameter supplied by the client.
It's not clear what the function of the "layer"
actually is in the PXE protocol, and in practise layer
is always zero, so the filename is <basename>.0
The new behaviour is the same as the old, except when
<basename> includes a file suffix, in which case
the layer suffix is no longer added. This allows
sensible suffices to be used, rather then the
meaningless ".0". Only in the unlikely event that you
have a config with a basename which already has a
suffix, is this an incompatible change, since the file
downloaded will change from name.suffix.0 to justy
name.suffix
pisa is a html2pdf convert using ReportLab, HTML5lib and pyPDF.
It supports HTML 5 and CSS 2.1.
This package is obsolete. Please use print/py-weasyprint instead.
Release 0.48.0
core:
* Fix crashes and memory leaks in invalid files.
* Small memory usage improvements.
* TextOutputDev: Remove null characters from PDF text. Bug #97144
* TextOutputDev: Break words on all whitespace characters. Bug #97399
* Fix UTF16 decoding of document outline title. Bug #97156
* Add functions for named destination name in name-tree/dict
glib:
* Increase glib requirement to 2.41
Release 0.47.0
core:
* Fix abort on documents where the docinfo obj is not a dict. Bug #97134
* Check for XRefEntry existing before using it. Bug #97005
* Fix memory leak on PDFDoc::setDocInfoStringEntry() with empty string
* Don't presume that DocInfo is a dictionary in XRef::createDocInfoIfNoneExists()
build system:
* configure: Work with non gnu greps
* Now requires JDK 8.0.
* Use a different installation layout as some location are now impossible to
change without patching the Java sources.
=== Breaking changes in 5.0 ===
Migration Plugin
- The elasticsearch-migration plugin (compatible with Elasticsearch 2.3.0
and above) will help you to find issues that need to be addressed when
upgrading to Elasticsearch 5.0.
Indices created before 5.0
- Elasticsearch 5.0 can read indices created in version 2.0 or above. An
Elasticsearch 5.0 node will not start in the presence of indices created
in a version of Elasticsearch before 2.0.
- Indices created in Elasticsearch 1.x or before will need to be reindexed
with Elasticsearch 2.x in order to be readable by Elasticsearch 5.x. It is
not sufficient to use the upgrade API.
=== Breaking changes
Aggregations::
- Remove size 0 options in aggregations
Aliases::
- make get alias expand to open and closed indices by default
- Remove deprecated indices.get_aliases
Allocation::
- Remove DisableAllocationDecider
Analysis::
- Remove `token_filter` in _analyze API
- Removes support for adding aliases to analyzers
- Analyze API : Rename filters/token_filters/char_filter in Analyze API in
master
CAT API::
- Improve cat thread pool API
- Row-centric output for _cat/fielddata
- Add raw recovery progress to cat recovery API
- Remove host from cat nodes API
- Using the accept header in the request instead of content-type in _cat
API.
CRUD::
- Fixed naming inconsistency for fields/stored_fields in the APIs
- Disallow creating indices starting with '-' or '+'
- Wait for changes to be visible by search
- Remove object notation for core types.
Cache::
- Remove deprecated query cache settings
Cluster::
- Persistent Node Ids
- Remove validation errors from cluster health response
- Remove memory section
Core::
- Remove ignore system bootstrap checks
- Remove minimum master nodes bootstrap check
- Keep input time unit when parsing TimeValues
- Remove cluster name from data path
- Add max number of processes check
- Add mlockall bootstrap check
- One log
Engine::
- Optimize indexing for the autogenerated ID append-only case
- Remove `index.compound_on_flush` setting and default to `true`
Exceptions::
- Die with dignity
Fielddata::
- Remove "uninverted" and "binary" fielddata support for numeric and boolean
fields.
Geo::
- Deprecate GeoDistance enums and remove geo distance script helpers
Index APIs::
- Removes write consistency level across replication action APIs in favor of
wait_for_active_shards
- Remove `GET` option for /_forcemerge
- Remove /_optimize REST API endpoint
Indexed Scripts/Templates::
- Store indexed scripts in the cluster state instead of the `.scripts` index
Inner Hits::
- Also do not serialize `_index` key in search response for parent/child
inner hits
- Don't include `_id`, `_type` and `_index` keys in search response for
inner hits
- Nested inner hits shouldn't use relative paths
- Drop top level inner hits in favour of inner hits defined in the query dsl
Internal::
- `_flush` should block by default
- Actually bound the generic thread pool
- Remove support for pre 2.0 indices
Logging::
- Introduce Log4j 2
Mapping::
- Remove `_timestamp` and `_ttl` on 5.x indices.
- Add a soft limit on the mapping depth.
- Disable fielddata on text fields by defaults.
- Add limit to total number of fields in mapping
- Change the field mapping index time boost into a query time boost.
- Deprecate string in favor of text/keyword.
- Term vector APIs should no longer update mappings
- Remove the `format` option of the `_source` field.
- Remove transform
Packaging::
- Rename service.bat to elasticsearch-service.bat
- Remove -D handling in args for windows plugin script
- Set default min heap equal to default max heap
- Remove allow running as root
- Require /bin/bash in packaging
- Remove plugin script parsing of system properties
- Add JVM options configuration file
Parent/Child::
- Removed `total` score mode in favour for `sum` score mode.
- Removed pre 2.x parent child implementation
Percolator::
- Remove `.percolator` type in favour of `percolator` field type
- Change the percolate api to not dynamically add fields to mapping
Plugin Delete By Query::
- Remove Delete-By-Query plugin
Plugin Lang Painless::
- Remove all date 'now' methods from Painless
- Make Painless the Default Language
Plugins::
- Plugins cleanup
- Rename bin/plugin in bin/elasticsearch-plugin
- Change the inner structure of the plugins zip
- Remove multicast plugin
- Plugins: Remove site plugins
Query DSL::
- Lessen leniency of the query dsl.
- Function score query: remove deprecated support for boost_factor
- Remove support for deprecated queries.
REST::
- Change separator for shards preference
- Parameter improvements to Cluster Health API wait for shards
- Switch indices.exists_type from `{index}/{type}` to
`{index}/_mapping/{type}`.
- Only use `PUT` for index creation, not POST.
- Remove camelCase support
- Remove 'case' parameter from rest apis
- Disallow unquoted field names
- Limit the accepted length of the _id
Scripting::
- Hardcode painless as the default scripting lang and add legacy script
default for stored scripts
- Remove deprecated 1.x script and template syntax
- Allow only a single extension for a scripting engine
- Remove 'sandbox' option for script settings, allow only registering a
single language.
Search::
- Rename `fields` to `stored_fields` and add `docvalue_fields`
- Remove only node preference
- Add search preference to prefer multiple nodes
- Add a soft limit on the number of shards that can be queried in a single
search request.
- Remove deprecated reverse option from sorting
- Remove some deprecations
- Remove search exists api
- Remove the scan and count search types.
Search Refactoring::
- Remove deprecated parameter from field sort builder.
- Remove "query" query and fix related parsing bugs
Settings::
- Default max local storage nodes to one
- Persistent Node Names
- Remove support for properties
- Rename boostrap.mlockall to bootstrap.memory_lock
- Register `indices.query.bool.max_clause_count` setting
- Remove settings and system properties entanglement
- Remove `action.get.realtime` setting
- Remove ability to specify arbitrary node attributes with `node.` prefix
- Enforce `discovery.zen.minimum_master_nodes` is set when bound to a public
ip
- Prevent index level setting from being configured on a node level
- Remove support for node.client setting
- Remove es.max-open-files flag
- Enforce node level limits if node is started in production env
- Make settings validation strict
- Remove the ability to fsync on every operation and only schedule fsync
task if really needed
- Script settings
- Remove index.flush_on_close entirely
- Restore chunksize of 512kb on recovery and remove configurability
- Remove ancient deprecated and alternative recovery settings
Similarities::
- Renames `default` similarity into `classic`
Snapshot/Restore::
- Change the default of `include_global_state` from true to false for
snapshot restores
- Fail closing or deleting indices during a full snapshot
Stats::
- Modify load average format
- Reintroduce five-minute and fifteen-minute load averages on Linux
- Add system CPU percent to OS stats
Term Vectors::
- Remove DFS support from TermVector API
Translog::
- Drop support for simple translog and hard-wire buffer to 8kb
- Simplify translog-based flush settings
Warmers::
- Remove query warmers and the warmer API.
=== Breaking Java changes
Aggregations::
- getKeyAsString and key_as_string should be the same for terms aggregation
on boolean field
Allocation::
- Move parsing of allocation commands into REST and remove support for
plugins to register allocation commands
- Simplify shard balancer interface
Analysis::
- Simplify Analysis registration and configuration
CRUD::
- Removing isCreated and isFound from the Java API
Cache::
- Refactor IndicesRequestCache to make it testable.
- Fold IndexCacheModule into IndexModule
Core::
- Remove ability to plug-in TransportService
- Register thread pool settings
- Bootstrap does not set system properties
- Remove es.useLinkedTransferQueue
Discovery::
- Introduce node handshake
- Include pings from client nodes in master election
Highlighting::
- Register Highlighter instances instead of classes
Internal::
- Remove TransportService#registerRequestHandler leniency
- Consolidate search parser registries
- Move all FetchSubPhases to o.e.search.fetch.subphase
- Squash the rest of o.e.rest.action
- Clean up BytesReference
- Cleanup ClusterService dependencies and detached from Guice
- Simplify SubFetchPhase interface
- Simplify FetchSubPhase registration and detach it from Guice
- Remove duplicate getters from DiscoveryNode and DiscoveryNodes
- Cli: Switch to jopt-simple
- Replace ContextAndHeaders with a ThreadPool based ThreadLocal
implementation
- Remove NodeBuilder
- Fix IndexSearcherWrapper interface to not depend on the EngineConfig
- Cleanup query parsing and remove IndexQueryParserService
- Remove circular dependency between IndicesService and IndicesStore
- Remove guice injection from IndexStore and friends
- Replace IndicesLifecycle with a per-index IndexEventListener
- Simplify similarity module and friends
- Refactor SearchRequest to be parsed on the coordinating node
Java API::
- Add a dedicated client/transport project for transport-client
- Remove setRefresh
- Remove the count api
- IdsQueryBuilder to accept only non null ids and types
Mapping::
- [Mapping] Several MappingService cleanups
Network::
- Factor out abstract TCPTransport* classes to reduce the netty footprint
- Remove ability to disable Netty gathering writes
Parent/Child::
- Cleanup ParentFieldMapper
- Several other parent/child cleanups
Percolator::
- Move the percolator from core to its own module
- Remove percolator cache
Plugins::
- Cleanup sub fetch phase extension point
- Remove IndexTemplateFilter
- Switch custom ShardsAllocators to pull based model
- Make custom allocation deciders use pull based extensions
- Migrate query registration from push to pull
- Add components getter as bridge between guice and new plugin init world
- Remove CustomNodeAttributes extension point
- Add RepositoryPlugin interface for registering snapshot repositories
- Simplified repository api for snapshot/restore
- Switch most search extensions from push to pull
- Move RestHandler registration to ActionModule and ActionPlugin
- Pull actions from plugins
- Switch analysis from push to pull
- Remove guice from Mapper plugins
- Fail to start if plugin tries broken onModule
- Simplify ScriptModule and script registration
- Cut over settings registration to a pull model
- Enforce isolated mode for all plugins
- Don't use guice for QueryParsers
- Remove guice from the index level
- Remove shard-level injector
Query DSL::
- Remove the MissingQueryBuilder which was deprecated in 2.2.0.
- Remove NotQueryBuilder
Scripting::
- Remove o.e.script.Template class and move template query to lang-mustache
module
- Move search template to lang-mustache module
- Remove LeafSearchScript.runAsFloat(): Nothing calls it.
Search::
- Remove FetchSubPhaseParseElement
- Refactor of query profile classes to make way for other profile
implementations
- Query refactoring: split parse phase into fromXContent and toQuery for all
queries
Search Refactoring::
- Refactored inner hits parsing and intoduced InnerHitBuilder
- Remove support for query_binary and filter_binary
- Validate query api: move query parsing to the coordinating node
Settings::
- Remove `node.mode` and `node.local` settings
- Remove Settings.settingsBuilder.
- Move remaining settings in NettyHttpServerTransport to the new infra
- Replace IndexSettings annotation with a full-fledged class
- Fix ping timeout settings inconsistencies
Snapshot/Restore::
- Removes extra writeBlob method in BlobContainer
Store::
- Standardize state format type for global and index level metadata
Suggesters::
- Remove suggest threadpool
- Remove suggest transport action
=== Deprecations
CRUD::
- Deprecate found and created in delete and index rest responses
Plugin Discovery Azure Classic::
- Deprecate discovery-azure and rename it to discovery-azure-classic
Plugin Mapper Attachment::
- Deprecate mapper-attachments plugin
Query DSL::
- Deprecate Indices query
- Deprecate mlt, in and geo_bbox query name shortcuts
Query Refactoring::
- Splits `phrase` and `phrase_prefix` in match query into
`MatchPhraseQueryBuilder` and `MatchPhrasePrefixQueryBuilder`
Scripting::
- Deprecate Groovy, Python, and Javascript
Search::
- Deprecate fuzzy query
Templates::
- Deprecate template query
=== New features
Aggregations::
- Split regular histograms from date histograms.
- Adds aggregation profiling to the profile API
- New Matrix Stats Aggregation module
Aliases::
- Add an alias action to delete an index
Allocation::
- Add API to explain why a shard is or isn't assigned
Analysis::
- Exposing lucene 6.x minhash filter.
- Add `fingerprint` token filter and `fingerprint` analyzer
Circuit Breakers::
- Circuit break on aggregation bucket numbers with request breaker
Discovery::
- Add two phased commit to Cluster State publishing
Geo::
- Cut over geo_point field and queries to new LatLonPoint type
Index APIs::
- Add rollover API to switch index aliases given some predicates
Ingest::
- ingest-useragent plugin
- Add a Sort ingest processor
- Add date_index_name processor
- Merge feature/ingest branch into master branch
Java REST Client::
- Introduce async performRequest method
- Low level Rest Client
Mapping::
- Add `scaled_float`.
- Expose half-floats.
- Add a text field.
- Add a new `keyword` field.
Percolator::
- index the query terms from the percolator query
Plugin Analysis ICU::
- Adding support for customizing the rule file in ICU tokenizer
Plugin Discovery File::
- File-based discovery plugin
Plugin Ingest Attachment::
- Ingest: Add attachment processor
Plugin Mapper Attachment::
- Migrate mapper attachments plugin to main repository
Plugin Repository HDFS::
- HDFS Snapshot/Restore plugin
Plugin Repository S3::
- Add support for path_style_access
Query DSL::
- Adds a rewrite phase to queries on the shard level
Reindex API::
- Reindex from remote
- Port Delete By Query to Reindex infrastructure
- Merge reindex to master
Scripting::
- Exceptions and Infinite Loop Checking
- Added a new scripting language (PlanA)
Scroll::
- Add the ability to partition a scroll in multiple slices.
Search::
- Add the ability to disable the retrieval of the stored fields entirely
- Add `search_after` parameter in the SearchAPI
Settings::
- Add infrastructure to transactionally apply and reset dynamic settings
Snapshot/Restore::
- Add Google Cloud Storage repository plugin
Stats::
- Extend field stats to report searchable/aggregatable fields
- API for listing index file sizes
Store::
- Expose MMapDirectory.preLoad().
- Add primitive to shrink an index into a single shard
Suggesters::
- Add support for returning documents with completion suggester
- Add document-oriented completion suggester
Task Manager::
- Add task cancellation mechanism
- Make the Task object available to the action caller
- Task Management: Add framework for registering and communicating with
tasks
Translog::
- Add `elasticsearch-translog` CLI tool with `truncate` command
=== Enhancements
Aggregations::
- Make the heuristic to compute the default shard size less aggressive.
- Add _bucket_count option to buckets_path
- Remove AggregationStreams
- Migrate serial_diff aggregation to NamedWriteable
- Migrate most remaining pipeline aggregations to NamedWriteable
- Migrate moving_avg pipeline aggregation to NamedWriteable
- Migrate matrix_stats to NamedWriteable
- Migrate derivative pipeline aggregation to NamedWriteable
- Migrate top_hits, histogram, and ip_range aggregations to NamedWriteable
- Migrate nested, reverse_nested, and children aggregations to
NamedWriteable
- Migrate geohash_grid and geo_bounds aggregations to NamedWriteable
- Clean up significant terms aggregation results
- Migrate range, date_range, and geo_distance aggregations to NamedWriteable
- Migrate terms aggregation to NamedWriteable
- Migrate sampler and missing aggregations to NamedWriteable
- Migrate global, filter, and filters aggregation to NamedWriteable
- Migrate the cardinality, scripted_metric, and geo_centroid aggregations to
NamedWriteable
- Use a static default precision for the cardinality aggregation.
- Migrate more aggregations to NamedWriteable
- Migrate stats and extended stats to NamedWriteable
- Migrate sum, min, and max aggregations over to NamedWriteable
- Start migration away from aggregation streams
- Automatically set the collection mode to breadth_first in the terms
aggregation when the cardinality of the field is unknown or smaller than
the requested size.
- Rename PipelineAggregatorBuilder to PipelineAggregationBuilder.
- AggregatorBuilder and PipelineAggregatorBuilder do not need generics.
- Rename AggregatorBuilder to AggregationBuilder
- Add the ability to use the breadth_first mode with nested aggregations
(such as `top_hits`) which require access to score information.
- Make significant terms work on fields that are indexed with points.
- Add tests and documentation for using `time_zone` in date range
aggregation
- Fixes serialisation of Ranges
Allocation::
- Verify AllocationIDs in replication actions
- Mark shard as stale on non-replicated write, not on node shutdown
- Add routing changes API to RoutingAllocation
- Primary shard allocator observes limits in forcing allocation
- Use primary terms as authority to fail shards
- Add recovery source to ShardRouting
- Allow `_shrink` to N shards if source shards is a multiple of N
- Only filter intial recovery (post API) when shrinking an index
- Estimate shard size for shrinked indices
- Only fail relocation target shard if failing source shard is a primary
- Simplify delayed shard allocation
- Limit retries of failed allocations per index
- Immutable ShardRouting
- Add the shard's store status to the explain API
- Write shard state metadata as soon as shard is created / initializing
- Reuse existing allocation id for primary shard allocation
- Remove version in ShardRouting (now obsolete)
- Prefer nodes that previously held primary shard for primary shard
allocation
- Extend reroute with an option to force assign stale primary shard copies
- Allocate primary shards based on allocation IDs
- Persist currently started allocation IDs to index metadata
- Use ObjectParser to parse AllocationID
- Persist allocation ID with shard state metadata on nodes
Analysis::
- Stop using cached component in _analyze API
- Specify custom char_filters/tokenizer/token_filters in the analyze API
- Add a MultiTermAwareComponent marker interface to analysis factories.
- Add Flags Parameter for Char Filter
- Core: better error message when analyzer created without tokenizer or…
- Move AsciiFolding earlier in FingerprintAnalyzer filter chain
- Improve error message if resource files have illegal encoding
Benchmark::
- Add client-benchmark-noop-api-plugin to stress clients even more in
benchmarks
CAT API::
- Add health status parameter to cat indices API
- Includes the index UUID in the _cat/indices API
- Add node name to Cat Recovery
- Add support for documented byte/size units and for micros as a time unit
in _cat API
- Add _cat/tasks
- Cat health supports ts=0 option
- Expose http address in cat/nodes
- [cat/recovery] Make recovery time a TimeValue()
- :CAT API: remove space at the end of a line
CRUD::
- Renaming operation to result and reworking responses
- Adding _operation field to index, update, delete response.
- CRUD: Allow to get and set ttl as a time value/string
Cache::
- Enable option to use request cache for size > 0
- Cache FieldStats in the request cache
- Allow the query cache to be disabled.
- Enable the indices request cache by default
Circuit Breakers::
- Cluster Settings Updates should not trigger circuit breakers.
- Circuit break the number of inline scripts compiled per minute
Cluster::
- Add clusterUUID to RestMainAction output
- Batch process node left and node failure
- Index creation waits for write consistency shards
- Inline reroute with process of node join/master election
- Index creation does not cause the cluster health to go RED
- Cluster Health class improvements
- Adds tombstones to cluster state for index deletions
- Enable acked indexing
- Cluster Health should run on applied states, even if waitFor=0
- Resolve index names to Index instances early
- Remove DiscoveryNode#shouldConnectTo method
- Fail demoted primary shards and retry request
- Illegal shard failure requests
- Shard failure requests for non-existent shards
- Add handling of channel failures when starting a shard
- Wait for new master when failing shard
- Master should wait on cluster state publication when failing a shard
- Split cluster state update tasks into roles
- Add timeout mechanism for sending shard failures
- Add listener mechanism for failures to send shard failed
Core::
- Add production warning for pre-release builds
- Add serial collector bootstrap check
- Rename Netty TCP transports thread factories from http_* to transport_*
- Do not log full bootstrap checks exception
- Mark halting the virtual machine as privileged
- Makes index creation more friendly
- Clearer error when handling fractional time values
- Read Elasticsearch manifest via URL
- Throw if the local node is not set
- Bootstrap check for OnOutOfMemoryError and seccomp
- Log OS and JVM on startup
- Add GC overhead logging
- Refactor JvmGcMonitorService for testing
- Default to server VM and add client VM check
- Add system bootstrap checks escape hatch
- Avoid sliced locked contention in internal engine
- Add heap size bootstrap check
- Remove hostname from NetworkAddress.format
- Bootstrapping bootstrap checks
- Add max map count check
- Remove PROTOTYPE from BulkItemResponse.Failure
- Throw an exception if Writeable.Reader reads null
- Remove PROTOTYPE from RescorerBuilders
- Port Primary Terms to master
- Use index UUID to lookup indices on IndicesService
- Add -XX+AlwaysPreTouch JVM flag
- Add max size virtual memory check
- Use and test relative time in TransportBulkAction
- Bump Elasticsearch version to 5.0.0-SNAPSHOT
- Assert that we can write in all data-path on startup
- Add G1GC check on startup
- Shards with heavy indexing should get more of the indexing buffer
- Remove and ban ImmutableMap
- Finish banning ImmutableSet
- Removes and bans ImmutableSet
- Remove and ban ImmutableMap#entrySet
- Forbid ForwardingSet
Dates::
- Improve TimeZoneRoundingTests error messages
- Support full range of Java Long for epoch DateTime
Discovery::
- Do not log cluster service errors at after joining a master
- Log warning if minimum_master_nodes set to less than quorum
- Add a dedicate queue for incoming ClusterStates
Engine::
- Only try to read new segments info if we really flushed the index
- Use _refresh instead of reading from Translog in the RT GET case
- Remove writeLockTimeout from InternalEngine
- Don't guard IndexShard#refresh calls by a check to isRefreshNeeded
- Never call a listerner under lock in InternalEngine
- Use System.nanoTime() to initialize Engine.lastWriteNanos
- Flush big merges automatically if shard is inactive
- Remove Engine.Create
- Remove the disabled autogenerated id optimization from InternalEngine
Exceptions::
- Improve startup exception
- Make NotMasterException a first class citizen
- Do not catch throwable
- Make the index-too-old exception more explicit
- Add index name in IndexAlreadyExistsException default message
- Fix typos in exception/assert/log messages in core module.
- Add field names to several mapping errors
- Add serialization support for more important IOExceptions
- Adds exception objects to log messages.
- Add stack traces to logged exceptions where missing
- Remove reflection hacks from ElasticsearchException
- Rename QueryParsingException to a more generic ParsingException
- Add *Exception(Throwable cause) constructors/ call where appropriate
Expressions::
- improve date api for expressions/painless fields
- Support geo_point fields in lucene expressions
- Add support for .empty to expressions, and some docs improvements
Geo::
- GeoBoundingBoxQueryBuilder should throw IAE when topLeft and bottomRight
are the same coordinate
- Enhanced lat/long error handling
- Fix a potential parsing problem in GeoDistanceSortParser
- Geo: Add validation of shapes to ShapeBuilders
- Make remaining ShapeBuilders implement Writeable
- Geo: Remove internal `translated` flag from LineStringBuilder
- Make PointBuilder, CircleBuilder & EnvelopeBuilder implement Writable
- Merging BaseLineString and BasePolygonBuilder with subclass
- Moving static factory methods to ShapeBuilders
- Remove InternalLineStringBuilder and InternalPolygonBuilder
Highlighting::
- Switch Highlighting to ObjectParser
- Use HighlightBuilder in SearchSourceBuilder
- Joint parsing of common global Hightlighter and subfield parameters
- Enable HighlightBuilder to create SearchContextHighlight
- Add fromXContent method to HighlightBuilder
Index APIs::
- Add date-math support to `_rollover`
- Add Shrink request source parser to parse create index request body
- Fail hot_threads in a better way if unsupported by JDK
Index Templates::
- Add "version" field to Templates
- Parse and validate mappings on index template creation
Ingest::
- Add "version" field to Pipelines
- Make it possible for Ingest Processors to access AnalysisRegistry
- add ignore_missing option to convert,trim,lowercase,uppercase,grok,rename
- Add support for parameters to the script ingest processor
- introduce the JSON Processor
- Allow rename processor to turn leaf fields into branch fields
- remove ability to set field value in script-processor configuration
- Add REST _ingest/pipeline to get all pipelines
- Show ignored errors in verbose simulate result
- update foreach processor to only support one applied processor.
- Skip the execution of an empty pipeline
- Add `ignore_failure` option to all ingest processors
- new ScriptProcessor for Ingest
- Expose underlying processor to blame for thrown exception within
CompoundProcessor
- Avoid string concatentation in IngestDocument.FieldPath
- add ability to specify multiple grok patterns
- add ability to disable ability to override values of existing fields in
set processor
- Streamline option naming for several processors
- add automatic type conversion support to ConvertProcessor
- Give the foreach processor access to the rest of the document
- Added ingest statistics to node stats API
- Add `ingest_took` to bulk response
- Add ingest info to node info API, which contains a list of available
processors
- Use diffs for ingest metadata in cluster state
- hide null-valued metadata fields from WriteableIngestDocument#toXContent
- Ingest: use bulk thread pool for bulk request processing (was index
before)
- Add foreach processor
- revert PipelineFactoryError handling with throwing
ElasticsearchParseException in ingest pipeline creation
- Add processor tags to on_failure metadata in ingest pipeline
- catch processor/pipeline factory exceptions and return structured error
responses
- Ingest: move get/put/delete pipeline methods to ClusterAdminClient
- Geoip processor: remove redundant latitude and longitude fields and make
location an object with lat and lon subfields
Inner Hits::
- Change scriptFields member in InnerHitBuilder to set
Internal::
- Remove poor-mans compression in InternalSearchHit and friends
- Don't register SearchTransportService handlers more than once
- Unguice SearchModule
- Deguice SearchService and friends
- NodeStats classes to implement Writeable rather then Streamable
- More info classes to implement Writeable rather than Streamable
- Internal: Split disk threshold monitoring from decider
- Switching LockObtainFailedException over to ShardLockObtainFailedException
- update and delete by query requests to implement
IndicesRequest.Replaceable
- VersionFetchSubPhase should not use Versions#loadDocIdAndVersion
- Remove useless PK lookup in IndicesTTLService
- ignore some docker craziness in seccomp environment checks
- Make Priority an enum
- Snapshot UUIDs in blob names
- Add RestController method for deprecating in one step
- Tighten ensure atomic move cleanup
- Enable checkstyle ModifierOrder
- Expose task information from NodeClient
- Changed rest handler interface to take NodeClient
- Deprecate ExceptionsHelper.detailedMessage
- Factor out ChannelBuffer from BytesReference
- Cleanup Compressor interface
- Hot methods redux
- Remove forked joda time BaseDateTime class
- Support optional ctor args in ConstructingObjectParser
- Remove thread pool from page cache recycler
- Do not automatically close XContent objects/arrays
- Remove use of a Fields class in snapshot responses
- Removes multiple toXContent entry points for SnapshotInfo
- Removes unused methods in the o/e/common/Strings class
- Determine content length eagerly in HttpServer
- Consolidate query generation in QueryShardContext
- Make reset in QueryShardContext private
- Remove Strings#splitStringToArray
- Add toString() to GetResponse
- ConstructingObjectParser adapts ObjectParser for ctor args
- Makes Script type writeable
- FiltersAggregatorBuilder: Don't create new context for inner parsing
- Clean up serialization on some stats
- Normalize registration for SignificanceHeuristics
- Make (read|write)NamedWriteable public
- Use try-with-resource when creating new parser instances where possible
- Don't pass XContentParser to ParseFieldRegistry#lookup
- Internal: Remove threadlocal from document parser
- Cut range aggregations to registerAggregation
- Remove ParseFieldMatcher from AbstractXContentParser
- Remove parser argument from methods where we already pass in a parse
context
- Switch SearchAfterBuilder to writeGenericValue
- Remove StreamableReader
- Cleanup nested, has_child & has_parent query builders for inner hits
construction
- Make AllocationCommands NamedWriteables
- Isolate StreamableReader
- Create registration methods for aggregations similar to those for queries
- Remove PROTOTYPEs from QueryBuilders
- Remove registerQueryParser
- ParseField#getAllNamesIncludedDeprecated to not return duplicate names
- Rework a query parser and improve registration
- Clean up QueryParseContext and don't hold it inside
QueryRewrite/ShardContext
- Remove PROTOTYPE from MLT.Item
- Remove PROTOTYPE from VersionType
- Remove PROTOTYPEs from highlighting
- Remove PROTOTYPEs from ingest
- Start to rework query registration
- Factor out slow logs into Search and IndexingOperationListeners
- Remove PROTOTYPE from Suggesters
- Remove PROTOTYPE from SortBuilders
- Remove PROTOTYPE from ShapeBuilders
- Replace FieldStatsProvider with a method on MappedFieldType.
- Stop using PROTOTYPE in NamedWriteableRegistry
- Support scheduled commands in current context
- Thread limits
- Remove leniency from segments info integrity checks
- Rename SearchServiceTransportAction to SearchTransportService
- Decouple the TransportService and ClusterService
- Refactor bootstrap checks
- Add LifecycleRunnable
- Hot inlined methods in your area
- Move IndicesQueryCache and IndicesRequestCache into IndicesService
- Forbid use of java.security.MessageDigest#clone()
- Make IndicesWarmer a private class of IndexService
- Simplify IndicesFieldDataCache and detach from guice
- Uppercase ells ('L') in long literals
- ShardId equality and hash code inconsistency
- Ensure all resources are closed on Node#close()
- Make index uuid available in Index, ShardRouting & ShardId
- Move RefreshTask into IndexService and use since task per index
- Make IndexingMemoryController private to IndicesService
- Cleanup IndexingOperationListeners infrastructure
- Remove and forbid use of j.u.c.ThreadLocalRandom
- Fix IntelliJ query builder type inference issues
- Remove and forbid use of Collections#shuffle(List) and Random#<init>()
- Remove and forbid use of the type-unsafe empty Collections fields
- Move IndicesService.canDeleteShardContent to use IndexSettings
- Simplify MonitorService construction and detach from guice
- Use Supplier for StreamInput#readOptionalStreamable
- Add variable-length long encoding
- Extend usage of IndexSetting class
- Fold SimilarityModule into IndexModule
- Move to lucene BoostQuery
- Use built-in method for computing hash code of longs
- Refactor ShardFailure listener infrastructure
- Add methods for variable-length encoding integral arrays
- Fold IndexAliasesService into IndexService
- Remove unneeded Module abstractions
- Query refactoring: simplify IndexQueryParserService parse methods
- Remove and forbid use of com.google.common.collect.Iterators
- Remove and forbid use of com.google.common.collect.ImmutableCollection
- Remove and forbid use of com.google.common.io.Resources
- Remove and forbid use of com.google.common.hash.*
- Remove and forbid use of com.google.common.net.InetAddresses
- Remove and forbid use of com.google.common.collect.EvictingQueue
- Replace Guava cache with simple concurrent LRU cache
- Remove ClusterSerivce and IndexSettingsService dependency from IndexShard
- Start making RecoverySourceHandler unittestable
- Remove IndexService dep. from IndexShard
- Remove ES internal deletion policies in favour of Lucenes implementations
- Move ShardTermVectorService to be on indices level as TermVectorService
- Move ShardPercolateService creation into IndexShard
- Remove `ExpressionScriptCompilationException` and
`ExpressionScriptExecutionException`
- Reduced the number of ClusterStateUpdateTask variants
- Add a BaseParser helper for stream parsing
- Remove and forbid use of com.google.common.primitives.Ints
- Remove and forbid use of com.google.common.math.LongMath
- Remove and forbid use of com.google.common.base.Joiner
- Replace and ban next batch of Guava classes
- Remove and forbid use of com.google.common.collect.Iterables
- Replace LoadingCache usage with a simple ConcurrentHashMap
- Use Supplier instead of Reflection
- Remove and forbid use of com.google.common.base.Preconditions
- Remove and forbid use of guava Function, Charsets, Collections2
- Remove and forbid use of com.google.common.collect.ImmutableSortedMap
- Remove and forbid use of several com.google.common.util. classes
- Cleanup SearchRequest & SearchRequestBuilder
- Remove and forbid use of com.google.common.collect.Queues
- Remove and forbid use of com.google.common.base.Preconditions#checkNotNull
- Remove and forbid use of com.google.common.collect.Sets
- Remove and forbid use of com.google.common.collect.Maps
- Remove use of underscore as an identifier
- Remove and forbid the use of com.google.common.base.Predicate(s)?
- This commit removes com.google.common.io
Java API::
- Ensure PutMappingRequest.buildFromSimplifiedDef input are pairs
- Start from a random node number so that clients do not overload the first
node configured
- Switch QueryBuilders to new MatchPhraseQueryBuilder
- Improve adding clauses to `span_near` and `span_or` query
- QueryBuilder does not need generics.
- Remove copy constructors from request classes and TransportMessage type
Java REST Client::
- Add support for a RestClient path prefix
- Add "Async" to the end of each Async RestClient method
- Allow RestClient to send array-based headers
- Add response body to ResponseException error message
- Simplify Sniffer initialization and automatically create the default
HostsSniffer
- Remove duplicate dependency declaration for http client
- Add callback to customize http client settings
- Rest Client: add short performRequest method variants without params
and/or body
Logging::
- Ensure logging is initialized in CLI tools
- Give useful error message if log config is missing
- Complete Elasticsearch logger names
- Add node name to decider trace logging
- Logging shutdown hack
- Disable console logging
- Skip loading of jansi from log4j2
- Configure AWS SDK logging configuration
- Warn if unsupported logging configuration present
- Size limit deprecation logs
- Increase visibility of deprecation logger
- Add log message about enforcing bootstrap checks
- Improve logging for batched cluster state updates
- Send HTTP Warning Header(s) for any Deprecation Usage from a REST request
- Throw IllegalStateException when handshake fails due to version or cluster
mismatch
Mapping::
- Automatically downgrade text and keyword to string on indexes imported
from 2.x
- Do not parse numbers as both strings and numbers when not included in
`_all`.
- Don't index the `_version` field
- The root object mapper should support updating `numeric_detection`,
`date_detection` and `dynamic_date_formats`.
- Automatically upgrade analyzed string fields that have `index_options` or
`position_increment_gap` set.
- Mappings: Support dots in field names in mapping parsing
- Save one utf8 conversion in KeywordFieldMapper.
- Do not parse the created version from the settings every time a field is
parsed.
- Elasticsearch should reject dynamic templates with unknown
`match_mapping_type`.
- Upgrade `string` fields to `text`/`keyword` even if `include_in_all` is
set.
- Adds a methods to find (and dynamically create) the mappers for the
parents of a field with dots in the field name
- Automatically upgrade analyzed strings with an analyzer to `text`.
- Support dots in field names when mapping already exists
- Use the new points API to index numeric fields.
- Simplify AllEntries, AllField and AllFieldMapper:
- Make `parseMultiField` part of `parseField`.
- Automatically add a sub keyword field to string dynamic mappings.
- Remove friction from the mapping changes in 5.0.
- Rework norms parameters for 5.0.
- Moved dynamic field handling in doc parsing to end of parsing
- Remove the MapperBuilders utility class.
- Make the `index` property a boolean.
- Remove the ability to enable doc values with the `fielddata.format`
setting.
- Be stricter about parsing boolean values in mappings.
- Fix default doc values to be enabled when a field is not indexed.
- Dynamically map floating-point numbers as floats instead of doubles.
- Simplify MetaDataMappingService.
- Remove MergeMappingException.
Network::
- Avoid early initializing Netty
- Network: Allow to listen on virtual interfaces.
- Explicitly tell Netty to not use unsafe
- Enable Netty 4 extensions
- Modularize netty
- Simplify TcpTransport interface by reducing send code to a single send
method
- Do not start scheduled pings until transport start
Packaging::
- Add quiet option to disable console logging
- Explicitly disable Netty key set replacement
- Remove explicit parallel new GC flag
- Use JAVA_HOME or java.exe in PATH like the Linux scripts do
- Don't mkdir directly in deb init script
- Increase default heap size to 2g
- Switch init.d scripts to use bash
- Switch scripts to use bash
- Further simplifications of plugin script
- Pass ES_JAVA_OPTS to JVM for plugins script
- Remove unnecessary sleep from init script restart
- Explicitly set packaging permissions
- rpm uses non-portable `--system` flag to `useradd`
- Adding JAVA_HOME to documents and env config file
- Added RPM metadata
- Elasticsearch ownership for data, logs, and configs
- Fail early on JDK with compiler bug
- Make security non-optional
- Remove RuntimePermission("accessDeclaredMembers")
- Remove Guava as a dependency
- Remove Guava as a dependency
Percolator::
- Also support query term extract for queries wrapped inside a
FunctionScoreQuery
- Add support for synonym query to percolator query term extraction
- Add percolator query extraction support for dismax query
- Improve percolate query performance by not verifying certain candidate
matches
- Improve percolator query term extraction
- PercolatorQueryBuilder cleanup by using MemoryIndex#fromDocument(...)
helper
- Add scoring support to the percolator query
- Add query extract support for the blended term query and the common terms
query
- Add support for several span queries in ExtractQueryTermsService
- Add support for TermsQuery in ExtractQueryTermsService
- Replace percolate APIs with a percolator query
Plugin Analysis Kuromoji::
- Add nbest options and NumberFilter
Plugin Discovery EC2::
- Use `DefaultAWSCredentialsProviderChain` AWS SDK class for credentials
- Support new Asia Pacific (Mumbai) ap-south-1 AWS region
- Add support for proxy authentication for s3 and ec2
Plugin Discovery GCE::
- Allow `_gce_` network when not using discovery gce
Plugin Ingest Attachment::
- Minor attachment processor improvements
Plugin Lang Painless::
- Disable regexes by default in painless
- Catch OutOfMemory and StackOverflow errors in Painless
- Change Painless Tree Structure for Variable/Method Chains
- Add replaceAll and replaceFirst
- Painless Initializers
- Add augmentation
- Infer lambda arguments/return type
- Fix explicit casts and improve tests.
- Add lambda captures
- improve Debugger to print code even if it hits exception
- Move semicolon hack into lexer
- Add flag support to regexes
- improve lambda syntax (allow single expression)
- Remove useless dropArguments in megamorphic cache
- non-capturing lambda support
- fix bugs in operators and more improvements for the dynamic case
- improve unary operators and cleanup tests
- Add support for the find operator (=~) and the match operator (==~)
- Remove casts and boxing for dynamic math
- Refactor def math
- Add support for /regex/
- Array constructor references
- Method references to user functions
- Add } as a delimiter.
- Add Lambda Stub Node
- Add capturing method references
- Add Functions to Painless
- Add Method to Get New MethodWriters
- Static For Each
- Method reference support
- Add support for the new Java 9 MethodHandles#arrayLength() factory
- Improve painless compile-time exceptions
- add java.time packages to painless whitelist
- Add Function Reference Stub to Painless
- improve painless whitelist coverage of java api
- Definition cleanup
- Made def variable casting consistent with invokedynamic rules
- Use Java 9 Indy String Concats, if available
- Add method overloading based on arity
- Refactor WriterUtils to extend ASM GeneratorAdapter
- Whitelist expansion
- Remove boxing when loading and storing values in "def" fields/arrays,
remove boxing onsimple method calls of "def" methods
- Some cleanups
- Use isAssignableFrom instead of relying on ClassCastException
- Build descriptor of array and field load/store in code
- Rename the dynamic call site factory to DefBootstrap
- Cleanup of DynamicCallSite
- Improve exception stacktraces
- Make Line Number Available in Painless
- Remove input, support params instead
- Decouple ANTLR AST from Painless
- _value support in painess?
- Long priority over Float
- _score as double, not float
- Add 'ctx' keyword to painless.
- Painless doc access
- Retrieve _score directly from Scorer
- Implement needsScore() correctly.
- Add synthetic length property as alias to Lists, so they can be used like
arrays
- Use better typing for dynamic method calls
- Array load/store and length with invokedynamic
- Switch painless dynamic calls to invokedynamic, remove perf hack/cheat
- Add fielddata accessors (.value/.values/.distance()/etc)
- painless: optimize/simplify dynamic field and method access
- Painless: Single-Quoted Strings
- Painless Clean Up
- Make Painless a Module
- Minor Clean up
- Remove Extra String Concat Token
Plugin Mapper Attachment::
- minor attachments cleanups: IDE test support and EPUB format
Plugin Mapper Size::
- Add doc values support to the _size field in the mapper-size plugin
Plugin Repository Azure::
- Support global `repositories.azure.` settings
- Add timeout settings (default to 5 minutes)
- Remove AbstractLegacyBlobContainer
Plugin Repository HDFS::
- merge current hdfs improvements to master
Plugin Repository S3::
- Extract AWS Key from KeyChain instead of using potential null value
- Check that S3 setting `buffer_size` is always lower than `chunk_size`
Plugins::
- Revert "Display plugins versions"
- Provide error message when plugin id is missing
- Print message when removing plugin with config
- Plugins: Update official plugin location with unified release
- Allow plugins to upgrade global custom metadata on startup
- Switch aggregations from push to pull
- Display plugins versions
- Add ScriptService to dependencies available for plugin components
- Make NamedWriteableRegistry immutable and add extension point for named
writeables
- Log one plugin info per line
- Make rest headers registration pull based
- Add resource watcher to services available for plugin components
- Add some basic services to createComponents for plugins
- Make plugins closeable
- Plugins: Add status bar on download
- Add did-you-mean for plugin cli
- Plugins: Remove name() and description() from api
- Emit nicer error message when trying to install unknown plugin
- Add plugin information for Verbose mode
- Cli: Improve output for usage errors
- Cli: Add verbose output with zip url when installing plugin
- PluginManager: Add xpack as official plugin
- CliTool: Cleanup and document Terminal
- Plugin cli: Improve maven coordinates detection
- Enforce plugin zip does not contain zip entries outside of the plugin dir
- CliTool: Allow unexpected exceptions to propagate
- Reduce complexity of plugin cli
- Remove Plugin.onIndexService.
- Open up QueryCache and SearcherWrapper extension points
Query DSL::
- Throw exception when multiple field names are provided as part of query
short syntax
- Query parsers to throw exception when multiple field names are provided
- Allow empty json object in request body in `_count` API
- Treat zero token in `common` terms query as MatchNoDocsQuery
- Handle empty query bodies at parse time and remove EmptyQueryBuilder
- Enforce MatchQueryBuilder#maxExpansions() to be strictly positive
- Don't allow `fuzziness` for `multi_match` types `cross_fields`, `phrase`
and `phrase_prefix`
- Add MatchNoDocsQuery, a query that matches no documents and prints the
reason why in the toString method.
- Adds `ignore_unmapped` option to geo queries
- Adds `ignore_unmapped` option to nested and P/C queries
- SimpleQueryParser should call MappedFieldType.termQuery when appropriate.
- An `exists` query on an object should query a single term.
- Function Score Query: make parsing stricter
- Parsers should throw exception on unknown objects
- UNICODE_CHARACTER_CLASS fix
Query Refactoring::
- Add infrastructure to rewrite query builders
- Switch geo validation to enum
REST::
- Add a REST spec for the create API
- Add response params to REST params did you mean
- Add did you mean to strict REST params
- Add exclusion support to response filtering
- Only write forced_refresh if we forced a refresh
- Add Location header to the index, update, and create APIs
- Add support for `wait_for_events` to the `_cluster/health` REST endpoint
- Rename Search Template REST spec names
- Adding status field in _msearch error request bodies
- Add semicolon query string parameter delimiter
- Enable HTTP compression by default with compression level 3
- Allow JSON with unquoted field names by enabling system property
- More robust handling of CORS HTTP Access Control
- Add option to exclude based on paths in XContent
Recovery::
- Pass on maxUnsafeAutoIdTimestamp on recovery / relocation
- Non-blocking primary relocation hand-off
- index shard should be able to cancel check index on close.
- TransportNodesListGatewayStartedShards should fall back to disk based
index metadata if not found in cluster state
- Recover broken IndexMetaData as closed
- Relocation source should be marked as relocating before starting recovery
to primary relocation target
- Operation counter for IndexShard
- Primary relocation handoff
- Remove recovery threadpools and throttle outgoing recoveries on the master
- Refactor StoreRecoveryService to be a simple package private util class
Reindex API::
- Only ask for `_version` we need it
- Use fewer threads when reindexing-from-remote
- Support authentication with reindex-from-remote
- Support requests_per_second=-1 to mean no throttling in reindex
- Implement ctx.op = "delete" on _update_by_query and _reindex
- Make Reindex cancellation tests more uniform
- Makes DeleteByQueryRequest implements IndicesRequest
- Teach reindex to retry on search failures
- Remove ReindexResponse in favor of BulkIndexByScrollResponse
- Stricter validation of Reindex's requests_per_second
- Properly mark reindex's child tasks as child tasks
- Make reindex throttling dynamic
- Throttling support for reindex
- Add ingest pipeline support to reindex
Scripting::
- Parse script on storage instead of on retrieval
- Migrate elasticsearch native script examples to the main repo
- Remove ClusterState from compile api
- Mustache: Render Map as JSON
- Compile each Groovy script in its own classloader
- Include script field even if it value is null
- Skipping hidden files compilation for script service
- Rename Plan A to Painless
- Add plumbing for script compile-time parameters
- Factor mustache -> modules/lang-mustache
Scroll::
- Add an index setting to limit the maximum number of slices allowed in a
scroll request.
Search::
- Limit batch size when scrolling
- Record method counts while profiling query components
- Change default similarity to BM25
- Add a parameter to cap the number of searches the msearch api will
concurrently execute
- Introduces GeoValidationMethod to GeoDistanceSortBuilder
- Switches from empty boolean query to matchNoDocs
- Allow binary sort values.
- Fail query if it contains very large rescores
- Type filters should not have a performance impact when there is a single
type.
- Store _all payloads on 1 byte instead of 4.
- Refuse to load fields from _source when using the `fields` option and
support wildcards.
- Add response into ClearScrollResponse
- Shuffle shards for _only_nodes + support multiple specifications like
cluster API
Search Refactoring::
- Removes the now obsolete SearchParseElement implementations
- Remove RescoreParseElement
- Remove HighlighterParseElement
- Move top level parsing of sort element to SortBuilder
- Switch to using refactored SortBuilder instead of using BytesReference in
serialization
- Add build() method to SortBuilder implementations
- Refactoring of Suggestions
- Move sort `order` field up into SortBuilder
- Moves SortParser:parse(...) to only require QueryShardContext
- Change internal representation of suggesters
- Make GeoDistanceSortBuilder serializable, 2nd try
- Move missing() from SortBuilder interface to class
- Remove deprecated parameters from ScriptSortBuilder
- Refactor GeoSortBuilder
- Refactor FieldSortBuilder
- Make sort order enum writable.
- Make DistanceUnit writable.
- RescoreBuilder: Add parsing and creating of RescoreSearchContext
- Make RescoreBuilder and nested QueryRescorer Writable
- Explain api: move query parsing to the coordinating node
- Switch query parsers to use ParseField
- Refactoring of Aggregations
Sequence IDs::
- Persist sequence number checkpoints
- Add sequence numbers to cat shards API
Settings::
- Add precise logging on unknown or invalid settings
- Make `action.auto_create_index` setting a dynamic cluster setting
- Removes space between # and the setting in elasticsearch.yml
- Validates new dynamic settings from the current state
- Improve error message if a setting is not found
- Cleanup placeholder replacement
- Switch to registered Settings for all IndexingMemoryController settings
- Add guard against null-valued settings
- Useful error message for null property placeholder
- Archive cluster level settings if unknown or broken
- Improve error message if setting is not found
- Improve upgrade experience of node level index settings
- Settings with complex matchers should not overlap
- Moves GCE settings to the new infra
- Add filtering support within Setting class
- Migrate AWS settings to new settings infrastructure
- Remove `gateway.initial_meta` and always rely on min master nodes
- Rewrite SettingsFilter to be immutable
- Simplify azure settings
- Convert PageCacheRecycler settings
- Monitor settings
- Cut over tribe node settings to new settings infra
- Convert multcast plugin settings to the new infra
- Convert `request.headers.*` to the new settings infra
- Migrate Azure settings to new settings infrastructure
- Validate logger settings and allow them to be reset via API
- Switch NodeEnvironment's settings to new settings
- Simplify AutoCreateIndex and add more tests
- Convert several pending settings
- Migrate query caching settings to the new settings infra.
- Convert `action.auto_create_index` and `action.master.force_local` to the
new settings infra
- Convert `cluster.routing.allocation.type` and `processors` to the new
settings infra.
- Validate tribe node settings on startup
- Move node.client, node.data, node.master, node.local and node.mode to new
settings infra
- Moved http settings to the new settings infrastructure
- Migrate network service to the new infra
- Convert client.transport settings to new infra
- Move discovery.* settings to new Setting infrastructure
- Change over to o.e.common.settings.Setting for http settings
- Convert "path.*" and "pidfile" to new settings infra
- Migrate repository settings to the new settings API
- Convert "indices.*" settings to new infra.
- Migrate gateway settings to the new settings API.
- Convert several node and test level settings
- Run Metadata upgrade tool on every version
- Check for invalid index settings on metadata upgrade
- Validate the settings key if it's simple chars separated by `.`
- Validate known global settings on startup
- Cut over all index scope settings to the new setting infrastrucuture
- Remove updatability of `index.flush_on_close`
- Move all dynamic settings and their config classes to the index level
- Always require units for bytes and time settings
- Make MetaData parsing less lenient.
- Move async translog sync logic into IndexService
- Remove `index.merge.scheduler.notify_on_failure` and default to `true`
- Remove cache concurrency level settings that no longer apply
Similarities::
- Defining a global default similarity
Snapshot/Restore::
- Delete differing files in the store before restoring
- Adds ignoreUnavailable option to the snapshot status API
- Check restores in progress before deleting a snapshot
- Snapshot repository cleans up empty index folders
- BlobContainer#writeBlob no longer can overwrite a blob
- More resilient blob handling in snapshot repositories
- Adding repository index generational files
- Raised IOException on deleteBlob
- Adds UUIDs to snapshots
- Clarify the semantics of the BlobContainer interface
- Change BlobPath.buildAsString() method
- Remove the Snapshot class in favor of using SnapshotInfo
Stats::
- Add mem section back to cluster stats
- Add network types to cluster stats
- Add missing field type in the FieldStats response.
- Expose the ClusterInfo object in the allocation explain output
- Add total_indexing_buffer/_in_bytes to nodes info API
- Allow FieldStatsRequest to disable cache
- Remove index_writer_max_memory stat from segment stats
- Move DocStats under Engine to get more accurate numbers
- Do not return fieldstats information for fields that exist in the mapping
but not in the index.
- Add whether the shard state fetch is pending to the allocation explain API
- Add Failure Details to every NodesResponse
- Add I/O statistics on Linux
- Add points to SegmentStats.
- Remove FieldStats.Float.
- Show configured and remaining delay for an unassigned shard.
- indexing stats now contain indexing ops from recovery [ISSUE]
- Normalize unavailable load average
- Add load averages to OS stats on FreeBSD
- Expose pending cluster state queue size in node stats
Store::
- Use `mmapfs` by default.
- Remove support for legacy checksums
- Rename index folder to index_uuid
Suggesters::
- Move SuggestUtils methods to their respective caller classes
- Remove payload option from completion suggester
- Add bwc support for reading pre-5.0 completion index
Task Manager::
- Rename Task Persistence into Storing Task Results
- Fetch result when wait_for_completion
- Create get task API that falls back to the .tasks index
- Add ability to store results for long running tasks
- Move parentTaskId into TransportRequest
- Shorten the serialization of the empty TaskId
- Expose whether a task is cancellable in the _tasks list API
- Add ability to group tasks by common parent
- Add start time and duration to tasks
- Combine node name and task id into single string task id
- Add task status
- Extend tracking of parent tasks to master node, replication and broadcast
actions
Translog::
- Fsync documents in an async fashion
- Add checksumming and versions to the Translog's Checkpoint files
- Beef up Translog testing with random channel exceptions
- Do not replay into translog on local recovery
- FSync translog outside of the writers global lock
- Remove ChannelReference and simplify Views
- Simplify TranslogWriter to always write to a stream
- Remove TranslogService and fold it into synchronous IndexShard API
=== Bug fixes
Aggregations::
- Fixed writeable name from range to geo_distance
- Fix date_range aggregation to not cache if now is used
- The `top_hits` aggregation should compile scripts only once.
- Fix agg profiling when using breadth_first collect mode
- Throw exception when maxBounds greater than minBounds
- Undeprecates `aggs` in the search request
- Change how `nested` and `reverse_nested` aggs know about their nested
depth level
- Make ExtendedBounds immutable
- Aggregations fix: support include/exclude strings for IP and dates
- Fix xcontent rendering of ip terms aggs.
- Improving parsing of sigma param for Extended Stats Bucket Aggregation
- Fixes NPE when no window is specified in moving average request
- Fixes Filter and FiltersAggregation to work with empty query
- Fixes the defaults for `keyed` in the percentiles aggregations
- Correct typo in class name of StatsAggregator
Allocation::
- Keep a shadow replicas' allocation id when it is promoted to primary
- IndicesClusterStateService should clean local started when re-assigns an
initializing shard with the same aid
- IndexRoutingTable.initializeEmpty shouldn't override supplied primary
RecoverySource
- Update incoming recoveries stats when shadow replica is reinitialized
- `index.routing.allocation.initial_recovery` limits replica allocation
- Upon being elected as master, prefer joins' node info to existing cluster
state
- Fix NPE when initializing replica shard has no UnassignedInfo
- Make shard store fetch less dependent on the current cluster state, both
on master and non data nodes
- Fix recovery throttling to properly handle relocating non-primary shards
- Replica shards must be failed before primary shards
Analysis::
- Named analyzer should close the analyzer that it wraps
- Can load non-PreBuiltTokenFilter in Analyze API
- Fix analyzer alias processing
Bulk::
- Add not-null precondition check in BulkRequest
CAT API::
- Fixes cat tasks operation in detailed mode
- Add index pattern wildcards support to _cat/shards
CRUD::
- GET operations should not extract fields from `_source`.
- Squash a race condition in RefreshListeners
- Prevent TransportReplicationAction to route request based on stale local
routing table
- Resolves the conflict between alias routing and parent routing by applying
the alias routing and ignoring the parent routing.
Cache::
- Prevent requests that use scripts or now() from being cached
- Serialize index boost and phrase suggest collation keys in a consistent
order
Circuit Breakers::
- Never trip circuit breaker in liveness request
- Free bytes reserved on request breaker
Cluster::
- Fixes issue with dangling index being deleted instead of re-imported
- Allow routing table to be filtered by index pattern
- Use executor's describeTasks method to log task information in cluster
service
- Acknowledge index deletion requests based on standard cluster state
acknowledgment
- Dangling indices are not imported if a tombstone for the index exists
- Fix issue with tombstones matching active indices in cluster state
- Shard state action channel exceptions
Core::
- Makes `m` case sensitive in TimeValue
- Guard against negative result from FileStore.getUsableSpace when picking
data path for a new shard
- Handle rejected execution exception on reschedule
- Fix concurrency bug in IMC that could cause it to check too infrequently
- Iterables.flatten should not pre-cache the first iterator
- Avoid race while retiring executors
- Refactor UUID-generating methods out of Strings
- Node names cleanup
- NullPointerException from IndexingMemoryController when a version conflict
happens during recovery
- Handle RejectedExecution gracefully in TransportService during shutdown
Discovery::
- Update discovery nodes after cluster state is published
- Add current cluster state version to zen pings and use them in master
election
Engine::
- Take refresh IOExceptions into account when catching ACE in InternalEngine
- Don't suppress AlreadyClosedException
Expressions::
- replace ScriptException with a better one
Geo::
- Incomplete results when using geo_distance for large distances [ISSUE]
- Fix multi-field support for GeoPoint types
- Enforce distance in distance query is > 0 [ISSUE]
Highlighting::
- Enable BoostingQuery with FVH highlighter
Index APIs::
- Fixes active shard count check in the case of `all` shards
- Add zero-padding to auto-generated rollover index name increment
Ingest::
- no null values in ingest configuration error messages
- JSON Processor was not properly added
- Don't rebuild pipeline on every cluster state update
- Add dotexpander processor
- Fix NPE when simulating a pipeline with no id
- Change foreach processor to use ingest metadata for array element
- No other processors should be executed after on_failure is called
- rethrow script compilation exceptions into ingest configuration exceptions
- Rename from `ingest-useragent` plugin to `ingest-user-agent` and its
processor from `useragent` to `user_agent`
- Fix ignore_failure behavior in _simulate?verbose and more cleanup
- Pipeline Stats: Fix concurrent modification exception
- Validate properties values according to database type
- Ingest does not close its factories
- Handle regex parsing errors in Gsub and Grok Processors
- add on_failure exception metadata to ingest document for verbose simulate
- The IngestDocument copy constructor should make a deep copy
Inner Hits::
- Ensure that that InnerHitBuilder uses rewritten queries
Internal::
- Prevent AbstractArrays from release bytes more than once
- IndicesAliasesRequest should not implement CompositeIndicesRequest
- Ensure elasticsearch doesn't start with unuspported indices
- Remove ListTasksResponse#setDiscoveryNodes()
- Priority values should be unmodifiable
- Extract AbstractBytesReferenceTestCase
- Add XPointValues
- Fix BulkItemResponse.Failure.toString
- Enable unmap hack for java 9
- Fix issues with failed cache loads
- Allow parser to move on the START_OBJECT token when parsing search source
- Ensure searcher is release if wrapping fails
- Avoid deadlocks in Cache#computeIfAbsent
Java API::
- fix IndexResponse#toString to print out shards info
- Add NamedWriteables from plugins to TransportClient
- Fix potential NPE in SearchSourceBuilder
Java REST Client::
- Rest Client: add slash to log line when missing between host and uri
- Rest Client: HostsSniffer to set http as default scheme
Logging::
- Fix logger when you can not create an azure storage client
- Avoid unnecessary creation of prefix loggers
- Fix logging hierarchy configs
- Fix prefix logging
- Hack around Log4j bug rendering exceptions
- Avoid prematurely triggering logger initialization
- Only log running out of slots when out of slots
Mapping::
- Allow position_gap_increment for fields in indices created prior to 5.0
- Validate blank field name
- Better error message when mapping configures null
- Make doc_values accessible for _type
- Fix and test handling of `null_value`.
- Fail automatic string upgrade if the value of `index` is not recognized.
- Fix dynamic check to properly handle parents
- Fix array parsing to remove its context when finished parsing
- Disallow fielddata loading on text fields that are not indexed.
- Make dynamic template parsing less lenient.
- Fix dynamic mapper when its parent already has an update
- Fix copy_to when the target is a dynamic object field.
- Preserve existing mappings on batch mapping updates
Network::
- Fix connection close header handling
- Ensure port range is readable in the exception message
- Fix expect 100 continue header handling
- Fixes netty4 module's CORS config to use defaults
- Fix various concurrency issues in transport
- Verify lower level transport exceptions don't bubble up on disconnects
Packaging::
- [Packaging] Do not remove scripts directory on upgrade
- [Package] Remove bin/lib/modules directories on RPM uninstall/upgrade
- Fix handling of spaces for jvm.options on Windows
- Disable service in pre-uninstall
- Remove extra bin/ directory in bin folder
- Filter client/server VM options from jvm.options
- Preserve config files from RPM install
- Fix typo in message for variable setup ES_MAX_MEM
- Don't run `mkdir` when $DATA_DIR contains a comma-separated list
- Fix exit code
- Set MAX_OPEN_FILES to 65536
- [windows] Service command still had positional start command
- Do not pass double-dash arguments on startup
Parent/Child::
- Make sure that no `_parent#null` gets introduces as default _parent
mapping
Percolator::
- Fail indexing percolator queries containing either a has_child or
has_parent query
- Add support for MatchNoDocsQuery in percolator's query terms extract
service
- Let PercolatorQuery's explain use the two phase iterator
Plugin Discovery Azure Classic::
- Make discovery-azure plugin work again
Plugin Discovery EC2::
- Fix EC2 discovery settings
- Add TAG_SETTING to list of allowed tags for the ec2 discovery plugin.
- Fix EC2 Discovery settings
Plugin Discovery GCE::
- Fix NPE when GCE region is empty
Plugin Ingest Attachment::
- Adds content-length as number
Plugin Ingest GeoIp::
- [ingest-geoip] update geoip to not include null-valued results from
Plugin Lang Painless::
- Fix String Concatenation Bug In Painless
- Fix break bug in for/foreach loops.
- Fix compound assignment with string concats
- Fix horrible capture
- Fix Casting Bug
- Remove Grammar Ambiguities
- Remove if/else ANTLR ambiguity.
- Fix insanely slow compilation
- Fix Bug in Painless Assignment
- Fix bracket shortcuts
Plugin Repository Azure::
- Register group setting for repository-azure accounts
- Fix azure files removal
Plugin Repository S3::
- Fixes leading forward slash in S3 repository base_path
- Add missing permission to repository-s3
- Fix repository S3 Settings and add more tests
Plugin Store SMB::
- Fix calling ensureOpen() on the wrong directory (master forwardport)
Plugins::
- Use sysprop like with es.path.home to pass conf dir
- Quote path to java binary
- CliTool: Messages printed in Terminal should have percent char escaped
Query DSL::
- Fixes MultiMatchQuery so that it doesn't provide a null context
- Fix silently accepting malformed queries
- query_string_query should take term length into consideration when
fuzziness is auto
- Throw ParsingException if a query is wrapped in an array
- Restore parameter name auto_generate_phrase_queries
- Resolve string dates and date math to millis before evaluating for rewrite
in range query
- `constant_score` query should throw error on more than one filter
- Single IPv4 addresses in IP field term queries
- Make strategy optional in GeoShapeQueryBuilder readFrom and writeTo
Query Refactoring::
- Query refactoring: set has_parent & has_child types context properly
- Make sure equivalent geohashCellQueries are equal after toQuery called
REST::
- Remove lenient URL parameter parsing
- Fixes CORS handling so that it uses the defaults
- Get XContent params from request in Nodes rest actions
- Fixes reading of CORS pre-flight headers and methods
Recovery::
- Fix concurrency issues between cancelling a relocation and marking shard
as relocated
- Move `reset recovery` into RecoveriesCollection
- Fix replica-primary inconsistencies when indexing during primary
relocation with ongoing replica recoveries
- Invoke `IndexingOperationListeners` also when recovering from store or
remote
- Prevent interruption while store checks lucene files for consistency
- Mark shard as recovering on the cluster state thread
Reindex API::
- Fix reindex with transport client
- Fix a race condition in reindex's rethrottle
- Reindex should never report negative throttled_until
- Reindex should gracefully handle when _source is disabled
Scripting::
- Add support for booleans in scripts
- Fix Javascript OOM build Failure
- Fix propagating the default value for script settings
- Catch and wrap AssertionError and NoClassDefFoundError in groovy scripts
Search::
- Do not cache script queries.
- Throw error when trying to fetch fields from source and source is disabled
- Source filtering should keep working when the source contains numbers
greater than `Long.MAX_VALUE`.
- Fix NPE when running a range query on a `scaled_float` with no upper
bound.
- Fix NPE during search with source filtering if the source is disabled.
- Restore assignment of time value when deserializing a scroll instance
- Fix explain output for dfs query
- Don't recursively count children profile timings
- fix explain in function_score if no function filter matches
- Fix NPEs due to disabled source
- Require timeout units when parsing query body
- Close SearchContext if query rewrite failed
- Fix parsing single `rescore` element in SearchSourceBuilder
- Fail queries on not indexed fields.
- Fix for search after
- Do not be lenient when parsing CIDRs
Settings::
- Fix Setting.timeValue() method
- Add a hard limit for `index.number_of_shard`
- Include complex settings in settings requests
- Fix filter cache setting to allow percentages
- Move cluster.routing.allocation.same_shard.host setting to new settings
infrastructure
- Validate settings against dynamic updaters on the master
- Register "cloud.node.auto_attributes" setting in EC2 discovery plugin
- Use object equality to compare versions in IndexSettings
- fix exists method for list settings when using numbered setting format
- convert settings for ResourceWatcherService to new infrastructure
- Register bootstrap settings
- Add settings filtering to node info requests
- Ban write access to system properties
Snapshot/Restore::
- Better handling of an empty shard's segments_N file
- Fix race condition in snapshot initialization
- Fix the semantics for the BlobContainer interface
Stats::
- Fix FieldStats deserialization of `ip` field
- Fix serialization bug in allocation explain API.
- Allocation explain: Also serialize `includeDiskInfo` field
- Add missing builder.endObject() in FsInfo
Store::
- Tighten up concurrent store metadata listing and engine writes
- Make static Store access shard lock aware
- Catch assertion errors on commit and turn it into a real exception
Task Manager::
- Shard level tasks in Bulk Action lose reference to their parent tasks
- Take filterNodeIds into consideration while sending task requests to nodes
Term Vectors::
- Fix calculation of took time of term vectors request
Translog::
- Fix RAM usage estimation of LiveVersionMap.
- Fix translog replay multiple operations same doc
- Snapshotting and sync could cause a dead lock TranslogWriter
- Move translog recover outside of the engine
- Mark shard active during recovery; push settings after engine finally
inits
=== Regressions
Highlighting::
- Handle SynonymQuery extraction for the FastVectorHighlighter
=== Upgrades
Core::
- Upgrade to Lucene 6.2.0
- Update to jackson 2.8.1
- Upgrade to Lucene 6.1.0.
- Upgrade to lucene-6.1.0-snapshot-3a57bea.
- Upgrade to Lucene 6.0.1.
- Upgrade to lucene 6 release
- Upgrade to lucene-6.0.0-f0aa4fc.
- upgrade to lucene 6.0.0-snapshot-bea235f
- Upgrade to Jackson 2.7.1
Ingest::
- Update MaxMind geoip2 version to 2.6
Internal::
- Bump master (3.0-snapshot) to java 8
Network::
- Upgrade to Netty 4.1.5
- Dependencies: Upgrade to netty 4.1.4
- Introduce Netty 4
Packaging::
- Upgrade JNA to 4.2.2 and remove optionality
Plugin Discovery EC2::
- Update aws sdk to 1.10.69 and add use_throttle_retries repository setting
Scripting::
- Dependencies: Updates to mustache 0.9.3
Search Templates::
- Update mustache.java to version 0.9.1
5.0.1 (2016-11-02)
- Fixed performance regression in scan helper
5.0.0 (2016-10-19)
- Version compatible with elasticsearch 5.0
- when using SSL certificate validation is now on by default.
Install certifi or supply root certificate bundle.
- elasticsearch.trace logger now also logs failed requests, signature
of internal logging method log_request_fail has changed, all custom
connection classes need to be updated
- added headers arg to connections to support custom http headers
- passing in a keyword parameter with None as value will cause that
param to be ignored
-------------------------------------------
1.028003 23.10.16
* Removed AutoPrereqs from dist.ini (Mickey)
1.028002 23.10.16
* GH #53 a few small dist.ini tweaks (Karen Etheridge)
* Even more dist.ini tweaks (Mickey, thanks to Grinnz)
1.028001 22.10.16
* GH #51 Adds eumm_version to dist.ini (Olaf Alders)
* GH #52 Stop excluding cpanfile from being copied to
build (Olaf Alders)
1.028000 21.10.16
* GH #50 Remove hard-deps for HTTP::Tiny::Mech and
WWW::Mechanize::Cached (Paul Howarth)
* dist.ini: don't automatically update cpanfile (Mickey)
1.027000 20.10.16
* GH #49 Convert values of JSON::PP::Boolean objects in output
so they are not skipped when expeting scalars (Mickey)
1.026001 19.10.16
* Fixed version range for Search::Elasticsearch (Mickey)
1.026000 19.10.16
* Moved distini prereqs to cpanfile (Mickey)
* Limit Search::Elasticsearch version to 2.02 (Mickey)
* Updated docs (Thomas Sibley)
(pkgsrc changes)
- Add BUILD_DEPENDS+= p5-ExtUtils-MakeMaker>=7.11.01
------------------------------------
1.124 2016-11-05
- avoid an uninitialized warning when array_each() compares to a
non-reference (thanks, Максим Вуец!, Maxim Vuets)
-------------------------------------
1.35 2016-11-03
- readline-7.0 support
new function
rl_clear_visible_line
rl_tty_set_echoing
rl_pending_signal
new variable
rl_persistent_signal_handlers
- Gnu.xs: fix a bug of rl_readline_state variable causing on a
big-endian, sizeof(int)==4, and sizeof(long)==8 platform
with the GNU Readline Library 7.0. [rt.cpan.org #118371]
------------------------------
0.31 2016-11-05
- The stack trace contained by Specio::Exception objects no longer includes a
stack frames for the Specio::Exception package.
- Made the inline_environment() and description() methods public on type and
coercion objects.
Sigil-0.9.7
Bug Fixes
- Allow tags in the svg and mathml namespace to automtically self-close if empty to help work around
a bug in Kindlegen that will not seem to accept a closing svg image tag even though image is non-void
- Prevent TextTab from constantly recentering page when focus is lost
- Fix bugs in plugin basename_to_id when used with xpgt files or any unrecognized extensions
- Fix typos in pls mimetype in plugins
- Fix code synchronization issues among 3 places where file extensions are mapped to mimetypes
- Fix plugin readotherfile interface to rebuild the opf on the fly only if it has been modified
- Fix plugin validation issues with integer vs string representations of line number and character offsets
- Fix duplicate filename in multiple directories bug when updating CSS urls
- Fix bug in page-map.xml mimetype when "Add Existing ..." is used
- Fix undefined behaiviour shifting signed negative values in 3rdparty libs and fix many warnings
- Fix text vs binary file type recognition in the plugin interface (CSS and js files are text not binary)
- Fix too small toolbar icons on high dpi displays
- Fix bug that caused text highlighting to get lost on some systems when doing a CSS Find & Replace.
- Fix bug in plugin interface basename_to_id to recognize .htm extensions
- Fix bug in epub3 semantic popups to always refect the local name of epub:type setting
- Fix bug where creation on an HTML TOC could overwrite an existing Nav under epub3
- Fix manifest id not starting with alpha character bug
New Features
- Extend validation plugin interface with add_extended_result() method to allow better cursor positioning
- Extend TextTab and Tabs derived by it to position cursor based on offset
- Allow editing of page-map.xml files, xpgt files and other misc xml based files inside Sigil
- Update Windows builds use Python 3.5, VS2015
- Update Mac OS X builds and build instructions to use Python-3.5.2
- Remove support for python2.7 *only* plugins and simply Manage Plugins settings
- Update to Qt 5.6.1-1 with QtWebKit added back for release builds for Windows (VS2015) and Mac OSX
- Update Mac OS X and Linux build instruction documentation for recent changes
- Allow Linux Dictionaries to look up default paths for dictionaries passed in by build cmake settings
- Mke the columns in the Manage Plugins table be sortable by the user
- Better detect undefined and non-existing url fragments to prevent issues when splitting or merging files
- Make tooltips for Run Plugin Icons show the name of the selected plugin on hover
- Upgrade from jquery 1.6.2 to version 2.2.4
- Upgrade from jquery.scrollTo 1.4.2 to version 2.1.2
- Upgrade to double sized 48x48 pixel icons for High DPI displays, Special Thanks to PatNY for creating our icons
Sigil-0.9.6
Bug Fixes
- Make StdWarningDialog resizeable when "Show Details" is used
- Fix CleanSource svg prefix removal bug that sometimes broke valid svg code
- Remove svg image and html5 menuitem from the list of void elements in the Sigil and plugin code
- Properly xml escape "&" in metadata attribute values
- Properly perform source updates on epub load even when they do not follow recommended spec
- Make handling of comments in both xhtml and xml more roubst
- Properly url escape css file names to handle css files with spaces in them
- Try to make direct editing of content.opf more safe by auto-fixing errors when possible
- Properly handle WellFormed checks for pure XML in XMLResource.cpp by using embedded python3 lxml
- Make opf_newparser.py and xmlprocessor.py more robust to broken user input in content.opf
- Make ProcessXML (repairXML in xmlprocessor.py) - leave untouched anything well-formed
- Fix thinko in plugin bookcontainer.py and outputcontainer.py contributed by wrCisco
- Fix for improper encoding in plugins on Mac OS X due to missing inherited plugin environment
- Fix for typos in epub xmlns when splitting epub3 ebooks in BookView
- Update testplugin_v012.zip to testplugin_v013.zip to handle sgc-nav.css new feature
- Fix bugs in DeleteUnusedStyles when selector exists more than once in the same stylesheet
- Fix bugs in DeleteUnusedStyles when group selectors span more than one line
- Fix bugs in Reports: CSS Styles missing cases when selector exists more than once
- Fix bug in Reports: All Files to use Landmark Semantics under epub3 not guide semantics
- Fix bug, slow in loading ini when too large clipboard history, user can delete them via dialog
- Stop cosmetic double-spaces being introduced into OPF manifest, spine and guide entries
- Fix bug when user selects too much in BookView and then tries to change case
- Fix bug in Delete Unused Media when css urls do not use quotes
- Try to set all ways of updating the ncx to use 2 character indentation of head element
- Fix Building Relocatable Python on Mac build instructions to remove BeautifulSoup4 requirement
- Fix for generating empty guide for epub3 when in plugins
New Features
- create sgc-nav.css stylesheet for nav and allow templates in Prefs Dir for user to control it
- Add General Setting to allow user to set own temporary directory location
- Add Qt Stylesheet support - Recognize and load "qt_styles.qss" file if stored in Sigil Preferences folder
- Extended the plugin interface to add support for epub3 bindings elements
- Add option + forward delete shortcut to active Metadata Editor remove
[Changes contributed by Nick Morrott]
- Fix typo in POD documentation (fixes#730).
[Changes contributed by Olly Betts]
- Allow building against xapian-core 1.4.x as well as xapian-core
1.2.x.
omindex:
+ Also index leafname with _ and & replaced by spaces. Literal spaces are
often avoided in filenames, and "hello_world.txt" ought to be searchable for
via "hello" and "world". Partly addresses #618, reported by Julien
Pfefferkorn.
+ Make named entity look-up (e.g. é -> 233) use the same keyword-lookup
table approach we already use for HTML tags and built-in MIME content-types,
rather than a std::map, which makes it faster while using less memory.
+ Avoid using the shell to run most external commands as it's unnecessary
overhead. For the built-in filters, the only cases which now use a shell
are where we run two unzip commands. For user-specified commands, a simple
and slightly conservative test is used, which should avoid a shell in most
common cases where it isn't needed. Notably, environment variables set
before the command are handled.
+ Track files which couldn't be indexed in the user metadata and skip them by
default on subsequent runs to avoid the costs of repeatedly running a
filter on a file it can't handle. Run omindex with --retry-failed to retry
such files.
+ Overhaul the "per-site" terms:
- 'H' prefix is hostname as before, except that if the term would be > 240
bytes (unlikely but possible) the end is hashed is the same way 'U'
prefix terms are.
- 'P' terms are now added for every directory level, not just the start
URL's path.
- A new 'J' prefix term is added with the start URL (less any trailing
'/'), which means all files indexed from a particular "site" are now
indexed by one term. See #376.
+ Add 'skip' pseudo-mimetype which extensions can be mapped to, and they will
then be reported and skipped (to complement the existing 'ignore'
pseudo-mimetype which causes files with the specified extension to be
quietly ignored).
+ Treat a command of 'true' specially as meaning make the text extraction a
no-op (as actually running /bin/true effectively would). This provides a
way to index some file types by only meta-data. Fixes#519, reported by
Brian Burton.
+ Add support for wildcard mimetypes */* and *. Combined with filter command
``true`` for indexing by meta-data only, you can specify a fall back case
of indexing by meta-data only using ``--filter '*:true'``. From a
suggestion by Brian Burton on xapian-discuss.
+ Index message/rfc822 and message/news. These are individually saved email
messages and news articles.
+ Index archived web page formats MAFF and MHTML.
+ Handle .xla, yet another XL extension.
+ Handle metadata in LibreOffice HTML export (dcterms.subject,
dcterms.description, dcterms.creator and dcterms.contributor).
+ Use zlib's gzopen() instead of invoking "gzip -dc" for compressed Abiword
documents.
+ Add support for %f in command passed to --filter to allow specifying
commands where the input file is not the final argument. Fixed#570,
reported by Charles Atkinson.
+ Allow --filter to handle commands which produce output in a temporary file
rather than on stdout.
+ Allow --filter to specify the character set of the output the filter
produces.
+ Handle application/vnd.ms-excel, text/x-perl and application/x-dvi via
default --filter settings instead of hardcoded cases (now possible thanks
to the new abilities that --filter has).
+ Add support for specifying a MIME subtype of '*' in --filter arguments.
+ Add -track-ctime option to allow omindex to pick up changes to file
ownership and permissions.
+ Index terms from the leafname with an 'F' prefix, rather than treating them
as more body text. (Fixes#633, reported by Emmanuel Garette)
+ The starting URL wasn't previously URL encoded. In 1.2.18, a minimally
intrusive fix was implemented. In 1.3.2, we now encode the starting URL
as we do for the rest of the filename.
+ Don't assume .doc is application/msword but let libmagic decide, since .doc
files may actually be RTF, and sometimes people use .doc for plain-text
documentation.
+ Add support for indexing 'topic' and 'created date' meta-data for
OpenDocument format and HTML.
+ Index "topic" for PDF documents.
+ Commit changes and exit, rather than skipping the current file on most
unexpected errors reading directories or initialising libmagic - otherwise
we can end up deleting a lot of database entries on errors like EHOSTDOWN
when indexing network mounts.
+ Add --opendir-sleep=SECS option to allow working around problems with
indexing files on Microsoft DFS shares.
+ If we get ENOTDIR trying to index a file, skip it quietly (unless in
verbose mode) as we already do if we get ENOENT, since ENOTDIR is what we
get if the file and the directory it was in got removed between us getting
the filename and trying to open it.
+ Handle ENOENT, ENOTDIR and EACCES from readdir().
+ If we've already opened the file (as we often will have if using a modern
libmagic with magic_descriptor() available), then use fstat() on that fd
rather than stat()/lstat() on the pathname.
+ Pass error message string and errno value in ReadError exceptions.
+ Report strerror(errno) if we can't read a file.
+ Filtering via text/html now handles HTML documents which specify a charset.
+ Add support for indexing Microsoft Publisher files using pub2xhtml.
+ Restrict the length of what we consider to be an extension, currently to 7
characters or whatever the longest extension in the mime_map is if it is
longer.
+ Avoid '//' in temporary filenames (cosmetic only).
+ Extend --filter to handle commands which produce HTML on stdout.
+ Don't report an error if a file is deleted (or renamed) between us reading
the directory entry for it and trying to read the file itself by default.
In --verbose mode, the situation is still reported, but now with a
specific message.
+ If omindex receives any of the signals SIGHUP, SIGINT, SIGQUIT or SIGTERM,
then kill any active external filter child process, then handle the signal
as we did before. If setpgid() is available, put each external filter in
its own process group and kill the whole process group when we get a
signal.
+ Use magic_descriptor() if the version of libmagic we're building against
is new enough to have it. This eliminates an extra opening of a file
being indexed in certain cases.
+ Use rst2html to handle .rst and .rest files.
+ Index title with an 'S' prefix rather than no prefix.
+ If the document with the highest existing docid before the run was updated,
we were reporting it as "added", but now we correctly report it as
"updated".
+ Catch and report std::exception explicitly, so failing to allocate memory
is no longer reported as "Unknown exception".
omindex-list: New tool to list URLs of all the documents in a database
(or list of databases) indexed by omindex.
* The HTML parser now explicitly handles <APPLET>, <OBJECT> and <TR>.
* Use a generated compact and efficient table to convert HTML tag names
to enum codes - this is both faster and smaller than the approach we were
using, with the benefit that the table is auto-generated.
* Always use our built-in conversion code for the character sets it can handle
(previously we'd use iconv if available; now we only use iconv for other
character sets). This gives us more consistent results, and in particular
means we now handle BOMs better (at least when using GNU iconv).
* A lot of data labelled as "iso-8859-1" is actually "windows-1252". The two
only differ in characters which are control characters in iso-8859-1, so
assume the latter when we see the former.
scriptindex:
+ Remove special error handling case noting that index=nopos was replaced
with indexnopos - this was removed in 1.1.0 so there's been enough time to
upgrade.
omega:
* Add support for sorting by more than one value - e.g. SORT=+1,-2
* Add $msizelower and $msizeupper which provide access to the lower and upper
bounds on the number of matches.
* Add support for $set{weighting,coord}.
* Add weightingpurefilter option. Normally a query consisting only of filter
terms won't have relevance weights calculated. This new option allows you to
specify a weighting scheme to use for such queries, with the same values
supported as for the existing weighting option. For example,
$set{weightingpurefilter,coord} will weight such queries by how many filter
terms match each document.
* $filters now includes DATEVALUE, which means we'll force the first page when
reloading or changing page starting from existing URLs upon upgrade to 1.4.1,
but the exact same existing URL could be for a search without the date filter
where we want to force the first page, so there's an inherent ambiguity
there. Forcing first page in this case seems the least problematic
side-effect.
* Implement $match command for omegascript. Patch from Richhiey Thomas.
* Add optional prefix argument to $terms.
* $snippet now uses MSet::snippet() instead of the Snipper class.
* Add $contains{STRING1,STRING2}. Contributed by Ayush Gupta.
* Add support for negated boolean filter terms, specified by CGI parameter "N".
* Support a direction prefix on SORT: '+' for ascending, '-' for descending.
SORTREVERSE set to non-0 now flips the direction. Fixes#697, reported by
Andy Chilton.
* Add options argument to $transform.
* Cache compiled regexps used in $transform.
* Add $ord OmegaScript command which returns the Unicode codepoint for the
first character of a UTF-8 string.
* Add $chr OmegaScript command which returns the UTF-8 string for given Unicode
codepoint.
* Add $csv OmegaScript command which escapes a string for use as a field in a
CSV file ("always quote" mode inspired by patch from Gaurav Arora.)
* New $filters encoding which avoids collisions. We also compare CGI parameter
xFILTERS to what $filters would have returned in previous releases, so that
on upgrades old format serialised filters are handled correctly.
* Fix $jsonarray not to prepend ']' to the first array element.
* Skip weighting scheme setup for a pure date range query - it won't be
weighted anyway, so we can avoid having to parse weighting scheme parameters,
etc.
* Use value ranges when date range filtering by value. Should be more
efficient than a MatchDecider, and will automatically take advantage of any
future value range optimisations in xapian-core.
* Add default_db and default_template config options. These allow the default
template and default database name to be set via the config file, rather than
being stuck with the respective defaults of "default" and "query". Fixes
#310, reported by Marco Hennigs.
* Add support for non-exclusive filters. Fixes#234, reported by Thomas
Viehmann.
* Fix handling of multiple P.<prefix> fields - previously only the first seen
was used. These fields are also now taken into account when deciding if the
query has changed. $query now returns an OmegaScript list with one entry for
each CGI parameter passed.
* Allow setting query expansion scheme to "bo1".
* Make the $json and $jsonarray force the text to be valid UTF-8, since
otherwise the output isn't valid JSON.
* Check parameters to $set{weighting,bm25 ...} and $set{weighting,trad ...}
converted OK. Based on patch from Aarsh Shah.
* Add support to $set{weighting,...} for bb2, dlh, dph, ifb2, ineb2, inl2, lm,
pl2 when we're built against a xapian-core which is new enough to have these
schemes.
* Add $snippet to generate a snippet of text tailored to the search.
* Add new $json and $jsonarray OmegaScript commands to support producing JSON
output.
* Add $truncate command which truncates a string after a word.
* Add support for $set{weighting,tfidf} to allow the new TfIdfWeight weighting
scheme to be used.
+ DEFAULTOP now defaults to AND rather than OR, since that matches what pretty
much every search engine does these days. Closes ticket#512.
* Allow mapping a query string prefix to more than one term prefix (which
xapian-core has supported since 1.0.4).
* Add support for search inputs for multiple probabilistic prefixes, with
support for per-prefix stemmers.
* Drop legacy support for handling '.' separated terms in xP - that changed in
Omega 0.9.7, more than 5 years ago now.
* Remove support for OLDP CGI parameter which was superseded by xP
approximately a decade ago, and isn't even documented!
* Drop special handling for R-prefixed terms in $prettyterm - we stopped
generating these in Xapian 1.0.
templates:
* Lower case all HTML tags, attributes and values; explicitly close <option>
tags. Patches from Vivek Pal and Nirmal Singhania.
* Migrate Omega Templates to HTML5. Patch from Nirmal Sighania.
* templates/query: Remove stray double quote from generated URL for spelling
suggestion when THRESHOLD is set. Patch from Nirmal Singhania.
* templates/opensearch: Change response feeds to support OpenSearch 1.1.
Patch from Nirmal Singhania.
* templates/query: Fix setting setting of prefix map for P - in 1.3.2, this
would failed to also search in the subject. Now it also searches in the
subject and topic.
* templates/query:
+ We now map unprefixed queries to include S-prefixed terms to match the
change in omindex to prefixing terms from the title with S. You may want
to make the same update to your own templates.
+ Set up prefixes for 'author:' and 'title:'.
API:
* Constructing a Query for a non-reference counted PostingSource object will
now try to clone the PostingSource object (as happened in 1.3.4 and
earlier). This clone code was removed as part of the changes in 1.3.5 to
support optional reference counting of PostingSource objects, but that breaks
the case when the PostingSource object is on the stack and goes out of scope
before the Query object is used. Issue reported by Till Schäfer and analysed
by Daniel Vrátil in a bug report against Akonadi:
https://bugs.kde.org/show_bug.cgi?id=363741
* Add BM25PlusWeight class implementing the BM25+ weighting scheme, implemented
by Vivek Pal (https://github.com/xapian/xapian/pull/104).
* Add PL2PlusWeight class implementing the PL2+ weighting scheme, implemented
by Vivek Pal (https://github.com/xapian/xapian/pull/108).
* LMWeight: Implement Dir+ weighting scheme as DIRICHLET_PLUS_SMOOTHING.
Patch from Vivek Pal.
* Add CoordWeight class implementing coordinate matching. This can be useful
for specialised uses - e.g. to implement sorting by the number of matching
filters.
* DLHWeight,DPHWeight,PL2Weight: With these weighting schemes, the formulae
can give a negative weight contribution for a term in extreme cases. We
used to try to handle this by calculating a per-term lower bound on the
contribution and subtracting this from the contribution, but this idea
is fundamentally flawed as the total offset it adds to a document depends on
what combination of terms that document matches, meaning in general the
offset isn't the same for every matching document. So instead we now clamp
each term's weight contribution to be >= 0.
* TfIdfWeight: Always scale term weight by wqf - this seems the logical
approach as it matches the weighting we'd get if we weighted every non-unique
term in the query, as well as being explicit in the Piv+ formula.
* Fix OP_SCALE_WEIGHT to work with all weighting schemes - previously it was
ignored when using PL2Weight and LMWeight.
* PL2Weight: Greatly improve upper bound on weight:
+ Split the weight equation into two parts and maximise each separately as
that gives an easily solvable problem, and in common cases the maximum is
at the same value of wdfn for both parts. In a simple test, the upper
bounds are now just over double the highest weight actually achieved -
previously they were several hundred times. This approach was suggested by
Aarsh Shah in: https://github.com/xapian/xapian/pull/48
+ Improve upper bound on normalised wdf (wdfn) - when wdf_upper_bound >
doclength_lower_bound, we get a tighter bound by evaluating at
wdf=wdf_upper_bound. In a simple test, this reduces the upper bound on
wdfn by 36-64%, and the upper bound on the weight by 9-33%.
* PL2Weight: Fix calculation of upper_bound when P2>0. P2 is typically
negative, but for a very common term it can be positive and then we should
use wdfn_lower not wdfn_upper to adjust P_max.
* Weight::unserialise(): Check serialised form is empty when unserialising
parameter-free schemes BoolWeight, DLHWeight and DPHWeight.
* TermGenerator::set_stopper_strategy(): New method to control how the Stopper
object is used. Patch from Arnav Jain.
* QueryParser: Fix handling of CJK query over multiple prefixes. Previously
all the n-gram terms were AND-ed together - now we AND together for each
prefix, then OR the results. Fixes#719, reported by Aaron Li.
* Add Database::get_revision() method which provides access to the database
revision number for chert and glass, intended for use by xapiand. Marked
as experimental, so we don't have to go through the usual deprecation cycle
if this proves not to be the approach we want to take. Fixes#709,
reported by German M. Bravo.
* Mark RangeProcessor constructor as `explicit`.
* Update to Unicode 9.0.0.
* Reimplement ESet and ESetIterator as we did for MSet and MSetIterator in
1.3.5. ESetIterator internally now counts down to the end of the ESet, so
the end test is now against 0, rather than against eset.size(). And more of
the trivial methods are now inlined, which reduces the number of relocations
needed to load the library, and should give faster code which is a very
similar size to before.
* MSetIterator and ESetIterator are now STL-compatible random_access_iterators
(previously they were only bidirectional_iterators).
* TfIdfWeight: Support freq and squared IDF normalisations. Patch from Vivek
Pal.
* New Xapian::Query::OP_INVALID to provide an "invalid" query object.
* Reject OP_NEAR/OP_PHRASE with non-leaf subqueries early to avoid a
potential segmentation fault if the non-leaf subquery decayed at
just the wrong moment. See #508.
* Reduce positional queries with a MatchAll or PostingSource subquery to
MatchNothing (since these subqueries have no positional information, so
the query can't match).
* Deprecate ValueRangeProcessor and introduce new RangeProcessor class as
a replacement. RangeProcessor()::operator()() method returns Xapian::Query,
so a range can expand to any query. OP_INVALID is used to signal that
a range is not recognised. Fixes#663.
* Combining of ranges over the same quantity with OP_OR is now handled by
an explicit "grouping" parameter, with a sensible default which works
for value range queries. Boolean term prefixes and FieldProcessor now
support "grouping" too, so ranges and other filters can now be grouped
together.
* Formally deprecate WritableDatabase::flush(). The replacement commit()
method was added in 1.1.0, so code can be switched to use this and still
work with 1.2.x.
* Fix handling of a self-initialised PIMPL object (e.g. Xapian::Query q(q);).
Previously the uninitialised pointer was copied to itself, resulting in
undefined behaviour when the object was used to destroyed. This isn't
something you'd see in normal code, but it's a cheap check which can probably
be optimised away by the compiler (GCC 6 does).
* The Snipper class has been replaced with a new MSet::snippet() method.
The implementation has also been redone - the existing implementation was
slower than ideal, and didn't directly consider the query so would sometimes
selects a snippet which doesn't contain any of the query terms (which users
quite reasonably found surprising). The new implementation is faster, will
always prefer snippets containing query terms, and also understands exact
phrases and wildcards. Fixes#211.
* Add optional reference counting support for ErrorHandler, ExpandDecider,
KeyMaker, PostingSource, Stopper and TermGenerator. Fixes#186, reported
by Richard Boulton. (ErrorHandler's reference counting isn't actually used
anywhere in xapian-core currently, but means we can hook it up in 1.4.x if
ticket #3 gets addressed).
* Deprecate public member variables of PostingSource. The new getters and/or
setters added in 1.2.23 and 1.3.5 are preferred. Fixes#499, reported by
Joost Cassee.
* Reimplement MSet and MSetIterator. MSetIterator internally now counts down
to the end of the MSet, so the end test is now against 0, rather than against
mset.size(). And more of the trivial methods are now inlined, which reduces
the number of relocations needed to load the library, and should give faster
code which is a very similar size to before.
* Only issue prefetch hints for documents if MSet::fetch() is called. It's not
useful to send the prefetch hint right before the actual read, which was
happening since the implementation of prefetch hints in 1.3.4. Fixes#671,
reported by Will Greenberg.
* Fix OP_ELITE_SET selection in multi-database case - we were selecting
different sets for each subdatabase, but removing the special case check for
termfreq_max == 0 solves that.
* Remove "experimental" marker from FieldProcessor, since we're happy with the
API as-is. Reported by David Bremner on xapian-discuss.
* Remove "experimental" marker from Database::check(). We've not had any
negative feedback on the current API.
* Databse::check() now checks that doccount <= last_docid.
* Database::compact() on a WritableDatabase with uncommitted changes could
produce a corrupted output. We now throw Xapian::InvalidOperationError in
this case, with a message suggesting you either commit() or open the database
from disk to compact from. Reported by Will Greenberg on #xapian-discuss
* Add Arabic stemmer. Patch from Assem Chelli in
https://github.com/xapian/xapian/pull/45
* Improve the Arabic stopword list. Patch from Assem Chelli.
* Make functions defined in xapian/iterator.h 'inline'.
* Don't force the user to specify the metric in the geospatial API -
GreatCircleMetric is probably what most users will want, so a sensible
default.
* Xapian::DBCHECK_SHOW_BITMAP: This was added in 1.3.0 (so has never been in
a stable release) and was superseded by Xapian::DBCHECK_SHOW_FREELIST in
1.3.2, so just remove it.
* Make setting an ErrorHandler a no-op - this feature is deprecated and we're
not aware of anyone using it. We're hoping to rework ErrorHandler in 1.4.x,
which will be simpler without having to support the current behaviour as well
as the new. See #3.
* Update to Unicode 8.0.0. Fixes#680.
* Overhaul database compaction API. Add a Xapian::Database::compact() method,
with the Database object specifying the source database(s).
Xapian::Compactor is now just a functor to use if you want to control
progress reporting and/or the merging of user metadata. The existing API
has been reimplemented using the new one, but is marked as deprecated.
* Add support for a default value when sorting. Fixes#452, patch from
Richard Boulton.
* Make all functor objects non-copyable. Previously some were, some weren't,
but it's hard to correctly make use of this ability. Fixes#681.
* Fix use after free with WILDCARD_LIMIT_MOST_FREQUENT. If we tried to open a
postlist after processing such a wildcard, the postlist hint could be
pointing to a PostList object which had been deleted. Fixes#696, reported
by coventry.
* Add support for optional reference counting of MatchSpy objects.
* Improve Document::get_description() - the output is now always valid UTF-8,
doesn't contain implementation details like "Document::Internal", and more
clearly reports if the document is linked to a database.
* Remove XAPIAN_CONST_FUNCTION marker from sortable_serialise_() helper, as it
writes to the passed in buffer, so it isn't const or pure. Fixes
decvalwtsource2 testcase failure when compiled with clang.
* Make PostingSource::set_maxweight() public - it's hard to wrap for the
bindings as a protected method. Fixes#498, reported by Richard Boulton.
* Database:
+ Add new flag Xapian::DB_RETRY_LOCK which allows opening a database for
writing to wait until it can get a write lock. (Fixes#275, reported by
Richard Boulton).
+ Fix Database::get_doclength_lower_bound() over multiple databases when some
are empty or consist only of zero-length documents. Previously this would
report a lower bound of zero, now it reports the same lowest bound as a
single database containing all the same documents.
+ Database::check(): When checking a single table, handle the ".glass"
extension on glass database tables, and use the extension to guide the
decision of which backend the table is from.
* Query:
+ Add new OP_WILDCARD query operator, which expands wildcards lazily, so now
we create the PostList tree for a wildcard directly, rather than creating
an intermediate Query tree. OP_WILDCARD offers a choice of ways to limit
wildcard expansion (no limit, throw an exception, use the first N by term
name, or use the most frequent N). (See tickets #48 and #608).
* QueryParser:
+ Add new set_max_expansion() method which provides access to OP_WILDCARD's
choice of ways to limit expansion and can set limits for partial terms as
well as for wildcards. Partial terms now default to the 100 most frequent
matching terms. (Completes #608, reported by boomboo).
+ Deprecate set_max_wildcard_expansion() in favour of set_max_expansion().
* Add support for optional reference counting of FieldProcessor and
ValueRangeProcessor objects.
* Update Unicode character database to Unicode 7.0.0.
* New Xapian::Snipper class from Mihai Bivol's GSOC 2012 project. (mostly
fixes#211)
* Fix all get_description() methods to always return UTF-8 text. (fixes#620)
* Database::check():
+ Alter to take its "out" parameter as a pointer to std::ostream instead of a
reference, and make passing NULL mean "do not produce output", and make
the second and third parameters optional, defaulting to a quiet check.
+ Escape invalid UTF-8 data in keys and tags reported by xapian-check, using
the same code we use to clean up strings returned by get_description()
methods.
+ Correct failure message which talks above the root block when it's actually
testing a leaf key.
+ Rename DBCHECK_SHOW_BITMAP to DBCHECK_SHOW_FREELIST (old name still
provided for now, but flagged as deprecated - DBCHECK_SHOW_BITMAP was new
in 1.3.0, so will likely be removed before 1.4.0).
* Methods and functions which take a string to unserialise now consistently
call that parameter "serialised".
* Weight: Make number of distinct terms indexing each document and the
collection frequency of the term available to subclasses. Patch from
Gaurav Arora's Language Modelling branch.
* WritableDatabase: Add support for multiple subdatabases, and support opening
a stub database containing multiple subdatabases as a WritableDatabase.
* WritableDatabase can now be constructed from just a pathname (defaulting to
opening the database with DB_CREATE_OR_OPEN).
* WritableDatabase: Add flags which can be bitwise OR-ed into the second
argument when constructing:
+ Xapian::DB_NO_SYNC: to disable use of fsync, etc
+ Xapian::DB_DANGEROUS: to enable in-place updates
+ Xapian::DB_BACKEND_CHERT: if creating, create a chert database
+ Xapian::DB_BACKEND_GLASS: if creating, create a glass database
+ Xapian::DB_NO_TERMLIST: create a database without a termlist (see #181)
+ Xapian::DB_FULL_SYNC flag - if this is set for a database, we use the Mac
OS X F_FULL_SYNC instead of fdatasync()/fsync()/etc on the version file
when committing.
* Database: Add optional flags argument to constructor - the following can be
bitwise OR-ed into it:
+ Xapian::DB_BACKEND_CHERT (only open a chert database)
+ Xapian::DB_BACKEND_GLASS (only open a glass database)
+ Xapian::DB_BACKEND_STUB (only open a stub database)
* Xapian::Auto::open_stub() and Xapian::Chert::open() are now deprecated in
favour of these new flags.
* Add LMWeight class, which implements the Unigram Language Modelling weighting
scheme. Patch from Gaurav Arora.
* Add implementations of a number of DfR weighting schemes (BB2, DLH, DPH,
IfB2, IneB2, InL2, PL2). Patches from Aarsh Shah.
* Add support for the Bo1 query expansion scheme. Patch from Aarsh Shah.
* Add Enquire::set_time_limit() method which sets a timelimit after which
check_at_least will be disabled.
* Database: Trying to perform operations on a database with no subdatabases now
throws InvalidOperationError not DocNotFoundError.
* Query: Implement new OP_MAX query operator, which returns the maximum weight
of any of its subqueries. (see #360)
* Query: Add methods to allow introspection on Query objects - currently you
can read the leaf type/operator, how many subqueries there are, and get a
particular subquery. For a query which is a term, Query::get_terms_begin()
allows you to get the term. (see #159)
* Query: Only simplify OP_SYNONYM with a single subquery if that subquery is a
term or MatchAll.
* Avoid two vector copies when storing term positions in most common cases.
* Reimplement version functions to use a single function in libxapian which
returns a pointer to a static const struct containing the version
information, with inline wrappers in the API header which call this. This
means we only need one relocation instead of 4, reducing library load time a
little.
* Make TermGenerator flags an anonymous enum, and typedef TermGenerator::flags
to int for backward compatibility with existing user code which uses it.
* Stem: Fix incorrect Unicode codepoints for o-double-acute and u-double-acute
in the Hungarian Snowball stemmer. Reported by Tom Lane to snowball-discuss.
* Stem: Add an early english stemmer.
* Provide the stopword lists from Snowball plus an Arabic one, installed in
${prefix}/share/xapian-core/stopwords/. Patch from Assem Chelli, fixes#269.
* Improve check for direct inclusion of Xapian subheaders in user code to
catch more cases.
* Add simple API to help with creating language-idiomatic iterator wrappers
in <xapian/iterator.h>.
* Give an compilation error if user code tries to include API headers other
than xapian.h directly - these other headers are an internal implementation
detail, but experience has shown that some people try to include them
directly. Please just use '#include <xapian.h>' instead.
* Update Unicode character database to Unicode 6.2.0.
* Add FieldProcessor class (ticket#128) - currently marked as an experimental
API while we sort out how best to sort out exactly how it interacts with
other QueryParser features.
* Add implementation of several TF-IDF weighting schemes via a new TfIdfWeight
class.
* Add ExpandDeciderFilterPrefix class which only return terms with a particular
prefix. (fixes#467)
* QueryParser: Adjust handling of Unicode opening/closing double quotes - if a
quoted boolean term was started with ASCII double quote, then only ASCII
double quote can end it, as otherwise it's impossible to quote a term
containing Unicode double quotes.
* Database::check(): If the database can't be opened, don't emit a bogus
warning about there being too many documents to cross-check doclens.
* TradWeight,BM25Weight: Throw SerialisationError instead of NetworkError if
unserialise() fails.
* QueryParser: Change the default stemming strategy to STEM_SOME, to eliminate
the API gotcha that setting a stemmer is ignored until you also set a
strategy.
* Deprecate Xapian::ErrorHandler. (ticket#3)
* Stem: Generate a compact and efficient table to decode language names. This
is both faster and smaller than the approach we were using, with the added
benefit that the table is auto-generated.
* xapian.h:
+ Add check for Qt headers being included before us and defining
'slots' as a macro - if they are, give a clear error advising how to work
around this (previously compilation would fail with a confusing error).
+ Add a similar check for Wt headers which also define 'slots' as a macro
by default.
* Update Unicode character database to Unicode 6.1.0. (ticket#497)
* TermIterator returned by Enquire::get_matching_terms_begin(),
Query::get_terms_begin(), Database::synonyms_begin(),
QueryParser::stoplist_begin(), and QueryParser::unstem_begin() now stores the
list of terms to iterate much more compactly.
* QueryParser:
+ Allow Unicode curly double quote characters to start and/or end phrases.
+ The set_default_op() method will now reject operators which don't make
sense to set. The operators which are allowed are now explicitly
documented in the API docs.
* Query: The internals have been completely reimplemented (ticket#280). The
notable changes are:
+ Query objects are smaller and should be faster.
+ More readable format for Query::get_description().
+ More compact serialisation format for Query objects.
+ Query operators are no longer flattened as you build up a tree (but the
query optimiser still combines groups of the same operator). This means
that Query objects are truly immutable, and so we don't need to copy Query
objects when composing them. This should also fix a few O(n*n) cases when
building up an n-way query pair-wise. (ticket#273)
+ The Query optimiser can do a few extra optimisations.
* There's now explicit support for geospatial search (this API is currently
marked as experimental). (ticket#481)
* There's now an API (currently experimental) for checking the integrity of
databases (partly addresses ticket#238).
* Database::reopen() now returns true if the database may have been reopened
(previously it returned void). (ticket#548)
* Deprecate Xapian::timeout in favour of POSIX type useconds_t.
* Deprecate Xapian::percent and use int instead in the API and our own code.
* Deprecate Xapian::weight typedef in favour of just using double and change
all uses in the API and our own code. (ticket#560)
* Rearrange members of Xapian::Error to reduce its size (from 48 to 40 bytes on
x86-64 Linux).
* Assignment operators for PositionIterator and TermIterator now return *this
rather than void.
* PositionIterator, PostingIterator, TermIterator and ValueIterator now
handle their reference counts in hand-crafted code rather than using
intrusive_ptr/RefCntPtr, which means the compiler can inline the destructor
and default constructor, so a comparison to an end iterator should now
optimise to a simple NULL pointer check, but without the issues which the
ValueIteratorEnd_ proxy class approach had (such as not working in templates
or some cases of overload resolution).
* Enquire:
+ Previously, Enquire::get_matching_terms_begin() threw InvalidArgumentError
if the query was empty. Now we just return an end iterator, which is more
consistent with how empty queries behave elsewhere.
+ Remove the deprecated old-style match spy approach of using a MatchDecider.
* Remove deprecated Sorter class and MultiValueSorter subclass.
* Xapian::Stem:
+ Add stemmers for Armenian (hy), Basque (eu), and Catalan (ca).
+ Stem::operator= now returns a reference to the assigned-to object.
testsuite:
* OP_SCALE_WEIGHT: Check top weight is non-zero - if it is zero, tests which
try to check that OP_SCALE_WEIGHT works will always pass.
* testsuite: Check SerialisationError descriptions from Xapian::Weight
subclasses mention the weighting scheme name.
* Merge queryparsertest and termgentest into apitest. Their testcases now use
the backend manager machinery in the testharness, so we don't have to
hard-code use of inmemory and chert backends, but instead run them under all
backends which support the required features. This fixes some test failures
when both chert and glass are disabled due to trying to run spelling tests
with the inmemory backend.
* Avoid overflowing collection frequency in totaldoclen1. We're trying to test
total document length doesn't wrap, so avoid collection freq overflowing in
the process, as that triggers errors when running the testsuite under ubsan.
We should handle collection frequency overflow better, but that's a separate
issue.
* Add some test coverage for ESet::get_ebound().
* Fix testcase notermlist1 to check correct table extension - ".glass" not
".DB" (chert doesn't support DB_NO_TERMLIST).
* unittest: We can't use Assert() to unit test noexcept code as it throws an
exception if it fails. Instead set up macros to set a variable and return if
an assertion fails in a unittest testcase, and check that variable in the
harness.
* Add unit test for internal C_isupper(), etc functions.
* If command line option --verbose/-v isn't specified, set the verbosity level
from environmental variable VERBOSE.
* Re-enable replicate3 for glass, as it no longer fails.
* Add more test coverage for get_unique_terms().
* Don't leave an extra fd open when starting xapian-tcpsrv for remotetcp tests.
* Extend checkstatsweight1 to check that Weight::get_collection_freq() returns
the same number as Database::get_collection_freq().
* queryparsertest: Add testcase for FieldProcessor on boolean prefix with
quoted contents.
* queryparsertest: Enable some disabled cases which actually work (in some
cases with slightly tweaked expected answers which are equivalent to those
that were shown).
* Make use of the new writable multidatabase feature to simplify the
multi-database handling in the test harness.
* Change querypairwise1_helper to repeat the query build 100 times, as with a
fast modern machine we were sometimes trying with so many subqueries that we
would run out of stack.
* apitest: Use Xapian::Database::check() in cursordelbug1. (partly addresses
#238)
* apitest: Test Query ops with a single MatchAll subquery.
* apitest: New testcase readonlyparentdir1 to ensure that commit works with a
read-only parent directory.
* tests/generate-api_generated: Test that the string returned by a
get_description() method isn't empty.
* Use git commit hash in title of test coverage reports generated from a git
tree.
* Make unittest use the test harness, so it gets all the valgrind and fd leak
checks, and other handy features all the other tests have.
* Improve test coverage in several places.
* Compress generated HTML files in coverage report.
matcher:
* Fix stats passed to Weight with OP_SYNONYM. Previously the number of
unique terms was never calculated, and a term which matched all documents
would be optimised to an all-docs postlist, which fails to supply the
correct wdf info.
* Use floating point calculation for OR synonym freq estimates. The division
was being done as an integer division, which means the result was always
getting rounded down rather than rounded to the nearest integer.
* Fix upper bound on matches for OP_XOR. Due to a reversed conditional, the
estimate could be one too low in some cases where the XOR matched all the
documents in the database.
* Improve lower bound on matches for OP_XOR. Previously the lower bound was
always set to 0, which is valid, but we can often do better.
* Optimise value range which is a superset of the bounds. If the value
frequency is equal to the doccount, such a range is equivalent to MatchAll,
and we now avoid having to read the valuestream at all.
* Optimise OP_VALUE_RANGE when the upper bound can't be exceeded. In this
case, we now use ValueGePostList instead of ValueRangePostList.
* Streamline collation of statistics for use by weighting schemes - tests show
a 2% or so increase in speed in some cases.
* If a term matches all documents and its weight doesn't depend on its wdf, we
can optimise it to MatchAll (the previous requirement that maxpart == 0 was
unnecessarily strict).
* Fix the check for a term which matches all documents to use the sub-db
termfreq, not the combined db termfreq.
* When we optimise a postlist for a term which matches all documents to use
MatchAll, we still need to set a weight object on it to get percentages
calculated correctly.
* Drop MatchNothing subqueries in OR-like situations in add_subquery() rather
than adding them and then handling it later.
* Handle the left side of AND_NOT and AND_MAYBE being MatchNothing in
add_subquery() rather than in done().
* Handle QueryAndLike with a MatchNothing subquery in add_subquery() rather
than done().
* Query: Multi-way operators now store their subquery pointers in a custom
class rather than std::vector<Xapian::Query>. The custom class take the
same amount of space, or often less. It's particularly efficient when
there are two subqueries, which is very desirable as we no longer flatten a
subtree of the same operator as we build the query.
* Optimise an unweighted query term which matches all the documents in a
subdatabase to use the "MatchAll" postlist. (ticket#387)
glass backend:
* Fix allterms with prefix on glass with uncommitted changes. Glass aims to
flush just the relevant postlist changes in this case but the end of the
range to flush was wrong, so we'd only actually flush changes for a term
exactly matching the prefix. Fixes#721.
* Fix Database::check() parsing of glass changes file header. In practice this
was unlikely to actually cause problems.
* Make glass the default backend. The format should now be stable, except
perhaps in the unlikely event that a bug emerges which requires a format
change to address.
* Don't explicitly store the 2 byte "component_of" counter for the first
component of every Btree entry in leaf blocks - instead use one of the upper
bits of the length to store a "first component" flag. This directly saves 2
bytes per entry in the Btree, plus additional space due to fewer blocks and
fewer levels being needed as a result. This particularly helps the position
table, which has a lot of entries, many of them very small. The saving would
be expected to be a little less than the saving from the change which shaved
2 bytes of every Btree item in 1.3.4 (since that saved 2 bytes multiple times
for large entries which get split into multiple items). A simple test
suggests a saving of several percent in total DB size, which fits that. This
change reduces the maximum component size to 8194, which affects tables
with a 64KB blocksize in normal use and tables with >= 16KB blocksize with
full compaction.
* Refactor glass backend key comparison - == and < operations are replaced by
a compare() function returns negative, 0 or positive (like strcmp(), memcmp()
and std::string::compare()). This allows us to avoid a final compare to
check for equality when binary chopping, and to terminate early if the binary
chop hits the exact entry.
* If a cursor is moved to an entry which doesn't exist, we need to step back to
the first component of previous entry before we can read its tag. However we
often don't actually read its tag (e.g. if we only wanted the key), so make
this stepping back lazy so we can avoid doing it when we don't want to read
the tag.
* Avoid creating std::string objects to hold data when compressing and
decompressing tags with zlib.
* Store minimum compression length per table in the version file, with 0
meaning "don't compress". Currently you can only change this setting with a
hex editor on the file, but now it is there we can later make use of it
without needing a database format change.
* Database::check() now performs additional consistency checks for glass.
Reported by Jean-Francois Dockes and Bob Cargill via xapian-discuss.
* Database::check(): check docids don't exceed db_last_docid when checking
a single glass table.
* We now throw DatabaseCorruptError in a few cases where it's appropriate
but we didn't previously, in particular in the case where all the files in a
DB have been truncated to zero size (which makes handling of this case
consistent with chert).
* Fix compaction to a single file which already exists. This was hanging.
Noted by Will Greenberg on #xapian.
* Shave 2 bytes of every Btree item (which will probably typically reduce
database size by several percent).
* More compact item format for branch blocks - 2 bytes per item smaller. This
means each branch block can branch more ways, reducing the number of Btree
levels needed, which is especially helpful for cold-cache search times.
* Track an upper bound on spelling word frequency. This isn't currently used,
but will be useful for improving the spelling algorithm, and we want to
stabilise the glass backend format. See #225, reported by Philip Neustrom.
* Support 64-bit docids in the glass backend on-disk format. This changes the
encoding used by pack_uint_preserving_sort() to one which supports 64 bit
values, and is a byte smaller for values 16384-32767, and the same size for
all other 32 bit values. Fixes#686, from original report by James Aylett.
* Use memcpy() not memmove() when no risk of overlap.
* Store length of just the key data itself, allowing keys to be up to 255 bytes
long - the previous limit was 252.
* Change glass to store DB stats in the version file. Previously we stored
them in a special item in the postlist table, but putting them in the version
file reduces the number of block reads required to open the database, is
simpler to deal with, and means we can potentially recalculate tight upper
and lower bounds for an existing database without having to commit a new
revision.
* Add support for a single-file variant for glass. Currently such databases
can only be opened for reading - to create one you need to use
xapian-compact (or its API equivalent). You can embed such databases within
another file, and open them by passing in a file descriptor open on that file
and positioned at the offset the database starts at). Database::check() also
supports them. Fixes#666, reported by Will Greenberg (and previously
suggested on xapian-discuss by Emmanuel Engelhart).
* Avoid potential DB corruption with full-compaction when using 64K blocks.
* Where posix_fadvise() is available, use it to prefetch postlist Btree blocks
from the level below the root block which will be needed for postlists of
terms in the query, and similarly for the docdata table when MSet::fetch() is
called. Based on patch by Will Greenberg in #671.
* When reporting freelist errors during a database check, distinguish between a
block in use and in the freelist, and a block in the freelist more than once.
* Fix compaction and database checking for the change to the format of keys
in the positionlist table which happened in 1.3.2.
* After splitting a block, we always insert the new block in the parent right
after the block it was split from - there's no need to binary chop.
* Avoid infinite recursion when we hit the end of the freelist block we're
reading and the end of the block we're writing at the same time.
* Fix freelist handling to allow for the newly loaded first block of the
freelist being already used up.
* 'brass' backend renamed to 'glass' - we decided to use names in ascending
alphabetical order to make it easier to understand which backend is newest,
and since 'flint' was used recently, we skipped over 'd', 'e' and 'f'.
* Change positionlist keys to be ordered by term first rather than docid first,
which helps phrase searching significantly. For more efficient indexing,
positionlist changes are now batched up in memory and written out in key
order.
* Use a separate cursor for each position list - now we're ordering the
position B-tree by term first, phrase matching would cause a single cursor
to cycle between disparate areas of the B-tree and reread the same blocks
repeatedly.
* Reference count blocks in the btree cursor, so cursors can cheaply share
blocks. This can significantly reduce the amount of memory used by cursors
for queries which contain a lot of terms (e.g. wildcards which expand to a
lot of terms).
* Under glass, optimise the turning of a query into a postlist to reuse the
cursor blocks which are the same as the previous term's postlist. This is
particularly effective for a wildcard query which expands to a lot of terms.
* Keep track of unused blocks in the Btrees using freelists rather than
bitmaps. (fixes#40)
* Eliminate the base files, and instead store the root block and freelist
pointers in the "iamglass" file.
* When compacting, sync all the tables together at the end.
* In DB_DANGEROUS mode, update the version file in-place.
* Only actually store the document data if it is non-empty. The table which
holds the document data is now lazily created, so won't exist if you never
set the document data.
* Iterating positional data now decodes it lazily, which should speed up
phrases which include common words.
* Compress changesets in brass replication. Increments the changeset version.
Ticket #348
* Restore two missing lines in database checking where we report a block with
the wrong level.
* When checking if a block was newly allocated in this revision, just look
at its revision number rather than consulting the base file's bitmap.
remote backend:
* Improve handling of invalid remote stub entries: Entries without a colon now
give an error rather than being quietly skipped; IPv6 isn't yet supported,
but entries with IPv6 addresses now result in saner errors (previously the
colons confused the code which looks for a port number).
* Fix hook for remote support of user weighting schemes. The commented-out
code used entirely the wrong class - now we use the server object we have
access to, and forward the method to the class which needs it.
* Avoid dividing zero by zero when calculating the average length for an empty
database.
* Bump remote protocol version to 38.0, due to extra statistics being tracked
for weighting.
* Make Weight::Internal track if any max_part values are set, so we don't need
to serialise them when they've not been set.
* Prefix compress list of terms and metadata keys in the remote protocol.
This requires a remote protocol major version bump.
* When propagating exceptions from a remote backend server, the protocol now
sends a numeric code to represent which exception is being propagated, rather
than the name of the type, as a number can be turned back into an exception
with a simple switch statement and is also less data to transfer.
(ticket#471)
* Remote protocol (these changes require a protocol major version bump):
+ Unify REPLY_GREETING and REPLY_UPDATE.
+ Send (last_docid - doccount) instead of last_docid and (doclen_ubound -
doclen_lbound) instead of doclen_ubound.
* Remove special check which gives a more helpful error message when a modern
client is used against a remote server running Xapian <= 0.9.6.
chert backend:
* When using 64-bit Xapian::docid, consistently use the actual maximum valid
docid value rather instead of the maximum value the type can hold.
* Where posix_fadvise() is available, use it to prefetch postlist Btree blocks
from the level below the root block which will be needed for postlists of
terms in the query, and similarly for the record table when MSet::fetch() is
called. Based on patch by Will Greenberg in #671.
* Fix problems with get_unique_terms() on a modified chert database.
* Fix xapian-check on a single chert table, which seg faulted in 1.3.2.
* Improve DBCHECK_FIX:
+ if fixing a whole database, we now take the revision from the first table
we successfully look at, which should be correct in most cases, and is
definitely better than trying to determine the revision of each broken
table independently.
+ handle a zero-sized .DB file.
+ After we successfully regenerate baseA, remove any empty baseB file to
prevent it causing problems. Tracked down with help from Phil Hands.
* Iterating positional data now decodes it lazily, which should speed up
phrases which include common words.
flint backend:
* Remove flint backend.