Commit graph

26 commits

Author SHA1 Message Date
schmonz
bdc2ac9967 Update to 1.2.18. From the changelog:
indexers:

* omindex:

  + Work around libmagic returning a MIME content-type of "Composite Document
    File V2 Document[...]" or "application/CDFV2-corrupt" by returning a more
    suitable filetype based on looking at the file's extension.

  + The starting URL wasn't previously URL encoded.  In 1.3.2, this will be
    fixed by URL encoding it as we do for the rest of the path, for the 1.2
    branch we only URL encode it if it contains a character <= 31 or at least
    one of '#', '%', ':' or '?'.  This avoids a one-off reindex of every
    document in the database in cases which work OK in practice.

  + When we skip a file because it exceeds the configured size limit, include
    that size limit in the message.

omega:

* Add support for setting the query expansion scheme to use.

portability:

* Don't compile in unixperm.cc - it isn't currently used, and it fails to build
  with mingw.  (fixes #635, reported by Alexis Denis)

* Fix warning when built with GCC 4.7.2 using -Os.

* Removed unused inline function, fixing compiler warning.
2014-07-06 15:21:32 +00:00
wiz
7eeb51b534 Bump for perl-5.20.0.
Do it for all packages that
* mention perl, or
* have a directory name starting with p5-*, or
* depend on a package starting with p5-
like last time, for 5.18, where this didn't lead to complaints.
Let me know if you have any this time.
2014-05-29 23:35:13 +00:00
schmonz
13878cb17f Update to 1.2.17. From the changelog:
documentation:

* docs/overview.html: Add Abiword as an example use of --filter, based on patch
  from Frank J Bruzzaniti (fixes#383).

portability:

* Fix "no previous declaration" warning on platforms which don't have
  mkdtemp().

indexers:

* omindex:

  + Fix off-by-one when finding documents to delete which would sometimes cause
    omindex to fail to delete documents from the database when they weren't
    refound during an index update.

  + Decode dates in xlsx files.

  + Ignore extensions 'adm', 'cur', and 'ico' by default.

  + Group-readable files which are owner-readable but not world-readable should
    still get a "readable by owner" term added.  Reported by Emmanuel Garette.

build system:

* Compress source tarballs with xz instead of gzip.

* configure: Sync compiler warning flag machinery against xapian-core.  The
  changes are special handling for clang, passing -fshow-column where
  supported, and handling for new warning flags in GCC 4.6 and 4.7.
2014-02-20 19:15:43 +00:00
schmonz
1fdd3822e7 Update to 1.2.15. From the changelog:
Omega 1.2.15 (2013-04-16):

omega:

* Don't pointlessly link utf8convert.o into the omega CGI.

Omega 1.2.14 (2013-03-14):

indexers:

* omindex:

  + Correct "max" -> "min" when reserving space for shared strings in .xlsx
    files.  This just means we now reserve a more appropriate amount of space
    to start with.

  + Ignore .com files by default.

Omega 1.2.13 (2013-01-09):

indexers:

* omindex:

  + Extracting text using external filters now works for filenames containing a
    newline character - previously the newline got lost during escaping for the
    shell.

  + Fix segfault when -F option without a ':' is passed.

  + Skip a file if we get a read error while calculating the MD5 checksum (used
    for duplicate detection) - previously we used a checksum of the file up to
    that point.

  + Avoid rereading SVG and Atom files when we calculate their MD5 checksums.

  + Improvement --help output and man page, most notably:

    - Say explicitly that --sample-size accepts the same formats as --max-size.

    - Note default size limit on files to index is unlimited.

  + When generating a sample for a CSV file, limit the size we pre-allocate to
    the CSV file size if that's smaller than the requested sample size, in case
    the user sets that limit very high.

omega:

* Fix to decode %-encoded character at the end of the query string.

Omega 1.2.12 (2012-06-27):

No changes since 1.2.11 except to bump the version - this release was made to
fix an incorrect library version information update in xapian-core 1.2.11.

Omega 1.2.11 (2012-06-26):

indexers:

* Change HTML parser's handling of multiple <body> tags and of text outside of
  <body> to match the behaviour of modern web browsers.  (ticket#599)

* omindex:

  + Add command line option to control the size of the document sample stored.
    Patch from Mihai Bivol.

  + Rework .xlsx parsing to substitute the shared strings into the positions
    they are used in, so that the sample actually matches what appears in the
    spreadsheet, and to index calculated cell contents.

  + Improve handling of headers and footers in OpenDocument documents.

  + pdftotext outputs a formfeed between each page, which messes up our "empty
    body" check, so trim any trailing formfeeds before this check.

Omega 1.2.10 (2012-05-09):

indexers:

* Add support for CDATA to HTML/XML parser.

* omindex:

  + Add --max-size option, based on patch from ndaley in ticket#587.

  + Add support for atom feed files, patch from Mihai Bivol in ticket#595.

  + If the document with the highest existing docid before the run was updated,
    we were reporting it as "added", but now we correctly report it as
    "updated".  (Backported from 1.3.0).

  + Catch and report std::exception explicitly, so failing to allocate memory
    is no longer reported as "Unknown exception".  (Backported from 1.3.0).

Omega 1.2.9 (2012-03-08):

documentation:

* docs/overview.html:

  + Document that libmagic is used to determine the MIME type if the extension
    isn't known.  Partly addresses ticket#569.

  + We now limit time as well as CPU and memory for external filters.

indexers:

* Our HTML parser now ignores sections bracketed by <!--UdmComment--> and
  <!--/UdmComment-->, like we already do for <!--htdig_noindex-->.

* omindex: Add more extensions to the default ignore list: bin dat db fon jar
  lnk pyc pyd pyo sqlite sqlite3 sqlite-journal tmp ttf
2013-06-04 21:28:26 +00:00
wiz
d2ca14a3f1 Bump all packages for perl-5.18, that
a) refer 'perl' in their Makefile, or
b) have a directory name of p5-*, or
c) have any dependency on any p5-* package

Like last time, where this caused no complaints.
2013-05-31 12:39:57 +00:00
asau
1f96787c11 Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days. 2012-10-25 06:55:37 +00:00
wiz
8b5d49eb78 Bump all packages that use perl, or depend on a p5-* package, or
are called p5-*.

I hope that's all of them.
2012-10-03 21:53:53 +00:00
wiz
ee311e3b36 Recursive bump for pcre-8.30* (shlib major change) 2012-03-03 00:11:51 +00:00
sbd
9905b0c76c Add missing sysutils/file buildlink
Bump PKGREVISION
2012-01-27 08:08:13 +00:00
schmonz
87b8c4d573 Update to 1.2.8. Changelog since 1.0.18 is way too long and highlights
aren't obvious. Lots of bug fixes.
2012-01-10 01:03:59 +00:00
sbd
45bf5505a7 Recursive bump for textproc/xapian buildlink additions. 2011-12-03 03:44:54 +00:00
wiz
410d26d738 Update to 1.0.18.
The rlimit issue adressed in patches ac,ad,ae was already addressed in
release 1.0.11, so remove them.

Omega 1.0.18 (2010-02-14):

indexers:

* Make the default charset "utf-8" not "UTF-8" as we lower case explicitly
  specified character sets to compare to see if we need to reparse.  Previously
  XML documents which explicitly specified their character set as UTF-8 would
  cause needless restart or the parser.

* omindex:

  + Increase the wdf boost for the document title from 2 to 5, since 2 isn't
    really enough.

* scriptindex:

  + Don't abort with "Unknown Exception" if indexing is disallowed or we hit
    </body> for a document which had an overridden character set.  Fixes
    ticket#410.

Omega 1.0.17 (2009-11-18):

indexers:

* omindex:

  + On Linux, change the memory limit on external filters to use _SC_PHYS_PAGES
    since _SC_AVPHYS_PAGES excludes pages used by the OS cache and so will
    often report a really low value.  Fixes Debian bug#548987 and ticket#358.

  + Fix likely crash when reading output from external filter program if read()
    is interrupted by a signal.

  + Fix potential crash when indexing PostScript files (fixed by using delete[]
    (not delete) for array allocated by new[]).

testsuite:

* utf8converttest: Charset "8859_1" isn't understood by Solaris libiconv, and
  isn't a standard charset name, so just test it when using our built-in
  converter and GNU libc.

portability:

* Fix build failure on Mac OS X 10.6.

* Also check for socketpair() in -lxnet if it isn't found without, which
  enables resource limits on external filter programs called by omindex on
  Solaris, and possibly some other platforms.  Fixes ticket#412.
2010-02-16 14:53:13 +00:00
schmonz
69188bd5aa Update to 1.0.16. From the changelog:
* Fix cross-site scripting vulnerability in reporting of exceptions
  (CVE-2009-2947).
2009-09-10 18:54:29 +00:00
schmonz
9197dbef43 Update to 1.0.15. From the changelog:
general:
* omegascript.vim: The list of OmegaScript commands in the vim mode was rather
  out of date, and a few commands were misclassified.  Fix both problems and
  avoid future recurrences by automatically generating those lists from the
  command list in query.cc.

documentation:
* omegascript.html: Document that $date uses UTC.  (ticket#314)

templates:
* query: Link to "xapian.org" rather than "www.xapian.org".
* inc/toptermsjs: Use double-quotes rather than single quotes for parameter
  values on the <script> tag.

portability:
* omindex: Implement correct handling of paths when calling external filter
  programs on Microsoft Windows.
2009-08-27 13:22:42 +00:00
schmonz
d9d04a0eb4 Update to 1.0.14:
indexers:
* omindex: Make sure that output is flushed after every message, not just after
  some of them.

portability:
* Avoid infinite loop in omindex and scriptindex when reading files under
  Cygwin with automatic end of line translation enabled.  This same bug can
  also manifest on Unix platforms if the file is truncated by another process
  while being read.
2009-07-23 19:27:21 +00:00
schmonz
27f3ec2fb6 Update to 1.0.13. From the changelog:
* omindex:
  + If the filter program needed for a file format isn't installed, report this
    explicitly when skipping subsequent files with the extension instead of
    misleadingly reporting "Unknown extension".
  + Make -s actually work as a short-form for --stemmer (as documented by
    "omindex --help" and "man omindex").
  + Drop the copyright info from the output of --version as it's perennially
    out of date and we don't report it for any other Xapian programs.
* scriptindex:
  + Add new "valuenumeric" action to add a document value using
    Xapian::sortable_serialise() to allow numeric sorting (ticket#260).
2009-07-18 22:28:28 +00:00
joerg
75ca860d36 user-destdir support 2009-07-07 21:24:04 +00:00
joerg
75fc561a65 Convert @exec/@unexec to @pkgdir or drop it. 2009-06-14 21:28:46 +00:00
joerg
73ae0afd90 Remove @dirrm entries from PLISTs 2009-06-14 18:17:11 +00:00
schmonz
0005baa74d Needs zlib. 2009-05-01 23:38:38 +00:00
schmonz
690db7101d Meant to add LICENSE in previous (gnu-gpl-v2). 2009-04-20 22:29:09 +00:00
schmonz
48b9c06813 Update to 1.0.12. From the changelog:
* $log now retries a partial write, or one interrupted by a system call.
* cgiparams.html: Note the technique of using a stub database file to allow a
  default of searching over multiple databases.
* omindex:
  + Add support for indexing Microsoft Office 2007 formats and XPS files
    (bug#290).
  + Fix the extraction of metadata from OpenDocument formats.
  + Fix "-l" which would previously always cause a segmentation fault if used
    ("--depth-limit" wasn't affected).
* Fix to compile when RLIMIT_AS isn't available (as on NetBSD and OpenBSD).
  Instead use RLIMIT_VMEM or RLIMIT_DATA if either is available, else don't try
  to limit the memory the filter process can use.
2009-04-20 22:25:38 +00:00
wiz
28cbc56325 Update to 1.0.10:
Omega 1.0.10 (2008-12-23):

build system:

* This release now uses newer versions of the autotools (autoconf 2.62 ->
  2.63; automake 1.10.1 -> 1.10.2).  The newer autoconf fixes a regression
  in autoconf 2.62 (and so Omega 1.0.7) with detecting the endian-ness of some
  platforms.

Omega 1.0.9 (2008-10-31):

documentation:

* docs/overview.html: Document HTML parsing a bit, including robots
  meta and htdig_noindex.

omega:

* omega: Catch std::exception and report what its what() method returns.

* omega: Remove undocumented and non-functional support for numeric sorting
  via CGI parameter SORT=#<slot> (SORT=<slot> works as before).

build system:

* configure: Sync warning flag handling changes from xapian-core to eliminate
  many warnings from GCC 4.3.

Omega 1.0.8 (2008-09-04):

documentation:

* Fix a few typos and improve wording in a few places.

indexers:

* omindex:

  + If the character encoding is specified using <meta http-equiv=...> in an
    HTML document then reparse the document if it isn't the encoding we're
    already using so that any preceding <title> is converted correctly
    (bug#292).

  + Convert text from meta tag parameters to UTF-8 (bug#293).

  + Handle <meta charset="..."> (new in HTML 5).

  + Fix bug in HTML tag parameter parsing which was probably just a small
    performance penalty in real world cases, but could perhaps result in
    parsing bogus extra parameters in carefully contrived situations.

portability:

* Add missing <signal.h>, noted on FreeBSD by Henrik Brix Andersen.
2009-01-07 22:40:14 +00:00
schmonz
3aca243727 Add missing dependency on Perl, found by joerg's bulk build. Bump
PKGREVISION.
2008-07-31 15:11:31 +00:00
schmonz
e16b509e2d Fix build on NetBSD (4.0, at least): include <signal.h> and avoid
RLIMIT_AS on systems without it. Also fix path to Perl interpreter
in installed scripts, and as a result, bump PKGREVISION.
2008-07-27 04:06:00 +00:00
schmonz
c2e477db4a Initial import of Omega, which operates on a set of Xapian databases.
Each database is created and updated separately using either omindex
or scriptindex. You can search these databases (or any other Xapian
database with suitable contents) via a web front-end provided by
omega, a CGI application.  A search can also be done over more than
one database at once.
2008-07-26 23:37:29 +00:00