- Path fixes to the build system the benefit of windows users (sengels)
- Clean up of class ArchiveReader
- Support for LZMA compressed streams in archives, notably .deb and .rpm
- Remove preceding ./ from file path in tar archives.
- Make parsing ar and deb files easier to abort: useful in e.g. Dolphin
- Better method of removing deleted file from the CLucene
- Do not tokenize the URL in the index to improve polling speed
- Fix the bz2 header check: more bz2 archives are recognized (pino)
- Fix infinite loop on parsing SGI image files
- Fix reading of zip files without central directory.
Add LICENSE.
This changes the buildlink3.mk files to use an include guard for the
recursive include. The use of BUILDLINK_DEPTH, BUILDLINK_DEPENDS,
BUILDLINK_PACKAGES and BUILDLINK_ORDER is handled by a single new
variable BUILDLINK_TREE. Each buildlink3.mk file adds a pair of
enter/exit marker, which can be used to reconstruct the tree and
to determine first level includes. Avoiding := for large variables
(BUILDLINK_ORDER) speeds up parse time as += has linear complexity.
The include guard reduces system time by avoiding reading files over and
over again. For complex packages this reduces both %user and %sys time to
half of the former time.
Changes:
0.6.3
- Move Strigi::DirLister in archivereader.h to ArchiveReader::DirLister.
Two class with this name were present in the code. The one in
archivereader.h was not used in any code outside of Strigi, so we are
changing it. Note that this changes means that one should not use
Strigi 0.6.2.
- Change type of EntryInfo.mtime from 'unsigned' to time_t.
- The spec of SDF files was found and used to implement a more precise
syntax check for the header of SDF files.
- Fix memory corruption bug in ArchiveReader.
- Change type of ontology entry 'exposureTime' to string. In theory
something like duration would make sense but in practice xsd:string is
the used one.
- Add a default rule to find mail box directories with pattern
'.*.directory'. Since these directory names start with a dot, they are
normally not found.
- Add '$HOME/.kde4' to the directories that are indexed by default.
- Simplify matching of file paths in the rules for including or excluding
directories from the index. The code is now more readable and easier to
maintain.
- Fix a big performance problem: Whenever a directory mtime changed, all
files inside the directory were re-indexed.
- Fix bug where a gz archive that contains a file that is identical to
the original archive is indexed over and over. The depth of nested files
that are indexed is now limited to 127.
0.6.2
- Better support for nice IO priorities on Linux (Sebastian Trueg)
- Compile with development version of CLucene (Ben van Klinken)
- Explicitly use 'unsigned char' or 'signed char' instead of 'char'
since 'char' can be either signed or unsigned on different processors.
E.g. on ARM 'char' means 'unsigned char' and on i386 'char' means
'signed char'. This changes makes libstreamanalyzer 0.6.2 binary
incompatible with versions < 0.6.0. (Jos van den OOever)
- Many CMake cleanups (Alexander Neundorf)
- 6.5x speedup of C++ comment analyzer (Jakub Stachowski)
- Various stability fixes (Jos van den Oever, Sebastian Trueg)
- Support for ePub format (Jakub Stachowski)
- Handle RIFF file with unspecified size for the RIFF packet. (Jos van
den Oever)
0.5.11
- Fix a bug that can cause a crash on an executable zip file.
- Fix parsing of empty headers when CRLFCRLF is followed by a space. In
other words, fix parsing of emails that have a space as the first
character in the body.
- Fix two broken (by design) throughanalyzers by replacing the with one
eventanalyzer.
- Updated xesam ontology to include proper ranges. This is necessary for
the Nepomuk backend but does not change anything for clucene (were all
is string anyway)
- Make sure the app can handle environments where HOME is not defined.
- Make the zip analyzer check more often if it should stop analyzing.
- Fix wrong comparison when checking if we are finished yet.
- Make the analyzer respect a configuration that only wants part of the
stream to be analyzed.
- Add an analyzer for Windows self-extracting zip archives.
- Ask the analyzerconfiguration if we should continue and put a cap on
the maximum length of stream we read
- Log parse errors in the analysisresult.
Strigi is a daemon which uses a very fast and efficient crawler that can index
data on your harddrive. Indexing operations are performed without hammering
your system, this makes Strigi the fastest and smallest desktop searching
program. Strigi can index different file formats, including the contents of
the archive files.