- use Algorithm::Diff instead of external diff tool in synccompare;

Algorithm::Diff is embedded in synccompare to keep it self-contained
- tput must be called without redirection of stderr, otherwise it did
  not find the current number of columns


git-svn-id: https://zeitsenke.de/svn/SyncEvolution/trunk@333 15ad00c4-1369-45f4-8270-35d70d36bdcd
This commit is contained in:
Patrick Ohly 2007-03-25 16:42:27 +00:00
parent d1b2452170
commit f471c872ef
6 changed files with 2000 additions and 23 deletions

131
src/Algorithm/Artistic Normal file
View File

@ -0,0 +1,131 @@
The "Artistic License"
Preamble
The intent of this document is to state the conditions under which a
Package may be copied, such that the Copyright Holder maintains some
semblance of artistic control over the development of the package,
while giving the users of the package the right to use and distribute
the Package in a more-or-less customary fashion, plus the right to make
reasonable modifications.
Definitions:
"Package" refers to the collection of files distributed by the
Copyright Holder, and derivatives of that collection of files
created through textual modification.
"Standard Version" refers to such a Package if it has not been
modified, or has been modified in accordance with the wishes
of the Copyright Holder as specified below.
"Copyright Holder" is whoever is named in the copyright or
copyrights for the package.
"You" is you, if you're thinking about copying or distributing
this Package.
"Reasonable copying fee" is whatever you can justify on the
basis of media cost, duplication charges, time of people involved,
and so on. (You will not be required to justify it to the
Copyright Holder, but only to the computing community at large
as a market that must bear the fee.)
"Freely Available" means that no fee is charged for the item
itself, though there may be fees involved in handling the item.
It also means that recipients of the item may redistribute it
under the same conditions they received it.
1. You may make and give away verbatim copies of the source form of the
Standard Version of this Package without restriction, provided that you
duplicate all of the original copyright notices and associated disclaimers.
2. You may apply bug fixes, portability fixes and other modifications
derived from the Public Domain or from the Copyright Holder. A Package
modified in such a way shall still be considered the Standard Version.
3. You may otherwise modify your copy of this Package in any way, provided
that you insert a prominent notice in each changed file stating how and
when you changed that file, and provided that you do at least ONE of the
following:
a) place your modifications in the Public Domain or otherwise make them
Freely Available, such as by posting said modifications to Usenet or
an equivalent medium, or placing the modifications on a major archive
site such as uunet.uu.net, or by allowing the Copyright Holder to include
your modifications in the Standard Version of the Package.
b) use the modified Package only within your corporation or organization.
c) rename any non-standard executables so the names do not conflict
with standard executables, which must also be provided, and provide
a separate manual page for each non-standard executable that clearly
documents how it differs from the Standard Version.
d) make other distribution arrangements with the Copyright Holder.
4. You may distribute the programs of this Package in object code or
executable form, provided that you do at least ONE of the following:
a) distribute a Standard Version of the executables and library files,
together with instructions (in the manual page or equivalent) on where
to get the Standard Version.
b) accompany the distribution with the machine-readable source of
the Package with your modifications.
c) give non-standard executables non-standard names, and clearly
document the differences in manual pages (or equivalent), together
with instructions on where to get the Standard Version.
d) make other distribution arrangements with the Copyright Holder.
5. You may charge a reasonable copying fee for any distribution of this
Package. You may charge any fee you choose for support of this
Package. You may not charge a fee for this Package itself. However,
you may distribute this Package in aggregate with other (possibly
commercial) programs as part of a larger (possibly commercial) software
distribution provided that you do not advertise this Package as a
product of your own. You may embed this Package's interpreter within
an executable of yours (by linking); this shall be construed as a mere
form of aggregation, provided that the complete Standard Version of the
interpreter is so embedded.
6. The scripts and library files supplied as input to or produced as
output from the programs of this Package do not automatically fall
under the copyright of this Package, but belong to whoever generated
them, and may be sold commercially, and may be aggregated with this
Package. If such scripts or library files are aggregated with this
Package via the so-called "undump" or "unexec" methods of producing a
binary executable image, then distribution of such an image shall
neither be construed as a distribution of this Package nor shall it
fall under the restrictions of Paragraphs 3 and 4, provided that you do
not represent such an executable image as a Standard Version of this
Package.
7. C subroutines (or comparably compiled subroutines in other
languages) supplied by you and linked into this Package in order to
emulate subroutines and variables of the language defined by this
Package shall not be considered part of this Package, but are the
equivalent of input as in Paragraph 6, provided these subroutines do
not change the language in any way that would cause it to fail the
regression tests for the language.
8. Aggregation of this Package with a commercial distribution is always
permitted provided that the use of this Package is embedded; that is,
when no overt attempt is made to make this Package's interfaces visible
to the end user of the commercial distribution. Such use shall not be
construed as a distribution of this Package.
9. The name of the Copyright Holder may not be used to endorse or promote
products derived from this software without specific prior written permission.
10. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The End

1713
src/Algorithm/Diff.pm Normal file

File diff suppressed because it is too large Load Diff

81
src/Algorithm/README Normal file
View File

@ -0,0 +1,81 @@
This is a module for computing the difference between two files, two
strings, or any other two lists of things. It uses an intelligent
algorithm similar to (or identical to) the one used by the Unix "diff"
program. It is guaranteed to find the *smallest possible* set of
differences.
This package contains a few parts.
Algorithm::Diff is the module that contains several interfaces for which
computing the differences betwen two lists.
The several "diff" programs also included in this package use
Algorithm::Diff to find the differences and then they format the output.
Algorithm::Diff also includes some other useful functions such as "LCS",
which computes the longest common subsequence of two lists.
A::D is suitable for many uses. You can use it for finding the smallest
set of differences between two strings, or for computing the most
efficient way to update the screen if you were replacing "curses".
Algorithm::DiffOld is a previous version of the module which is included
primarilly for those wanting to use a custom comparison function rather
than a key generating function (and who don't mind the significant
performance penalty of perhaps 20-fold).
diff.pl implements a "diff" in Perl that is as simple as (was
previously) possible so that you can see how it works. The output
format is not compatible with regular "diff". It needs to be
reimplemented using the OO interface to greatly simplify the code.
diffnew.pl implements a "diff" in Perl with full bells and whistles. By
Mark-Jason, with code from cdiff.pl included.
cdiff.pl implements "diff" that generates real context diffs in either
traditional format or GNU unified format. Original contextless
"context" diff supplied by Christian Murphy. Modifications to make it
into a real full-featured diff with -c and -u options supplied by Amir
D. Karger.
Yes, you can use this program to generate patches.
OTHER RESOURCES
"Longest Common Subsequences", at
http://www.ics.uci.edu/~eppstein/161/960229.html
This code was adapted from the Smalltalk code of Mario Wolczko
<mario@wolczko.com>, which is available at
ftp://st.cs.uiuc.edu/pub/Smalltalk/MANCHESTER/manchester/4.0/diff.st
THANKS SECTION
Thanks to Ned Konz's for rewriting the module to greatly improve
performance, for maintaining it over the years, and for readilly handing
it over to me so I could plod along with my improvements.
(From Ned Konz's earlier versions):
Thanks to Mark-Jason Dominus for doing the original Perl version and
maintaining it over the last couple of years. Mark-Jason has been a huge
contributor to the Perl community and CPAN; it's because of people like
him that Perl has become a success.
Thanks to Mario Wolczko <mario@wolczko.com> for writing and making
publicly available his Smalltalk version of diff, which this Perl
version is heavily based on.
Thanks to Mike Schilli <m@perlmeister.com> for writing sdiff and
traverse_balanced and making them available for the Algorithm::Diff
distribution.
(From Mark-Jason Dominus' earlier versions):
Huge thanks to Amir Karger for adding full context diff supprt to
"cdiff.pl", and then for waiting patiently for five months while I let
it sit in a closet and didn't release it. Thank you thank you thank
you, Amir!
Thanks to Christian Murphy for adding the first context diff format
support to "cdiff.pl".

18
src/Algorithm/copyright Normal file
View File

@ -0,0 +1,18 @@
This is a subset of the original Algorithm::Diff distribution, added
here to avoid the external dependency. No other changes were made.
The original sources should always be available from the Comprehensive
Perl Archive Network (CPAN). Visit <URL:http://www.perl.com/CPAN/> to
find a CPAN site near you.
The Algorithm::Diff copyright is as follows:
| Parts Copyright (c) 2000-2004 Ned Konz. All rights reserved.
| Parts by Tye McQueen.
|
| This program is free software; you can redistribute it and/or modify it
| under the same terms as Perl.
The content of this directory is distributed under the original license:
dual-licensed under GPL-2 (../../COPYING) and Larry Wall's "Artistic
License" (./Artistic).

View File

@ -17,8 +17,10 @@ DISTCLEANFILES = synccompare
MAINTAINERCLEANFILES = Makefile.in
CLEANFILES = libstdc++.a
synccompare : normalize_vcard.pl
perl -p -e '' @MODIFY_SYNCCOMPARE@ $< >$@
# synccompare is created by replacing its 'import Algorithm::Diff;'
# with a simplified copy of Diff.pm.
synccompare : Algorithm/Diff.pm normalize_vcard.pl
perl -e '$$diff = shift; open(DIFF, "<$$diff"); ($$_) = split(/__END__/, join("", <DIFF>)); s/\*import.*//m; s/require +Exporter;//; s/^#.*\n//mg; s/ +#.*\n//mg; $$diff = $$_;' -e 'while(<>) {' @MODIFY_SYNCCOMPARE@ -e 's/use +Algorithm::Diff;/"# embedded version of Algorithm::Diff follows, copyright by the original authors\n" . $$diff . "# end of embedded Algorithm::Diff\n"/e; print;}' $+ >$@
chmod u+x $@
VOCL_SOURCES = \

View File

@ -2,6 +2,7 @@
use strict;
use encoding 'utf8';
use Algorithm::Diff;
# ignore differences caused by specific servers?
my $server = $ENV{TEST_EVOLUTION_SERVER} || "";
@ -18,9 +19,10 @@ sub Usage {
print "Also works for iCalendar files.\n";
}
# parameters: file handle with input, width to use for reformatted lines
# returns list of lines without line breaks
sub Normalize {
my $in = shift;
my $out = shift;
my $width = shift;
$_ = join( "", <$in> );
@ -191,13 +193,13 @@ sub Normalize {
push @items, ${$formatted[0]}[0];
}
print $out join( "\n\n", sort @items ), "\n";
return split( /\n/, join( "\n\n", sort @items ));
}
# number of columns available for output:
# try tput without printing the shells error if not found,
# default to 80
my $columns = `which tput >/dev/null && tput cols 2>/dev/null`;
my $columns = `which tput >/dev/null 2>/dev/null && tput cols`;
if ($? || !$columns) {
$columns = 80;
}
@ -210,32 +212,52 @@ if($#ARGV > 1) {
# comparison
my ($file1, $file2) = ($ARGV[0], $ARGV[1]);
my $tmp = $ENV{TMPDIR} || "/tmp";
my $normal1 = `mktemp $tmp/synccompare.XXXXXXXXXX`;
my $normal2 = `mktemp $tmp/synccompare.XXXXXXXXXX`;
chomp($normal1);
chomp($normal2);
open(IN1, "<:utf8", $file1) || die "$file1: $!";
open(IN2, "<:utf8", $file2) || die "$file2: $!";
open(OUT1, ">:utf8", $normal1) || die "$normal1: $!";
open(OUT2, ">:utf8", $normal2) || die "$normal2: $!";
my $singlewidth = int(($columns - 3) / 2);
$columns = $singlewidth * 2 + 3;
Normalize(*IN1{IO}, *OUT1{IO}, $singlewidth);
Normalize(*IN2{IO}, *OUT2{IO}, $singlewidth);
my @normal1 = Normalize(*IN1{IO}, $singlewidth);
my @normal2 = Normalize(*IN2{IO}, $singlewidth);
close(IN1);
close(IN2);
close(OUT1);
close(OUT2);
# Produce output where each line is marked as old (aka remove) with o,
# as new (aka added) with n, and as unchanged with u at the beginning.
# This allows simpler processing below.
$_ = `diff "--old-line-format=o %L" "--new-line-format=n %L" "--unchanged-line-format=u %L" "$normal1" "$normal2"`;
my $res = $?;
my $res = 1;
if (0) {
# $_ = `diff "--old-line-format=o %L" "--new-line-format=n %L" "--unchanged-line-format=u %L" "$normal1" "$normal2"`;
# $res = $?;
} else {
# convert into same format as diff above - this allows reusing the
# existing output formatting code
my $diffs_ref = Algorithm::Diff::sdiff(\@normal1, \@normal2);
@_ = ();
my $hunk;
foreach $hunk ( @{$diffs_ref} ) {
my ($type, $left, $right) = @{$hunk};
if ($type eq "-") {
push @_, "o $left";
} elsif ($type eq "+") {
push @_, "n $right";
} elsif ($type eq "c") {
push @_, "o $left";
push @_, "n $right";
} else {
push @_, "u $left";
}
}
$_ = join("\n", @_);
}
if ($res) {
printf "%*s | %s\n", $singlewidth, "before sync", "after sync";
printf "%*s <\n", $singlewidth, "removed during sync";
printf "%*s > %s\n", $singlewidth, "", "added during sync";
print "-" x $columns, "\n";
# fix confusing output like:
# BEGIN:VCARD BEGIN:VCARD
# > N:new;entry
@ -259,12 +281,22 @@ if($#ARGV > 1) {
# n
# n BEGIN:VCARD
#
# The alternative case is also possible:
# o END:VCARD
# o
# o BEGIN:VCARD
# o N:old;entry
# u END:VCARD
# case one above
while( s/^u BEGIN:(VCARD|VCALENDAR)\n((?:^n .*\n)+?)^n BEGIN:/n BEGIN:$1\n$2u BEGIN:/m) {}
# same for the other way around
# same for the other direction
while( s/^u BEGIN:(VCARD|VCALENDAR)\n((?:^o .*\n)+?)^o BEGIN:/o BEGIN:$1\n$2u BEGIN:/m) {}
# case two
while( s/^o END:(VCARD|VCALENDAR)\n((?:^o .*\n)+?)^u END:/u END:$1\n$2o END:/m) {}
while( s/^n END:(VCARD|VCALENDAR)\n((?:^n .*\n)+?)^u END:/u END:$1\n$2n END:/m) {}
# split at end of each record
my $spaces = " " x $singlewidth;
foreach $_ (split /(?:(?<=. END:VCARD\n)|(?<=. END:VCALENDAR\n))(?:^. \n)*/m, $_) {
@ -304,8 +336,8 @@ if($#ARGV > 1) {
}
}
unlink($normal1);
unlink($normal2);
# unlink($normal1);
# unlink($normal2);
exit($res ? 1 : 0);
} else {
# normalize
@ -317,5 +349,5 @@ if($#ARGV > 1) {
$in = *STDIN{IO};
}
Normalize($in, *STDOUT{IO}, $columns);
print STDOUT join("\n", Normalize($in, $columns)), "\n";
}