podiff: Implement podiff to optionally work on a word per word basis

Registered by TLE

Moved to: https://github.com/pyg3t/pyg3t/issues/22

When only a few words in a string have changed, the ordinary diff output is unnecessarily complicated for proofreading purposes. Therefore podiff should implement a word wise diff. The implementation should present an option that activates wordwise diff and possibly define a set of criterion for when to use wordwise diff and when to fall back on ordinary diff.

Blueprint information

Status:
Complete
Approver:
None
Priority:
High
Drafter:
TLE
Direction:
Approved
Assignee:
TLE
Definition:
Superseded
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
TLE

Related branches

Sprints

Whiteboard

Implementation considerations:
While GNU wdiff provides this basic word wise diff functionality as a command line tool, it would be nice to not (re)introduce a dependence of the command line in podiff. Therefore a custom implementation is the best way to go.

For the implementation it would be fruitful to look to the GNU wdiff method[1] of converting the texts to new texts with one word per line, and then using standard diff tools and reformat the output.

Unanswered questions:
Podiff uses the standard [2]difflib for ordinary diffs.
* Should worddiff functionality be implemented as a function in podiff, as a class in the podiff file or as a class in a "utilities" file.
* And if we go with a class, should we pull ordinary diff functionlity into this class along with the worddiff criterion and do all the work there?

[1] http://www.gnu.org/software/wdiff/
[2] http://docs.python.org/library/difflib.html

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.