Registered by Noah Gift

Command Line Tool and Library To Eliminate Duplicates and Facilitate Intelligent Merging of Data Structures.

LITEN IS IN VERY ACTIVE DEVELOPMENT

Installation

You can use setuptools, via easy_install, to install the module, and the script to your bin directory automatically. You can download setuptools ez_setup.py here:

http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install
Just easy_install the egg for your version of python, either python2.4, or python2.5

Easy Install Example:

easy_install-2.4 http://liten.googlecode.com/files/liten-0.1.3-py2.4.egg

or

easy_install liten (This downloads the 2.5 egg from the cheesehop)

You can also visit the cheeshop as well: http://pypi.python.org/pypi/Liten/0.1.3

RPM Version Example:

Download RPM (Requires Python 2.4 or Python 2.5): http://liten.googlecode.com/files/liten-0.1.3-linux-2.6-intel.rpm

Install:

rpm -ivh liten-0.1.3-linux-2.6-intel.rpm
Debian/Ubuntu Package Version

Download the .deb (Requires Python2.5): http://liten.googlecode.com/files/liten-0.1.3-linux-2.6-intel.deb

Note: This should work on almost any Ubuntu or Debian system with Python 2.5

Install:

dpkg -i liten-0.1.3-linux-2.6-intel.deb
Download the Script

If you are in a huge rush, or don't have a package for your distribution, then you can just download liten.py and run it:

Do:
wget http://liten.googlecode.com/files/liten.py

or perhaps:

curl http://liten.googlecode.com/files/liten.py > /usr/local/bin/liten.py
Version: 0.1.3 Description

A deduplication command line tool and library. A relatively efficient algorithm based on filtering like sized bytes, and then performing a full md5 checksum, is used to determine duplicate files/file objects.

Example CLI Usage:

liten.py -s 1 /mnt/raid is equal to liten.py -s 1MB /mnt/raid
liten.py -s 1bytes /mnt/raid
liten.py -s 1KB /mnt/raid
liten.py -s 1MB /mnt/raid
liten.py -s 1GB /mnt/raid
liten.py -s 1TB /mnt/raid
Example Library Usage:

Currently Liten is optimized for CLI use, but more library friendly changes are coming.

    >>> Liten = LitenBaseClass(spath='testData')
    >>> dupeFileOne = 'testData/testDocOne.txt'
    >>> checksumOne = Liten.createChecksum(dupeFileOne)
    >>> dupeFileTwo = 'testData/testDocTwo.txt'
    >>> checksumTwo = Liten.createChecksum(dupeFileTwo)
    >>> nonDupeFile = 'testData/testDocThree_wrong_match.txt'
    >>> checksumThree = Liten.createChecksum(nonDupeFile)
    >>> checksumOne == checksumTwo
    True
    >>> checksumOne == checksumThree
    False
Tests:

Run Doctests: ./liten -t or --test
Run test_liten.py
Display Options:

STDOUT: stdout will show you duplicate file paths and sizes such as:

Printing dups over 1 MB using md5 checksum: [SIZE] [ORIG] [DUP]
7 MB Orig: /Users/ngift/Downloads/bzr-0-2.17.tar
Dupe: /Users/ngift/Downloads/bzr-0-4.17.tar
REPORT:

A report named LitenDuplicateReport??.txt will be created in your current working directory.

Duplicate Version, Path, Size, ModDate
Original, /Users/ngift/Downloads/bzr-0-2.17.tar, 7 MB, 07/10/2007 01:43:12 AM
Duplicate, /Users/ngift/Downloads/bzr-0-3.17.tar, 7 MB, 07/10/2007 01:43:27 AM
DEBUG MODE ENVIRONMENTAL VARIABLES:

To enable print statement debugging set LITEN_DEBUG to 1
To enable pdb break point debugging set LITEN_DEBUG to 2
LITEN_DEBUG_MODE = int(os.environ.get('LITEN_DEBUG', 0))
Note: When DEBUG MODE is enabled, a message will appear to standard out
[http://code.google.com/p/liten/wiki/RoadMap RoadMap]: A roadmap for Liten.

Project information

Maintainer:
Noah Gift
Driver:
Not yet selected
Licence:
MIT / X / Expat Licence

RDF metadata

View full history Series and milestones

trunk series is the current focus of development.

All code Code

Version control system:
Bazaar
Programming languages:
Python

Get Involved

  • warning
    Report a bug
  • warning
    Ask a question
  • warning
    Help translate

Downloads

Latest version is 0.1.3
released

All downloads