Selaa lähdekoodia

Imported Upstream version 0.9.5

Jelmer Vernooij 11 vuotta sitten
vanhempi
commit
a20931b04a

+ 34 - 0
NEWS

@@ -1,3 +1,37 @@
+0.9.5	2014-02-23
+
+ IMPROVEMENTS
+
+ * Add porcelain 'tag'. (Ryan Faulkner)
+
+ * New module `dulwich.objectspec` for parsing strings referencing
+   objects and commit ranges. (Jelmer Vernooij)
+
+ * Add shallow branch support. (milki)
+
+ * Allow passing urllib2 `opener` into HttpGitClient.
+   (Dov Feldstern, #909037)
+
+ CHANGES
+
+ * Drop support for Python 2.4 and 2.5. (Jelmer Vernooij)
+
+ API CHANGES
+
+ * Remove long deprecated ``Repo.commit``, ``Repo.get_blob``,
+   ``Repo.tree`` and ``Repo.tag``. (Jelmer Vernooij)
+
+ * Remove long deprecated ``Repo.revision_history`` and ``Repo.ref``.
+   (Jelmer Vernooij)
+
+ * Remove long deprecated ``Tree.entries``. (Jelmer Vernooij)
+
+ BUG FIXES
+
+ * Raise KeyError rather than TypeError when passing in
+   unicode object of length 20 or 40 to Repo.__getitem__.
+   (Jelmer Vernooij)
+
 0.9.4	2013-11-30
 
  IMPROVEMENTS

+ 6 - 6
PKG-INFO

@@ -1,20 +1,20 @@
 Metadata-Version: 1.0
 Name: dulwich
-Version: 0.9.4
+Version: 0.9.5
 Summary: Python Git Library
-Home-page: http://samba.org/~jelmer/dulwich
+Home-page: https://samba.org/~jelmer/dulwich
 Author: Jelmer Vernooij
 Author-email: jelmer@samba.org
 License: GPLv2 or later
 Description: 
-              Simple Python implementation of the Git file formats and
-              protocols.
+              Python implementation of the Git file formats and protocols,
+              without the need to have git installed.
         
               All functionality is available in pure Python. Optional
               C extensions can be built for improved performance.
         
-              Dulwich takes its name from the area in London where the friendly
-              Mr. and Mrs. Git once attended a cocktail party.
+              The project is named after the part of London that Mr. and Mrs. Git live in
+              in the particular Monty Python sketch.
               
 Keywords: git
 Platform: UNKNOWN

+ 14 - 8
README

@@ -1,16 +1,16 @@
 This is the Dulwich project.
 
-It aims to give an interface to git repos (both local and remote) that doesn't
-call out to git directly but instead uses pure Python.
+It aims to provide an interface to git repos (both local and remote) that
+doesn't call out to git directly but instead uses pure Python.
 
-The project is named after the part of London that Mr. and Mrs. Git live in 
-in the particular Monty Python sketch. It is based on the Python-Git module 
-that James Westby <jw+debian@jameswestby.net> released in 2007 and now 
-maintained by Jelmer Vernooij et al.
+Homepage: https://samba.org/~jelmer/dulwich/
+Author: Jelmer Vernooij <jelmer@samba.org>
 
-Please file bugs in the Dulwich project on Launchpad: 
+The project is named after the part of London that Mr. and Mrs. Git live in
+in the particular Monty Python sketch.
 
-https://bugs.launchpad.net/dulwich/+filebug
+Further documentation
+---------------------
 
 The dulwich documentation can be found in doc/ and on the web:
 
@@ -19,3 +19,9 @@ http://www.samba.org/~jelmer/dulwich/docs/
 The API reference can be generated using pydoctor, by running "make pydoctor", or on the web:
 
 http://www.samba.org/~jelmer/dulwich/apidocs
+
+Help
+----
+
+There is a #dulwich IRC channel on Freenode, and a dulwich mailing list at
+https://launchpad.net/~dulwich-users.

+ 1 - 1
bin/dulwich

@@ -205,7 +205,7 @@ def cmd_symbolic_ref(args):
 
 def cmd_show(args):
     opts, args = getopt(args, "", [])
-    porcelain.show(".")
+    porcelain.show(".", args)
 
 
 def cmd_diff_tree(args):

+ 99 - 0
docs/tutorial/file-format.txt

@@ -0,0 +1,99 @@
+Git File format
+===============
+
+For a better understanding of Dulwich, we'll start by explaining most of the
+Git secrets.
+
+Open the ".git" folder of any Git-managed repository. You'll find folders
+like "branches", "hooks"... We're only interested in "objects" here. Open it.
+
+You'll mostly see 2 hex-digits folders. Git identifies content by its SHA-1
+digest. The 2 hex-digits plus the 38 hex-digits of files inside these folders
+form the 40 characters (or 20 bytes) id of Git objects you'll manage in
+Dulwich.
+
+We'll first study the three main objects:
+
+- The Commit;
+
+- The Tree;
+
+- The Blob.
+
+The Commit
+----------
+
+You're used to generate commits using Git. You have set up your name and
+e-mail, and you know how to see the history using ``git log``.
+
+A commit file looks like this::
+
+  commit <content length><NUL>tree <tree sha>
+  parent <parent sha>
+  [parent <parent sha> if several parents from merges]
+  author <author name> <author e-mail> <timestamp> <timezone>
+  committer <author name> <author e-mail> <timestamp> <timezone>
+ 
+  <commit message>
+
+But where are the changes you commited? The commit contains a reference to a
+tree.
+
+The Tree
+--------
+
+A tree is a collection of file information, the state of a single directory at
+a given point in time.
+
+A tree file looks like this::
+
+  tree <content length><NUL><file mode> <filename><NUL><item sha>...
+
+And repeats for every file in the tree.
+
+Note that the SHA-1 digest is in binary form here.
+
+The file mode is like the octal argument you could give to the ``chmod``
+command.  Except it is in extended form to tell regular files from
+directories and other types.
+
+We now know how our files are referenced but we haven't found their actual
+content yet. That's where the reference to a blob comes in.
+
+The Blob
+--------
+
+A blob is simply the content of files you are versionning.
+
+A blob file looks like this::
+
+  blob <content length><NUL><content>
+
+If you change a single line, another blob will be generated by Git at commit
+time. This is how Git can fastly checkout any version in time.
+
+On the opposite, several identical files with different filenames generate
+only one blob. That's mostly how renames are so cheap and efficient in Git.
+
+Dulwich Objects
+---------------
+
+Dulwich implements these three objects with an API to easily access the
+information you need, while abstracting some more secrets Git is using to
+accelerate operations and reduce space.
+
+More About Git formats
+----------------------
+
+These three objects make up most of the contents of a Git repository and are
+used for the history. They can either appear as simple files on disk (one file
+per object) or in a ``pack`` file, which is a container for a number of these
+objects.
+
+The is also an index of the current state of the working copy in the
+repository as well as files to track the existing branches and tags.
+
+For a more detailed explanation of object formats and SHA-1 digests, see:
+http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html
+
+Just note that recent versions of Git compress object files using zlib.

+ 1 - 0
docs/tutorial/index.txt

@@ -8,6 +8,7 @@ Tutorial
    :maxdepth: 2
 
    introduction
+   file-format
    repo
    object-store
    remote

+ 9 - 97
docs/tutorial/introduction.txt

@@ -3,102 +3,14 @@
 Introduction
 ============
 
-Git repository format
----------------------
+Like Git itself, Dulwich consists of two main layers; the so-called plumbing
+and the porcelain.
 
-For a better understanding of Dulwich, we'll start by explaining most of the
-Git secrets.
+The plumbing is the lower layer and it deals with the Git object database and the
+nitty gritty internals. The porcelain is roughly what you would expect to
+be exposed to as a user of the ``git`` command-like tool.
 
-Open the ".git" folder of any Git-managed repository. You'll find folders
-like "branches", "hooks"... We're only interested in "objects" here. Open it.
-
-You'll mostly see 2 hex-digits folders. Git identifies content by its SHA-1
-digest. The 2 hex-digits plus the 38 hex-digits of files inside these folders
-form the 40 characters (or 20 bytes) id of Git objects you'll manage in
-Dulwich.
-
-We'll first study the three main objects:
-
-- The Commit;
-
-- The Tree;
-
-- The Blob.
-
-The Commit
-----------
-
-You're used to generate commits using Git. You have set up your name and
-e-mail, and you know how to see the history using ``git log``.
-
-A commit file looks like this::
-
-  commit <content length><NUL>tree <tree sha>
-  parent <parent sha>
-  [parent <parent sha> if several parents from merges]
-  author <author name> <author e-mail> <timestamp> <timezone>
-  committer <author name> <author e-mail> <timestamp> <timezone>
- 
-  <commit message>
-
-But where are the changes you commited? The commit contains a reference to a
-tree.
-
-The Tree
---------
-
-A tree is a collection of file information, the state of a single directory at
-a given point in time.
-
-A tree file looks like this::
-
-  tree <content length><NUL><file mode> <filename><NUL><item sha>...
-
-And repeats for every file in the tree.
-
-Note that the SHA-1 digest is in binary form here.
-
-The file mode is like the octal argument you could give to the ``chmod``
-command.  Except it is in extended form to tell regular files from
-directories and other types.
-
-We now know how our files are referenced but we haven't found their actual
-content yet. That's where the reference to a blob comes in.
-
-The Blob
---------
-
-A blob is simply the content of files you are versionning.
-
-A blob file looks like this::
-
-  blob <content length><NUL><content>
-
-If you change a single line, another blob will be generated by Git at commit
-time. This is how Git can fastly checkout any version in time.
-
-On the opposite, several identical files with different filenames generate
-only one blob. That's mostly how renames are so cheap and efficient in Git.
-
-Dulwich Objects
----------------
-
-Dulwich implements these three objects with an API to easily access the
-information you need, while abstracting some more secrets Git is using to
-accelerate operations and reduce space.
-
-More About Git formats
-----------------------
-
-These three objects make up most of the contents of a Git repository and are
-used for the history. They can either appear as simple files on disk (one file
-per object) or in a ``pack`` file, which is a container for a number of these
-objects.
-
-The is also an index of the current state of the working copy in the
-repository as well as files to track the existing branches and tags.
-
-For a more detailed explanation of object formats and SHA-1 digests, see:
-http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html
-
-Just note that recent versions of Git compress object files using zlib.
+Dulwich has a fairly complete plumbing implementation, and only a somewhat
+smaller porcelain implementation. The porcelain code lives in
+``dulwich.porcelain``. For the large part, this tutorial introduces you to the
+internal concepts of Git and the main plumbing parts of Dulwich.

+ 4 - 4
docs/tutorial/remote.txt

@@ -54,7 +54,7 @@ which claims that the client doesn't have any objects::
    ...     def ack(self, sha): pass
    ...     def next(self): pass
 
-With the determine_wants function in place, we can now fetch a pack,
+With the ``determine_wants`` function in place, we can now fetch a pack,
 which we will write to a ``StringIO`` object::
 
    >>> from cStringIO import StringIO
@@ -70,14 +70,14 @@ which we will write to a ``StringIO`` object::
 Fetching objects into a local repository
 ----------------------------------------
 
-It also possible to fetch from a remote repository into a local repository,
-in which case dulwich takes care of providing the right graph walker, and
+It is also possible to fetch from a remote repository into a local repository,
+in which case Dulwich takes care of providing the right graph walker, and
 importing the received pack file into the local repository::
 
    >>> from dulwich.repo import Repo
    >>> local = Repo.init("local", mkdir=True)
    >>> remote_refs = client.fetch("/", local)
 
-Let's show down the server now that all tests have been run::
+Let's shut down the server now that all tests have been run::
 
    >>> dul_server.shutdown()

+ 6 - 6
dulwich.egg-info/PKG-INFO

@@ -1,20 +1,20 @@
 Metadata-Version: 1.0
 Name: dulwich
-Version: 0.9.4
+Version: 0.9.5
 Summary: Python Git Library
-Home-page: http://samba.org/~jelmer/dulwich
+Home-page: https://samba.org/~jelmer/dulwich
 Author: Jelmer Vernooij
 Author-email: jelmer@samba.org
 License: GPLv2 or later
 Description: 
-              Simple Python implementation of the Git file formats and
-              protocols.
+              Python implementation of the Git file formats and protocols,
+              without the need to have git installed.
         
               All functionality is available in pure Python. Optional
               C extensions can be built for improved performance.
         
-              Dulwich takes its name from the area in London where the friendly
-              Mr. and Mrs. Git once attended a cocktail party.
+              The project is named after the part of London that Mr. and Mrs. Git live in
+              in the particular Monty Python sketch.
               
 Keywords: git
 Platform: UNKNOWN

+ 6 - 1
dulwich.egg-info/SOURCES.txt

@@ -17,6 +17,7 @@ docs/performance.txt
 docs/protocol.txt
 docs/tutorial/Makefile
 docs/tutorial/conclusion.txt
+docs/tutorial/file-format.txt
 docs/tutorial/index.txt
 docs/tutorial/introduction.txt
 docs/tutorial/object-store.txt
@@ -40,6 +41,7 @@ dulwich/log_utils.py
 dulwich/lru_cache.py
 dulwich/object_store.py
 dulwich/objects.py
+dulwich/objectspec.py
 dulwich/pack.py
 dulwich/patch.py
 dulwich/porcelain.py
@@ -68,10 +70,12 @@ dulwich/tests/test_lru_cache.py
 dulwich/tests/test_missing_obj_finder.py
 dulwich/tests/test_object_store.py
 dulwich/tests/test_objects.py
+dulwich/tests/test_objectspec.py
 dulwich/tests/test_pack.py
 dulwich/tests/test_patch.py
 dulwich/tests/test_porcelain.py
 dulwich/tests/test_protocol.py
+dulwich/tests/test_refs.py
 dulwich/tests/test_repository.py
 dulwich/tests/test_server.py
 dulwich/tests/test_utils.py
@@ -164,4 +168,5 @@ dulwich/tests/data/tags/71/033db03a03c6a36721efcf1968dd8f8e0cf023
 dulwich/tests/data/trees/70/c190eb48fa8bbb50ddc692a17b44cb781af7f6
 examples/clone.py
 examples/config.py
-examples/diff.py
+examples/diff.py
+examples/latest_change.py

+ 1 - 1
dulwich/__init__.py

@@ -21,4 +21,4 @@
 
 """Python implementation of the Git file formats and protocols."""
 
-__version__ = (0, 9, 4)
+__version__ = (0, 9, 5)

+ 2 - 250
dulwich/_compat.py

@@ -16,259 +16,11 @@
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
 # MA  02110-1301, USA.
 
-"""Misc utilities to work with python <2.6.
+"""Misc utilities to work with python <2.7.
 
 These utilities can all be deleted when dulwich decides it wants to stop
-support for python <2.6.
+support for python <2.7.
 """
-try:
-    import hashlib
-except ImportError:
-    import sha
-
-try:
-    from urlparse import parse_qs
-except ImportError:
-    from cgi import parse_qs
-
-try:
-    from os import SEEK_CUR, SEEK_END
-except ImportError:
-    SEEK_CUR = 1
-    SEEK_END = 2
-
-import struct
-
-
-class defaultdict(dict):
-    """A python 2.4 equivalent of collections.defaultdict."""
-
-    def __init__(self, default_factory=None, *a, **kw):
-        if (default_factory is not None and
-            not hasattr(default_factory, '__call__')):
-            raise TypeError('first argument must be callable')
-        dict.__init__(self, *a, **kw)
-        self.default_factory = default_factory
-
-    def __getitem__(self, key):
-        try:
-            return dict.__getitem__(self, key)
-        except KeyError:
-            return self.__missing__(key)
-
-    def __missing__(self, key):
-        if self.default_factory is None:
-            raise KeyError(key)
-        self[key] = value = self.default_factory()
-        return value
-
-    def __reduce__(self):
-        if self.default_factory is None:
-            args = tuple()
-        else:
-            args = self.default_factory,
-        return type(self), args, None, None, self.items()
-
-    def copy(self):
-        return self.__copy__()
-
-    def __copy__(self):
-        return type(self)(self.default_factory, self)
-
-    def __deepcopy__(self, memo):
-        import copy
-        return type(self)(self.default_factory,
-                          copy.deepcopy(self.items()))
-    def __repr__(self):
-        return 'defaultdict(%s, %s)' % (self.default_factory,
-                                        dict.__repr__(self))
-
-
-def make_sha(source=''):
-    """A python2.4 workaround for the sha/hashlib module fiasco."""
-    try:
-        return hashlib.sha1(source)
-    except NameError:
-        sha1 = sha.sha(source)
-        return sha1
-
-
-def unpack_from(fmt, buf, offset=0):
-    """A python2.4 workaround for struct missing unpack_from."""
-    try:
-        return struct.unpack_from(fmt, buf, offset)
-    except AttributeError:
-        b = buf[offset:offset+struct.calcsize(fmt)]
-        return struct.unpack(fmt, b)
-
-
-try:
-    from itertools import permutations
-except ImportError:
-    # Implementation of permutations from Python 2.6 documentation:
-    # http://docs.python.org/2.6/library/itertools.html#itertools.permutations
-    # Copyright (c) 2001-2010 Python Software Foundation; All Rights Reserved
-    # Modified syntax slightly to run under Python 2.4.
-    def permutations(iterable, r=None):
-        # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
-        # permutations(range(3)) --> 012 021 102 120 201 210
-        pool = tuple(iterable)
-        n = len(pool)
-        if r is None:
-            r = n
-        if r > n:
-            return
-        indices = range(n)
-        cycles = range(n, n-r, -1)
-        yield tuple(pool[i] for i in indices[:r])
-        while n:
-            for i in reversed(range(r)):
-                cycles[i] -= 1
-                if cycles[i] == 0:
-                    indices[i:] = indices[i+1:] + indices[i:i+1]
-                    cycles[i] = n - i
-                else:
-                    j = cycles[i]
-                    indices[i], indices[-j] = indices[-j], indices[i]
-                    yield tuple(pool[i] for i in indices[:r])
-                    break
-            else:
-                return
-
-
-try:
-    all = all
-except NameError:
-    # Implementation of permutations from Python 2.6 documentation:
-    # http://docs.python.org/2.6/library/functions.html#all
-    # Copyright (c) 2001-2010 Python Software Foundation; All Rights Reserved
-    # Licensed under the Python Software Foundation License.
-    def all(iterable):
-        for element in iterable:
-            if not element:
-                return False
-        return True
-
-
-try:
-    from collections import namedtuple
-except ImportError:
-    # Recipe for namedtuple from http://code.activestate.com/recipes/500261/
-    # Copyright (c) 2007 Python Software Foundation; All Rights Reserved
-    # Licensed under the Python Software Foundation License.
-    from operator import itemgetter as _itemgetter
-    from keyword import iskeyword as _iskeyword
-    import sys as _sys
-
-    def namedtuple(typename, field_names, verbose=False, rename=False):
-        """Returns a new subclass of tuple with named fields.
-
-        >>> Point = namedtuple('Point', 'x y')
-        >>> Point.__doc__                   # docstring for the new class
-        'Point(x, y)'
-        >>> p = Point(11, y=22)             # instantiate with positional args or keywords
-        >>> p[0] + p[1]                     # indexable like a plain tuple
-        33
-        >>> x, y = p                        # unpack like a regular tuple
-        >>> x, y
-        (11, 22)
-        >>> p.x + p.y                       # fields also accessable by name
-        33
-        >>> d = p._asdict()                 # convert to a dictionary
-        >>> d['x']
-        11
-        >>> Point(**d)                      # convert from a dictionary
-        Point(x=11, y=22)
-        >>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
-        Point(x=100, y=22)
-
-        """
-
-        # Parse and validate the field names.  Validation serves two purposes,
-        # generating informative error messages and preventing template injection attacks.
-        if isinstance(field_names, basestring):
-            field_names = field_names.replace(',', ' ').split() # names separated by whitespace and/or commas
-        field_names = tuple(map(str, field_names))
-        if rename:
-            names = list(field_names)
-            seen = set()
-            for i, name in enumerate(names):
-                if (not min(c.isalnum() or c=='_' for c in name) or _iskeyword(name)
-                    or not name or name[0].isdigit() or name.startswith('_')
-                    or name in seen):
-                        names[i] = '_%d' % i
-                seen.add(name)
-            field_names = tuple(names)
-        for name in (typename,) + field_names:
-            if not min(c.isalnum() or c=='_' for c in name):
-                raise ValueError('Type names and field names can only contain alphanumeric characters and underscores: %r' % name)
-            if _iskeyword(name):
-                raise ValueError('Type names and field names cannot be a keyword: %r' % name)
-            if name[0].isdigit():
-                raise ValueError('Type names and field names cannot start with a number: %r' % name)
-        seen_names = set()
-        for name in field_names:
-            if name.startswith('_') and not rename:
-                raise ValueError('Field names cannot start with an underscore: %r' % name)
-            if name in seen_names:
-                raise ValueError('Encountered duplicate field name: %r' % name)
-            seen_names.add(name)
-
-        # Create and fill-in the class template
-        numfields = len(field_names)
-        argtxt = repr(field_names).replace("'", "")[1:-1]   # tuple repr without parens or quotes
-        reprtxt = ', '.join('%s=%%r' % name for name in field_names)
-        template = '''class %(typename)s(tuple):
-        '%(typename)s(%(argtxt)s)' \n
-        __slots__ = () \n
-        _fields = %(field_names)r \n
-        def __new__(_cls, %(argtxt)s):
-            return _tuple.__new__(_cls, (%(argtxt)s)) \n
-        @classmethod
-        def _make(cls, iterable, new=tuple.__new__, len=len):
-            'Make a new %(typename)s object from a sequence or iterable'
-            result = new(cls, iterable)
-            if len(result) != %(numfields)d:
-                raise TypeError('Expected %(numfields)d arguments, got %%d' %% len(result))
-            return result \n
-        def __repr__(self):
-            return '%(typename)s(%(reprtxt)s)' %% self \n
-        def _asdict(self):
-            'Return a new dict which maps field names to their values'
-            return dict(zip(self._fields, self)) \n
-        def _replace(_self, **kwds):
-            'Return a new %(typename)s object replacing specified fields with new values'
-            result = _self._make(map(kwds.pop, %(field_names)r, _self))
-            if kwds:
-                raise ValueError('Got unexpected field names: %%r' %% kwds.keys())
-            return result \n
-        def __getnewargs__(self):
-            return tuple(self) \n\n''' % locals()
-        for i, name in enumerate(field_names):
-            template += '        %s = _property(_itemgetter(%d))\n' % (name, i)
-        if verbose:
-            print template
-
-        # Execute the template string in a temporary namespace
-        namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename,
-                         _property=property, _tuple=tuple)
-        try:
-            exec template in namespace
-        except SyntaxError, e:
-            raise SyntaxError(e.message + ':\n' + template)
-        result = namespace[typename]
-
-        # For pickling to work, the __module__ variable needs to be set to the frame
-        # where the named tuple is created.  Bypass this step in enviroments where
-        # sys._getframe is not defined (Jython for example) or sys._getframe is not
-        # defined for arguments greater than 0 (IronPython).
-        try:
-            result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__')
-        except (AttributeError, ValueError):
-            pass
-
-        return result
-
 
 # Backport of OrderedDict() class that runs on Python 2.4, 2.5, 2.6, 2.7 and pypy.
 # Passes Python2.7's test suite and incorporates all the latest updates.

+ 18 - 15
dulwich/client.py

@@ -55,6 +55,7 @@ from dulwich.protocol import (
     _RBUFSIZE,
     PktLineParser,
     Protocol,
+    ProtocolFile,
     TCP_GIT_PORT,
     ZERO_SHA,
     extract_capabilities,
@@ -582,7 +583,7 @@ class TCPGitClient(TraditionalGitClient):
             try:
                 s.connect(sockaddr)
                 break
-            except socket.error, err:
+            except socket.error as err:
                 if s is not None:
                     s.close()
                 s = None
@@ -697,7 +698,15 @@ class LocalGitClient(GitClient):
         :param pack_data: Callback called for each bit of data in the pack
         :param progress: Callback for progress reports (strings)
         """
-        raise NotImplementedError(self.fetch_pack)
+        from dulwich.repo import Repo
+        r = Repo(path)
+        objects_iter = r.fetch_objects(determine_wants, graph_walker, progress)
+
+        # Did the process short-circuit (e.g. in a stateless RPC call)? Note
+        # that the client still expects a 0-object pack in most cases.
+        if objects_iter is None:
+            return
+        write_pack_objects(ProtocolFile(None, pack_data), objects_iter)
 
 
 # What Git client to use for local access
@@ -885,9 +894,13 @@ class SSHGitClient(TraditionalGitClient):
 
 class HttpGitClient(GitClient):
 
-    def __init__(self, base_url, dumb=None, *args, **kwargs):
+    def __init__(self, base_url, dumb=None, opener=None, *args, **kwargs):
         self.base_url = base_url.rstrip("/") + "/"
         self.dumb = dumb
+        if opener is None:
+            self.opener = urllib2.build_opener()
+        else:
+            self.opener = opener
         GitClient.__init__(self, *args, **kwargs)
 
     def _get_url(self, path):
@@ -896,24 +909,14 @@ class HttpGitClient(GitClient):
     def _http_request(self, url, headers={}, data=None):
         req = urllib2.Request(url, headers=headers, data=data)
         try:
-            resp = self._perform(req)
-        except urllib2.HTTPError, e:
+            resp = self.opener.open(req)
+        except urllib2.HTTPError as e:
             if e.code == 404:
                 raise NotGitRepository()
             if e.code != 200:
                 raise GitProtocolError("unexpected http response %d" % e.code)
         return resp
 
-    def _perform(self, req):
-        """Perform a HTTP request.
-
-        This is provided so subclasses can provide their own version.
-
-        :param req: urllib2.Request instance
-        :return: matching response
-        """
-        return urllib2.urlopen(req)
-
     def _discover_references(self, service, url):
         assert url[-1] == "/"
         url = urlparse.urljoin(url, "info/refs")

+ 1 - 1
dulwich/config.py

@@ -359,7 +359,7 @@ class StackedConfig(Config):
         for path in paths:
             try:
                 cf = ConfigFile.from_path(path)
-            except (IOError, OSError), e:
+            except (IOError, OSError) as e:
                 if e.errno != errno.ENOENT:
                     raise
                 else:

+ 4 - 7
dulwich/diff_tree.py

@@ -18,18 +18,15 @@
 
 """Utilities for diffing files and trees."""
 
-try:
-    from collections import defaultdict
-except ImportError:
-    from dulwich._compat import defaultdict
+from collections import (
+    defaultdict,
+    namedtuple,
+    )
 
 from cStringIO import StringIO
 import itertools
 import stat
 
-from dulwich._compat import (
-    namedtuple,
-    )
 from dulwich.objects import (
     S_ISGITLINK,
     TreeEntry,

+ 7 - 7
dulwich/file.py

@@ -26,7 +26,7 @@ def ensure_dir_exists(dirname):
     """Ensure a directory exists, creating if necessary."""
     try:
         os.makedirs(dirname)
-    except OSError, e:
+    except OSError as e:
         if e.errno != errno.EEXIST:
             raise
 
@@ -36,7 +36,7 @@ def fancy_rename(oldname, newname):
     if not os.path.exists(newname):
         try:
             os.rename(oldname, newname)
-        except OSError, e:
+        except OSError as e:
             raise
         return
 
@@ -45,17 +45,17 @@ def fancy_rename(oldname, newname):
         (fd, tmpfile) = tempfile.mkstemp(".tmp", prefix=oldname+".", dir=".")
         os.close(fd)
         os.remove(tmpfile)
-    except OSError, e:
+    except OSError as e:
         # either file could not be created (e.g. permission problem)
         # or could not be deleted (e.g. rude virus scanner)
         raise
     try:
         os.rename(newname, tmpfile)
-    except OSError, e:
+    except OSError as e:
         raise   # no rename occurred
     try:
         os.rename(oldname, newname)
-    except OSError, e:
+    except OSError as e:
         os.rename(tmpfile, newname)
         raise
     os.remove(tmpfile)
@@ -123,7 +123,7 @@ class _GitFile(object):
         try:
             os.remove(self._lockfilename)
             self._closed = True
-        except OSError, e:
+        except OSError as e:
             # The file may have been removed already, which is ok.
             if e.errno != errno.ENOENT:
                 raise
@@ -146,7 +146,7 @@ class _GitFile(object):
         try:
             try:
                 os.rename(self._lockfilename, self._filename)
-            except OSError, e:
+            except OSError as e:
                 # Windows versions prior to Vista don't support atomic renames
                 if e.errno != errno.EEXIST:
                     raise

+ 1 - 1
dulwich/index.py

@@ -423,7 +423,7 @@ def build_index_from_tree(prefix, index_path, object_store, tree_id,
             src_path = object_store[entry.sha].as_raw_string()
             try:
                 os.symlink(src_path, full_path)
-            except OSError, e:
+            except OSError as e:
                 if e.errno == errno.EEXIST:
                     os.unlink(full_path)
                     os.symlink(src_path, full_path)

+ 29 - 16
dulwich/object_store.py

@@ -162,7 +162,8 @@ class BaseObjectStore(object):
                 yield entry
 
     def find_missing_objects(self, haves, wants, progress=None,
-                             get_tagged=None):
+                             get_tagged=None,
+                             get_parents=lambda commit: commit.parents):
         """Find the missing objects required for a set of revisions.
 
         :param haves: Iterable over SHAs already in common.
@@ -171,9 +172,10 @@ class BaseObjectStore(object):
             updated progress strings.
         :param get_tagged: Function that returns a dict of pointed-to sha -> tag
             sha for including tags.
+        :param get_parents: Optional function for getting the parents of a commit.
         :return: Iterator over (sha, path) pairs.
         """
-        finder = MissingObjectFinder(self, haves, wants, progress, get_tagged)
+        finder = MissingObjectFinder(self, haves, wants, progress, get_tagged, get_parents=get_parents)
         return iter(finder.next, None)
 
     def find_common_revisions(self, graphwalker):
@@ -215,12 +217,14 @@ class BaseObjectStore(object):
             obj = self[sha]
         return obj
 
-    def _collect_ancestors(self, heads, common=set()):
+    def _collect_ancestors(self, heads, common=set(),
+                           get_parents=lambda commit: commit.parents):
         """Collect all ancestors of heads up to (excluding) those in common.
 
         :param heads: commits to start from
         :param common: commits to end at, or empty set to walk repository
             completely
+        :param get_parents: Optional function for getting the parents of a commit.
         :return: a tuple (A, B) where A - all commits reachable
             from heads but not present in common, B - common (shared) elements
             that are directly reachable from heads
@@ -236,7 +240,7 @@ class BaseObjectStore(object):
             elif e not in commits:
                 commits.add(e)
                 cmt = self[e]
-                queue.extend(cmt.parents)
+                queue.extend(get_parents(cmt))
         return (commits, bases)
 
     def close(self):
@@ -408,6 +412,9 @@ class DiskObjectStore(PackBasedObjectStore):
         self._pack_cache_time = 0
         self._alternates = None
 
+    def __repr__(self):
+        return "<%s(%r)>" % (self.__class__.__name__, self.path)
+
     @property
     def alternates(self):
         if self._alternates is not None:
@@ -421,7 +428,7 @@ class DiskObjectStore(PackBasedObjectStore):
         try:
             f = GitFile(os.path.join(self.path, "info", "alternates"),
                     'rb')
-        except (OSError, IOError), e:
+        except (OSError, IOError) as e:
             if e.errno == errno.ENOENT:
                 return []
             raise
@@ -444,7 +451,7 @@ class DiskObjectStore(PackBasedObjectStore):
         """
         try:
             os.mkdir(os.path.join(self.path, "info"))
-        except OSError, e:
+        except OSError as e:
             if e.errno != errno.EEXIST:
                 raise
         alternates_path = os.path.join(self.path, "info/alternates")
@@ -452,7 +459,7 @@ class DiskObjectStore(PackBasedObjectStore):
         try:
             try:
                 orig_f = open(alternates_path, 'rb')
-            except (OSError, IOError), e:
+            except (OSError, IOError) as e:
                 if e.errno != errno.ENOENT:
                     raise
             else:
@@ -478,7 +485,7 @@ class DiskObjectStore(PackBasedObjectStore):
                 if name.startswith("pack-") and name.endswith(".pack"):
                     filename = os.path.join(self.pack_dir, name)
                     pack_files.append((os.stat(filename).st_mtime, filename))
-        except OSError, e:
+        except OSError as e:
             if e.errno == errno.ENOENT:
                 return []
             raise
@@ -497,7 +504,7 @@ class DiskObjectStore(PackBasedObjectStore):
     def _pack_cache_stale(self):
         try:
             return os.stat(self.pack_dir).st_mtime > self._pack_cache_time
-        except OSError, e:
+        except OSError as e:
             if e.errno == errno.ENOENT:
                 return True
             raise
@@ -517,7 +524,7 @@ class DiskObjectStore(PackBasedObjectStore):
         path = self._get_shafile_path(sha)
         try:
             return ShaFile.from_path(path)
-        except (OSError, IOError), e:
+        except (OSError, IOError) as e:
             if e.errno == errno.ENOENT:
                 return None
             raise
@@ -663,7 +670,7 @@ class DiskObjectStore(PackBasedObjectStore):
         dir = os.path.join(self.path, obj.id[:2])
         try:
             os.mkdir(dir)
-        except OSError, e:
+        except OSError as e:
             if e.errno != errno.EEXIST:
                 raise
         path = os.path.join(dir, obj.id[2:])
@@ -679,7 +686,7 @@ class DiskObjectStore(PackBasedObjectStore):
     def init(cls, path):
         try:
             os.mkdir(path)
-        except OSError, e:
+        except OSError as e:
             if e.errno != errno.EEXIST:
                 raise
         os.mkdir(os.path.join(path, "info"))
@@ -970,12 +977,14 @@ class MissingObjectFinder(object):
     :param progress: Optional function to report progress to.
     :param get_tagged: Function that returns a dict of pointed-to sha -> tag
         sha for including tags.
+    :param get_parents: Optional function for getting the parents of a commit.
     :param tagged: dict of pointed-to sha -> tag sha for including tags
     """
 
     def __init__(self, object_store, haves, wants, progress=None,
-                 get_tagged=None):
+            get_tagged=None, get_parents=lambda commit: commit.parents):
         self.object_store = object_store
+        self._get_parents = get_parents
         # process Commits and Tags differently
         # Note, while haves may list commits/tags not available locally,
         # and such SHAs would get filtered out by _split_commits_and_tags,
@@ -987,12 +996,16 @@ class MissingObjectFinder(object):
                 _split_commits_and_tags(object_store, wants, False)
         # all_ancestors is a set of commits that shall not be sent
         # (complete repository up to 'haves')
-        all_ancestors = object_store._collect_ancestors(have_commits)[0]
+        all_ancestors = object_store._collect_ancestors(
+                have_commits,
+                get_parents=self._get_parents)[0]
         # all_missing - complete set of commits between haves and wants
         # common - commits from all_ancestors we hit into while
         # traversing parent hierarchy of wants
-        missing_commits, common_commits = \
-            object_store._collect_ancestors(want_commits, all_ancestors)
+        missing_commits, common_commits = object_store._collect_ancestors(
+            want_commits,
+            all_ancestors,
+            get_parents=self._get_parents);
         self.sha_done = set()
         # Now, fill sha_done with commits and revisions of
         # files and directories known to be both locally

+ 10 - 28
dulwich/objects.py

@@ -23,6 +23,7 @@ import binascii
 from cStringIO import (
     StringIO,
     )
+from collections import namedtuple
 import os
 import posixpath
 import stat
@@ -38,10 +39,7 @@ from dulwich.errors import (
     ObjectFormatException,
     )
 from dulwich.file import GitFile
-from dulwich._compat import (
-    make_sha,
-    namedtuple,
-    )
+from hashlib import sha1
 
 ZERO_SHA = "0" * 40
 
@@ -90,10 +88,10 @@ def hex_to_sha(hex):
     assert len(hex) == 40, "Incorrent length of hexsha: %s" % hex
     try:
         return binascii.unhexlify(hex)
-    except TypeError, exc:
+    except TypeError as exc:
         if not isinstance(hex, str):
             raise
-        raise ValueError(exc.message)
+        raise ValueError(exc.args[0])
 
 
 def hex_to_filename(path, hex):
@@ -402,7 +400,7 @@ class ShaFile(object):
             obj._needs_serialization = True
             obj._file = f
             return obj
-        except (IndexError, ValueError), e:
+        except (IndexError, ValueError) as e:
             raise ObjectFormatException("invalid object header")
 
     @staticmethod
@@ -463,7 +461,7 @@ class ShaFile(object):
             self._deserialize(self.as_raw_chunks())
             self._sha = None
             new_sha = self.id
-        except Exception, e:
+        except Exception as e:
             raise ObjectFormatException(e)
         if old_sha != new_sha:
             raise ChecksumMismatch(new_sha, old_sha)
@@ -479,7 +477,7 @@ class ShaFile(object):
         return ret
 
     def _make_sha(self):
-        ret = make_sha()
+        ret = sha1()
         ret.update(self._header())
         for chunk in self.as_raw_chunks():
             ret.update(chunk)
@@ -489,7 +487,7 @@ class ShaFile(object):
         """The SHA1 object that is the name of this object."""
         if self._sha is None or self._needs_serialization:
             # this is a local because as_raw_chunks() overwrites self._sha
-            new_sha = make_sha()
+            new_sha = sha1()
             new_sha.update(self._header())
             for chunk in self.as_raw_chunks():
                 new_sha.update(chunk)
@@ -705,7 +703,7 @@ class Tag(ShaFile):
                         self._tag_time = int(timetext)
                         self._tag_timezone, self._tag_timezone_neg_utc = \
                                 parse_timezone(timezonetext)
-                    except ValueError, e:
+                    except ValueError as e:
                         raise ObjectFormatException(e)
             elif field is None:
                 self._message = value
@@ -889,22 +887,6 @@ class Tree(ShaFile):
         self._entries[name] = mode, hexsha
         self._needs_serialization = True
 
-    def entries(self):
-        """Return a list of tuples describing the tree entries.
-
-        :note: The order of the tuples that are returned is different from that
-            returned by the items and iteritems methods. This function will be
-            deprecated in the future.
-        """
-        warnings.warn("Tree.entries() is deprecated. Use Tree.items() or"
-            " Tree.iteritems() instead.", category=DeprecationWarning,
-            stacklevel=2)
-        self._ensure_parsed()
-        # The order of this is different from iteritems() for historical
-        # reasons
-        return [
-            (mode, name, hexsha) for (name, mode, hexsha) in self.iteritems()]
-
     def iteritems(self, name_order=False):
         """Iterate over entries.
 
@@ -925,7 +907,7 @@ class Tree(ShaFile):
         """Grab the entries in the tree"""
         try:
             parsed_entries = parse_tree("".join(chunks))
-        except ValueError, e:
+        except ValueError as e:
             raise ObjectFormatException(e)
         # TODO: list comprehension is for efficiency in the common (small) case;
         # if memory efficiency in the large case is a concern, use a genexp.

+ 42 - 0
dulwich/objectspec.py

@@ -0,0 +1,42 @@
+# objectspec.py -- Object specification
+# Copyright (C) 2014 Jelmer Vernooij <jelmer@samba.org>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; version 2
+# of the License or (at your option) a later version of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA  02110-1301, USA.
+
+"""Object specification."""
+
+
+def parse_object(repo, objectish):
+    """Parse a string referring to an object.
+
+    :param repo: A `Repo` object
+    :param objectish: A string referring to an object
+    :return: A git object
+    :raise KeyError: If the object can not be found
+    """
+    return repo[objectish]
+
+
+def parse_commit_range(repo, committishs):
+    """Parse a string referring to a range of commits.
+
+    :param repo: A `Repo` object
+    :param committishs: A string referring to a range of commits.
+    :return: An iterator over `Commit` objects
+    :raise KeyError: When the reference commits can not be found
+    :raise ValueError: If the range can not be parsed
+    """
+    return iter([repo[committishs]])

+ 20 - 22
dulwich/pack.py

@@ -30,10 +30,7 @@ match for the object name. You then use the pointer got from this as
 a pointer in to the corresponding packfile.
 """
 
-try:
-    from collections import defaultdict
-except ImportError:
-    from dulwich._compat import defaultdict
+from collections import defaultdict
 
 import binascii
 from cStringIO import (
@@ -54,12 +51,14 @@ except ImportError:
     has_mmap = False
 else:
     has_mmap = True
+from hashlib import sha1
 import os
+from os import (
+    SEEK_CUR,
+    SEEK_END,
+    )
 import struct
-try:
-    from struct import unpack_from
-except ImportError:
-    from dulwich._compat import unpack_from
+from struct import unpack_from
 import sys
 import warnings
 import zlib
@@ -72,11 +71,6 @@ from dulwich.file import GitFile
 from dulwich.lru_cache import (
     LRUSizeCache,
     )
-from dulwich._compat import (
-    make_sha,
-    SEEK_CUR,
-    SEEK_END,
-    )
 from dulwich.objects import (
     ShaFile,
     hex_to_sha,
@@ -257,10 +251,10 @@ def iter_sha1(iter):
     :param iter: Iterator over string objects
     :return: 40-byte hex sha1 digest
     """
-    sha1 = make_sha()
+    sha = sha1()
     for name in iter:
-        sha1.update(name)
-    return sha1.hexdigest()
+        sha.update(name)
+    return sha.hexdigest()
 
 
 def load_pack_index(path):
@@ -540,7 +534,7 @@ class FilePackIndex(PackIndex):
 
         :return: This is a 20-byte binary digest
         """
-        return make_sha(self._contents[:-20]).digest()
+        return sha1(self._contents[:-20]).digest()
 
     def get_pack_checksum(self):
         """Return the SHA1 checksum stored for the corresponding packfile.
@@ -745,7 +739,7 @@ class PackStreamReader(object):
             self.read_some = read_all
         else:
             self.read_some = read_some
-        self.sha = make_sha()
+        self.sha = sha1()
         self._offset = 0
         self._rbuf = StringIO()
         # trailer is a deque to avoid memory allocation on small reads
@@ -910,7 +904,7 @@ class PackStreamCopier(PackStreamReader):
 
 def obj_sha(type, chunks):
     """Compute the SHA for a numeric type and object chunks."""
-    sha = make_sha()
+    sha = sha1()
     sha.update(object_header(type, chunks_length(chunks)))
     for chunk in chunks:
         sha.update(chunk)
@@ -927,7 +921,7 @@ def compute_file_sha(f, start_ofs=0, end_ofs=0, buffer_size=1<<16):
     :param buffer_size: A buffer size for reading.
     :return: A new SHA object updated with data read from the file.
     """
-    sha = make_sha()
+    sha = sha1()
     f.seek(0, SEEK_END)
     todo = f.tell() + end_ofs - start_ofs
     f.seek(start_ofs)
@@ -986,6 +980,10 @@ class PackData(object):
             compute_size=_compute_object_size)
         self.pack = None
 
+    @property
+    def filename(self):
+        return os.path.basename(self._filename)
+
     @classmethod
     def from_file(cls, file, size):
         return cls(str(file), file=file, size=size)
@@ -1341,7 +1339,7 @@ class SHA1Reader(object):
 
     def __init__(self, f):
         self.f = f
-        self.sha1 = make_sha('')
+        self.sha1 = sha1('')
 
     def read(self, num=None):
         data = self.f.read(num)
@@ -1366,7 +1364,7 @@ class SHA1Writer(object):
     def __init__(self, f):
         self.f = f
         self.length = 0
-        self.sha1 = make_sha('')
+        self.sha1 = sha1('')
 
     def write(self, data):
         self.sha1.update(data)

+ 1 - 1
dulwich/patch.py

@@ -52,7 +52,7 @@ def write_commit_patch(f, commit, contents, progress, version=None):
         import subprocess
         p = subprocess.Popen(["diffstat"], stdout=subprocess.PIPE,
                              stdin=subprocess.PIPE)
-    except (ImportError, OSError), e:
+    except (ImportError, OSError):
         pass # diffstat not available?
     else:
         (diffstat, _) = p.communicate(contents)

+ 128 - 10
dulwich/porcelain.py

@@ -18,9 +18,16 @@
 
 import os
 import sys
+import time
 
 from dulwich import index
 from dulwich.client import get_transport_and_path
+from dulwich.objects import (
+    Commit,
+    Tag,
+    parse_timezone,
+    )
+from dulwich.objectspec import parse_object
 from dulwich.patch import write_tree_diff
 from dulwich.repo import (BaseRepo, Repo)
 from dulwich.server import update_server_info as server_update_server_info
@@ -36,7 +43,9 @@ Currently implemented:
  * diff-tree
  * init
  * remove
+ * reset
  * rev-list
+ * tag
  * update-server-info
  * symbolic-ref
 
@@ -215,32 +224,99 @@ def print_commit(commit, outstream):
     outstream.write("\n")
 
 
-def log(repo=".", outstream=sys.stdout):
+def print_tag(tag, outstream):
+    """Write a human-readable tag.
+
+    :param tag: A `Tag` object
+    :param outstream: A stream to write to
+    """
+    outstream.write("Tagger: %s\n" % tag.tagger)
+    outstream.write("Date:   %s\n" % tag.tag_time)
+    outstream.write("\n")
+    outstream.write("%s\n" % tag.message)
+    outstream.write("\n")
+
+
+def show_blob(repo, blob, outstream):
+    """Write a blob to a stream.
+
+    :param repo: A `Repo` object
+    :param blob: A `Blob` object
+    :param outstream: A stream file to write to
+    """
+    outstream.write(blob.data)
+
+
+def show_commit(repo, commit, outstream):
+    """Show a commit to a stream.
+
+    :param repo: A `Repo` object
+    :param commit: A `Commit` object
+    :param outstream: Stream to write to
+    """
+    print_commit(commit, outstream)
+    parent_commit = repo[commit.parents[0]]
+    write_tree_diff(outstream, repo.object_store, parent_commit.tree, commit.tree)
+
+
+def show_tree(repo, tree, outstream):
+    """Print a tree to a stream.
+
+    :param repo: A `Repo` object
+    :param tree: A `Tree` object
+    :param outstream: Stream to write to
+    """
+    for n in tree:
+        outstream.write("%s\n" % n)
+
+
+def show_tag(repo, tag, outstream):
+    """Print a tag to a stream.
+
+    :param repo: A `Repo` object
+    :param tag: A `Tag` object
+    :param outstream: Stream to write to
+    """
+    print_tag(tag, outstream)
+    show_object(repo, repo[tag.object[1]], outstream)
+
+
+def show_object(repo, obj, outstream):
+    return {
+        "tree": show_tree,
+        "blob": show_blob,
+        "commit": show_commit,
+        "tag": show_tag,
+            }[obj.type_name](repo, obj, outstream)
+
+
+def log(repo=".", outstream=sys.stdout, max_entries=None):
     """Write commit logs.
 
     :param repo: Path to repository
     :param outstream: Stream to write log output to
+    :param max_entries: Optional maximum number of entries to display
     """
     r = open_repo(repo)
-    walker = r.get_walker()
+    walker = r.get_walker(max_entries=max_entries)
     for entry in walker:
         print_commit(entry.commit, outstream)
 
 
-def show(repo=".", committish=None, outstream=sys.stdout):
+def show(repo=".", objects=None, outstream=sys.stdout):
     """Print the changes in a commit.
 
     :param repo: Path to repository
-    :param committish: Commit to write
+    :param objects: Objects to show (defaults to [HEAD])
     :param outstream: Stream to write to
     """
-    if committish is None:
-        committish = "HEAD"
+    if objects is None:
+        objects = ["HEAD"]
+    if not isinstance(objects, list):
+        objects = [objects]
     r = open_repo(repo)
-    commit = r[committish]
-    parent_commit = r[commit.parents[0]]
-    print_commit(commit, outstream)
-    write_tree_diff(outstream, r.object_store, parent_commit.tree, commit.tree)
+    for objectish in objects:
+        show_object(r, parse_object(r, objectish), outstream)
 
 
 def diff_tree(repo, old_tree, new_tree, outstream=sys.stdout):
@@ -265,3 +341,45 @@ def rev_list(repo, commits, outstream=sys.stdout):
     r = open_repo(repo)
     for entry in r.get_walker(include=[r[c].id for c in commits]):
         outstream.write("%s\n" % entry.commit.id)
+
+
+def tag(repo, tag, author, message):
+    """Creates a tag in git via dulwich calls:
+
+    :param repo: Path to repository
+    :param tag: tag string
+    :param author: tag author
+    :param repo: tag message
+    """
+
+    r = open_repo(repo)
+
+    # Create the tag object
+    tag_obj = Tag()
+    tag_obj.tagger = author
+    tag_obj.message = message
+    tag_obj.name = tag
+    tag_obj.object = (Commit, r.refs['HEAD'])
+    tag_obj.tag_time = int(time.time())
+    tag_obj.tag_timezone = parse_timezone('-0200')[0]
+
+    # Add tag to the object store
+    r.object_store.add_object(tag_obj)
+    r.refs['refs/tags/' + tag] = tag_obj.id
+
+
+def reset(repo, mode, committish="HEAD"):
+    """Reset current HEAD to the specified state.
+
+    :param repo: Path to repository
+    :param mode: Mode ("hard", "soft", "mixed")
+    """
+
+    if mode != "hard":
+        raise ValueError("hard is the only mode currently supported")
+
+    r = open_repo(repo)
+
+    indexfile = r.index_path()
+    tree = r[committish].tree
+    index.build_index_from_tree(r.path, indexfile, r.object_store, tree)

+ 5 - 5
dulwich/protocol.py

@@ -20,15 +20,15 @@
 """Generic functions for talking the git smart server protocol."""
 
 from cStringIO import StringIO
+from os import (
+    SEEK_END,
+    )
 import socket
 
 from dulwich.errors import (
     HangupException,
     GitProtocolError,
     )
-from dulwich._compat import (
-    SEEK_END,
-    )
 
 TCP_GIT_PORT = 9418
 
@@ -109,7 +109,7 @@ class Protocol(object):
             if self.report_activity:
                 self.report_activity(size, 'read')
             return read(size-4)
-        except socket.error, e:
+        except socket.error as e:
             raise GitProtocolError(e)
 
     def eof(self):
@@ -160,7 +160,7 @@ class Protocol(object):
             self.write(line)
             if self.report_activity:
                 self.report_activity(len(line), 'write')
-        except socket.error, e:
+        except socket.error as e:
             raise GitProtocolError(e)
 
     def write_file(self):

+ 2 - 2
dulwich/refs.py

@@ -442,7 +442,7 @@ class DiskRefsContainer(RefsContainer):
             path = os.path.join(self.path, 'packed-refs')
             try:
                 f = GitFile(path, 'rb')
-            except IOError, e:
+            except IOError as e:
                 if e.errno == errno.ENOENT:
                     return {}
                 raise
@@ -504,7 +504,7 @@ class DiskRefsContainer(RefsContainer):
                     return header + f.read(40 - len(SYMREF))
             finally:
                 f.close()
-        except IOError, e:
+        except IOError as e:
             if e.errno == errno.ENOENT:
                 return None
             raise

+ 32 - 78
dulwich/repo.py

@@ -248,14 +248,38 @@ class BaseRepo(object):
         wants = determine_wants(self.get_refs())
         if type(wants) is not list:
             raise TypeError("determine_wants() did not return a list")
+
+        shallows = getattr(graph_walker, 'shallow', frozenset())
+        unshallows = getattr(graph_walker, 'unshallow', frozenset())
+
         if wants == []:
             # TODO(dborowitz): find a way to short-circuit that doesn't change
             # this interface.
+
+            if shallows or unshallows:
+                # Do not send a pack in shallow short-circuit path
+                return None
+
             return []
+
         haves = self.object_store.find_common_revisions(graph_walker)
+
+        # Deal with shallow requests separately because the haves do
+        # not reflect what objects are missing
+        if shallows or unshallows:
+            haves = []  # TODO: filter the haves commits from iter_shas.
+                        # the specific commits aren't missing.
+
+        def get_parents(commit):
+            if commit.id in shallows:
+                return []
+            return self.get_parents(commit.id, commit)
+
         return self.object_store.iter_shas(
-          self.object_store.find_missing_objects(haves, wants, progress,
-                                                 get_tagged))
+          self.object_store.find_missing_objects(
+              haves, wants, progress,
+              get_tagged,
+              get_parents=get_parents))
 
     def get_graph_walker(self, heads=None):
         """Retrieve a graph walker.
@@ -270,18 +294,6 @@ class BaseRepo(object):
             heads = self.refs.as_dict('refs/heads').values()
         return ObjectStoreGraphWalker(heads, self.get_parents)
 
-    def ref(self, name):
-        """Return the SHA1 a ref is pointing to.
-
-        :param name: Name of the ref to look up
-        :raise KeyError: when the ref (or the one it points to) does not exist
-        :return: SHA1 it is pointing at
-        """
-        warnings.warn(
-            "Repo.ref(name) is deprecated. Use Repo.refs[name] instead.",
-            category=DeprecationWarning, stacklevel=2)
-        return self.refs[name]
-
     def get_refs(self):
         """Get dictionary with all refs.
 
@@ -372,54 +384,6 @@ class BaseRepo(object):
         backends = [self.get_config()] + StackedConfig.default_backends()
         return StackedConfig(backends, writable=backends[0])
 
-    def commit(self, sha):
-        """Retrieve the commit with a particular SHA.
-
-        :param sha: SHA of the commit to retrieve
-        :raise NotCommitError: If the SHA provided doesn't point at a Commit
-        :raise KeyError: If the SHA provided didn't exist
-        :return: A `Commit` object
-        """
-        warnings.warn("Repo.commit(sha) is deprecated. Use Repo[sha] instead.",
-            category=DeprecationWarning, stacklevel=2)
-        return self._get_object(sha, Commit)
-
-    def tree(self, sha):
-        """Retrieve the tree with a particular SHA.
-
-        :param sha: SHA of the tree to retrieve
-        :raise NotTreeError: If the SHA provided doesn't point at a Tree
-        :raise KeyError: If the SHA provided didn't exist
-        :return: A `Tree` object
-        """
-        warnings.warn("Repo.tree(sha) is deprecated. Use Repo[sha] instead.",
-            category=DeprecationWarning, stacklevel=2)
-        return self._get_object(sha, Tree)
-
-    def tag(self, sha):
-        """Retrieve the tag with a particular SHA.
-
-        :param sha: SHA of the tag to retrieve
-        :raise NotTagError: If the SHA provided doesn't point at a Tag
-        :raise KeyError: If the SHA provided didn't exist
-        :return: A `Tag` object
-        """
-        warnings.warn("Repo.tag(sha) is deprecated. Use Repo[sha] instead.",
-            category=DeprecationWarning, stacklevel=2)
-        return self._get_object(sha, Tag)
-
-    def get_blob(self, sha):
-        """Retrieve the blob with a particular SHA.
-
-        :param sha: SHA of the blob to retrieve
-        :raise NotBlobError: If the SHA provided doesn't point at a Blob
-        :raise KeyError: If the SHA provided didn't exist
-        :return: A `Blob` object
-        """
-        warnings.warn("Repo.get_blob(sha) is deprecated. Use Repo[sha] "
-            "instead.", category=DeprecationWarning, stacklevel=2)
-        return self._get_object(sha, Blob)
-
     def get_peeled(self, ref):
         """Get the peeled value of a ref.
 
@@ -468,20 +432,6 @@ class BaseRepo(object):
 
         return Walker(self.object_store, include, *args, **kwargs)
 
-    def revision_history(self, head):
-        """Returns a list of the commits reachable from head.
-
-        :param head: The SHA of the head to list revision history for.
-        :return: A list of commit objects reachable from head, starting with
-            head itself, in descending commit time order.
-        :raise MissingCommitError: if any missing commits are referenced,
-            including if the head parameter isn't the SHA of a commit.
-        """
-        warnings.warn("Repo.revision_history() is deprecated."
-            "Use dulwich.walker.Walker(repo) instead.",
-            category=DeprecationWarning, stacklevel=2)
-        return [e.commit for e in self.get_walker(include=[head])]
-
     def __getitem__(self, name):
         """Retrieve a Git object by SHA1 or ref.
 
@@ -489,7 +439,7 @@ class BaseRepo(object):
         :return: A `ShaFile` object, such as a Commit or Blob
         :raise KeyError: when the specified ref or object does not exist
         """
-        if len(name) in (20, 40):
+        if len(name) in (20, 40) and isinstance(name, str):
             try:
                 return self.object_store[name]
             except (KeyError, ValueError):
@@ -706,9 +656,13 @@ class Repo(BaseRepo):
         refs = DiskRefsContainer(self.controldir())
         BaseRepo.__init__(self, object_store, refs)
 
+        self._graftpoints = {}
         graft_file = self.get_named_file(os.path.join("info", "grafts"))
         if graft_file:
-            self._graftpoints = parse_graftpoints(graft_file)
+            self._graftpoints.update(parse_graftpoints(graft_file))
+        graft_file = self.get_named_file("shallow")
+        if graft_file:
+            self._graftpoints.update(parse_graftpoints(graft_file))
 
         self.hooks['pre-commit'] = PreCommitShellHook(self.controldir())
         self.hooks['commit-msg'] = CommitMsgShellHook(self.controldir())

+ 91 - 10
dulwich/server.py

@@ -59,6 +59,7 @@ from dulwich.errors import (
 from dulwich import log_utils
 from dulwich.objects import (
     hex_to_sha,
+    Commit,
     )
 from dulwich.pack import (
     write_pack_objects,
@@ -226,7 +227,7 @@ class UploadPackHandler(Handler):
     @classmethod
     def capabilities(cls):
         return ("multi_ack_detailed", "multi_ack", "side-band-64k", "thin-pack",
-                "ofs-delta", "no-progress", "include-tag")
+                "ofs-delta", "no-progress", "include-tag", "shallow")
 
     @classmethod
     def required_capabilities(cls):
@@ -277,7 +278,7 @@ class UploadPackHandler(Handler):
 
         # Did the process short-circuit (e.g. in a stateless RPC call)? Note
         # that the client still expects a 0-object pack in most cases.
-        if objects_iter is None:
+        if len(objects_iter) == 0:
             return
 
         self.progress("dul-daemon says what\n")
@@ -315,14 +316,55 @@ def _split_proto_line(line, allowed):
     try:
         if len(fields) == 1 and command in ('done', None):
             return (command, None)
-        elif len(fields) == 2 and command in ('want', 'have'):
-            hex_to_sha(fields[1])
-            return tuple(fields)
+        elif len(fields) == 2:
+            if command in ('want', 'have', 'shallow', 'unshallow'):
+                hex_to_sha(fields[1])
+                return tuple(fields)
+            elif command == 'deepen':
+                return command, int(fields[1])
     except (TypeError, AssertionError), e:
         raise GitProtocolError(e)
     raise GitProtocolError('Received invalid line from client: %s' % line)
 
 
+def _find_shallow(store, heads, depth):
+    """Find shallow commits according to a given depth.
+
+    :param store: An ObjectStore for looking up objects.
+    :param heads: Iterable of head SHAs to start walking from.
+    :param depth: The depth of ancestors to include.
+    :return: A tuple of (shallow, not_shallow), sets of SHAs that should be
+        considered shallow and unshallow according to the arguments. Note that
+        these sets may overlap if a commit is reachable along multiple paths.
+    """
+    parents = {}
+    def get_parents(sha):
+        result = parents.get(sha, None)
+        if not result:
+            result = store[sha].parents
+            parents[sha] = result
+        return result
+
+    todo = []  # stack of (sha, depth)
+    for head_sha in heads:
+        obj = store.peel_sha(head_sha)
+        if isinstance(obj, Commit):
+            todo.append((obj.id, 0))
+
+    not_shallow = set()
+    shallow = set()
+    while todo:
+        sha, cur_depth = todo.pop()
+        if cur_depth < depth:
+            not_shallow.add(sha)
+            new_depth = cur_depth + 1
+            todo.extend((p, new_depth) for p in get_parents(sha))
+        else:
+            shallow.add(sha)
+
+    return shallow, not_shallow
+
+
 class ProtocolGraphWalker(object):
     """A graph walker that knows the git protocol.
 
@@ -344,6 +386,9 @@ class ProtocolGraphWalker(object):
         self.http_req = handler.http_req
         self.advertise_refs = handler.advertise_refs
         self._wants = []
+        self.shallow = set()
+        self.client_shallow = set()
+        self.unshallow = set()
         self._cached = False
         self._cache = []
         self._cache_index = 0
@@ -357,6 +402,12 @@ class ProtocolGraphWalker(object):
         same regardless of ack type, and in fact is used to set the ack type of
         the ProtocolGraphWalker.
 
+        If the client has the 'shallow' capability, this method also reads and
+        responds to the 'shallow' and 'deepen' lines from the client. These are
+        not part of the wants per se, but they set up necessary state for
+        walking the graph. Additionally, later code depends on this method
+        consuming everything up to the first 'have' line.
+
         :param heads: a dict of refname->SHA1 to advertise
         :return: a list of SHA1s requested by the client
         """
@@ -380,7 +431,7 @@ class ProtocolGraphWalker(object):
             self.proto.write_pkt_line(None)
 
             if self.advertise_refs:
-                return None
+                return []
 
         # Now client will sending want want want commands
         want = self.proto.read_pkt_line()
@@ -389,11 +440,11 @@ class ProtocolGraphWalker(object):
         line, caps = extract_want_line_capabilities(want)
         self.handler.set_client_capabilities(caps)
         self.set_ack_type(ack_type(caps))
-        allowed = ('want', None)
+        allowed = ('want', 'shallow', 'deepen', None)
         command, sha = _split_proto_line(line, allowed)
 
         want_revs = []
-        while command != None:
+        while command == 'want':
             if sha not in values:
                 raise GitProtocolError(
                   'Client wants invalid object %s' % sha)
@@ -401,6 +452,9 @@ class ProtocolGraphWalker(object):
             command, sha = self.read_proto_line(allowed)
 
         self.set_wants(want_revs)
+        if command in ('shallow', 'deepen'):
+            self.unread_proto_line(command, sha)
+            self._handle_shallow_request(want_revs)
 
         if self.http_req and self.proto.eof():
             # The client may close the socket at this point, expecting a
@@ -410,6 +464,9 @@ class ProtocolGraphWalker(object):
 
         return want_revs
 
+    def unread_proto_line(self, command, value):
+        self.proto.unread_pkt_line('%s %s' % (command, value))
+
     def ack(self, have_ref):
         return self._impl.ack(have_ref)
 
@@ -432,10 +489,34 @@ class ProtocolGraphWalker(object):
 
         :param allowed: An iterable of command names that should be allowed.
         :return: A tuple of (command, value); see _split_proto_line.
-        :raise GitProtocolError: If an error occurred reading the line.
+        :raise UnexpectedCommandError: If an error occurred reading the line.
         """
         return _split_proto_line(self.proto.read_pkt_line(), allowed)
 
+    def _handle_shallow_request(self, wants):
+        while True:
+            command, val = self.read_proto_line(('deepen', 'shallow'))
+            if command == 'deepen':
+                depth = val
+                break
+            self.client_shallow.add(val)
+        self.read_proto_line((None,))  # consume client's flush-pkt
+
+        shallow, not_shallow = _find_shallow(self.store, wants, depth)
+
+        # Update self.shallow instead of reassigning it since we passed a
+        # reference to it before this method was called.
+        self.shallow.update(shallow - not_shallow)
+        new_shallow = self.shallow - self.client_shallow
+        unshallow = self.unshallow = not_shallow & self.client_shallow
+
+        for sha in sorted(new_shallow):
+            self.proto.write_pkt_line('shallow %s' % sha)
+        for sha in sorted(unshallow):
+            self.proto.write_pkt_line('unshallow %s' % sha)
+
+        self.proto.write_pkt_line(None)
+
     def send_ack(self, sha, ack_type=''):
         if ack_type:
             ack_type = ' %s' % ack_type
@@ -838,7 +919,7 @@ def generate_info_refs(repo):
 def generate_objects_info_packs(repo):
     """Generate an index for for packs."""
     for pack in repo.object_store.packs:
-        yield 'P pack-%s.pack\n' % pack.name()
+        yield 'P %s\n' % pack.data.filename
 
 
 def update_server_info(repo):

+ 3 - 0
dulwich/tests/__init__.py

@@ -122,12 +122,14 @@ def self_test_suite():
         'index',
         'lru_cache',
         'objects',
+        'objectspec',
         'object_store',
         'missing_obj_finder',
         'pack',
         'patch',
         'porcelain',
         'protocol',
+        'refs',
         'repository',
         'server',
         'walk',
@@ -141,6 +143,7 @@ def self_test_suite():
 def tutorial_test_suite():
     tutorial = [
         'introduction',
+        'file-format',
         'repo',
         'object-store',
         'remote',

+ 101 - 0
dulwich/tests/compat/server_utils.py

@@ -28,6 +28,7 @@ import tempfile
 import threading
 
 from dulwich.repo import Repo
+from dulwich.objects import hex_to_sha
 from dulwich.server import (
     ReceivePackHandler,
     )
@@ -40,6 +41,32 @@ from dulwich.tests.compat.utils import (
     )
 
 
+class _StubRepo(object):
+    """A stub repo that just contains a path to tear down."""
+
+    def __init__(self, name):
+        temp_dir = tempfile.mkdtemp()
+        self.path = os.path.join(temp_dir, name)
+        os.mkdir(self.path)
+
+
+def _get_shallow(repo):
+    shallow_file = repo.get_named_file('shallow')
+    if not shallow_file:
+        return []
+    shallows = []
+    try:
+        for line in shallow_file:
+            sha = line.strip()
+            if not sha:
+                continue
+            hex_to_sha(sha)
+            shallows.append(sha)
+    finally:
+        shallow_file.close()
+    return shallows
+
+
 class ServerTests(object):
     """Base tests for testing servers.
 
@@ -71,7 +98,9 @@ class ServerTests(object):
 
     def test_push_to_dulwich_no_op(self):
         self._old_repo = import_repo('server_old.export')
+        self.addCleanup(tear_down_repo, self._old_repo)
         self._new_repo = import_repo('server_old.export')
+        self.addCleanup(tear_down_repo, self._new_repo)
         self.assertReposEqual(self._old_repo, self._new_repo)
         port = self._start_server(self._old_repo)
 
@@ -81,7 +110,9 @@ class ServerTests(object):
 
     def test_push_to_dulwich_remove_branch(self):
         self._old_repo = import_repo('server_old.export')
+        self.addCleanup(tear_down_repo, self._old_repo)
         self._new_repo = import_repo('server_old.export')
+        self.addCleanup(tear_down_repo, self._new_repo)
         self.assertReposEqual(self._old_repo, self._new_repo)
         port = self._start_server(self._old_repo)
 
@@ -104,7 +135,9 @@ class ServerTests(object):
 
     def test_fetch_from_dulwich_no_op(self):
         self._old_repo = import_repo('server_old.export')
+        self.addCleanup(tear_down_repo, self._old_repo)
         self._new_repo = import_repo('server_old.export')
+        self.addCleanup(tear_down_repo, self._new_repo)
         self.assertReposEqual(self._old_repo, self._new_repo)
         port = self._start_server(self._new_repo)
 
@@ -118,6 +151,7 @@ class ServerTests(object):
         old_repo_dir = os.path.join(tempfile.mkdtemp(), 'empty_old')
         run_git_or_fail(['init', '--quiet', '--bare', old_repo_dir])
         self._old_repo = Repo(old_repo_dir)
+        self.addCleanup(tear_down_repo, self._old_repo)
         port = self._start_server(self._old_repo)
 
         new_repo_base_dir = tempfile.mkdtemp()
@@ -132,6 +166,73 @@ class ServerTests(object):
             # may have occurred, so don't depend on tearDown to clean it up.
             shutil.rmtree(new_repo_base_dir)
 
+    def test_lsremote_from_dulwich(self):
+        self._repo = import_repo('server_old.export')
+        port = self._start_server(self._repo)
+        o = run_git_or_fail(['ls-remote', self.url(port)])
+        self.assertEqual(len(o.split('\n')), 4)
+
+    def test_new_shallow_clone_from_dulwich(self):
+        self._source_repo = import_repo('server_new.export')
+        self.addCleanup(tear_down_repo, self._source_repo)
+        self._stub_repo = _StubRepo('shallow')
+        self.addCleanup(tear_down_repo, self._stub_repo)
+        port = self._start_server(self._source_repo)
+
+        # Fetch at depth 1
+        run_git_or_fail(['clone', '--mirror', '--depth=1', '--no-single-branch',
+                        self.url(port), self._stub_repo.path])
+        clone = self._stub_repo = Repo(self._stub_repo.path)
+        expected_shallow = ['94de09a530df27ac3bb613aaecdd539e0a0655e1',
+                            'da5cd81e1883c62a25bb37c4d1f8ad965b29bf8d']
+        self.assertEqual(expected_shallow, _get_shallow(clone))
+        self.assertReposNotEqual(clone, self._source_repo)
+
+    def test_fetch_same_depth_into_shallow_clone_from_dulwich(self):
+        self._source_repo = import_repo('server_new.export')
+        self.addCleanup(tear_down_repo, self._source_repo)
+        self._stub_repo = _StubRepo('shallow')
+        self.addCleanup(tear_down_repo, self._stub_repo)
+        port = self._start_server(self._source_repo)
+
+        # Fetch at depth 1
+        run_git_or_fail(['clone', '--mirror', '--depth=1', '--no-single-branch',
+                        self.url(port), self._stub_repo.path])
+        clone = self._stub_repo = Repo(self._stub_repo.path)
+
+        # Fetching at the same depth is a no-op.
+        run_git_or_fail(
+          ['fetch', '--depth=1', self.url(port)] + self.branch_args(),
+          cwd=self._stub_repo.path)
+        expected_shallow = ['94de09a530df27ac3bb613aaecdd539e0a0655e1',
+                            'da5cd81e1883c62a25bb37c4d1f8ad965b29bf8d']
+        self.assertEqual(expected_shallow, _get_shallow(clone))
+        self.assertReposNotEqual(clone, self._source_repo)
+
+    def test_fetch_full_depth_into_shallow_clone_from_dulwich(self):
+        self._source_repo = import_repo('server_new.export')
+        self.addCleanup(tear_down_repo, self._source_repo)
+        self._stub_repo = _StubRepo('shallow')
+        self.addCleanup(tear_down_repo, self._stub_repo)
+        port = self._start_server(self._source_repo)
+
+        # Fetch at depth 1
+        run_git_or_fail(['clone', '--mirror', '--depth=1', '--no-single-branch',
+                        self.url(port), self._stub_repo.path])
+        clone = self._stub_repo = Repo(self._stub_repo.path)
+
+        # Fetching at the same depth is a no-op.
+        run_git_or_fail(
+          ['fetch', '--depth=1', self.url(port)] + self.branch_args(),
+          cwd=self._stub_repo.path)
+
+        # The whole repo only has depth 3, so it should equal server_new.
+        run_git_or_fail(
+          ['fetch', '--depth=3', self.url(port)] + self.branch_args(),
+          cwd=self._stub_repo.path)
+        self.assertEqual([], _get_shallow(clone))
+        self.assertReposEqual(clone, self._source_repo)
+
 
 class ShutdownServerMixIn:
     """Mixin that allows serve_forever to be shut down.

+ 16 - 1
dulwich/tests/compat/test_web.py

@@ -134,9 +134,24 @@ class DumbWebTestCase(WebTests, CompatTestCase):
         return make_wsgi_chain(backend, dumb=True)
 
     def test_push_to_dulwich(self):
-        # Note: remove this if dumb pushing is supported
+        # Note: remove this if dulwich implements dumb web pushing.
         raise SkipTest('Dumb web pushing not supported.')
 
     def test_push_to_dulwich_remove_branch(self):
         # Note: remove this if dumb pushing is supported
         raise SkipTest('Dumb web pushing not supported.')
+
+    def test_new_shallow_clone_from_dulwich(self):
+        # Note: remove this if C git and dulwich implement dumb web shallow
+        # clones.
+        raise SkipTest('Dumb web shallow cloning not supported.')
+
+    def test_fetch_same_depth_into_shallow_clone_from_dulwich(self):
+        # Note: remove this if C git and dulwich implement dumb web shallow
+        # clones.
+        raise SkipTest('Dumb web shallow cloning not supported.')
+
+    def test_fetch_full_depth_into_shallow_clone_from_dulwich(self):
+        # Note: remove this if C git and dulwich implement dumb web shallow
+        # clones.
+        raise SkipTest('Dumb web shallow cloning not supported.')

+ 21 - 0
dulwich/tests/test_client.py

@@ -559,3 +559,24 @@ class LocalGitClientTests(TestCase):
         t = MemoryRepo()
         s = open_repo('a.git')
         self.assertEquals(s.get_refs(), c.fetch(s.path, t))
+
+    def test_fetch_empty(self):
+        c = LocalGitClient()
+        s = open_repo('a.git')
+        out = StringIO()
+        walker = {}
+        c.fetch_pack(s.path, lambda heads: [], graph_walker=walker,
+            pack_data=out.write)
+        self.assertEquals("PACK\x00\x00\x00\x02\x00\x00\x00\x00\x02\x9d\x08"
+            "\x82;\xd8\xa8\xea\xb5\x10\xadj\xc7\\\x82<\xfd>\xd3\x1e", out.getvalue())
+
+    def test_fetch_pack_none(self):
+        c = LocalGitClient()
+        s = open_repo('a.git')
+        out = StringIO()
+        walker = MemoryRepo().get_graph_walker()
+        c.fetch_pack(s.path,
+            lambda heads: ["a90fa2d900a17e99b433217e988c4eb4a2e9a097"],
+            graph_walker=walker, pack_data=out.write)
+        # Hardcoding is not ideal, but we'll fix that some other day..
+        self.assertTrue(out.getvalue().startswith('PACK\x00\x00\x00\x02\x00\x00\x00\x07'))

+ 4 - 3
dulwich/tests/test_diff_tree.py

@@ -18,6 +18,10 @@
 
 """Tests for file and tree diff utilities."""
 
+from itertools import (
+    permutations,
+    )
+
 from dulwich.diff_tree import (
     CHANGE_MODIFY,
     CHANGE_RENAME,
@@ -39,9 +43,6 @@ from dulwich.diff_tree import (
 from dulwich.index import (
     commit_tree,
     )
-from dulwich._compat import (
-    permutations,
-    )
 from dulwich.object_store import (
     MemoryObjectStore,
     )

+ 3 - 3
dulwich/tests/test_objects.py

@@ -24,6 +24,9 @@
 
 from cStringIO import StringIO
 import datetime
+from itertools import (
+    permutations,
+    )
 import os
 import stat
 import warnings
@@ -31,9 +34,6 @@ import warnings
 from dulwich.errors import (
     ObjectFormatException,
     )
-from dulwich._compat import (
-    permutations,
-    )
 from dulwich.objects import (
     Blob,
     Tree,

+ 70 - 0
dulwich/tests/test_objectspec.py

@@ -0,0 +1,70 @@
+# test_objectspec.py -- tests for objectspec.py
+# Copyright (C) 2014 Jelmer Vernooij <jelmer@samba.org>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; version 2
+# of the License or (at your option) any later version of
+# the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA  02110-1301, USA.
+
+"""Tests for revision spec parsing."""
+
+# TODO: Round-trip parse-serialize-parse and serialize-parse-serialize tests.
+
+
+from dulwich.objects import (
+    Blob,
+    Commit,
+    Tag,
+    Tree,
+    )
+from dulwich.objectspec import (
+    parse_object,
+    parse_commit_range,
+    )
+from dulwich.repo import MemoryRepo
+from dulwich.tests import (
+    TestCase,
+    )
+from dulwich.tests.utils import (
+    build_commit_graph,
+    make_object,
+    )
+
+
+class ParseObjectTests(TestCase):
+    """Test parse_object."""
+
+    def test_nonexistent(self):
+        r = MemoryRepo()
+        self.assertRaises(KeyError, parse_object, r, "thisdoesnotexist")
+
+    def test_blob_by_sha(self):
+        r = MemoryRepo()
+        b = Blob.from_string("Blah")
+        r.object_store.add_object(b)
+        self.assertEquals(b, parse_object(r, b.id))
+
+
+class ParseCommitRangeTests(TestCase):
+    """Test parse_commit_range."""
+
+    def test_nonexistent(self):
+        r = MemoryRepo()
+        self.assertRaises(KeyError, parse_commit_range, r, "thisdoesnotexist")
+
+    def test_commit_by_sha(self):
+        r = MemoryRepo()
+        c1, c2, c3 = build_commit_graph(r.object_store, [[1], [2, 1],
+            [3, 1, 2]])
+        self.assertEquals([c1], list(parse_commit_range(r, c1.id)))

+ 7 - 9
dulwich/tests/test_pack.py

@@ -21,14 +21,12 @@
 
 
 from cStringIO import StringIO
+from hashlib import sha1
 import os
 import shutil
 import tempfile
 import zlib
 
-from dulwich._compat import (
-    make_sha,
-    )
 from dulwich.errors import (
     ChecksumMismatch,
     )
@@ -250,16 +248,16 @@ class TestPackData(PackTests):
 
     def test_compute_file_sha(self):
         f = StringIO('abcd1234wxyz')
-        self.assertEqual(make_sha('abcd1234wxyz').hexdigest(),
+        self.assertEqual(sha1('abcd1234wxyz').hexdigest(),
                          compute_file_sha(f).hexdigest())
-        self.assertEqual(make_sha('abcd1234wxyz').hexdigest(),
+        self.assertEqual(sha1('abcd1234wxyz').hexdigest(),
                          compute_file_sha(f, buffer_size=5).hexdigest())
-        self.assertEqual(make_sha('abcd1234').hexdigest(),
+        self.assertEqual(sha1('abcd1234').hexdigest(),
                          compute_file_sha(f, end_ofs=-4).hexdigest())
-        self.assertEqual(make_sha('1234wxyz').hexdigest(),
+        self.assertEqual(sha1('1234wxyz').hexdigest(),
                          compute_file_sha(f, start_ofs=4).hexdigest())
         self.assertEqual(
-          make_sha('1234').hexdigest(),
+          sha1('1234').hexdigest(),
           compute_file_sha(f, start_ofs=4, end_ofs=-4).hexdigest())
 
 
@@ -504,7 +502,7 @@ class WritePackTests(TestCase):
         f = StringIO()
         f.write('header')
         offset = f.tell()
-        sha_a = make_sha('foo')
+        sha_a = sha1('foo')
         sha_b = sha_a.copy()
         write_pack_object(f, Blob.type_num, 'blob', sha=sha_a)
         self.assertNotEqual(sha_a.digest(), sha_b.digest())

+ 72 - 2
dulwich/tests/test_porcelain.py

@@ -25,6 +25,7 @@ import tarfile
 import tempfile
 
 from dulwich import porcelain
+from dulwich.diff_tree import tree_changes
 from dulwich.objects import (
     Blob,
     Tree,
@@ -207,19 +208,42 @@ class LogTests(PorcelainTestCase):
         self.repo.refs["HEAD"] = c3.id
         outstream = StringIO()
         porcelain.log(self.repo.path, outstream=outstream)
-        self.assertTrue(outstream.getvalue().startswith("-" * 50))
+        self.assertEquals(3, outstream.getvalue().count("-" * 50))
+
+    def test_max_entries(self):
+        c1, c2, c3 = build_commit_graph(self.repo.object_store, [[1], [2, 1],
+            [3, 1, 2]])
+        self.repo.refs["HEAD"] = c3.id
+        outstream = StringIO()
+        porcelain.log(self.repo.path, outstream=outstream, max_entries=1)
+        self.assertEquals(1, outstream.getvalue().count("-" * 50))
 
 
 class ShowTests(PorcelainTestCase):
 
+    def test_nolist(self):
+        c1, c2, c3 = build_commit_graph(self.repo.object_store, [[1], [2, 1],
+            [3, 1, 2]])
+        self.repo.refs["HEAD"] = c3.id
+        outstream = StringIO()
+        porcelain.show(self.repo.path, objects=c3.id, outstream=outstream)
+        self.assertTrue(outstream.getvalue().startswith("-" * 50))
+
     def test_simple(self):
         c1, c2, c3 = build_commit_graph(self.repo.object_store, [[1], [2, 1],
             [3, 1, 2]])
         self.repo.refs["HEAD"] = c3.id
         outstream = StringIO()
-        porcelain.show(self.repo.path, committish=c3.id, outstream=outstream)
+        porcelain.show(self.repo.path, objects=[c3.id], outstream=outstream)
         self.assertTrue(outstream.getvalue().startswith("-" * 50))
 
+    def test_blob(self):
+        b = Blob.from_string("The Foo\n")
+        self.repo.object_store.add_object(b)
+        outstream = StringIO()
+        porcelain.show(self.repo.path, objects=[b.id], outstream=outstream)
+        self.assertEquals(outstream.getvalue(), "The Foo\n")
+
 
 class SymbolicRefTests(PorcelainTestCase):
 
@@ -303,3 +327,49 @@ class RevListTests(PorcelainTestCase):
         self.assertEquals(
             "%s\n%s\n%s\n" % (c3.id, c2.id, c1.id),
             outstream.getvalue())
+
+
+class TagTests(PorcelainTestCase):
+
+    def test_simple(self):
+        tag = 'tryme'
+        author = 'foo'
+        message = 'bar'
+
+        c1, c2, c3 = build_commit_graph(self.repo.object_store, [[1], [2, 1],
+            [3, 1, 2]])
+        self.repo.refs["HEAD"] = c3.id
+
+        porcelain.tag(self.repo.path, tag, author, message)
+
+        tags = self.repo.refs.as_dict("refs/tags")
+        self.assertEquals(tags.keys()[0], tag)
+
+
+class ResetTests(PorcelainTestCase):
+
+    def test_hard_head(self):
+        f = open(os.path.join(self.repo.path, 'foo'), 'w')
+        try:
+            f.write("BAR")
+        finally:
+            f.close()
+        porcelain.add(self.repo.path, paths=["foo"])
+        porcelain.commit(self.repo.path, message="Some message",
+                committer="Jane <jane@example.com>",
+                author="John <john@example.com>")
+
+        f = open(os.path.join(self.repo.path, 'foo'), 'w')
+        try:
+            f.write("OOH")
+        finally:
+            f.close()
+
+        porcelain.reset(self.repo, "hard", "HEAD")
+
+        index = self.repo.open_index()
+        changes = list(tree_changes(self.repo,
+                       index.commit(self.repo.object_store),
+                       self.repo['HEAD'].tree))
+
+        self.assertEquals([], changes)

+ 473 - 0
dulwich/tests/test_refs.py

@@ -0,0 +1,473 @@
+# test_refs.py -- tests for refs.py
+# Copyright (C) 2013 Jelmer Vernooij <jelmer@samba.org>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; version 2
+# of the License or (at your option) any later version of
+# the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA  02110-1301, USA.
+
+"""Tests for dulwich.refs."""
+
+from cStringIO import StringIO
+import os
+import tempfile
+
+from dulwich import errors
+from dulwich.file import (
+    GitFile,
+    )
+from dulwich.refs import (
+    DictRefsContainer,
+    InfoRefsContainer,
+    check_ref_format,
+    _split_ref_line,
+    read_packed_refs_with_peeled,
+    read_packed_refs,
+    write_packed_refs,
+    )
+from dulwich.repo import Repo
+
+from dulwich.tests import (
+    TestCase,
+    )
+
+from dulwich.tests.utils import (
+    open_repo,
+    tear_down_repo,
+    )
+
+
+class CheckRefFormatTests(TestCase):
+    """Tests for the check_ref_format function.
+
+    These are the same tests as in the git test suite.
+    """
+
+    def test_valid(self):
+        self.assertTrue(check_ref_format('heads/foo'))
+        self.assertTrue(check_ref_format('foo/bar/baz'))
+        self.assertTrue(check_ref_format('refs///heads/foo'))
+        self.assertTrue(check_ref_format('foo./bar'))
+        self.assertTrue(check_ref_format('heads/foo@bar'))
+        self.assertTrue(check_ref_format('heads/fix.lock.error'))
+
+    def test_invalid(self):
+        self.assertFalse(check_ref_format('foo'))
+        self.assertFalse(check_ref_format('heads/foo/'))
+        self.assertFalse(check_ref_format('./foo'))
+        self.assertFalse(check_ref_format('.refs/foo'))
+        self.assertFalse(check_ref_format('heads/foo..bar'))
+        self.assertFalse(check_ref_format('heads/foo?bar'))
+        self.assertFalse(check_ref_format('heads/foo.lock'))
+        self.assertFalse(check_ref_format('heads/v@{ation'))
+        self.assertFalse(check_ref_format('heads/foo\bar'))
+
+
+ONES = "1" * 40
+TWOS = "2" * 40
+THREES = "3" * 40
+FOURS = "4" * 40
+
+class PackedRefsFileTests(TestCase):
+
+    def test_split_ref_line_errors(self):
+        self.assertRaises(errors.PackedRefsException, _split_ref_line,
+                          'singlefield')
+        self.assertRaises(errors.PackedRefsException, _split_ref_line,
+                          'badsha name')
+        self.assertRaises(errors.PackedRefsException, _split_ref_line,
+                          '%s bad/../refname' % ONES)
+
+    def test_read_without_peeled(self):
+        f = StringIO('# comment\n%s ref/1\n%s ref/2' % (ONES, TWOS))
+        self.assertEqual([(ONES, 'ref/1'), (TWOS, 'ref/2')],
+                         list(read_packed_refs(f)))
+
+    def test_read_without_peeled_errors(self):
+        f = StringIO('%s ref/1\n^%s' % (ONES, TWOS))
+        self.assertRaises(errors.PackedRefsException, list, read_packed_refs(f))
+
+    def test_read_with_peeled(self):
+        f = StringIO('%s ref/1\n%s ref/2\n^%s\n%s ref/4' % (
+          ONES, TWOS, THREES, FOURS))
+        self.assertEqual([
+          (ONES, 'ref/1', None),
+          (TWOS, 'ref/2', THREES),
+          (FOURS, 'ref/4', None),
+          ], list(read_packed_refs_with_peeled(f)))
+
+    def test_read_with_peeled_errors(self):
+        f = StringIO('^%s\n%s ref/1' % (TWOS, ONES))
+        self.assertRaises(errors.PackedRefsException, list, read_packed_refs(f))
+
+        f = StringIO('%s ref/1\n^%s\n^%s' % (ONES, TWOS, THREES))
+        self.assertRaises(errors.PackedRefsException, list, read_packed_refs(f))
+
+    def test_write_with_peeled(self):
+        f = StringIO()
+        write_packed_refs(f, {'ref/1': ONES, 'ref/2': TWOS},
+                          {'ref/1': THREES})
+        self.assertEqual(
+          "# pack-refs with: peeled\n%s ref/1\n^%s\n%s ref/2\n" % (
+          ONES, THREES, TWOS), f.getvalue())
+
+    def test_write_without_peeled(self):
+        f = StringIO()
+        write_packed_refs(f, {'ref/1': ONES, 'ref/2': TWOS})
+        self.assertEqual("%s ref/1\n%s ref/2\n" % (ONES, TWOS), f.getvalue())
+
+
+# Dict of refs that we expect all RefsContainerTests subclasses to define.
+_TEST_REFS = {
+  'HEAD': '42d06bd4b77fed026b154d16493e5deab78f02ec',
+  'refs/heads/40-char-ref-aaaaaaaaaaaaaaaaaa': '42d06bd4b77fed026b154d16493e5deab78f02ec',
+  'refs/heads/master': '42d06bd4b77fed026b154d16493e5deab78f02ec',
+  'refs/heads/packed': '42d06bd4b77fed026b154d16493e5deab78f02ec',
+  'refs/tags/refs-0.1': 'df6800012397fb85c56e7418dd4eb9405dee075c',
+  'refs/tags/refs-0.2': '3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8',
+  }
+
+
+class RefsContainerTests(object):
+
+    def test_keys(self):
+        actual_keys = set(self._refs.keys())
+        self.assertEqual(set(self._refs.allkeys()), actual_keys)
+        # ignore the symref loop if it exists
+        actual_keys.discard('refs/heads/loop')
+        self.assertEqual(set(_TEST_REFS.iterkeys()), actual_keys)
+
+        actual_keys = self._refs.keys('refs/heads')
+        actual_keys.discard('loop')
+        self.assertEqual(
+            ['40-char-ref-aaaaaaaaaaaaaaaaaa', 'master', 'packed'],
+            sorted(actual_keys))
+        self.assertEqual(['refs-0.1', 'refs-0.2'],
+                         sorted(self._refs.keys('refs/tags')))
+
+    def test_as_dict(self):
+        # refs/heads/loop does not show up even if it exists
+        self.assertEqual(_TEST_REFS, self._refs.as_dict())
+
+    def test_setitem(self):
+        self._refs['refs/some/ref'] = '42d06bd4b77fed026b154d16493e5deab78f02ec'
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['refs/some/ref'])
+        self.assertRaises(errors.RefFormatError, self._refs.__setitem__,
+                          'notrefs/foo', '42d06bd4b77fed026b154d16493e5deab78f02ec')
+
+    def test_set_if_equals(self):
+        nines = '9' * 40
+        self.assertFalse(self._refs.set_if_equals('HEAD', 'c0ffee', nines))
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['HEAD'])
+
+        self.assertTrue(self._refs.set_if_equals(
+          'HEAD', '42d06bd4b77fed026b154d16493e5deab78f02ec', nines))
+        self.assertEqual(nines, self._refs['HEAD'])
+
+        self.assertTrue(self._refs.set_if_equals('refs/heads/master', None,
+                                                 nines))
+        self.assertEqual(nines, self._refs['refs/heads/master'])
+
+    def test_add_if_new(self):
+        nines = '9' * 40
+        self.assertFalse(self._refs.add_if_new('refs/heads/master', nines))
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['refs/heads/master'])
+
+        self.assertTrue(self._refs.add_if_new('refs/some/ref', nines))
+        self.assertEqual(nines, self._refs['refs/some/ref'])
+
+    def test_set_symbolic_ref(self):
+        self._refs.set_symbolic_ref('refs/heads/symbolic', 'refs/heads/master')
+        self.assertEqual('ref: refs/heads/master',
+                         self._refs.read_loose_ref('refs/heads/symbolic'))
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['refs/heads/symbolic'])
+
+    def test_set_symbolic_ref_overwrite(self):
+        nines = '9' * 40
+        self.assertFalse('refs/heads/symbolic' in self._refs)
+        self._refs['refs/heads/symbolic'] = nines
+        self.assertEqual(nines, self._refs.read_loose_ref('refs/heads/symbolic'))
+        self._refs.set_symbolic_ref('refs/heads/symbolic', 'refs/heads/master')
+        self.assertEqual('ref: refs/heads/master',
+                         self._refs.read_loose_ref('refs/heads/symbolic'))
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['refs/heads/symbolic'])
+
+    def test_check_refname(self):
+        self._refs._check_refname('HEAD')
+        self._refs._check_refname('refs/stash')
+        self._refs._check_refname('refs/heads/foo')
+
+        self.assertRaises(errors.RefFormatError, self._refs._check_refname,
+                          'refs')
+        self.assertRaises(errors.RefFormatError, self._refs._check_refname,
+                          'notrefs/foo')
+
+    def test_contains(self):
+        self.assertTrue('refs/heads/master' in self._refs)
+        self.assertFalse('refs/heads/bar' in self._refs)
+
+    def test_delitem(self):
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                          self._refs['refs/heads/master'])
+        del self._refs['refs/heads/master']
+        self.assertRaises(KeyError, lambda: self._refs['refs/heads/master'])
+
+    def test_remove_if_equals(self):
+        self.assertFalse(self._refs.remove_if_equals('HEAD', 'c0ffee'))
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['HEAD'])
+        self.assertTrue(self._refs.remove_if_equals(
+          'refs/tags/refs-0.2', '3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8'))
+        self.assertFalse('refs/tags/refs-0.2' in self._refs)
+
+
+
+
+class DictRefsContainerTests(RefsContainerTests, TestCase):
+
+    def setUp(self):
+        TestCase.setUp(self)
+        self._refs = DictRefsContainer(dict(_TEST_REFS))
+
+    def test_invalid_refname(self):
+        # FIXME: Move this test into RefsContainerTests, but requires
+        # some way of injecting invalid refs.
+        self._refs._refs["refs/stash"] = "00" * 20
+        expected_refs = dict(_TEST_REFS)
+        expected_refs["refs/stash"] = "00" * 20
+        self.assertEqual(expected_refs, self._refs.as_dict())
+
+
+class DiskRefsContainerTests(RefsContainerTests, TestCase):
+
+    def setUp(self):
+        TestCase.setUp(self)
+        self._repo = open_repo('refs.git')
+        self._refs = self._repo.refs
+
+    def tearDown(self):
+        tear_down_repo(self._repo)
+        TestCase.tearDown(self)
+
+    def test_get_packed_refs(self):
+        self.assertEqual({
+          'refs/heads/packed': '42d06bd4b77fed026b154d16493e5deab78f02ec',
+          'refs/tags/refs-0.1': 'df6800012397fb85c56e7418dd4eb9405dee075c',
+          }, self._refs.get_packed_refs())
+
+    def test_get_peeled_not_packed(self):
+        # not packed
+        self.assertEqual(None, self._refs.get_peeled('refs/tags/refs-0.2'))
+        self.assertEqual('3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8',
+                         self._refs['refs/tags/refs-0.2'])
+
+        # packed, known not peelable
+        self.assertEqual(self._refs['refs/heads/packed'],
+                         self._refs.get_peeled('refs/heads/packed'))
+
+        # packed, peeled
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs.get_peeled('refs/tags/refs-0.1'))
+
+    def test_setitem(self):
+        RefsContainerTests.test_setitem(self)
+        f = open(os.path.join(self._refs.path, 'refs', 'some', 'ref'), 'rb')
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                          f.read()[:40])
+        f.close()
+
+    def test_setitem_symbolic(self):
+        ones = '1' * 40
+        self._refs['HEAD'] = ones
+        self.assertEqual(ones, self._refs['HEAD'])
+
+        # ensure HEAD was not modified
+        f = open(os.path.join(self._refs.path, 'HEAD'), 'rb')
+        self.assertEqual('ref: refs/heads/master', iter(f).next().rstrip('\n'))
+        f.close()
+
+        # ensure the symbolic link was written through
+        f = open(os.path.join(self._refs.path, 'refs', 'heads', 'master'), 'rb')
+        self.assertEqual(ones, f.read()[:40])
+        f.close()
+
+    def test_set_if_equals(self):
+        RefsContainerTests.test_set_if_equals(self)
+
+        # ensure symref was followed
+        self.assertEqual('9' * 40, self._refs['refs/heads/master'])
+
+        # ensure lockfile was deleted
+        self.assertFalse(os.path.exists(
+          os.path.join(self._refs.path, 'refs', 'heads', 'master.lock')))
+        self.assertFalse(os.path.exists(
+          os.path.join(self._refs.path, 'HEAD.lock')))
+
+    def test_add_if_new_packed(self):
+        # don't overwrite packed ref
+        self.assertFalse(self._refs.add_if_new('refs/tags/refs-0.1', '9' * 40))
+        self.assertEqual('df6800012397fb85c56e7418dd4eb9405dee075c',
+                         self._refs['refs/tags/refs-0.1'])
+
+    def test_add_if_new_symbolic(self):
+        # Use an empty repo instead of the default.
+        tear_down_repo(self._repo)
+        repo_dir = os.path.join(tempfile.mkdtemp(), 'test')
+        os.makedirs(repo_dir)
+        self._repo = Repo.init(repo_dir)
+        refs = self._repo.refs
+
+        nines = '9' * 40
+        self.assertEqual('ref: refs/heads/master', refs.read_ref('HEAD'))
+        self.assertFalse('refs/heads/master' in refs)
+        self.assertTrue(refs.add_if_new('HEAD', nines))
+        self.assertEqual('ref: refs/heads/master', refs.read_ref('HEAD'))
+        self.assertEqual(nines, refs['HEAD'])
+        self.assertEqual(nines, refs['refs/heads/master'])
+        self.assertFalse(refs.add_if_new('HEAD', '1' * 40))
+        self.assertEqual(nines, refs['HEAD'])
+        self.assertEqual(nines, refs['refs/heads/master'])
+
+    def test_follow(self):
+        self.assertEqual(
+          ('refs/heads/master', '42d06bd4b77fed026b154d16493e5deab78f02ec'),
+          self._refs._follow('HEAD'))
+        self.assertEqual(
+          ('refs/heads/master', '42d06bd4b77fed026b154d16493e5deab78f02ec'),
+          self._refs._follow('refs/heads/master'))
+        self.assertRaises(KeyError, self._refs._follow, 'refs/heads/loop')
+
+    def test_delitem(self):
+        RefsContainerTests.test_delitem(self)
+        ref_file = os.path.join(self._refs.path, 'refs', 'heads', 'master')
+        self.assertFalse(os.path.exists(ref_file))
+        self.assertFalse('refs/heads/master' in self._refs.get_packed_refs())
+
+    def test_delitem_symbolic(self):
+        self.assertEqual('ref: refs/heads/master',
+                          self._refs.read_loose_ref('HEAD'))
+        del self._refs['HEAD']
+        self.assertRaises(KeyError, lambda: self._refs['HEAD'])
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+                         self._refs['refs/heads/master'])
+        self.assertFalse(os.path.exists(os.path.join(self._refs.path, 'HEAD')))
+
+    def test_remove_if_equals_symref(self):
+        # HEAD is a symref, so shouldn't equal its dereferenced value
+        self.assertFalse(self._refs.remove_if_equals(
+          'HEAD', '42d06bd4b77fed026b154d16493e5deab78f02ec'))
+        self.assertTrue(self._refs.remove_if_equals(
+          'refs/heads/master', '42d06bd4b77fed026b154d16493e5deab78f02ec'))
+        self.assertRaises(KeyError, lambda: self._refs['refs/heads/master'])
+
+        # HEAD is now a broken symref
+        self.assertRaises(KeyError, lambda: self._refs['HEAD'])
+        self.assertEqual('ref: refs/heads/master',
+                          self._refs.read_loose_ref('HEAD'))
+
+        self.assertFalse(os.path.exists(
+            os.path.join(self._refs.path, 'refs', 'heads', 'master.lock')))
+        self.assertFalse(os.path.exists(
+            os.path.join(self._refs.path, 'HEAD.lock')))
+
+    def test_remove_packed_without_peeled(self):
+        refs_file = os.path.join(self._repo.path, 'packed-refs')
+        f = GitFile(refs_file)
+        refs_data = f.read()
+        f.close()
+        f = GitFile(refs_file, 'wb')
+        f.write('\n'.join(l for l in refs_data.split('\n')
+                          if not l or l[0] not in '#^'))
+        f.close()
+        self._repo = Repo(self._repo.path)
+        refs = self._repo.refs
+        self.assertTrue(refs.remove_if_equals(
+          'refs/heads/packed', '42d06bd4b77fed026b154d16493e5deab78f02ec'))
+
+    def test_remove_if_equals_packed(self):
+        # test removing ref that is only packed
+        self.assertEqual('df6800012397fb85c56e7418dd4eb9405dee075c',
+                         self._refs['refs/tags/refs-0.1'])
+        self.assertTrue(
+          self._refs.remove_if_equals('refs/tags/refs-0.1',
+          'df6800012397fb85c56e7418dd4eb9405dee075c'))
+        self.assertRaises(KeyError, lambda: self._refs['refs/tags/refs-0.1'])
+
+    def test_read_ref(self):
+        self.assertEqual('ref: refs/heads/master', self._refs.read_ref("HEAD"))
+        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
+            self._refs.read_ref("refs/heads/packed"))
+        self.assertEqual(None,
+            self._refs.read_ref("nonexistant"))
+
+
+_TEST_REFS_SERIALIZED = (
+'42d06bd4b77fed026b154d16493e5deab78f02ec\trefs/heads/40-char-ref-aaaaaaaaaaaaaaaaaa\n'
+'42d06bd4b77fed026b154d16493e5deab78f02ec\trefs/heads/master\n'
+'42d06bd4b77fed026b154d16493e5deab78f02ec\trefs/heads/packed\n'
+'df6800012397fb85c56e7418dd4eb9405dee075c\trefs/tags/refs-0.1\n'
+'3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8\trefs/tags/refs-0.2\n')
+
+
+class InfoRefsContainerTests(TestCase):
+
+    def test_invalid_refname(self):
+        text = _TEST_REFS_SERIALIZED + '00' * 20 + '\trefs/stash\n'
+        refs = InfoRefsContainer(StringIO(text))
+        expected_refs = dict(_TEST_REFS)
+        del expected_refs['HEAD']
+        expected_refs["refs/stash"] = "00" * 20
+        self.assertEqual(expected_refs, refs.as_dict())
+
+    def test_keys(self):
+        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
+        actual_keys = set(refs.keys())
+        self.assertEqual(set(refs.allkeys()), actual_keys)
+        # ignore the symref loop if it exists
+        actual_keys.discard('refs/heads/loop')
+        expected_refs = dict(_TEST_REFS)
+        del expected_refs['HEAD']
+        self.assertEqual(set(expected_refs.iterkeys()), actual_keys)
+
+        actual_keys = refs.keys('refs/heads')
+        actual_keys.discard('loop')
+        self.assertEqual(
+            ['40-char-ref-aaaaaaaaaaaaaaaaaa', 'master', 'packed'],
+            sorted(actual_keys))
+        self.assertEqual(['refs-0.1', 'refs-0.2'],
+                         sorted(refs.keys('refs/tags')))
+
+    def test_as_dict(self):
+        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
+        # refs/heads/loop does not show up even if it exists
+        expected_refs = dict(_TEST_REFS)
+        del expected_refs['HEAD']
+        self.assertEqual(expected_refs, refs.as_dict())
+
+    def test_contains(self):
+        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
+        self.assertTrue('refs/heads/master' in refs)
+        self.assertFalse('refs/heads/bar' in refs)
+
+    def test_get_peeled(self):
+        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
+        # refs/heads/loop does not show up even if it exists
+        self.assertEqual(
+            _TEST_REFS['refs/heads/master'],
+            refs.get_peeled('refs/heads/master'))

+ 6 - 521
dulwich/tests/test_repository.py

@@ -19,7 +19,6 @@
 
 """Tests for the repository."""
 
-from cStringIO import StringIO
 import os
 import stat
 import shutil
@@ -27,26 +26,14 @@ import tempfile
 import warnings
 
 from dulwich import errors
-from dulwich.file import (
-    GitFile,
-    )
 from dulwich.object_store import (
     tree_lookup_path,
     )
 from dulwich import objects
 from dulwich.config import Config
-from dulwich.refs import (
-    _split_ref_line,
-    )
 from dulwich.repo import (
-    check_ref_format,
-    DictRefsContainer,
-    InfoRefsContainer,
     Repo,
     MemoryRepo,
-    read_packed_refs,
-    read_packed_refs_with_peeled,
-    write_packed_refs,
     )
 from dulwich.tests import (
     TestCase,
@@ -115,18 +102,18 @@ class RepositoryTests(TestCase):
         r = self._repo = open_repo('a.git')
         self.assertEqual(r.controldir(), r.path)
 
-    def test_ref(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.assertEqual(r.ref('refs/heads/master'),
-                         'a90fa2d900a17e99b433217e988c4eb4a2e9a097')
-
     def test_setitem(self):
         r = self._repo = open_repo('a.git')
         r["refs/tags/foo"] = 'a90fa2d900a17e99b433217e988c4eb4a2e9a097'
         self.assertEqual('a90fa2d900a17e99b433217e988c4eb4a2e9a097',
                           r["refs/tags/foo"].id)
 
+    def test_getitem_notfound_unicode(self):
+        r = self._repo = open_repo('a.git')
+        # In the future, this might raise a TypeError since we don't
+        # handle unicode strings properly (what encoding?) for refs.
+        self.assertRaises(KeyError, r.__getitem__, u"11" * 19 + "--")
+
     def test_delitem(self):
         r = self._repo = open_repo('a.git')
 
@@ -191,53 +178,6 @@ class RepositoryTests(TestCase):
         r = self._repo = open_repo('a.git')
         self.assertFalse("bar" in r)
 
-    def test_commit(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        obj = r.commit(r.head())
-        self.assertEqual(obj.type_name, 'commit')
-
-    def test_commit_not_commit(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        self.assertRaises(errors.NotCommitError,
-            r.commit, '4f2e6529203aa6d44b5af6e3292c837ceda003f9')
-
-    def test_tree(self):
-        r = self._repo = open_repo('a.git')
-        commit = r[r.head()]
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        tree = r.tree(commit.tree)
-        self.assertEqual(tree.type_name, 'tree')
-        self.assertEqual(tree.sha().hexdigest(), commit.tree)
-
-    def test_tree_not_tree(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        self.assertRaises(errors.NotTreeError, r.tree, r.head())
-
-    def test_tag(self):
-        r = self._repo = open_repo('a.git')
-        tag_sha = '28237f4dc30d0d462658d6b937b08a0f0b6ef55a'
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        tag = r.tag(tag_sha)
-        self.assertEqual(tag.type_name, 'tag')
-        self.assertEqual(tag.sha().hexdigest(), tag_sha)
-        obj_class, obj_sha = tag.object
-        self.assertEqual(obj_class, objects.Commit)
-        self.assertEqual(obj_sha, r.head())
-
-    def test_tag_not_tag(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        self.assertRaises(errors.NotTagError, r.tag, r.head())
-
     def test_get_peeled(self):
         # unpacked ref
         r = self._repo = open_repo('a.git')
@@ -257,23 +197,6 @@ class RepositoryTests(TestCase):
         r = self._repo = open_repo('a.git')
         self.assertEqual(r.get_peeled('HEAD'), r.head())
 
-    def test_get_blob(self):
-        r = self._repo = open_repo('a.git')
-        commit = r[r.head()]
-        tree = r[commit.tree]
-        blob_sha = tree.items()[0][2]
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        blob = r.get_blob(blob_sha)
-        self.assertEqual(blob.type_name, 'blob')
-        self.assertEqual(blob.sha().hexdigest(), blob_sha)
-
-    def test_get_blob_notblob(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        self.assertRaises(errors.NotBlobError, r.get_blob, r.head())
-
     def test_get_walker(self):
         r = self._repo = open_repo('a.git')
         # include defaults to [r.head()]
@@ -286,15 +209,6 @@ class RepositoryTests(TestCase):
             [e.commit.id for e in r.get_walker('2a72d929692c41d8554c07f6301757ba18a65d91')],
             ['2a72d929692c41d8554c07f6301757ba18a65d91'])
 
-    def test_linear_history(self):
-        r = self._repo = open_repo('a.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        history = r.revision_history(r.head())
-        shas = [c.sha().hexdigest() for c in history]
-        self.assertEqual(shas, [r.head(),
-                                '2a72d929692c41d8554c07f6301757ba18a65d91'])
-
     def test_clone(self):
         r = self._repo = open_repo('a.git')
         tmp_dir = tempfile.mkdtemp()
@@ -352,13 +266,6 @@ class RepositoryTests(TestCase):
                                 '60dacdc733de308bb77bb76ce0fb0f9b44c9769e',
                                 '0d89f20333fbb1d2f3a94da77f4981373d8f4310'])
 
-    def test_revision_history_missing_commit(self):
-        r = self._repo = open_repo('simple_merge.git')
-        warnings.simplefilter("ignore", DeprecationWarning)
-        self.addCleanup(warnings.resetwarnings)
-        self.assertRaises(errors.MissingCommitError, r.revision_history,
-                          missing_sha)
-
     def test_out_of_order_merge(self):
         """Test that revision history is ordered by date, not parent order."""
         r = self._repo = open_repo('ooo_merge.git')
@@ -770,425 +677,3 @@ class BuildRepoTests(TestCase):
         r.stage(['a'])
         r.stage(['a'])  # double-stage a deleted path
 
-
-class CheckRefFormatTests(TestCase):
-    """Tests for the check_ref_format function.
-
-    These are the same tests as in the git test suite.
-    """
-
-    def test_valid(self):
-        self.assertTrue(check_ref_format('heads/foo'))
-        self.assertTrue(check_ref_format('foo/bar/baz'))
-        self.assertTrue(check_ref_format('refs///heads/foo'))
-        self.assertTrue(check_ref_format('foo./bar'))
-        self.assertTrue(check_ref_format('heads/foo@bar'))
-        self.assertTrue(check_ref_format('heads/fix.lock.error'))
-
-    def test_invalid(self):
-        self.assertFalse(check_ref_format('foo'))
-        self.assertFalse(check_ref_format('heads/foo/'))
-        self.assertFalse(check_ref_format('./foo'))
-        self.assertFalse(check_ref_format('.refs/foo'))
-        self.assertFalse(check_ref_format('heads/foo..bar'))
-        self.assertFalse(check_ref_format('heads/foo?bar'))
-        self.assertFalse(check_ref_format('heads/foo.lock'))
-        self.assertFalse(check_ref_format('heads/v@{ation'))
-        self.assertFalse(check_ref_format('heads/foo\bar'))
-
-
-ONES = "1" * 40
-TWOS = "2" * 40
-THREES = "3" * 40
-FOURS = "4" * 40
-
-class PackedRefsFileTests(TestCase):
-
-    def test_split_ref_line_errors(self):
-        self.assertRaises(errors.PackedRefsException, _split_ref_line,
-                          'singlefield')
-        self.assertRaises(errors.PackedRefsException, _split_ref_line,
-                          'badsha name')
-        self.assertRaises(errors.PackedRefsException, _split_ref_line,
-                          '%s bad/../refname' % ONES)
-
-    def test_read_without_peeled(self):
-        f = StringIO('# comment\n%s ref/1\n%s ref/2' % (ONES, TWOS))
-        self.assertEqual([(ONES, 'ref/1'), (TWOS, 'ref/2')],
-                         list(read_packed_refs(f)))
-
-    def test_read_without_peeled_errors(self):
-        f = StringIO('%s ref/1\n^%s' % (ONES, TWOS))
-        self.assertRaises(errors.PackedRefsException, list, read_packed_refs(f))
-
-    def test_read_with_peeled(self):
-        f = StringIO('%s ref/1\n%s ref/2\n^%s\n%s ref/4' % (
-          ONES, TWOS, THREES, FOURS))
-        self.assertEqual([
-          (ONES, 'ref/1', None),
-          (TWOS, 'ref/2', THREES),
-          (FOURS, 'ref/4', None),
-          ], list(read_packed_refs_with_peeled(f)))
-
-    def test_read_with_peeled_errors(self):
-        f = StringIO('^%s\n%s ref/1' % (TWOS, ONES))
-        self.assertRaises(errors.PackedRefsException, list, read_packed_refs(f))
-
-        f = StringIO('%s ref/1\n^%s\n^%s' % (ONES, TWOS, THREES))
-        self.assertRaises(errors.PackedRefsException, list, read_packed_refs(f))
-
-    def test_write_with_peeled(self):
-        f = StringIO()
-        write_packed_refs(f, {'ref/1': ONES, 'ref/2': TWOS},
-                          {'ref/1': THREES})
-        self.assertEqual(
-          "# pack-refs with: peeled\n%s ref/1\n^%s\n%s ref/2\n" % (
-          ONES, THREES, TWOS), f.getvalue())
-
-    def test_write_without_peeled(self):
-        f = StringIO()
-        write_packed_refs(f, {'ref/1': ONES, 'ref/2': TWOS})
-        self.assertEqual("%s ref/1\n%s ref/2\n" % (ONES, TWOS), f.getvalue())
-
-
-# Dict of refs that we expect all RefsContainerTests subclasses to define.
-_TEST_REFS = {
-  'HEAD': '42d06bd4b77fed026b154d16493e5deab78f02ec',
-  'refs/heads/40-char-ref-aaaaaaaaaaaaaaaaaa': '42d06bd4b77fed026b154d16493e5deab78f02ec',
-  'refs/heads/master': '42d06bd4b77fed026b154d16493e5deab78f02ec',
-  'refs/heads/packed': '42d06bd4b77fed026b154d16493e5deab78f02ec',
-  'refs/tags/refs-0.1': 'df6800012397fb85c56e7418dd4eb9405dee075c',
-  'refs/tags/refs-0.2': '3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8',
-  }
-
-
-class RefsContainerTests(object):
-
-    def test_keys(self):
-        actual_keys = set(self._refs.keys())
-        self.assertEqual(set(self._refs.allkeys()), actual_keys)
-        # ignore the symref loop if it exists
-        actual_keys.discard('refs/heads/loop')
-        self.assertEqual(set(_TEST_REFS.iterkeys()), actual_keys)
-
-        actual_keys = self._refs.keys('refs/heads')
-        actual_keys.discard('loop')
-        self.assertEqual(
-            ['40-char-ref-aaaaaaaaaaaaaaaaaa', 'master', 'packed'],
-            sorted(actual_keys))
-        self.assertEqual(['refs-0.1', 'refs-0.2'],
-                         sorted(self._refs.keys('refs/tags')))
-
-    def test_as_dict(self):
-        # refs/heads/loop does not show up even if it exists
-        self.assertEqual(_TEST_REFS, self._refs.as_dict())
-
-    def test_setitem(self):
-        self._refs['refs/some/ref'] = '42d06bd4b77fed026b154d16493e5deab78f02ec'
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['refs/some/ref'])
-        self.assertRaises(errors.RefFormatError, self._refs.__setitem__,
-                          'notrefs/foo', '42d06bd4b77fed026b154d16493e5deab78f02ec')
-
-    def test_set_if_equals(self):
-        nines = '9' * 40
-        self.assertFalse(self._refs.set_if_equals('HEAD', 'c0ffee', nines))
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['HEAD'])
-
-        self.assertTrue(self._refs.set_if_equals(
-          'HEAD', '42d06bd4b77fed026b154d16493e5deab78f02ec', nines))
-        self.assertEqual(nines, self._refs['HEAD'])
-
-        self.assertTrue(self._refs.set_if_equals('refs/heads/master', None,
-                                                 nines))
-        self.assertEqual(nines, self._refs['refs/heads/master'])
-
-    def test_add_if_new(self):
-        nines = '9' * 40
-        self.assertFalse(self._refs.add_if_new('refs/heads/master', nines))
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['refs/heads/master'])
-
-        self.assertTrue(self._refs.add_if_new('refs/some/ref', nines))
-        self.assertEqual(nines, self._refs['refs/some/ref'])
-
-    def test_set_symbolic_ref(self):
-        self._refs.set_symbolic_ref('refs/heads/symbolic', 'refs/heads/master')
-        self.assertEqual('ref: refs/heads/master',
-                         self._refs.read_loose_ref('refs/heads/symbolic'))
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['refs/heads/symbolic'])
-
-    def test_set_symbolic_ref_overwrite(self):
-        nines = '9' * 40
-        self.assertFalse('refs/heads/symbolic' in self._refs)
-        self._refs['refs/heads/symbolic'] = nines
-        self.assertEqual(nines, self._refs.read_loose_ref('refs/heads/symbolic'))
-        self._refs.set_symbolic_ref('refs/heads/symbolic', 'refs/heads/master')
-        self.assertEqual('ref: refs/heads/master',
-                         self._refs.read_loose_ref('refs/heads/symbolic'))
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['refs/heads/symbolic'])
-
-    def test_check_refname(self):
-        self._refs._check_refname('HEAD')
-        self._refs._check_refname('refs/stash')
-        self._refs._check_refname('refs/heads/foo')
-
-        self.assertRaises(errors.RefFormatError, self._refs._check_refname,
-                          'refs')
-        self.assertRaises(errors.RefFormatError, self._refs._check_refname,
-                          'notrefs/foo')
-
-    def test_contains(self):
-        self.assertTrue('refs/heads/master' in self._refs)
-        self.assertFalse('refs/heads/bar' in self._refs)
-
-    def test_delitem(self):
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                          self._refs['refs/heads/master'])
-        del self._refs['refs/heads/master']
-        self.assertRaises(KeyError, lambda: self._refs['refs/heads/master'])
-
-    def test_remove_if_equals(self):
-        self.assertFalse(self._refs.remove_if_equals('HEAD', 'c0ffee'))
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['HEAD'])
-        self.assertTrue(self._refs.remove_if_equals(
-          'refs/tags/refs-0.2', '3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8'))
-        self.assertFalse('refs/tags/refs-0.2' in self._refs)
-
-
-class DictRefsContainerTests(RefsContainerTests, TestCase):
-
-    def setUp(self):
-        TestCase.setUp(self)
-        self._refs = DictRefsContainer(dict(_TEST_REFS))
-
-    def test_invalid_refname(self):
-        # FIXME: Move this test into RefsContainerTests, but requires
-        # some way of injecting invalid refs.
-        self._refs._refs["refs/stash"] = "00" * 20
-        expected_refs = dict(_TEST_REFS)
-        expected_refs["refs/stash"] = "00" * 20
-        self.assertEqual(expected_refs, self._refs.as_dict())
-
-
-class DiskRefsContainerTests(RefsContainerTests, TestCase):
-
-    def setUp(self):
-        TestCase.setUp(self)
-        self._repo = open_repo('refs.git')
-        self._refs = self._repo.refs
-
-    def tearDown(self):
-        tear_down_repo(self._repo)
-        TestCase.tearDown(self)
-
-    def test_get_packed_refs(self):
-        self.assertEqual({
-          'refs/heads/packed': '42d06bd4b77fed026b154d16493e5deab78f02ec',
-          'refs/tags/refs-0.1': 'df6800012397fb85c56e7418dd4eb9405dee075c',
-          }, self._refs.get_packed_refs())
-
-    def test_get_peeled_not_packed(self):
-        # not packed
-        self.assertEqual(None, self._refs.get_peeled('refs/tags/refs-0.2'))
-        self.assertEqual('3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8',
-                         self._refs['refs/tags/refs-0.2'])
-
-        # packed, known not peelable
-        self.assertEqual(self._refs['refs/heads/packed'],
-                         self._refs.get_peeled('refs/heads/packed'))
-
-        # packed, peeled
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs.get_peeled('refs/tags/refs-0.1'))
-
-    def test_setitem(self):
-        RefsContainerTests.test_setitem(self)
-        f = open(os.path.join(self._refs.path, 'refs', 'some', 'ref'), 'rb')
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                          f.read()[:40])
-        f.close()
-
-    def test_setitem_symbolic(self):
-        ones = '1' * 40
-        self._refs['HEAD'] = ones
-        self.assertEqual(ones, self._refs['HEAD'])
-
-        # ensure HEAD was not modified
-        f = open(os.path.join(self._refs.path, 'HEAD'), 'rb')
-        self.assertEqual('ref: refs/heads/master', iter(f).next().rstrip('\n'))
-        f.close()
-
-        # ensure the symbolic link was written through
-        f = open(os.path.join(self._refs.path, 'refs', 'heads', 'master'), 'rb')
-        self.assertEqual(ones, f.read()[:40])
-        f.close()
-
-    def test_set_if_equals(self):
-        RefsContainerTests.test_set_if_equals(self)
-
-        # ensure symref was followed
-        self.assertEqual('9' * 40, self._refs['refs/heads/master'])
-
-        # ensure lockfile was deleted
-        self.assertFalse(os.path.exists(
-          os.path.join(self._refs.path, 'refs', 'heads', 'master.lock')))
-        self.assertFalse(os.path.exists(
-          os.path.join(self._refs.path, 'HEAD.lock')))
-
-    def test_add_if_new_packed(self):
-        # don't overwrite packed ref
-        self.assertFalse(self._refs.add_if_new('refs/tags/refs-0.1', '9' * 40))
-        self.assertEqual('df6800012397fb85c56e7418dd4eb9405dee075c',
-                         self._refs['refs/tags/refs-0.1'])
-
-    def test_add_if_new_symbolic(self):
-        # Use an empty repo instead of the default.
-        tear_down_repo(self._repo)
-        repo_dir = os.path.join(tempfile.mkdtemp(), 'test')
-        os.makedirs(repo_dir)
-        self._repo = Repo.init(repo_dir)
-        refs = self._repo.refs
-
-        nines = '9' * 40
-        self.assertEqual('ref: refs/heads/master', refs.read_ref('HEAD'))
-        self.assertFalse('refs/heads/master' in refs)
-        self.assertTrue(refs.add_if_new('HEAD', nines))
-        self.assertEqual('ref: refs/heads/master', refs.read_ref('HEAD'))
-        self.assertEqual(nines, refs['HEAD'])
-        self.assertEqual(nines, refs['refs/heads/master'])
-        self.assertFalse(refs.add_if_new('HEAD', '1' * 40))
-        self.assertEqual(nines, refs['HEAD'])
-        self.assertEqual(nines, refs['refs/heads/master'])
-
-    def test_follow(self):
-        self.assertEqual(
-          ('refs/heads/master', '42d06bd4b77fed026b154d16493e5deab78f02ec'),
-          self._refs._follow('HEAD'))
-        self.assertEqual(
-          ('refs/heads/master', '42d06bd4b77fed026b154d16493e5deab78f02ec'),
-          self._refs._follow('refs/heads/master'))
-        self.assertRaises(KeyError, self._refs._follow, 'refs/heads/loop')
-
-    def test_delitem(self):
-        RefsContainerTests.test_delitem(self)
-        ref_file = os.path.join(self._refs.path, 'refs', 'heads', 'master')
-        self.assertFalse(os.path.exists(ref_file))
-        self.assertFalse('refs/heads/master' in self._refs.get_packed_refs())
-
-    def test_delitem_symbolic(self):
-        self.assertEqual('ref: refs/heads/master',
-                          self._refs.read_loose_ref('HEAD'))
-        del self._refs['HEAD']
-        self.assertRaises(KeyError, lambda: self._refs['HEAD'])
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-                         self._refs['refs/heads/master'])
-        self.assertFalse(os.path.exists(os.path.join(self._refs.path, 'HEAD')))
-
-    def test_remove_if_equals_symref(self):
-        # HEAD is a symref, so shouldn't equal its dereferenced value
-        self.assertFalse(self._refs.remove_if_equals(
-          'HEAD', '42d06bd4b77fed026b154d16493e5deab78f02ec'))
-        self.assertTrue(self._refs.remove_if_equals(
-          'refs/heads/master', '42d06bd4b77fed026b154d16493e5deab78f02ec'))
-        self.assertRaises(KeyError, lambda: self._refs['refs/heads/master'])
-
-        # HEAD is now a broken symref
-        self.assertRaises(KeyError, lambda: self._refs['HEAD'])
-        self.assertEqual('ref: refs/heads/master',
-                          self._refs.read_loose_ref('HEAD'))
-
-        self.assertFalse(os.path.exists(
-            os.path.join(self._refs.path, 'refs', 'heads', 'master.lock')))
-        self.assertFalse(os.path.exists(
-            os.path.join(self._refs.path, 'HEAD.lock')))
-
-    def test_remove_packed_without_peeled(self):
-        refs_file = os.path.join(self._repo.path, 'packed-refs')
-        f = GitFile(refs_file)
-        refs_data = f.read()
-        f.close()
-        f = GitFile(refs_file, 'wb')
-        f.write('\n'.join(l for l in refs_data.split('\n')
-                          if not l or l[0] not in '#^'))
-        f.close()
-        self._repo = Repo(self._repo.path)
-        refs = self._repo.refs
-        self.assertTrue(refs.remove_if_equals(
-          'refs/heads/packed', '42d06bd4b77fed026b154d16493e5deab78f02ec'))
-
-    def test_remove_if_equals_packed(self):
-        # test removing ref that is only packed
-        self.assertEqual('df6800012397fb85c56e7418dd4eb9405dee075c',
-                         self._refs['refs/tags/refs-0.1'])
-        self.assertTrue(
-          self._refs.remove_if_equals('refs/tags/refs-0.1',
-          'df6800012397fb85c56e7418dd4eb9405dee075c'))
-        self.assertRaises(KeyError, lambda: self._refs['refs/tags/refs-0.1'])
-
-    def test_read_ref(self):
-        self.assertEqual('ref: refs/heads/master', self._refs.read_ref("HEAD"))
-        self.assertEqual('42d06bd4b77fed026b154d16493e5deab78f02ec',
-            self._refs.read_ref("refs/heads/packed"))
-        self.assertEqual(None,
-            self._refs.read_ref("nonexistant"))
-
-
-_TEST_REFS_SERIALIZED = (
-'42d06bd4b77fed026b154d16493e5deab78f02ec\trefs/heads/40-char-ref-aaaaaaaaaaaaaaaaaa\n'
-'42d06bd4b77fed026b154d16493e5deab78f02ec\trefs/heads/master\n'
-'42d06bd4b77fed026b154d16493e5deab78f02ec\trefs/heads/packed\n'
-'df6800012397fb85c56e7418dd4eb9405dee075c\trefs/tags/refs-0.1\n'
-'3ec9c43c84ff242e3ef4a9fc5bc111fd780a76a8\trefs/tags/refs-0.2\n')
-
-
-class InfoRefsContainerTests(TestCase):
-
-    def test_invalid_refname(self):
-        text = _TEST_REFS_SERIALIZED + '00' * 20 + '\trefs/stash\n'
-        refs = InfoRefsContainer(StringIO(text))
-        expected_refs = dict(_TEST_REFS)
-        del expected_refs['HEAD']
-        expected_refs["refs/stash"] = "00" * 20
-        self.assertEqual(expected_refs, refs.as_dict())
-
-    def test_keys(self):
-        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
-        actual_keys = set(refs.keys())
-        self.assertEqual(set(refs.allkeys()), actual_keys)
-        # ignore the symref loop if it exists
-        actual_keys.discard('refs/heads/loop')
-        expected_refs = dict(_TEST_REFS)
-        del expected_refs['HEAD']
-        self.assertEqual(set(expected_refs.iterkeys()), actual_keys)
-
-        actual_keys = refs.keys('refs/heads')
-        actual_keys.discard('loop')
-        self.assertEqual(
-            ['40-char-ref-aaaaaaaaaaaaaaaaaa', 'master', 'packed'],
-            sorted(actual_keys))
-        self.assertEqual(['refs-0.1', 'refs-0.2'],
-                         sorted(refs.keys('refs/tags')))
-
-    def test_as_dict(self):
-        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
-        # refs/heads/loop does not show up even if it exists
-        expected_refs = dict(_TEST_REFS)
-        del expected_refs['HEAD']
-        self.assertEqual(expected_refs, refs.as_dict())
-
-    def test_contains(self):
-        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
-        self.assertTrue('refs/heads/master' in refs)
-        self.assertFalse('refs/heads/bar' in refs)
-
-    def test_get_peeled(self):
-        refs = InfoRefsContainer(StringIO(_TEST_REFS_SERIALIZED))
-        # refs/heads/loop does not show up even if it exists
-        self.assertEqual(
-            _TEST_REFS['refs/heads/master'],
-            refs.get_peeled('refs/heads/master'))

+ 127 - 0
dulwich/tests/test_server.py

@@ -27,6 +27,13 @@ from dulwich.errors import (
     NotGitRepository,
     UnexpectedCommandError,
     )
+from dulwich.objects import (
+    Commit,
+    Tag,
+    )
+from dulwich.object_store import (
+    MemoryObjectStore,
+    )
 from dulwich.repo import (
     MemoryRepo,
     Repo,
@@ -40,6 +47,7 @@ from dulwich.server import (
     MultiAckDetailedGraphWalkerImpl,
     _split_proto_line,
     serve_command,
+    _find_shallow,
     ProtocolGraphWalker,
     ReceivePackHandler,
     SingleAckGraphWalkerImpl,
@@ -49,6 +57,7 @@ from dulwich.server import (
 from dulwich.tests import TestCase
 from dulwich.tests.utils import (
     make_commit,
+    make_object,
     )
 from dulwich.protocol import (
     ZERO_SHA,
@@ -197,6 +206,81 @@ class UploadPackHandlerTestCase(TestCase):
         self.assertEqual({}, self._handler.get_tagged(refs, repo=self._repo))
 
 
+class FindShallowTests(TestCase):
+
+    def setUp(self):
+        self._store = MemoryObjectStore()
+
+    def make_commit(self, **attrs):
+        commit = make_commit(**attrs)
+        self._store.add_object(commit)
+        return commit
+
+    def make_linear_commits(self, n, message=''):
+        commits = []
+        parents = []
+        for _ in xrange(n):
+            commits.append(self.make_commit(parents=parents, message=message))
+            parents = [commits[-1].id]
+        return commits
+
+    def assertSameElements(self, expected, actual):
+        self.assertEqual(set(expected), set(actual))
+
+    def test_linear(self):
+        c1, c2, c3 = self.make_linear_commits(3)
+
+        self.assertEqual((set([c3.id]), set([])),
+                         _find_shallow(self._store, [c3.id], 0))
+        self.assertEqual((set([c2.id]), set([c3.id])),
+                         _find_shallow(self._store, [c3.id], 1))
+        self.assertEqual((set([c1.id]), set([c2.id, c3.id])),
+                         _find_shallow(self._store, [c3.id], 2))
+        self.assertEqual((set([]), set([c1.id, c2.id, c3.id])),
+                         _find_shallow(self._store, [c3.id], 3))
+
+    def test_multiple_independent(self):
+        a = self.make_linear_commits(2, message='a')
+        b = self.make_linear_commits(2, message='b')
+        c = self.make_linear_commits(2, message='c')
+        heads = [a[1].id, b[1].id, c[1].id]
+
+        self.assertEqual((set([a[0].id, b[0].id, c[0].id]), set(heads)),
+                         _find_shallow(self._store, heads, 1))
+
+    def test_multiple_overlapping(self):
+        # Create the following commit tree:
+        # 1--2
+        #  \
+        #   3--4
+        c1, c2 = self.make_linear_commits(2)
+        c3 = self.make_commit(parents=[c1.id])
+        c4 = self.make_commit(parents=[c3.id])
+
+        # 1 is shallow along the path from 4, but not along the path from 2.
+        self.assertEqual((set([c1.id]), set([c1.id, c2.id, c3.id, c4.id])),
+                         _find_shallow(self._store, [c2.id, c4.id], 2))
+
+    def test_merge(self):
+        c1 = self.make_commit()
+        c2 = self.make_commit()
+        c3 = self.make_commit(parents=[c1.id, c2.id])
+
+        self.assertEqual((set([c1.id, c2.id]), set([c3.id])),
+                         _find_shallow(self._store, [c3.id], 1))
+
+    def test_tag(self):
+        c1, c2 = self.make_linear_commits(2)
+        tag = make_object(Tag, name='tag', message='',
+                          tagger='Tagger <test@example.com>',
+                          tag_time=12345, tag_timezone=0,
+                          object=(Commit, c2.id))
+        self._store.add_object(tag)
+
+        self.assertEqual((set([c1.id]), set([c2.id])),
+                         _find_shallow(self._store, [tag.id], 1))
+
+
 class TestUploadPackHandler(UploadPackHandler):
     @classmethod
     def required_capabilities(self):
@@ -302,6 +386,10 @@ class ProtocolGraphWalkerTestCase(TestCase):
         self._repo.refs._update(heads)
         self.assertEqual([ONE, TWO], self._walker.determine_wants(heads))
 
+        self._walker.advertise_refs = True
+        self.assertEqual([], self._walker.determine_wants(heads))
+        self._walker.advertise_refs = False
+
         self._walker.proto.set_output(['want %s multi_ack' % FOUR])
         self.assertRaises(GitProtocolError, self._walker.determine_wants, heads)
 
@@ -350,6 +438,45 @@ class ProtocolGraphWalkerTestCase(TestCase):
 
     # TODO: test commit time cutoff
 
+    def _handle_shallow_request(self, lines, heads):
+        self._walker.proto.set_output(lines)
+        self._walker._handle_shallow_request(heads)
+
+    def assertReceived(self, expected):
+        self.assertEquals(
+          expected, list(iter(self._walker.proto.get_received_line, None)))
+
+    def test_handle_shallow_request_no_client_shallows(self):
+        self._handle_shallow_request(['deepen 1\n'], [FOUR, FIVE])
+        self.assertEquals(set([TWO, THREE]), self._walker.shallow)
+        self.assertReceived([
+          'shallow %s' % TWO,
+          'shallow %s' % THREE,
+          ])
+
+    def test_handle_shallow_request_no_new_shallows(self):
+        lines = [
+          'shallow %s\n' % TWO,
+          'shallow %s\n' % THREE,
+          'deepen 1\n',
+          ]
+        self._handle_shallow_request(lines, [FOUR, FIVE])
+        self.assertEquals(set([TWO, THREE]), self._walker.shallow)
+        self.assertReceived([])
+
+    def test_handle_shallow_request_unshallows(self):
+        lines = [
+          'shallow %s\n' % TWO,
+          'deepen 2\n',
+          ]
+        self._handle_shallow_request(lines, [FOUR, FIVE])
+        self.assertEquals(set([ONE]), self._walker.shallow)
+        self.assertReceived([
+          'shallow %s' % ONE,
+          'unshallow %s' % TWO,
+          # THREE is unshallow but was is not shallow in the client
+          ])
+
 
 class TestProtocolGraphWalker(object):
 

+ 2 - 1
dulwich/tests/test_walk.py

@@ -18,9 +18,10 @@
 
 """Tests for commit walking functionality."""
 
-from dulwich._compat import (
+from itertools import (
     permutations,
     )
+
 from dulwich.diff_tree import (
     CHANGE_ADD,
     CHANGE_MODIFY,

+ 6 - 4
dulwich/tests/test_web.py

@@ -254,12 +254,14 @@ class DumbHandlersTestCase(WebTestCase):
         self.assertFalse(self._req.cached)
 
     def test_get_info_packs(self):
-        class TestPack(object):
+        class TestPackData(object):
+
             def __init__(self, sha):
-                self._sha = sha
+                self.filename = "pack-%s.pack" % sha
 
-            def name(self):
-                return self._sha
+        class TestPack(object):
+            def __init__(self, sha):
+                self.data = TestPackData(sha)
 
         packs = [TestPack(str(i) * 40) for i in xrange(1, 4)]
 

+ 1 - 7
dulwich/walk.py

@@ -19,18 +19,12 @@
 """General implementation of walking commits and their contents."""
 
 
-try:
-    from collections import defaultdict
-except ImportError:
-    from _compat import defaultdict
+from collections import defaultdict
 
 import collections
 import heapq
 import itertools
 
-from dulwich._compat import (
-    all,
-    )
 from dulwich.diff_tree import (
     RENAME_CHANGE_TYPES,
     tree_changes,

+ 1 - 4
dulwich/web.py

@@ -25,11 +25,8 @@ import os
 import re
 import sys
 import time
+from urlparse import parse_qs
 
-try:
-    from urlparse import parse_qs
-except ImportError:
-    from dulwich._compat import parse_qs
 from dulwich import log_utils
 from dulwich.protocol import (
     ReceivableProtocol,

+ 21 - 0
examples/latest_change.py

@@ -0,0 +1,21 @@
+#!/usr/bin/python
+# Example printing the last author of a specified file
+
+import sys
+import time
+from dulwich.repo import Repo
+
+if len(sys.argv) < 2:
+    print "usage: %s filename" % (sys.argv[0], )
+    sys.exit(1)
+
+r = Repo(".")
+
+w = r.get_walker(paths=[sys.argv[1]], max_entries=1)
+try:
+    c = iter(w).next().commit
+except StopIteration:
+    print "No file %s anywhere in history." % sys.argv[1]
+else:
+    print "%s was last changed at %s by %s (commit %s)" % (
+        sys.argv[1], c.author, time.ctime(c.author_time), c.id)

+ 6 - 6
setup.py

@@ -10,7 +10,7 @@ except ImportError:
     has_setuptools = False
 from distutils.core import Distribution
 
-dulwich_version_string = '0.9.4'
+dulwich_version_string = '0.9.5'
 
 include_dirs = []
 # Windows MSVC support
@@ -57,19 +57,19 @@ setup(name='dulwich',
       description='Python Git Library',
       keywords='git',
       version=dulwich_version_string,
-      url='http://samba.org/~jelmer/dulwich',
+      url='https://samba.org/~jelmer/dulwich',
       license='GPLv2 or later',
       author='Jelmer Vernooij',
       author_email='jelmer@samba.org',
       long_description="""
-      Simple Python implementation of the Git file formats and
-      protocols.
+      Python implementation of the Git file formats and protocols,
+      without the need to have git installed.
 
       All functionality is available in pure Python. Optional
       C extensions can be built for improved performance.
 
-      Dulwich takes its name from the area in London where the friendly
-      Mr. and Mrs. Git once attended a cocktail party.
+      The project is named after the part of London that Mr. and Mrs. Git live in
+      in the particular Monty Python sketch.
       """,
       packages=['dulwich', 'dulwich.tests', 'dulwich.tests.compat'],
       scripts=['bin/dulwich', 'bin/dul-daemon', 'bin/dul-web', 'bin/dul-receive-pack', 'bin/dul-upload-pack'],