Browse Source

Merge tag 'dulwich-0.9.0' into debian

Jelmer Vernooij 12 năm trước cách đây
mục cha
commit
80fc304b00

+ 1 - 0
AUTHORS

@@ -3,6 +3,7 @@ James Westby <jw+debian@jameswestby.net>
 John Carr <john.carr@unrouted.co.uk>
 John Carr <john.carr@unrouted.co.uk>
 Dave Borowitz <dborowitz@google.com>
 Dave Borowitz <dborowitz@google.com>
 Chris Eberle <eberle1080@gmail.com>
 Chris Eberle <eberle1080@gmail.com>
+"milki" <milki@rescomp.berkeley.edu>
 
 
 Hervé Cauwelier <herve@itaapy.com> wrote the original tutorial.
 Hervé Cauwelier <herve@itaapy.com> wrote the original tutorial.
 
 

+ 9 - 7
HACKING

@@ -1,3 +1,12 @@
+All functionality should be available in pure Python. Optional C
+implementations may be written for performance reasons, but should never
+replace the Python implementation. The C implementations should follow the
+kernel/git coding style.
+
+Where possible include updates to NEWS along with your improvements.
+
+New functionality and bug fixes should be accompanied with matching unit tests.
+
 Coding style
 Coding style
 ------------
 ------------
 Where possible, please follow PEP8 with regard to coding style.
 Where possible, please follow PEP8 with regard to coding style.
@@ -5,17 +14,10 @@ Where possible, please follow PEP8 with regard to coding style.
 Furthermore, triple-quotes should always be """, single quotes are ' unless
 Furthermore, triple-quotes should always be """, single quotes are ' unless
 using " would result in less escaping within the string.
 using " would result in less escaping within the string.
 
 
-All functionality should be available in pure Python. Optional C
-implementations may be written for performance reasons, but should never
-replace the Python implementation. The C implementations should follow the
-kernel/git coding style.
-
 Public methods, functions and classes should all have doc strings. Please use
 Public methods, functions and classes should all have doc strings. Please use
 epydoc style docstrings to document parameters and return values.
 epydoc style docstrings to document parameters and return values.
 You can generate the documentation by running "make doc".
 You can generate the documentation by running "make doc".
 
 
-Where possible please include updates to NEWS along with your improvements.
-
 Running the tests
 Running the tests
 -----------------
 -----------------
 To run the testsuite, you should be able to simply run "make check". This
 To run the testsuite, you should be able to simply run "make check". This

+ 55 - 0
NEWS

@@ -1,3 +1,58 @@
+0.9.0	2013-05-31
+
+ BUG FIXES
+
+  * Push efficiency - report missing objects only. (#562676, Artem Tikhomirov)
+
+  * Use indentation consistent with C Git in config files.
+    (#1031356, Curt Moore, Jelmer Vernooij)
+
+  * Recognize and skip binary files in diff function.
+    (Takeshi Kanemoto)
+
+  * Fix handling of relative paths in dulwich.client.get_transport_and_path.
+    (Brian Visel, #1169368)
+
+  * Preserve ordering of entries in configuration.
+    (Benjamin Pollack)
+
+  * Support ~ expansion in SSH client paths. (milki, #1083439)
+
+  * Support relative paths in alternate paths.
+    (milki, Michel Lespinasse, #1175007)
+
+  * Log all error messages from wsgiref server to the logging module. This
+    makes the test suit quiet again. (Gary van der Merwe)
+
+  * Support passing None for empty tree in changes_from_tree.
+    (Kevin Watters)
+
+  * Support fetching empty repository in client. (milki, #1060462)
+
+ IMPROVEMENTS:
+
+  * Add optional honor_filemode flag to build_index_from_tree.
+    (Mark Mikofski)
+
+  * Support core/filemode setting when building trees. (Jelmer Vernooij)
+
+  * Add chapter on tags in tutorial. (Ryan Faulkner)
+
+ FEATURES
+
+  * Add support for mergetags. (milki, #963525)
+
+  * Add support for posix shell hooks. (milki)
+
+0.8.7	2012-11-27
+
+ BUG FIXES
+
+  * Fix use of alternates in ``DiskObjectStore``.{__contains__,__iter__}.
+    (Dmitriy)
+
+  * Fix compatibility with Python 2.4. (David Carr)
+
 0.8.6	2012-11-09
 0.8.6	2012-11-09
 
 
  API CHANGES
  API CHANGES

+ 1 - 0
docs/tutorial/index.txt

@@ -11,5 +11,6 @@ Tutorial
    repo
    repo
    object-store
    object-store
    remote
    remote
+   tag
    conclusion
    conclusion
 
 

+ 57 - 0
docs/tutorial/tag.txt

@@ -0,0 +1,57 @@
+.. _tutorial-tag:
+
+Tagging
+=======
+
+This tutorial will demonstrate how to add a tag to a commit via dulwich.
+
+First let's initialize the repository:
+
+    >>> from dulwich.repo import Repo
+    >>> _repo = Repo("myrepo", mkdir=True)
+
+Next we build the commit object and add it to the object store:
+
+    >>> from dulwich.objects import Blob, Tree, Commit, parse_timezone
+    >>> permissions = 0100644
+    >>> author = "John Smith"
+    >>> blob = Blob.from_string("empty")
+    >>> tree = Tree()
+    >>> tree.add(tag, permissions, blob.id)
+    >>> commit = Commit()
+    >>> commit.tree = tree.id
+    >>> commit.author = commit.committer = author
+    >>> commit.commit_time = commit.author_time = int(time())
+    >>> tz = parse_timezone('-0200')[0]
+    >>> commit.commit_timezone = commit.author_timezone = tz
+    >>> commit.encoding = "UTF-8"
+    >>> commit.message = 'Tagging repo: ' + message
+
+Add objects to the repo store instance:
+
+    >>> object_store = _repo.object_store
+    >>> object_store.add_object(blob)
+    >>> object_store.add_object(tree)
+    >>> object_store.add_object(commit)
+    >>> master_branch = 'master'
+    >>> _repo.refs['refs/heads/' + master_branch] = commit.id
+
+Finally, add the tag top the repo:
+
+    >>> _repo['refs/tags/' + commit] = commit.id
+
+Alternatively, we can use the tag object if we'd like to annotate the tag:
+
+    >>> from dulwich.objects import Blob, Tree, Commit, parse_timezone, Tag
+    >>> tag_message = "Tag Annotation"
+    >>> tag = Tag()
+    >>> tag.tagger = author
+    >>> tag.message = message
+    >>> tag.name = "v0.1"
+    >>> tag.object = (Commit, commit.id)
+    >>> tag.tag_time = commit.author_time
+    >>> tag.tag_timezone = tz
+    >>> object_store.add_object(tag)
+    >>> _repo['refs/tags/' + tag] = tag.id
+
+

+ 1 - 1
dulwich/__init__.py

@@ -21,4 +21,4 @@
 
 
 """Python implementation of the Git file formats and protocols."""
 """Python implementation of the Git file formats and protocols."""
 
 
-__version__ = (0, 8, 6)
+__version__ = (0, 9, 0)

+ 260 - 0
dulwich/_compat.py

@@ -268,3 +268,263 @@ except ImportError:
             pass
             pass
 
 
         return result
         return result
+
+
+# Backport of OrderedDict() class that runs on Python 2.4, 2.5, 2.6, 2.7 and pypy.
+# Passes Python2.7's test suite and incorporates all the latest updates.
+# Copyright (C) Raymond Hettinger, MIT license
+
+try:
+    from thread import get_ident as _get_ident
+except ImportError:
+    from dummy_thread import get_ident as _get_ident
+
+try:
+    from _abcoll import KeysView, ValuesView, ItemsView
+except ImportError:
+    pass
+
+class OrderedDict(dict):
+    'Dictionary that remembers insertion order'
+    # An inherited dict maps keys to values.
+    # The inherited dict provides __getitem__, __len__, __contains__, and get.
+    # The remaining methods are order-aware.
+    # Big-O running times for all methods are the same as for regular dictionaries.
+
+    # The internal self.__map dictionary maps keys to links in a doubly linked list.
+    # The circular doubly linked list starts and ends with a sentinel element.
+    # The sentinel element never gets deleted (this simplifies the algorithm).
+    # Each link is stored as a list of length three:  [PREV, NEXT, KEY].
+
+    def __init__(self, *args, **kwds):
+        '''Initialize an ordered dictionary.  Signature is the same as for
+        regular dictionaries, but keyword arguments are not recommended
+        because their insertion order is arbitrary.
+
+        '''
+        if len(args) > 1:
+            raise TypeError('expected at most 1 arguments, got %d' % len(args))
+        try:
+            self.__root
+        except AttributeError:
+            self.__root = root = []                     # sentinel node
+            root[:] = [root, root, None]
+            self.__map = {}
+        self.__update(*args, **kwds)
+
+    def __setitem__(self, key, value, dict_setitem=dict.__setitem__):
+        'od.__setitem__(i, y) <==> od[i]=y'
+        # Setting a new item creates a new link which goes at the end of the linked
+        # list, and the inherited dictionary is updated with the new key/value pair.
+        if key not in self:
+            root = self.__root
+            last = root[0]
+            last[1] = root[0] = self.__map[key] = [last, root, key]
+        dict_setitem(self, key, value)
+
+    def __delitem__(self, key, dict_delitem=dict.__delitem__):
+        'od.__delitem__(y) <==> del od[y]'
+        # Deleting an existing item uses self.__map to find the link which is
+        # then removed by updating the links in the predecessor and successor nodes.
+        dict_delitem(self, key)
+        link_prev, link_next, key = self.__map.pop(key)
+        link_prev[1] = link_next
+        link_next[0] = link_prev
+
+    def __iter__(self):
+        'od.__iter__() <==> iter(od)'
+        root = self.__root
+        curr = root[1]
+        while curr is not root:
+            yield curr[2]
+            curr = curr[1]
+
+    def __reversed__(self):
+        'od.__reversed__() <==> reversed(od)'
+        root = self.__root
+        curr = root[0]
+        while curr is not root:
+            yield curr[2]
+            curr = curr[0]
+
+    def clear(self):
+        'od.clear() -> None.  Remove all items from od.'
+        try:
+            for node in self.__map.itervalues():
+                del node[:]
+            root = self.__root
+            root[:] = [root, root, None]
+            self.__map.clear()
+        except AttributeError:
+            pass
+        dict.clear(self)
+
+    def popitem(self, last=True):
+        """od.popitem() -> (k, v), return and remove a (key, value) pair.
+        Pairs are returned in LIFO order if last is true or FIFO order if false.
+
+        """
+        if not self:
+            raise KeyError('dictionary is empty')
+        root = self.__root
+        if last:
+            link = root[0]
+            link_prev = link[0]
+            link_prev[1] = root
+            root[0] = link_prev
+        else:
+            link = root[1]
+            link_next = link[1]
+            root[1] = link_next
+            link_next[0] = root
+        key = link[2]
+        del self.__map[key]
+        value = dict.pop(self, key)
+        return key, value
+
+    # -- the following methods do not depend on the internal structure --
+
+    def keys(self):
+        """'od.keys() -> list of keys in od"""
+        return list(self)
+
+    def values(self):
+        """od.values() -> list of values in od"""
+        return [self[key] for key in self]
+
+    def items(self):
+        """od.items() -> list of (key, value) pairs in od"""
+        return [(key, self[key]) for key in self]
+
+    def iterkeys(self):
+        """od.iterkeys() -> an iterator over the keys in od"""
+        return iter(self)
+
+    def itervalues(self):
+        """od.itervalues -> an iterator over the values in od"""
+        for k in self:
+            yield self[k]
+
+    def iteritems(self):
+        """od.iteritems -> an iterator over the (key, value) items in od"""
+        for k in self:
+            yield (k, self[k])
+
+    def update(*args, **kwds):
+        """od.update(E, **F) -> None.  Update od from dict/iterable E and F.
+
+        If E is a dict instance, does:           for k in E: od[k] = E[k]
+        If E has a .keys() method, does:         for k in E.keys(): od[k] = E[k]
+        Or if E is an iterable of items, does:   for k, v in E: od[k] = v
+        In either case, this is followed by:     for k, v in F.items(): od[k] = v
+
+        """
+        if len(args) > 2:
+            raise TypeError('update() takes at most 2 positional '
+                            'arguments (%d given)' % (len(args),))
+        elif not args:
+            raise TypeError('update() takes at least 1 argument (0 given)')
+        self = args[0]
+        # Make progressively weaker assumptions about "other"
+        other = ()
+        if len(args) == 2:
+            other = args[1]
+        if isinstance(other, dict):
+            for key in other:
+                self[key] = other[key]
+        elif hasattr(other, 'keys'):
+            for key in other.keys():
+                self[key] = other[key]
+        else:
+            for key, value in other:
+                self[key] = value
+        for key, value in kwds.items():
+            self[key] = value
+
+    __update = update  # let subclasses override update without breaking __init__
+
+    __marker = object()
+
+    def pop(self, key, default=__marker):
+        """od.pop(k[,d]) -> v, remove specified key and return the corresponding value.
+        If key is not found, d is returned if given, otherwise KeyError is raised.
+
+        """
+        if key in self:
+            result = self[key]
+            del self[key]
+            return result
+        if default is self.__marker:
+            raise KeyError(key)
+        return default
+
+    def setdefault(self, key, default=None):
+        'od.setdefault(k[,d]) -> od.get(k,d), also set od[k]=d if k not in od'
+        if key in self:
+            return self[key]
+        self[key] = default
+        return default
+
+    def __repr__(self, _repr_running={}):
+        'od.__repr__() <==> repr(od)'
+        call_key = id(self), _get_ident()
+        if call_key in _repr_running:
+            return '...'
+        _repr_running[call_key] = 1
+        try:
+            if not self:
+                return '%s()' % (self.__class__.__name__,)
+            return '%s(%r)' % (self.__class__.__name__, self.items())
+        finally:
+            del _repr_running[call_key]
+
+    def __reduce__(self):
+        'Return state information for pickling'
+        items = [[k, self[k]] for k in self]
+        inst_dict = vars(self).copy()
+        for k in vars(OrderedDict()):
+            inst_dict.pop(k, None)
+        if inst_dict:
+            return (self.__class__, (items,), inst_dict)
+        return self.__class__, (items,)
+
+    def copy(self):
+        'od.copy() -> a shallow copy of od'
+        return self.__class__(self)
+
+    @classmethod
+    def fromkeys(cls, iterable, value=None):
+        '''OD.fromkeys(S[, v]) -> New ordered dictionary with keys from S
+        and values equal to v (which defaults to None).
+
+        '''
+        d = cls()
+        for key in iterable:
+            d[key] = value
+        return d
+
+    def __eq__(self, other):
+        '''od.__eq__(y) <==> od==y.  Comparison to another OD is order-sensitive
+        while comparison to a regular mapping is order-insensitive.
+
+        '''
+        if isinstance(other, OrderedDict):
+            return len(self)==len(other) and self.items() == other.items()
+        return dict.__eq__(self, other)
+
+    def __ne__(self, other):
+        return not self == other
+
+    # -- the following methods are only used in Python 2.7 --
+
+    def viewkeys(self):
+        "od.viewkeys() -> a set-like object providing a view on od's keys"
+        return KeysView(self)
+
+    def viewvalues(self):
+        "od.viewvalues() -> an object providing a view on od's values"
+        return ValuesView(self)
+
+    def viewitems(self):
+        "od.viewitems() -> a set-like object providing a view on od's items"
+        return ItemsView(self)

+ 31 - 21
dulwich/client.py

@@ -169,6 +169,9 @@ class GitClient(object):
             if server_capabilities is None:
             if server_capabilities is None:
                 (ref, server_capabilities) = extract_capabilities(ref)
                 (ref, server_capabilities) = extract_capabilities(ref)
             refs[ref] = sha
             refs[ref] = sha
+
+        if len(refs) == 0:
+            return None, set([])
         return refs, set(server_capabilities)
         return refs, set(server_capabilities)
 
 
     def send_pack(self, path, determine_wants, generate_pack_contents,
     def send_pack(self, path, determine_wants, generate_pack_contents,
@@ -199,11 +202,10 @@ class GitClient(object):
         if determine_wants is None:
         if determine_wants is None:
             determine_wants = target.object_store.determine_wants_all
             determine_wants = target.object_store.determine_wants_all
         f, commit = target.object_store.add_pack()
         f, commit = target.object_store.add_pack()
-        try:
-            return self.fetch_pack(path, determine_wants,
+        result = self.fetch_pack(path, determine_wants,
                 target.get_graph_walker(), f.write, progress)
                 target.get_graph_walker(), f.write, progress)
-        finally:
-            commit()
+        commit()
+        return result
 
 
     def fetch_pack(self, path, determine_wants, graph_walker, pack_data,
     def fetch_pack(self, path, determine_wants, graph_walker, pack_data,
                    progress=None):
                    progress=None):
@@ -470,6 +472,11 @@ class TraditionalGitClient(GitClient):
         proto, can_read = self._connect('upload-pack', path)
         proto, can_read = self._connect('upload-pack', path)
         refs, server_capabilities = self._read_refs(proto)
         refs, server_capabilities = self._read_refs(proto)
         negotiated_capabilities = self._fetch_capabilities & server_capabilities
         negotiated_capabilities = self._fetch_capabilities & server_capabilities
+
+        if refs is None:
+            proto.write_pkt_line(None)
+            return refs
+
         try:
         try:
             wants = determine_wants(refs)
             wants = determine_wants(refs)
         except:
         except:
@@ -622,6 +629,8 @@ class SSHGitClient(TraditionalGitClient):
         return self.alternative_paths.get(cmd, 'git-%s' % cmd)
         return self.alternative_paths.get(cmd, 'git-%s' % cmd)
 
 
     def _connect(self, cmd, path):
     def _connect(self, cmd, path):
+        if path.startswith("/~"):
+            path = path[1:]
         con = get_ssh_vendor().connect_ssh(
         con = get_ssh_vendor().connect_ssh(
             self.host, ["%s '%s'" % (self._get_cmd_path(cmd), path)],
             self.host, ["%s '%s'" % (self._get_cmd_path(cmd), path)],
             port=self.port, username=self.username)
             port=self.port, username=self.username)
@@ -639,6 +648,17 @@ class HttpGitClient(GitClient):
     def _get_url(self, path):
     def _get_url(self, path):
         return urlparse.urljoin(self.base_url, path).rstrip("/") + "/"
         return urlparse.urljoin(self.base_url, path).rstrip("/") + "/"
 
 
+    def _http_request(self, url, headers={}, data=None):
+        req = urllib2.Request(url, headers=headers, data=data)
+        try:
+            resp = self._perform(req)
+        except urllib2.HTTPError as e:
+            if e.code == 404:
+                raise NotGitRepository()
+            if e.code != 200:
+                raise GitProtocolError("unexpected http response %d" % e.code)
+        return resp
+
     def _perform(self, req):
     def _perform(self, req):
         """Perform a HTTP request.
         """Perform a HTTP request.
 
 
@@ -656,13 +676,7 @@ class HttpGitClient(GitClient):
         if self.dumb != False:
         if self.dumb != False:
             url += "?service=%s" % service
             url += "?service=%s" % service
             headers["Content-Type"] = "application/x-%s-request" % service
             headers["Content-Type"] = "application/x-%s-request" % service
-        req = urllib2.Request(url, headers=headers)
-        resp = self._perform(req)
-        if resp.getcode() == 404:
-            raise NotGitRepository()
-        if resp.getcode() != 200:
-            raise GitProtocolError("unexpected http response %d" %
-                resp.getcode())
+        resp = self._http_request(url, headers)
         self.dumb = (not resp.info().gettype().startswith("application/x-git-"))
         self.dumb = (not resp.info().gettype().startswith("application/x-git-"))
         proto = Protocol(resp.read, None)
         proto = Protocol(resp.read, None)
         if not self.dumb:
         if not self.dumb:
@@ -676,15 +690,8 @@ class HttpGitClient(GitClient):
     def _smart_request(self, service, url, data):
     def _smart_request(self, service, url, data):
         assert url[-1] == "/"
         assert url[-1] == "/"
         url = urlparse.urljoin(url, service)
         url = urlparse.urljoin(url, service)
-        req = urllib2.Request(url,
-            headers={"Content-Type": "application/x-%s-request" % service},
-            data=data)
-        resp = self._perform(req)
-        if resp.getcode() == 404:
-            raise NotGitRepository()
-        if resp.getcode() != 200:
-            raise GitProtocolError("Invalid HTTP response from server: %d"
-                % resp.getcode())
+        headers = {"Content-Type": "application/x-%s-request" % service}
+        resp = self._http_request(url, headers, data)
         if resp.info().gettype() != ("application/x-%s-result" % service):
         if resp.info().gettype() != ("application/x-%s-result" % service):
             raise GitProtocolError("Invalid content-type from server: %s"
             raise GitProtocolError("Invalid content-type from server: %s"
                 % resp.info().gettype())
                 % resp.info().gettype())
@@ -776,8 +783,11 @@ def get_transport_and_path(uri, **kwargs):
         return (TCPGitClient(parsed.hostname, port=parsed.port, **kwargs),
         return (TCPGitClient(parsed.hostname, port=parsed.port, **kwargs),
                 parsed.path)
                 parsed.path)
     elif parsed.scheme == 'git+ssh':
     elif parsed.scheme == 'git+ssh':
+        path = parsed.path
+        if path.startswith('/'):
+            path = parsed.path[1:]
         return SSHGitClient(parsed.hostname, port=parsed.port,
         return SSHGitClient(parsed.hostname, port=parsed.port,
-                            username=parsed.username, **kwargs), parsed.path
+                            username=parsed.username, **kwargs), path
     elif parsed.scheme in ('http', 'https'):
     elif parsed.scheme in ('http', 'https'):
         return HttpGitClient(urlparse.urlunparse(parsed), **kwargs), parsed.path
         return HttpGitClient(urlparse.urlunparse(parsed), **kwargs), parsed.path
 
 

+ 13 - 8
dulwich/config.py

@@ -28,6 +28,11 @@ import errno
 import os
 import os
 import re
 import re
 
 
+try:
+    from collections import OrderedDict
+except ImportError:
+    from dulwich._compat import OrderedDict
+
 from UserDict import DictMixin
 from UserDict import DictMixin
 
 
 from dulwich.file import GitFile
 from dulwich.file import GitFile
@@ -38,7 +43,7 @@ class Config(object):
 
 
     def get(self, section, name):
     def get(self, section, name):
         """Retrieve the contents of a configuration setting.
         """Retrieve the contents of a configuration setting.
-        
+
         :param section: Tuple with section name and optional subsection namee
         :param section: Tuple with section name and optional subsection namee
         :param subsection: Subsection name
         :param subsection: Subsection name
         :return: Contents of the setting
         :return: Contents of the setting
@@ -67,7 +72,7 @@ class Config(object):
 
 
     def set(self, section, name, value):
     def set(self, section, name, value):
         """Set a configuration value.
         """Set a configuration value.
-        
+
         :param name: Name of the configuration value, including section
         :param name: Name of the configuration value, including section
             and optional subsection
             and optional subsection
         :param: Value of the setting
         :param: Value of the setting
@@ -81,7 +86,7 @@ class ConfigDict(Config, DictMixin):
     def __init__(self, values=None):
     def __init__(self, values=None):
         """Create a new ConfigDict."""
         """Create a new ConfigDict."""
         if values is None:
         if values is None:
-            values = {}
+            values = OrderedDict()
         self._values = values
         self._values = values
 
 
     def __repr__(self):
     def __repr__(self):
@@ -94,10 +99,10 @@ class ConfigDict(Config, DictMixin):
 
 
     def __getitem__(self, key):
     def __getitem__(self, key):
         return self._values[key]
         return self._values[key]
-      
+
     def __setitem__(self, key, value):
     def __setitem__(self, key, value):
         self._values[key] = value
         self._values[key] = value
-        
+
     def keys(self):
     def keys(self):
         return self._values.keys()
         return self._values.keys()
 
 
@@ -122,7 +127,7 @@ class ConfigDict(Config, DictMixin):
     def set(self, section, name, value):
     def set(self, section, name, value):
         if isinstance(section, basestring):
         if isinstance(section, basestring):
             section = (section, )
             section = (section, )
-        self._values.setdefault(section, {})[name] = value
+        self._values.setdefault(section, OrderedDict())[name] = value
 
 
 
 
 def _format_string(value):
 def _format_string(value):
@@ -236,7 +241,7 @@ class ConfigFile(ConfigDict):
                             section = (pts[0], pts[1])
                             section = (pts[0], pts[1])
                         else:
                         else:
                             section = (pts[0], )
                             section = (pts[0], )
-                    ret._values[section] = {}
+                    ret._values[section] = OrderedDict()
                 if _strip_comments(line).strip() == "":
                 if _strip_comments(line).strip() == "":
                     continue
                     continue
                 if section is None:
                 if section is None:
@@ -304,7 +309,7 @@ class ConfigFile(ConfigDict):
             else:
             else:
                 f.write("[%s \"%s\"]\n" % (section_name, subsection_name))
                 f.write("[%s \"%s\"]\n" % (section_name, subsection_name))
             for key, value in values.iteritems():
             for key, value in values.iteritems():
-                f.write("%s = %s\n" % (key, _escape_value(value)))
+                f.write("\t%s = %s\n" % (key, _escape_value(value)))
 
 
 
 
 class StackedConfig(Config):
 class StackedConfig(Config):

+ 4 - 0
dulwich/errors.py

@@ -171,3 +171,7 @@ class CommitError(Exception):
 
 
 class RefFormatError(Exception):
 class RefFormatError(Exception):
     """Indicates an invalid ref name."""
     """Indicates an invalid ref name."""
+
+
+class HookError(Exception):
+    """An error occurred while executing a hook."""

+ 147 - 0
dulwich/hooks.py

@@ -0,0 +1,147 @@
+# hooks.py -- for dealing with git hooks
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; version 2
+# of the License or (at your option) a later version of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA  02110-1301, USA.
+
+"""Access to hooks."""
+
+import os
+import subprocess
+import tempfile
+import warnings
+
+from dulwich.errors import (
+    HookError,
+)
+
+
+class Hook(object):
+    """Generic hook object."""
+
+    def execute(elf, *args):
+        """Execute the hook with the given args
+
+        :param args: argument list to hook
+        :raise HookError: hook execution failure
+        :return: a hook may return a useful value
+        """
+        raise NotImplementedError(self.execute)
+
+
+class ShellHook(Hook):
+    """Hook by executable file
+
+    Implements standard githooks(5) [0]:
+
+    [0] http://www.kernel.org/pub/software/scm/git/docs/githooks.html
+    """
+
+    def __init__(self, name, path, numparam,
+                 pre_exec_callback=None, post_exec_callback=None):
+        """Setup shell hook definition
+
+        :param name: name of hook for error messages
+        :param path: absolute path to executable file
+        :param numparam: number of requirements parameters
+        :param pre_exec_callback: closure for setup before execution
+            Defaults to None. Takes in the variable argument list from the
+            execute functions and returns a modified argument list for the
+            shell hook.
+        :param post_exec_callback: closure for cleanup after execution
+            Defaults to None. Takes in a boolean for hook success and the
+            modified argument list and returns the final hook return value
+            if applicable
+        """
+        self.name = name
+        self.filepath = path
+        self.numparam = numparam
+
+        self.pre_exec_callback = pre_exec_callback
+        self.post_exec_callback = post_exec_callback
+
+    def execute(self, *args):
+        """Execute the hook with given args"""
+
+        if len(args) != self.numparam:
+            raise HookError("Hook %s executed with wrong number of args. \
+                            Expected %d. Saw %d. %s"
+                            % (self.name, self.numparam, len(args)))
+
+        if (self.pre_exec_callback is not None):
+            args = self.pre_exec_callback(*args)
+
+        try:
+            ret = subprocess.call([self.filepath] + list(args))
+            if ret != 0:
+                if (self.post_exec_callback is not None):
+                    self.post_exec_callback(0, *args)
+                raise HookError("Hook %s exited with non-zero status"
+                                % (self.name))
+            if (self.post_exec_callback is not None):
+                return self.post_exec_callback(1, *args)
+        except OSError:  # no file. silent failure.
+            if (self.post_exec_callback is not None):
+                self.post_exec_callback(0, *args)
+
+
+class PreCommitShellHook(ShellHook):
+    """pre-commit shell hook"""
+
+    def __init__(self, controldir):
+        filepath = os.path.join(controldir, 'hooks', 'pre-commit')
+
+        ShellHook.__init__(self, 'pre-commit', filepath, 0)
+
+
+class PostCommitShellHook(ShellHook):
+    """post-commit shell hook"""
+
+    def __init__(self, controldir):
+        filepath = os.path.join(controldir, 'hooks', 'post-commit')
+
+        ShellHook.__init__(self, 'post-commit', filepath, 0)
+
+
+class CommitMsgShellHook(ShellHook):
+    """commit-msg shell hook
+
+    :param args[0]: commit message
+    :return: new commit message or None
+    """
+
+    def __init__(self, controldir):
+        filepath = os.path.join(controldir, 'hooks', 'commit-msg')
+
+        def prepare_msg(*args):
+            (fd, path) = tempfile.mkstemp()
+
+            f = os.fdopen(fd, 'wb')
+            try:
+                f.write(args[0])
+            finally:
+                f.close()
+
+            return (path,)
+
+        def clean_msg(success, *args):
+            if success:
+                with open(args[0], 'rb') as f:
+                    new_msg = f.read()
+                os.unlink(args[0])
+                return new_msg
+            os.unlink(args[0])
+
+        ShellHook.__init__(self, 'commit-msg', filepath, 1,
+                           prepare_msg, clean_msg)

+ 29 - 14
dulwich/index.py

@@ -18,6 +18,7 @@
 
 
 """Parser for the git index file format."""
 """Parser for the git index file format."""
 
 
+import errno
 import os
 import os
 import stat
 import stat
 import struct
 import struct
@@ -354,22 +355,24 @@ def changes_from_tree(names, lookup_entry, object_store, tree,
     :param names: Iterable of names in the working copy
     :param names: Iterable of names in the working copy
     :param lookup_entry: Function to lookup an entry in the working copy
     :param lookup_entry: Function to lookup an entry in the working copy
     :param object_store: Object store to use for retrieving tree contents
     :param object_store: Object store to use for retrieving tree contents
-    :param tree: SHA1 of the root tree
+    :param tree: SHA1 of the root tree, or None for an empty tree
     :param want_unchanged: Whether unchanged files should be reported
     :param want_unchanged: Whether unchanged files should be reported
     :return: Iterator over tuples with (oldpath, newpath), (oldmode, newmode),
     :return: Iterator over tuples with (oldpath, newpath), (oldmode, newmode),
         (oldsha, newsha)
         (oldsha, newsha)
     """
     """
     other_names = set(names)
     other_names = set(names)
-    for (name, mode, sha) in object_store.iter_tree_contents(tree):
-        try:
-            (other_sha, other_mode) = lookup_entry(name)
-        except KeyError:
-            # Was removed
-            yield ((name, None), (mode, None), (sha, None))
-        else:
-            other_names.remove(name)
-            if (want_unchanged or other_sha != sha or other_mode != mode):
-                yield ((name, name), (mode, other_mode), (sha, other_sha))
+
+    if tree is not None:
+        for (name, mode, sha) in object_store.iter_tree_contents(tree):
+            try:
+                (other_sha, other_mode) = lookup_entry(name)
+            except KeyError:
+                # Was removed
+                yield ((name, None), (mode, None), (sha, None))
+            else:
+                other_names.remove(name)
+                if (want_unchanged or other_sha != sha or other_mode != mode):
+                    yield ((name, name), (mode, other_mode), (sha, other_sha))
 
 
     # Mention added files
     # Mention added files
     for name in other_names:
     for name in other_names:
@@ -391,13 +394,16 @@ def index_entry_from_stat(stat_val, hex_sha, flags, mode=None):
             stat_val.st_gid, stat_val.st_size, hex_sha, flags)
             stat_val.st_gid, stat_val.st_size, hex_sha, flags)
 
 
 
 
-def build_index_from_tree(prefix, index_path, object_store, tree_id):
+def build_index_from_tree(prefix, index_path, object_store, tree_id,
+                          honor_filemode=True):
     """Generate and materialize index from a tree
     """Generate and materialize index from a tree
 
 
     :param tree_id: Tree to materialize
     :param tree_id: Tree to materialize
     :param prefix: Target dir for materialized index files
     :param prefix: Target dir for materialized index files
     :param index_path: Target path for generated index
     :param index_path: Target path for generated index
     :param object_store: Non-empty object store holding tree contents
     :param object_store: Non-empty object store holding tree contents
+    :param honor_filemode: An optional flag to honor core.filemode setting in
+        config file, default is core.filemode=True, change executable bit
 
 
     :note:: existing index is wiped and contents are not merged
     :note:: existing index is wiped and contents are not merged
         in a working dir. Suiteable only for fresh clones.
         in a working dir. Suiteable only for fresh clones.
@@ -414,7 +420,15 @@ def build_index_from_tree(prefix, index_path, object_store, tree_id):
         # FIXME: Merge new index into working tree
         # FIXME: Merge new index into working tree
         if stat.S_ISLNK(entry.mode):
         if stat.S_ISLNK(entry.mode):
             # FIXME: This will fail on Windows. What should we do instead?
             # FIXME: This will fail on Windows. What should we do instead?
-            os.symlink(object_store[entry.sha].as_raw_string(), full_path)
+            src_path = object_store[entry.sha].as_raw_string()
+            try:
+                os.symlink(src_path, full_path)
+            except OSError, e:
+                if e.errno == errno.EEXIST:
+                    os.unlink(full_path)
+                    os.symlink(src_path, full_path)
+                else:
+                    raise
         else:
         else:
             f = open(full_path, 'wb')
             f = open(full_path, 'wb')
             try:
             try:
@@ -423,7 +437,8 @@ def build_index_from_tree(prefix, index_path, object_store, tree_id):
             finally:
             finally:
                 f.close()
                 f.close()
 
 
-            os.chmod(full_path, entry.mode)
+            if honor_filemode:
+                os.chmod(full_path, entry.mode)
 
 
         # Add file to index
         # Add file to index
         st = os.lstat(full_path)
         st = os.lstat(full_path)

+ 150 - 26
dulwich/object_store.py

@@ -1,5 +1,6 @@
 # object_store.py -- Object store for git objects
 # object_store.py -- Object store for git objects
-# Copyright (C) 2008-2009 Jelmer Vernooij <jelmer@samba.org>
+# Copyright (C) 2008-2012 Jelmer Vernooij <jelmer@samba.org>
+#                         and others
 #
 #
 # This program is free software; you can redistribute it and/or
 # This program is free software; you can redistribute it and/or
 # modify it under the terms of the GNU General Public License
 # modify it under the terms of the GNU General Public License
@@ -220,6 +221,30 @@ class BaseObjectStore(object):
             obj = self[sha]
             obj = self[sha]
         return obj
         return obj
 
 
+    def _collect_ancestors(self, heads, common=set()):
+        """Collect all ancestors of heads up to (excluding) those in common.
+
+        :param heads: commits to start from
+        :param common: commits to end at, or empty set to walk repository
+            completely
+        :return: a tuple (A, B) where A - all commits reachable
+            from heads but not present in common, B - common (shared) elements
+            that are directly reachable from heads
+        """
+        bases = set()
+        commits = set()
+        queue = []
+        queue.extend(heads)
+        while queue:
+            e = queue.pop(0)
+            if e in common:
+                bases.add(e)
+            elif e not in commits:
+                commits.add(e)
+                cmt = self[e]
+                queue.extend(cmt.parents)
+        return (commits, bases)
+
 
 
 class PackBasedObjectStore(BaseObjectStore):
 class PackBasedObjectStore(BaseObjectStore):
 
 
@@ -231,12 +256,27 @@ class PackBasedObjectStore(BaseObjectStore):
         return []
         return []
 
 
     def contains_packed(self, sha):
     def contains_packed(self, sha):
-        """Check if a particular object is present by SHA1 and is packed."""
+        """Check if a particular object is present by SHA1 and is packed.
+
+        This does not check alternates.
+        """
         for pack in self.packs:
         for pack in self.packs:
             if sha in pack:
             if sha in pack:
                 return True
                 return True
         return False
         return False
 
 
+    def __contains__(self, sha):
+        """Check if a particular object is present by SHA1.
+
+        This method makes no distinction between loose and packed objects.
+        """
+        if self.contains_packed(sha) or self.contains_loose(sha):
+            return True
+        for alternate in self.alternates:
+            if sha in alternate:
+                return True
+        return False
+
     def _load_packs(self):
     def _load_packs(self):
         raise NotImplementedError(self._load_packs)
         raise NotImplementedError(self._load_packs)
 
 
@@ -258,6 +298,12 @@ class PackBasedObjectStore(BaseObjectStore):
             self._pack_cache = self._load_packs()
             self._pack_cache = self._load_packs()
         return self._pack_cache
         return self._pack_cache
 
 
+    def _iter_alternate_objects(self):
+        """Iterate over the SHAs of all the objects in alternate stores."""
+        for alternate in self.alternates:
+            for alternate_object in alternate:
+                yield alternate_object
+
     def _iter_loose_objects(self):
     def _iter_loose_objects(self):
         """Iterate over the SHAs of all loose objects."""
         """Iterate over the SHAs of all loose objects."""
         raise NotImplementedError(self._iter_loose_objects)
         raise NotImplementedError(self._iter_loose_objects)
@@ -283,11 +329,14 @@ class PackBasedObjectStore(BaseObjectStore):
 
 
     def __iter__(self):
     def __iter__(self):
         """Iterate over the SHAs that are present in this store."""
         """Iterate over the SHAs that are present in this store."""
-        iterables = self.packs + [self._iter_loose_objects()]
+        iterables = self.packs + [self._iter_loose_objects()] + [self._iter_alternate_objects()]
         return itertools.chain(*iterables)
         return itertools.chain(*iterables)
 
 
     def contains_loose(self, sha):
     def contains_loose(self, sha):
-        """Check if a particular object is present by SHA1 and is loose."""
+        """Check if a particular object is present by SHA1 and is loose.
+
+        This does not check alternates.
+        """
         return self._get_loose_object(sha) is not None
         return self._get_loose_object(sha) is not None
 
 
     def get_raw(self, name):
     def get_raw(self, name):
@@ -372,9 +421,10 @@ class DiskObjectStore(PackBasedObjectStore):
                 l = l.rstrip("\n")
                 l = l.rstrip("\n")
                 if l[0] == "#":
                 if l[0] == "#":
                     continue
                     continue
-                if not os.path.isabs(l):
-                    continue
-                ret.append(l)
+                if os.path.isabs(l):
+                    ret.append(l)
+                else:
+                    ret.append(os.path.join(self.path, l))
             return ret
             return ret
         finally:
         finally:
             f.close()
             f.close()
@@ -403,6 +453,9 @@ class DiskObjectStore(PackBasedObjectStore):
             f.write("%s\n" % path)
             f.write("%s\n" % path)
         finally:
         finally:
             f.close()
             f.close()
+
+        if not os.path.isabs(path):
+            path = os.path.join(self.path, path)
         self.alternates.append(DiskObjectStore(path))
         self.alternates.append(DiskObjectStore(path))
 
 
     def _load_packs(self):
     def _load_packs(self):
@@ -769,6 +822,54 @@ def tree_lookup_path(lookup_obj, root_sha, path):
     return tree.lookup_path(lookup_obj, path)
     return tree.lookup_path(lookup_obj, path)
 
 
 
 
+def _collect_filetree_revs(obj_store, tree_sha, kset):
+    """Collect SHA1s of files and directories for specified tree.
+
+    :param obj_store: Object store to get objects by SHA from
+    :param tree_sha: tree reference to walk
+    :param kset: set to fill with references to files and directories
+    """
+    filetree = obj_store[tree_sha]
+    for name, mode, sha in filetree.iteritems():
+       if not S_ISGITLINK(mode) and sha not in kset:
+           kset.add(sha)
+           if stat.S_ISDIR(mode):
+               _collect_filetree_revs(obj_store, sha, kset)
+
+
+def _split_commits_and_tags(obj_store, lst, ignore_unknown=False):
+    """Split object id list into two list with commit SHA1s and tag SHA1s.
+
+    Commits referenced by tags are included into commits
+    list as well. Only SHA1s known in this repository will get
+    through, and unless ignore_unknown argument is True, KeyError
+    is thrown for SHA1 missing in the repository
+
+    :param obj_store: Object store to get objects by SHA1 from
+    :param lst: Collection of commit and tag SHAs
+    :param ignore_unknown: True to skip SHA1 missing in the repository
+        silently.
+    :return: A tuple of (commits, tags) SHA1s
+    """
+    commits = set()
+    tags = set()
+    for e in lst:
+        try:
+            o = obj_store[e]
+        except KeyError:
+            if not ignore_unknown:
+                raise
+        else:
+            if isinstance(o, Commit):
+                commits.add(e)
+            elif isinstance(o, Tag):
+                tags.add(e)
+                commits.add(o.object[1])
+            else:
+                raise KeyError('Not a commit or a tag: %s' % e)
+    return (commits, tags)
+
+
 class MissingObjectFinder(object):
 class MissingObjectFinder(object):
     """Find the objects missing from another object store.
     """Find the objects missing from another object store.
 
 
@@ -784,11 +885,44 @@ class MissingObjectFinder(object):
 
 
     def __init__(self, object_store, haves, wants, progress=None,
     def __init__(self, object_store, haves, wants, progress=None,
                  get_tagged=None):
                  get_tagged=None):
-        haves = set(haves)
-        self.sha_done = haves
-        self.objects_to_send = set([(w, None, False) for w in wants
-                                    if w not in haves])
         self.object_store = object_store
         self.object_store = object_store
+        # process Commits and Tags differently
+        # Note, while haves may list commits/tags not available locally,
+        # and such SHAs would get filtered out by _split_commits_and_tags,
+        # wants shall list only known SHAs, and otherwise
+        # _split_commits_and_tags fails with KeyError
+        have_commits, have_tags = \
+                _split_commits_and_tags(object_store, haves, True)
+        want_commits, want_tags = \
+                _split_commits_and_tags(object_store, wants, False)
+        # all_ancestors is a set of commits that shall not be sent
+        # (complete repository up to 'haves')
+        all_ancestors = object_store._collect_ancestors(have_commits)[0]
+        # all_missing - complete set of commits between haves and wants
+        # common - commits from all_ancestors we hit into while
+        # traversing parent hierarchy of wants
+        missing_commits, common_commits = \
+            object_store._collect_ancestors(want_commits, all_ancestors)
+        self.sha_done = set()
+        # Now, fill sha_done with commits and revisions of
+        # files and directories known to be both locally
+        # and on target. Thus these commits and files
+        # won't get selected for fetch
+        for h in common_commits:
+            self.sha_done.add(h)
+            cmt = object_store[h]
+            _collect_filetree_revs(object_store, cmt.tree, self.sha_done)
+        # record tags we have as visited, too
+        for t in have_tags:
+            self.sha_done.add(t)
+
+        missing_tags = want_tags.difference(have_tags)
+        # in fact, what we 'want' is commits and tags
+        # we've found missing
+        wants = missing_commits.union(missing_tags)
+
+        self.objects_to_send = set([(w, None, False) for w in wants])
+
         if progress is None:
         if progress is None:
             self.progress = lambda x: None
             self.progress = lambda x: None
         else:
         else:
@@ -799,18 +933,6 @@ class MissingObjectFinder(object):
         self.objects_to_send.update([e for e in entries
         self.objects_to_send.update([e for e in entries
                                      if not e[0] in self.sha_done])
                                      if not e[0] in self.sha_done])
 
 
-    def parse_tree(self, tree):
-        self.add_todo([(sha, name, not stat.S_ISDIR(mode))
-                       for name, mode, sha in tree.iteritems()
-                       if not S_ISGITLINK(mode)])
-
-    def parse_commit(self, commit):
-        self.add_todo([(commit.tree, "", False)])
-        self.add_todo([(p, None, False) for p in commit.parents])
-
-    def parse_tag(self, tag):
-        self.add_todo([(tag.object[1], None, False)])
-
     def next(self):
     def next(self):
         while True:
         while True:
             if not self.objects_to_send:
             if not self.objects_to_send:
@@ -821,11 +943,13 @@ class MissingObjectFinder(object):
         if not leaf:
         if not leaf:
             o = self.object_store[sha]
             o = self.object_store[sha]
             if isinstance(o, Commit):
             if isinstance(o, Commit):
-                self.parse_commit(o)
+                self.add_todo([(o.tree, "", False)])
             elif isinstance(o, Tree):
             elif isinstance(o, Tree):
-                self.parse_tree(o)
+                self.add_todo([(s, n, not stat.S_ISDIR(m))
+                               for n, m, s in o.iteritems()
+                               if not S_ISGITLINK(m)])
             elif isinstance(o, Tag):
             elif isinstance(o, Tag):
-                self.parse_tag(o)
+                self.add_todo([(o.object[1], None, False)])
         if sha in self._tagged:
         if sha in self._tagged:
             self.add_todo([(self._tagged[sha], None, True)])
             self.add_todo([(self._tagged[sha], None, True)])
         self.sha_done.add(sha)
         self.sha_done.add(sha)

+ 29 - 7
dulwich/objects.py

@@ -51,7 +51,7 @@ _PARENT_HEADER = "parent"
 _AUTHOR_HEADER = "author"
 _AUTHOR_HEADER = "author"
 _COMMITTER_HEADER = "committer"
 _COMMITTER_HEADER = "committer"
 _ENCODING_HEADER = "encoding"
 _ENCODING_HEADER = "encoding"
-
+_MERGETAG_HEADER = "mergetag"
 
 
 # Header fields for objects
 # Header fields for objects
 _OBJECT_HEADER = "object"
 _OBJECT_HEADER = "object"
@@ -583,12 +583,18 @@ def _parse_tag_or_commit(text):
         field named None for the freeform tag/commit text.
         field named None for the freeform tag/commit text.
     """
     """
     f = StringIO(text)
     f = StringIO(text)
+    k = None
+    v = ""
     for l in f:
     for l in f:
-        l = l.rstrip("\n")
-        if l == "":
-            # Empty line indicates end of headers
-            break
-        yield l.split(" ", 1)
+        if l.startswith(" "):
+            v += l[1:]
+        else:
+            if k is not None:
+                yield (k, v.rstrip("\n"))
+            if l == "\n":
+                # Empty line indicates end of headers
+                break
+            (k, v) = l.split(" ", 1)
     yield (None, f.read())
     yield (None, f.read())
     f.close()
     f.close()
 
 
@@ -1038,12 +1044,13 @@ class Commit(ShaFile):
                  '_commit_timezone_neg_utc', '_commit_time',
                  '_commit_timezone_neg_utc', '_commit_time',
                  '_author_time', '_author_timezone', '_commit_timezone',
                  '_author_time', '_author_timezone', '_commit_timezone',
                  '_author', '_committer', '_parents', '_extra',
                  '_author', '_committer', '_parents', '_extra',
-                 '_encoding', '_tree', '_message')
+                 '_encoding', '_tree', '_message', '_mergetag')
 
 
     def __init__(self):
     def __init__(self):
         super(Commit, self).__init__()
         super(Commit, self).__init__()
         self._parents = []
         self._parents = []
         self._encoding = None
         self._encoding = None
+        self._mergetag = []
         self._extra = []
         self._extra = []
         self._author_timezone_neg_utc = False
         self._author_timezone_neg_utc = False
         self._commit_timezone_neg_utc = False
         self._commit_timezone_neg_utc = False
@@ -1078,6 +1085,8 @@ class Commit(ShaFile):
                 self._encoding = value
                 self._encoding = value
             elif field is None:
             elif field is None:
                 self._message = value
                 self._message = value
+            elif field == _MERGETAG_HEADER:
+                self._mergetag.append(Tag.from_string(value + "\n"))
             else:
             else:
                 self._extra.append((field, value))
                 self._extra.append((field, value))
 
 
@@ -1132,6 +1141,16 @@ class Commit(ShaFile):
                           self._commit_timezone_neg_utc)))
                           self._commit_timezone_neg_utc)))
         if self.encoding:
         if self.encoding:
             chunks.append("%s %s\n" % (_ENCODING_HEADER, self.encoding))
             chunks.append("%s %s\n" % (_ENCODING_HEADER, self.encoding))
+        for mergetag in self.mergetag:
+            mergetag_chunks = mergetag.as_raw_string().split("\n")
+
+            chunks.append("%s %s\n" % (_MERGETAG_HEADER, mergetag_chunks[0]))
+            # Embedded extra header needs leading space
+            for chunk in mergetag_chunks[1:]:
+                chunks.append(" %s\n" % chunk)
+
+            # No trailing empty line
+            chunks[-1] = chunks[-1].rstrip(" \n")
         for k, v in self.extra:
         for k, v in self.extra:
             if "\n" in k or "\n" in v:
             if "\n" in k or "\n" in v:
                 raise AssertionError("newline in extra data: %r -> %r" % (k, v))
                 raise AssertionError("newline in extra data: %r -> %r" % (k, v))
@@ -1186,6 +1205,9 @@ class Commit(ShaFile):
     encoding = serializable_property("encoding",
     encoding = serializable_property("encoding",
         "Encoding of the commit message.")
         "Encoding of the commit message.")
 
 
+    mergetag = serializable_property("mergetag",
+        "Associated signed tag.")
+
 
 
 OBJECT_CLASSES = (
 OBJECT_CLASSES = (
     Commit,
     Commit,

+ 39 - 11
dulwich/patch.py

@@ -31,6 +31,9 @@ from dulwich.objects import (
     S_ISGITLINK,
     S_ISGITLINK,
     )
     )
 
 
+FIRST_FEW_BYTES = 8000
+
+
 def write_commit_patch(f, commit, contents, progress, version=None):
 def write_commit_patch(f, commit, contents, progress, version=None):
     """Write a individual file patch.
     """Write a individual file patch.
 
 
@@ -103,14 +106,25 @@ def unified_diff(a, b, fromfile='', tofile='', n=3):
                     yield '+' + line
                     yield '+' + line
 
 
 
 
+def is_binary(content):
+    """See if the first few bytes contain any null characters.
+
+    :param content: Bytestring to check for binary content
+    """
+    return '\0' in content[:FIRST_FEW_BYTES]
+
+
 def write_object_diff(f, store, (old_path, old_mode, old_id),
 def write_object_diff(f, store, (old_path, old_mode, old_id),
-                                (new_path, new_mode, new_id)):
+                                (new_path, new_mode, new_id),
+                                diff_binary=False):
     """Write the diff for an object.
     """Write the diff for an object.
 
 
     :param f: File-like object to write to
     :param f: File-like object to write to
     :param store: Store to retrieve objects from, if necessary
     :param store: Store to retrieve objects from, if necessary
     :param (old_path, old_mode, old_hexsha): Old file
     :param (old_path, old_mode, old_hexsha): Old file
     :param (new_path, new_mode, new_hexsha): New file
     :param (new_path, new_mode, new_hexsha): New file
+    :param diff_binary: Whether to diff files even if they
+        are considered binary files by is_binary().
 
 
     :note: the tuple elements should be None for nonexistant files
     :note: the tuple elements should be None for nonexistant files
     """
     """
@@ -119,13 +133,21 @@ def write_object_diff(f, store, (old_path, old_mode, old_id),
             return "0" * 7
             return "0" * 7
         else:
         else:
             return hexsha[:7]
             return hexsha[:7]
-    def lines(mode, hexsha):
+
+    def content(mode, hexsha):
         if hexsha is None:
         if hexsha is None:
-            return []
+            return ''
         elif S_ISGITLINK(mode):
         elif S_ISGITLINK(mode):
-            return ["Submodule commit " + hexsha + "\n"]
+            return "Submodule commit " + hexsha + "\n"
         else:
         else:
-            return store[hexsha].data.splitlines(True)
+            return store[hexsha].data
+
+    def lines(content):
+        if not content:
+            return []
+        else:
+            return content.splitlines(True)
+
     if old_path is None:
     if old_path is None:
         old_path = "/dev/null"
         old_path = "/dev/null"
     else:
     else:
@@ -146,10 +168,13 @@ def write_object_diff(f, store, (old_path, old_mode, old_id),
     if new_mode is not None:
     if new_mode is not None:
         f.write(" %o" % new_mode)
         f.write(" %o" % new_mode)
     f.write("\n")
     f.write("\n")
-    old_contents = lines(old_mode, old_id)
-    new_contents = lines(new_mode, new_id)
-    f.writelines(unified_diff(old_contents, new_contents,
-        old_path, new_path))
+    old_content = content(old_mode, old_id)
+    new_content = content(new_mode, new_id)
+    if not diff_binary and (is_binary(old_content) or is_binary(new_content)):
+        f.write("Binary files %s and %s differ\n" % (old_path, new_path))
+    else:
+        f.writelines(unified_diff(lines(old_content), lines(new_content),
+            old_path, new_path))
 
 
 
 
 def write_blob_diff(f, (old_path, old_mode, old_blob),
 def write_blob_diff(f, (old_path, old_mode, old_blob),
@@ -198,17 +223,20 @@ def write_blob_diff(f, (old_path, old_mode, old_blob),
         old_path, new_path))
         old_path, new_path))
 
 
 
 
-def write_tree_diff(f, store, old_tree, new_tree):
+def write_tree_diff(f, store, old_tree, new_tree, diff_binary=False):
     """Write tree diff.
     """Write tree diff.
 
 
     :param f: File-like object to write to.
     :param f: File-like object to write to.
     :param old_tree: Old tree id
     :param old_tree: Old tree id
     :param new_tree: New tree id
     :param new_tree: New tree id
+    :param diff_binary: Whether to diff files even if they
+        are considered binary files by is_binary().
     """
     """
     changes = store.tree_changes(old_tree, new_tree)
     changes = store.tree_changes(old_tree, new_tree)
     for (oldpath, newpath), (oldmode, newmode), (oldsha, newsha) in changes:
     for (oldpath, newpath), (oldmode, newmode), (oldsha, newsha) in changes:
         write_object_diff(f, store, (oldpath, oldmode, oldsha),
         write_object_diff(f, store, (oldpath, oldmode, oldsha),
-                                    (newpath, newmode, newsha))
+                                    (newpath, newmode, newsha),
+                                    diff_binary=diff_binary)
 
 
 
 
 def git_am_patch_split(f):
 def git_am_patch_split(f):

+ 52 - 5
dulwich/repo.py

@@ -41,6 +41,7 @@ from dulwich.errors import (
     PackedRefsException,
     PackedRefsException,
     CommitError,
     CommitError,
     RefFormatError,
     RefFormatError,
+    HookError,
     )
     )
 from dulwich.file import (
 from dulwich.file import (
     ensure_dir_exists,
     ensure_dir_exists,
@@ -58,6 +59,13 @@ from dulwich.objects import (
     Tree,
     Tree,
     hex_to_sha,
     hex_to_sha,
     )
     )
+
+from dulwich.hooks import (
+    PreCommitShellHook,
+    PostCommitShellHook,
+    CommitMsgShellHook,
+)
+
 import warnings
 import warnings
 
 
 
 
@@ -813,6 +821,8 @@ class BaseRepo(object):
         self.object_store = object_store
         self.object_store = object_store
         self.refs = refs
         self.refs = refs
 
 
+        self.hooks = {}
+
     def _init_files(self, bare):
     def _init_files(self, bare):
         """Initialize a default set of named files."""
         """Initialize a default set of named files."""
         from dulwich.config import ConfigFile
         from dulwich.config import ConfigFile
@@ -1179,6 +1189,14 @@ class BaseRepo(object):
             if len(tree) != 40:
             if len(tree) != 40:
                 raise ValueError("tree must be a 40-byte hex sha string")
                 raise ValueError("tree must be a 40-byte hex sha string")
             c.tree = tree
             c.tree = tree
+
+        try:
+            self.hooks['pre-commit'].execute()
+        except HookError, e:
+            raise CommitError(e)
+        except KeyError:  # no hook defined, silent fallthrough
+            pass
+
         if merge_heads is None:
         if merge_heads is None:
             # FIXME: Read merge heads from .git/MERGE_HEADS
             # FIXME: Read merge heads from .git/MERGE_HEADS
             merge_heads = []
             merge_heads = []
@@ -1206,7 +1224,16 @@ class BaseRepo(object):
         if message is None:
         if message is None:
             # FIXME: Try to read commit message from .git/MERGE_MSG
             # FIXME: Try to read commit message from .git/MERGE_MSG
             raise ValueError("No commit message specified")
             raise ValueError("No commit message specified")
-        c.message = message
+
+        try:
+            c.message = self.hooks['commit-msg'].execute(message)
+            if c.message is None:
+                c.message = message
+        except HookError, e:
+            raise CommitError(e)
+        except KeyError:  # no hook defined, message not modified
+            c.message = message
+
         try:
         try:
             old_head = self.refs[ref]
             old_head = self.refs[ref]
             c.parents = [old_head] + merge_heads
             c.parents = [old_head] + merge_heads
@@ -1221,6 +1248,13 @@ class BaseRepo(object):
             # all its objects as garbage.
             # all its objects as garbage.
             raise CommitError("%s changed during commit" % (ref,))
             raise CommitError("%s changed during commit" % (ref,))
 
 
+        try:
+            self.hooks['post-commit'].execute()
+        except HookError, e:  # silent failure
+            warnings.warn("post-commit hook failed: %s" % e, UserWarning)
+        except KeyError:  # no hook defined, silent fallthrough
+            pass
+
         return c.id
         return c.id
 
 
 
 
@@ -1243,8 +1277,11 @@ class Repo(BaseRepo):
             self._controldir = root
             self._controldir = root
         elif (os.path.isfile(os.path.join(root, ".git"))):
         elif (os.path.isfile(os.path.join(root, ".git"))):
             import re
             import re
-            with open(os.path.join(root, ".git"), 'r') as f:
+            f = open(os.path.join(root, ".git"), 'r')
+            try:
                 _, path = re.match('(gitdir: )(.+$)', f.read()).groups()
                 _, path = re.match('(gitdir: )(.+$)', f.read()).groups()
+            finally:
+                f.close()
             self.bare = False
             self.bare = False
             self._controldir = os.path.join(root, path)
             self._controldir = os.path.join(root, path)
         else:
         else:
@@ -1257,6 +1294,10 @@ class Repo(BaseRepo):
         refs = DiskRefsContainer(self.controldir())
         refs = DiskRefsContainer(self.controldir())
         BaseRepo.__init__(self, object_store, refs)
         BaseRepo.__init__(self, object_store, refs)
 
 
+        self.hooks['pre-commit'] = PreCommitShellHook(self.controldir())
+        self.hooks['commit-msg'] = CommitMsgShellHook(self.controldir())
+        self.hooks['post-commit'] = PostCommitShellHook(self.controldir())
+
     def controldir(self):
     def controldir(self):
         """Return the path of the control directory."""
         """Return the path of the control directory."""
         return self._controldir
         return self._controldir
@@ -1380,12 +1421,18 @@ class Repo(BaseRepo):
 
 
             if not bare:
             if not bare:
                 # Checkout HEAD to target dir
                 # Checkout HEAD to target dir
-                from dulwich.index import build_index_from_tree
-                build_index_from_tree(target.path, target.index_path(),
-                        target.object_store, target['HEAD'].tree)
+                target._build_tree()
 
 
         return target
         return target
 
 
+    def _build_tree(self):
+        from dulwich.index import build_index_from_tree
+        config = self.get_config()
+        honor_filemode = config.get_boolean('core', 'filemode', os.name != "nt")
+        return build_index_from_tree(self.path, self.index_path(),
+                self.object_store, self['HEAD'].tree,
+                honor_filemode=honor_filemode)
+
     def get_config(self):
     def get_config(self):
         """Retrieve the config object.
         """Retrieve the config object.
 
 

+ 2 - 0
dulwich/tests/__init__.py

@@ -117,10 +117,12 @@ def self_test_suite():
         'diff_tree',
         'diff_tree',
         'fastexport',
         'fastexport',
         'file',
         'file',
+        'hooks',
         'index',
         'index',
         'lru_cache',
         'lru_cache',
         'objects',
         'objects',
         'object_store',
         'object_store',
+        'missing_obj_finder',
         'pack',
         'pack',
         'patch',
         'patch',
         'protocol',
         'protocol',

+ 5 - 4
dulwich/tests/compat/test_web.py

@@ -36,7 +36,8 @@ from dulwich.tests import (
 from dulwich.web import (
 from dulwich.web import (
     make_wsgi_chain,
     make_wsgi_chain,
     HTTPGitApplication,
     HTTPGitApplication,
-    HTTPGitRequestHandler,
+    WSGIRequestHandlerLogger,
+    WSGIServerLogger,
     )
     )
 
 
 from dulwich.tests.compat.server_utils import (
 from dulwich.tests.compat.server_utils import (
@@ -50,9 +51,9 @@ from dulwich.tests.compat.utils import (
 
 
 
 
 if getattr(simple_server.WSGIServer, 'shutdown', None):
 if getattr(simple_server.WSGIServer, 'shutdown', None):
-    WSGIServer = simple_server.WSGIServer
+    WSGIServer = WSGIServerLogger
 else:
 else:
-    class WSGIServer(ShutdownServerMixIn, simple_server.WSGIServer):
+    class WSGIServer(ShutdownServerMixIn, WSGIServerLogger):
         """Subclass of WSGIServer that can be shut down."""
         """Subclass of WSGIServer that can be shut down."""
 
 
         def __init__(self, *args, **kwargs):
         def __init__(self, *args, **kwargs):
@@ -77,7 +78,7 @@ class WebTests(ServerTests):
         app = self._make_app(backend)
         app = self._make_app(backend)
         dul_server = simple_server.make_server(
         dul_server = simple_server.make_server(
           'localhost', 0, app, server_class=WSGIServer,
           'localhost', 0, app, server_class=WSGIServer,
-          handler_class=HTTPGitRequestHandler)
+          handler_class=WSGIRequestHandlerLogger)
         self.addCleanup(dul_server.shutdown)
         self.addCleanup(dul_server.shutdown)
         threading.Thread(target=dul_server.serve_forever).start()
         threading.Thread(target=dul_server.serve_forever).start()
         self._server = dul_server
         self._server = dul_server

+ 98 - 1
dulwich/tests/test_client.py

@@ -18,6 +18,9 @@
 
 
 from cStringIO import StringIO
 from cStringIO import StringIO
 
 
+from dulwich import (
+    client,
+    )
 from dulwich.client import (
 from dulwich.client import (
     TraditionalGitClient,
     TraditionalGitClient,
     TCPGitClient,
     TCPGitClient,
@@ -75,6 +78,11 @@ class GitClientTests(TestCase):
         self.client.archive('bla', 'HEAD', None, None)
         self.client.archive('bla', 'HEAD', None, None)
         self.assertEqual(self.rout.getvalue(), '0011argument HEAD0000')
         self.assertEqual(self.rout.getvalue(), '0011argument HEAD0000')
 
 
+    def test_fetch_empty(self):
+        self.rin.write('0000')
+        self.rin.seek(0)
+        self.client.fetch_pack('/', lambda heads: [], None, None)
+
     def test_fetch_pack_none(self):
     def test_fetch_pack_none(self):
         self.rin.write(
         self.rin.write(
             '008855dcc6bf963f922e1ed5c4bbaaefcfacef57b1d7 HEAD.multi_ack '
             '008855dcc6bf963f922e1ed5c4bbaaefcfacef57b1d7 HEAD.multi_ack '
@@ -92,6 +100,7 @@ class GitClientTests(TestCase):
         self.assertEqual(TCP_GIT_PORT, client._port)
         self.assertEqual(TCP_GIT_PORT, client._port)
         self.assertEqual('/bar/baz', path)
         self.assertEqual('/bar/baz', path)
 
 
+    def test_get_transport_and_path_tcp_port(self):
         client, path = get_transport_and_path('git://foo.com:1234/bar/baz')
         client, path = get_transport_and_path('git://foo.com:1234/bar/baz')
         self.assertTrue(isinstance(client, TCPGitClient))
         self.assertTrue(isinstance(client, TCPGitClient))
         self.assertEqual('foo.com', client._host)
         self.assertEqual('foo.com', client._host)
@@ -104,13 +113,30 @@ class GitClientTests(TestCase):
         self.assertEqual('foo.com', client.host)
         self.assertEqual('foo.com', client.host)
         self.assertEqual(None, client.port)
         self.assertEqual(None, client.port)
         self.assertEqual(None, client.username)
         self.assertEqual(None, client.username)
-        self.assertEqual('/bar/baz', path)
+        self.assertEqual('bar/baz', path)
 
 
+    def test_get_transport_and_path_ssh_port_explicit(self):
         client, path = get_transport_and_path(
         client, path = get_transport_and_path(
             'git+ssh://foo.com:1234/bar/baz')
             'git+ssh://foo.com:1234/bar/baz')
         self.assertTrue(isinstance(client, SSHGitClient))
         self.assertTrue(isinstance(client, SSHGitClient))
         self.assertEqual('foo.com', client.host)
         self.assertEqual('foo.com', client.host)
         self.assertEqual(1234, client.port)
         self.assertEqual(1234, client.port)
+        self.assertEqual('bar/baz', path)
+
+    def test_get_transport_and_path_ssh_abspath_explicit(self):
+        client, path = get_transport_and_path('git+ssh://foo.com//bar/baz')
+        self.assertTrue(isinstance(client, SSHGitClient))
+        self.assertEqual('foo.com', client.host)
+        self.assertEqual(None, client.port)
+        self.assertEqual(None, client.username)
+        self.assertEqual('/bar/baz', path)
+
+    def test_get_transport_and_path_ssh_port_abspath_explicit(self):
+        client, path = get_transport_and_path(
+            'git+ssh://foo.com:1234//bar/baz')
+        self.assertTrue(isinstance(client, SSHGitClient))
+        self.assertEqual('foo.com', client.host)
+        self.assertEqual(1234, client.port)
         self.assertEqual('/bar/baz', path)
         self.assertEqual('/bar/baz', path)
 
 
     def test_get_transport_and_path_ssh_implicit(self):
     def test_get_transport_and_path_ssh_implicit(self):
@@ -121,6 +147,7 @@ class GitClientTests(TestCase):
         self.assertEqual(None, client.username)
         self.assertEqual(None, client.username)
         self.assertEqual('/bar/baz', path)
         self.assertEqual('/bar/baz', path)
 
 
+    def test_get_transport_and_path_ssh_host(self):
         client, path = get_transport_and_path('foo.com:/bar/baz')
         client, path = get_transport_and_path('foo.com:/bar/baz')
         self.assertTrue(isinstance(client, SSHGitClient))
         self.assertTrue(isinstance(client, SSHGitClient))
         self.assertEqual('foo.com', client.host)
         self.assertEqual('foo.com', client.host)
@@ -128,6 +155,7 @@ class GitClientTests(TestCase):
         self.assertEqual(None, client.username)
         self.assertEqual(None, client.username)
         self.assertEqual('/bar/baz', path)
         self.assertEqual('/bar/baz', path)
 
 
+    def test_get_transport_and_path_ssh_user_host(self):
         client, path = get_transport_and_path('user@foo.com:/bar/baz')
         client, path = get_transport_and_path('user@foo.com:/bar/baz')
         self.assertTrue(isinstance(client, SSHGitClient))
         self.assertTrue(isinstance(client, SSHGitClient))
         self.assertEqual('foo.com', client.host)
         self.assertEqual('foo.com', client.host)
@@ -135,6 +163,30 @@ class GitClientTests(TestCase):
         self.assertEqual('user', client.username)
         self.assertEqual('user', client.username)
         self.assertEqual('/bar/baz', path)
         self.assertEqual('/bar/baz', path)
 
 
+    def test_get_transport_and_path_ssh_relpath(self):
+        client, path = get_transport_and_path('foo:bar/baz')
+        self.assertTrue(isinstance(client, SSHGitClient))
+        self.assertEqual('foo', client.host)
+        self.assertEqual(None, client.port)
+        self.assertEqual(None, client.username)
+        self.assertEqual('bar/baz', path)
+
+    def test_get_transport_and_path_ssh_host_relpath(self):
+        client, path = get_transport_and_path('foo.com:bar/baz')
+        self.assertTrue(isinstance(client, SSHGitClient))
+        self.assertEqual('foo.com', client.host)
+        self.assertEqual(None, client.port)
+        self.assertEqual(None, client.username)
+        self.assertEqual('bar/baz', path)
+
+    def test_get_transport_and_path_ssh_user_host_relpath(self):
+        client, path = get_transport_and_path('user@foo.com:bar/baz')
+        self.assertTrue(isinstance(client, SSHGitClient))
+        self.assertEqual('foo.com', client.host)
+        self.assertEqual(None, client.port)
+        self.assertEqual('user', client.username)
+        self.assertEqual('bar/baz', path)
+
     def test_get_transport_and_path_subprocess(self):
     def test_get_transport_and_path_subprocess(self):
         client, path = get_transport_and_path('foo.bar/baz')
         client, path = get_transport_and_path('foo.bar/baz')
         self.assertTrue(isinstance(client, SubprocessGitClient))
         self.assertTrue(isinstance(client, SubprocessGitClient))
@@ -170,12 +222,42 @@ class GitClientTests(TestCase):
             self.client.send_pack, "blah", lambda x: {}, lambda h,w: [])
             self.client.send_pack, "blah", lambda x: {}, lambda h,w: [])
 
 
 
 
+class TestSSHVendor(object):
+
+    def __init__(self):
+        self.host = None
+        self.command = ""
+        self.username = None
+        self.port = None
+
+    def connect_ssh(self, host, command, username=None, port=None):
+        self.host = host
+        self.command = command
+        self.username = username
+        self.port = port
+
+        class Subprocess: pass
+        setattr(Subprocess, 'read', lambda: None)
+        setattr(Subprocess, 'write', lambda: None)
+        setattr(Subprocess, 'can_read', lambda: None)
+        return Subprocess()
+
+
 class SSHGitClientTests(TestCase):
 class SSHGitClientTests(TestCase):
 
 
     def setUp(self):
     def setUp(self):
         super(SSHGitClientTests, self).setUp()
         super(SSHGitClientTests, self).setUp()
+
+        self.server = TestSSHVendor()
+        self.real_vendor = client.get_ssh_vendor
+        client.get_ssh_vendor = lambda: self.server
+
         self.client = SSHGitClient('git.samba.org')
         self.client = SSHGitClient('git.samba.org')
 
 
+    def tearDown(self):
+        super(SSHGitClientTests, self).tearDown()
+        client.get_ssh_vendor = self.real_vendor
+
     def test_default_command(self):
     def test_default_command(self):
         self.assertEqual('git-upload-pack',
         self.assertEqual('git-upload-pack',
                 self.client._get_cmd_path('upload-pack'))
                 self.client._get_cmd_path('upload-pack'))
@@ -186,6 +268,21 @@ class SSHGitClientTests(TestCase):
         self.assertEqual('/usr/lib/git/git-upload-pack',
         self.assertEqual('/usr/lib/git/git-upload-pack',
             self.client._get_cmd_path('upload-pack'))
             self.client._get_cmd_path('upload-pack'))
 
 
+    def test_connect(self):
+        server = self.server
+        client = self.client
+
+        client.username = "username"
+        client.port = 1337
+
+        client._connect("command", "/path/to/repo")
+        self.assertEquals("username", server.username)
+        self.assertEquals(1337, server.port)
+        self.assertEquals(["git-command '/path/to/repo'"], server.command)
+
+        client._connect("relative-command", "/~/path/to/repo")
+        self.assertEquals(["git-relative-command '~/path/to/repo'"],
+                          server.command)
 
 
 class ReportStatusParserTests(TestCase):
 class ReportStatusParserTests(TestCase):
 
 

+ 3 - 3
dulwich/tests/test_config.py

@@ -137,14 +137,14 @@ class ConfigFileTests(TestCase):
         c.set(("core", ), "foo", "bar")
         c.set(("core", ), "foo", "bar")
         f = StringIO()
         f = StringIO()
         c.write_to_file(f)
         c.write_to_file(f)
-        self.assertEqual("[core]\nfoo = bar\n", f.getvalue())
+        self.assertEqual("[core]\n\tfoo = bar\n", f.getvalue())
 
 
     def test_write_to_file_subsection(self):
     def test_write_to_file_subsection(self):
         c = ConfigFile()
         c = ConfigFile()
         c.set(("branch", "blie"), "foo", "bar")
         c.set(("branch", "blie"), "foo", "bar")
         f = StringIO()
         f = StringIO()
         c.write_to_file(f)
         c.write_to_file(f)
-        self.assertEqual("[branch \"blie\"]\nfoo = bar\n", f.getvalue())
+        self.assertEqual("[branch \"blie\"]\n\tfoo = bar\n", f.getvalue())
 
 
     def test_same_line(self):
     def test_same_line(self):
         cf = self.from_file("[branch.foo] foo = bar\n")
         cf = self.from_file("[branch.foo] foo = bar\n")
@@ -175,7 +175,7 @@ class ConfigDictTests(TestCase):
         cd.set(("core", ), "foo", "bla")
         cd.set(("core", ), "foo", "bla")
         cd.set(("core2", ), "foo", "bloe")
         cd.set(("core2", ), "foo", "bloe")
 
 
-        self.assertEqual([("core2", ), ("core", )], cd.keys())
+        self.assertEqual([("core", ), ("core2", )], cd.keys())
         self.assertEqual(cd[("core", )], {'foo': 'bla'})
         self.assertEqual(cd[("core", )], {'foo': 'bla'})
 
 
         cd['a'] = 'b'
         cd['a'] = 'b'

+ 147 - 0
dulwich/tests/test_hooks.py

@@ -0,0 +1,147 @@
+# test_hooks.py -- Tests for executing hooks
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# or (at your option) a later version of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA  02110-1301, USA.
+
+"""Tests for executing hooks."""
+
+import os
+import stat
+import shutil
+import tempfile
+import warnings
+
+from dulwich import errors
+
+from dulwich.hooks import (
+    PreCommitShellHook,
+    PostCommitShellHook,
+    CommitMsgShellHook,
+)
+
+from dulwich.tests import TestCase
+
+
+class ShellHookTests(TestCase):
+
+    def setUp(self):
+        if os.name != 'posix':
+            self.skipTest('shell hook tests requires POSIX shell')
+
+    def test_hook_pre_commit(self):
+        pre_commit_fail = """#!/bin/sh
+exit 1
+"""
+
+        pre_commit_success = """#!/bin/sh
+exit 0
+"""
+
+        repo_dir = os.path.join(tempfile.mkdtemp())
+        os.mkdir(os.path.join(repo_dir, 'hooks'))
+        self.addCleanup(shutil.rmtree, repo_dir)
+
+        pre_commit = os.path.join(repo_dir, 'hooks', 'pre-commit')
+        hook = PreCommitShellHook(repo_dir)
+
+        f = open(pre_commit, 'wb')
+        try:
+            f.write(pre_commit_fail)
+        finally:
+            f.close()
+        os.chmod(pre_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        self.assertRaises(errors.HookError, hook.execute)
+
+        f = open(pre_commit, 'wb')
+        try:
+            f.write(pre_commit_success)
+        finally:
+            f.close()
+        os.chmod(pre_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        hook.execute()
+
+    def test_hook_commit_msg(self):
+
+        commit_msg_fail = """#!/bin/sh
+exit 1
+"""
+
+        commit_msg_success = """#!/bin/sh
+exit 0
+"""
+
+        repo_dir = os.path.join(tempfile.mkdtemp())
+        os.mkdir(os.path.join(repo_dir, 'hooks'))
+        self.addCleanup(shutil.rmtree, repo_dir)
+
+        commit_msg = os.path.join(repo_dir, 'hooks', 'commit-msg')
+        hook = CommitMsgShellHook(repo_dir)
+
+        f = open(commit_msg, 'wb')
+        try:
+            f.write(commit_msg_fail)
+        finally:
+            f.close()
+        os.chmod(commit_msg, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        self.assertRaises(errors.HookError, hook.execute, 'failed commit')
+
+        f = open(commit_msg, 'wb')
+        try:
+            f.write(commit_msg_success)
+        finally:
+            f.close()
+        os.chmod(commit_msg, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        hook.execute('empty commit')
+
+    def test_hook_post_commit(self):
+
+        (fd, path) = tempfile.mkstemp()
+        post_commit_msg = """#!/bin/sh
+unlink %(file)s
+""" % {'file': path}
+
+        post_commit_msg_fail = """#!/bin/sh
+exit 1
+"""
+
+        repo_dir = os.path.join(tempfile.mkdtemp())
+        os.mkdir(os.path.join(repo_dir, 'hooks'))
+        self.addCleanup(shutil.rmtree, repo_dir)
+
+        post_commit = os.path.join(repo_dir, 'hooks', 'post-commit')
+        hook = PostCommitShellHook(repo_dir)
+
+        f = open(post_commit, 'wb')
+        try:
+            f.write(post_commit_msg_fail)
+        finally:
+            f.close()
+        os.chmod(post_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        self.assertRaises(errors.HookError, hook.execute)
+
+        f = open(post_commit, 'wb')
+        try:
+            f.write(post_commit_msg)
+        finally:
+            f.close()
+        os.chmod(post_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        hook.execute()
+        self.assertFalse(os.path.exists(path))

+ 7 - 0
dulwich/tests/test_index.py

@@ -76,6 +76,13 @@ class SimpleIndexTestCase(IndexTestCase):
         self.assertEqual(0, len(i))
         self.assertEqual(0, len(i))
         self.assertFalse(os.path.exists(i._filename))
         self.assertFalse(os.path.exists(i._filename))
 
 
+    def test_against_empty_tree(self):
+        i = self.get_simple_index("index")
+        changes = list(i.changes_from_tree(MemoryObjectStore(), None))
+        self.assertEqual(1, len(changes))
+        (oldname, newname), (oldmode, newmode), (oldsha, newsha) = changes[0]
+        self.assertEqual('bla', newname)
+        self.assertEqual('e69de29bb2d1d6434b8b29ae775ad8c2e48c5391', newsha)
 
 
 class SimpleIndexWriterTestCase(IndexTestCase):
 class SimpleIndexWriterTestCase(IndexTestCase):
 
 

+ 193 - 0
dulwich/tests/test_missing_obj_finder.py

@@ -0,0 +1,193 @@
+# test_missing_obj_finder.py -- tests for MissingObjectFinder
+# Copyright (C) 2012 syntevo GmbH
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; version 2
+# or (at your option) any later version of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA  02110-1301, USA.
+
+from dulwich.object_store import (
+    MemoryObjectStore,
+    )
+from dulwich.objects import (
+    Blob,
+    )
+from dulwich.tests import TestCase
+from utils import (
+    make_object,
+    build_commit_graph,
+    )
+
+
+class MissingObjectFinderTest(TestCase):
+
+    def setUp(self):
+        super(MissingObjectFinderTest, self).setUp()
+        self.store = MemoryObjectStore()
+        self.commits = []
+
+    def cmt(self, n):
+        return self.commits[n-1]
+
+    def assertMissingMatch(self, haves, wants, expected):
+        for sha, path in self.store.find_missing_objects(haves, wants):
+            self.assertTrue(sha in expected,
+                "(%s,%s) erroneously reported as missing" % (sha, path))
+            expected.remove(sha)
+
+        self.assertEquals(len(expected), 0,
+            "some objects are not reported as missing: %s" % (expected, ))
+
+
+class MOFLinearRepoTest(MissingObjectFinderTest):
+
+    def setUp(self):
+        super(MOFLinearRepoTest, self).setUp()
+        f1_1 = make_object(Blob, data='f1') # present in 1, removed in 3
+        f2_1 = make_object(Blob, data='f2') # present in all revisions, changed in 2 and 3
+        f2_2 = make_object(Blob, data='f2-changed')
+        f2_3 = make_object(Blob, data='f2-changed-again')
+        f3_2 = make_object(Blob, data='f3') # added in 2, left unmodified in 3
+
+        commit_spec = [[1], [2, 1], [3, 2]]
+        trees = {1: [('f1', f1_1), ('f2', f2_1)],
+                2: [('f1', f1_1), ('f2', f2_2), ('f3', f3_2)],
+                3: [('f2', f2_3), ('f3', f3_2)] }
+        # commit 1: f1 and f2
+        # commit 2: f3 added, f2 changed. Missing shall report commit id and a
+        # tree referenced by commit
+        # commit 3: f1 removed, f2 changed. Commit sha and root tree sha shall
+        # be reported as modified
+        self.commits = build_commit_graph(self.store, commit_spec, trees)
+        self.missing_1_2 = [self.cmt(2).id, self.cmt(2).tree, f2_2.id, f3_2.id]
+        self.missing_2_3 = [self.cmt(3).id, self.cmt(3).tree, f2_3.id]
+        self.missing_1_3 = [
+            self.cmt(2).id, self.cmt(3).id,
+            self.cmt(2).tree, self.cmt(3).tree,
+            f2_2.id, f3_2.id, f2_3.id]
+
+    def test_1_to_2(self):
+        self.assertMissingMatch([self.cmt(1).id], [self.cmt(2).id],
+            self.missing_1_2)
+
+    def test_2_to_3(self):
+        self.assertMissingMatch([self.cmt(2).id], [self.cmt(3).id],
+            self.missing_2_3)
+
+    def test_1_to_3(self):
+        self.assertMissingMatch([self.cmt(1).id], [self.cmt(3).id],
+            self.missing_1_3)
+
+    def test_bogus_haves_failure(self):
+        """Ensure non-existent SHA in haves are not tolerated"""
+        bogus_sha = self.cmt(2).id[::-1]
+        haves = [self.cmt(1).id, bogus_sha]
+        wants = [self.cmt(3).id]
+        self.assertRaises(KeyError, self.store.find_missing_objects,
+            self.store, haves, wants)
+
+    def test_bogus_wants_failure(self):
+        """Ensure non-existent SHA in wants are not tolerated"""
+        bogus_sha = self.cmt(2).id[::-1]
+        haves = [self.cmt(1).id]
+        wants = [self.cmt(3).id, bogus_sha]
+        self.assertRaises(KeyError, self.store.find_missing_objects,
+            self.store, haves, wants)
+
+    def test_no_changes(self):
+        self.assertMissingMatch([self.cmt(3).id], [self.cmt(3).id], [])
+
+
+class MOFMergeForkRepoTest(MissingObjectFinderTest):
+    # 1 --- 2 --- 4 --- 6 --- 7
+    #          \        /
+    #           3  ---
+    #            \
+    #             5
+
+    def setUp(self):
+        super(MOFMergeForkRepoTest, self).setUp()
+        f1_1 = make_object(Blob, data='f1')
+        f1_2 = make_object(Blob, data='f1-2')
+        f1_4 = make_object(Blob, data='f1-4')
+        f1_7 = make_object(Blob, data='f1-2') # same data as in rev 2
+        f2_1 = make_object(Blob, data='f2')
+        f2_3 = make_object(Blob, data='f2-3')
+        f3_3 = make_object(Blob, data='f3')
+        f3_5 = make_object(Blob, data='f3-5')
+        commit_spec = [[1], [2, 1], [3, 2], [4, 2], [5, 3], [6, 3, 4], [7, 6]]
+        trees = {1: [('f1', f1_1), ('f2', f2_1)],
+                2: [('f1', f1_2), ('f2', f2_1)], # f1 changed
+                # f3 added, f2 changed
+                3: [('f1', f1_2), ('f2', f2_3), ('f3', f3_3)],
+                4: [('f1', f1_4), ('f2', f2_1)],  # f1 changed
+                5: [('f1', f1_2), ('f3', f3_5)], # f2 removed, f3 changed
+                6: [('f1', f1_4), ('f2', f2_3), ('f3', f3_3)], # merged 3 and 4
+                # f1 changed to match rev2. f3 removed
+                7: [('f1', f1_7), ('f2', f2_3)]}
+        self.commits = build_commit_graph(self.store, commit_spec, trees)
+
+        self.f1_2_id = f1_2.id
+        self.f1_4_id = f1_4.id
+        self.f1_7_id = f1_7.id
+        self.f2_3_id = f2_3.id
+        self.f3_3_id = f3_3.id
+
+        self.assertEquals(f1_2.id, f1_7.id, "[sanity]")
+
+    def test_have6_want7(self):
+        # have 6, want 7. Ideally, shall not report f1_7 as it's the same as
+        # f1_2, however, to do so, MissingObjectFinder shall not record trees
+        # of common commits only, but also all parent trees and tree items,
+        # which is an overkill (i.e. in sha_done it records f1_4 as known, and
+        # doesn't record f1_2 was known prior to that, hence can't detect f1_7
+        # is in fact f1_2 and shall not be reported)
+        self.assertMissingMatch([self.cmt(6).id], [self.cmt(7).id],
+            [self.cmt(7).id, self.cmt(7).tree, self.f1_7_id])
+
+    def test_have4_want7(self):
+        # have 4, want 7. Shall not include rev5 as it is not in the tree
+        # between 4 and 7 (well, it is, but its SHA's are irrelevant for 4..7
+        # commit hierarchy)
+        self.assertMissingMatch([self.cmt(4).id], [self.cmt(7).id], [
+            self.cmt(7).id, self.cmt(6).id, self.cmt(3).id,
+            self.cmt(7).tree, self.cmt(6).tree, self.cmt(3).tree,
+            self.f2_3_id, self.f3_3_id])
+
+    def test_have1_want6(self):
+        # have 1, want 6. Shall not include rev5
+        self.assertMissingMatch([self.cmt(1).id], [self.cmt(6).id], [
+            self.cmt(6).id, self.cmt(4).id, self.cmt(3).id, self.cmt(2).id,
+            self.cmt(6).tree, self.cmt(4).tree, self.cmt(3).tree,
+            self.cmt(2).tree, self.f1_2_id, self.f1_4_id, self.f2_3_id,
+            self.f3_3_id])
+
+    def test_have3_want6(self):
+        # have 3, want 7. Shall not report rev2 and its tree, because
+        # haves(3) means has parents, i.e. rev2, too
+        # BUT shall report any changes descending rev2 (excluding rev3)
+        # Shall NOT report f1_7 as it's techically == f1_2
+        self.assertMissingMatch([self.cmt(3).id], [self.cmt(7).id], [
+              self.cmt(7).id, self.cmt(6).id, self.cmt(4).id,
+              self.cmt(7).tree, self.cmt(6).tree, self.cmt(4).tree,
+              self.f1_4_id])
+
+    def test_have5_want7(self):
+        # have 5, want 7. Common parent is rev2, hence children of rev2 from
+        # a descent line other than rev5 shall be reported
+        # expects f1_4 from rev6. f3_5 is known in rev5;
+        # f1_7 shall be the same as f1_2 (known, too)
+        self.assertMissingMatch([self.cmt(5).id], [self.cmt(7).id], [
+              self.cmt(7).id, self.cmt(6).id, self.cmt(4).id,
+              self.cmt(7).tree, self.cmt(6).tree, self.cmt(4).tree,
+              self.f1_4_id])

+ 14 - 0
dulwich/tests/test_object_store.py

@@ -237,6 +237,7 @@ class DiskObjectStoreTests(PackBasedObjectStoreTests, TestCase):
         store = DiskObjectStore(self.store_dir)
         store = DiskObjectStore(self.store_dir)
         self.assertRaises(KeyError, store.__getitem__, b2.id)
         self.assertRaises(KeyError, store.__getitem__, b2.id)
         store.add_alternate_path(alternate_dir)
         store.add_alternate_path(alternate_dir)
+        self.assertIn(b2.id, store)
         self.assertEqual(b2, store[b2.id])
         self.assertEqual(b2, store[b2.id])
 
 
     def test_add_alternate_path(self):
     def test_add_alternate_path(self):
@@ -249,6 +250,19 @@ class DiskObjectStoreTests(PackBasedObjectStoreTests, TestCase):
             ["/foo/path", "/bar/path"],
             ["/foo/path", "/bar/path"],
             store._read_alternate_paths())
             store._read_alternate_paths())
 
 
+    def test_rel_alternative_path(self):
+        alternate_dir = tempfile.mkdtemp()
+        self.addCleanup(shutil.rmtree, alternate_dir)
+        alternate_store = DiskObjectStore(alternate_dir)
+        b2 = make_object(Blob, data="yummy data")
+        alternate_store.add_object(b2)
+        store = DiskObjectStore(self.store_dir)
+        self.assertRaises(KeyError, store.__getitem__, b2.id)
+        store.add_alternate_path(os.path.relpath(alternate_dir, self.store_dir))
+        self.assertEqual(list(alternate_store), list(store.alternates[0]))
+        self.assertIn(b2.id, store)
+        self.assertEqual(b2, store[b2.id])
+
     def test_pack_dir(self):
     def test_pack_dir(self):
         o = DiskObjectStore(self.store_dir)
         o = DiskObjectStore(self.store_dir)
         self.assertEqual(os.path.join(self.store_dir, "pack"), o.pack_dir)
         self.assertEqual(os.path.join(self.store_dir, "pack"), o.pack_dir)

+ 111 - 1
dulwich/tests/test_objects.py

@@ -157,7 +157,8 @@ class BlobReadTests(TestCase):
 
 
     def test_read_tag_from_file(self):
     def test_read_tag_from_file(self):
         t = self.get_tag(tag_sha)
         t = self.get_tag(tag_sha)
-        self.assertEqual(t.object, (Commit, '51b668fd5bf7061b7d6fa525f88803e6cfadaa51'))
+        self.assertEqual(t.object,
+            (Commit, '51b668fd5bf7061b7d6fa525f88803e6cfadaa51'))
         self.assertEqual(t.name,'signed')
         self.assertEqual(t.name,'signed')
         self.assertEqual(t.tagger,'Ali Sabil <ali.sabil@gmail.com>')
         self.assertEqual(t.tagger,'Ali Sabil <ali.sabil@gmail.com>')
         self.assertEqual(t.tag_time, 1231203091)
         self.assertEqual(t.tag_time, 1231203091)
@@ -313,6 +314,115 @@ class CommitSerializationTests(TestCase):
         d._deserialize(c.as_raw_chunks())
         d._deserialize(c.as_raw_chunks())
         self.assertEqual(c, d)
         self.assertEqual(c, d)
 
 
+    def test_serialize_mergetag(self):
+        tag = make_object(
+            Tag, object=(Commit, "a38d6181ff27824c79fc7df825164a212eff6a3f"),
+            object_type_name="commit",
+            name="v2.6.22-rc7",
+            tag_time=1183319674,
+            tag_timezone=0,
+            tagger="Linus Torvalds <torvalds@woody.linux-foundation.org>",
+            message=default_message)
+        commit = self.make_commit(mergetag=[tag])
+
+        self.assertEqual("""tree d80c186a03f423a81b39df39dc87fd269736ca86
+parent ab64bbdcc51b170d21588e5c5d391ee5c0c96dfd
+parent 4cffe90e0a41ad3f5190079d7c8f036bde29cbe6
+author James Westby <jw+debian@jameswestby.net> 1174773719 +0000
+committer James Westby <jw+debian@jameswestby.net> 1174773719 +0000
+mergetag object a38d6181ff27824c79fc7df825164a212eff6a3f
+ type commit
+ tag v2.6.22-rc7
+ tagger Linus Torvalds <torvalds@woody.linux-foundation.org> 1183319674 +0000
+ 
+ Linux 2.6.22-rc7
+ -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1.4.7 (GNU/Linux)
+ 
+ iD8DBQBGiAaAF3YsRnbiHLsRAitMAKCiLboJkQECM/jpYsY3WPfvUgLXkACgg3ql
+ OK2XeQOiEeXtT76rV4t2WR4=
+ =ivrA
+ -----END PGP SIGNATURE-----
+
+Merge ../b
+""", commit.as_raw_string())
+
+    def test_serialize_mergetags(self):
+        tag = make_object(
+            Tag, object=(Commit, "a38d6181ff27824c79fc7df825164a212eff6a3f"),
+            object_type_name="commit",
+            name="v2.6.22-rc7",
+            tag_time=1183319674,
+            tag_timezone=0,
+            tagger="Linus Torvalds <torvalds@woody.linux-foundation.org>",
+            message=default_message)
+        commit = self.make_commit(mergetag=[tag, tag])
+
+        self.assertEqual("""tree d80c186a03f423a81b39df39dc87fd269736ca86
+parent ab64bbdcc51b170d21588e5c5d391ee5c0c96dfd
+parent 4cffe90e0a41ad3f5190079d7c8f036bde29cbe6
+author James Westby <jw+debian@jameswestby.net> 1174773719 +0000
+committer James Westby <jw+debian@jameswestby.net> 1174773719 +0000
+mergetag object a38d6181ff27824c79fc7df825164a212eff6a3f
+ type commit
+ tag v2.6.22-rc7
+ tagger Linus Torvalds <torvalds@woody.linux-foundation.org> 1183319674 +0000
+ 
+ Linux 2.6.22-rc7
+ -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1.4.7 (GNU/Linux)
+ 
+ iD8DBQBGiAaAF3YsRnbiHLsRAitMAKCiLboJkQECM/jpYsY3WPfvUgLXkACgg3ql
+ OK2XeQOiEeXtT76rV4t2WR4=
+ =ivrA
+ -----END PGP SIGNATURE-----
+mergetag object a38d6181ff27824c79fc7df825164a212eff6a3f
+ type commit
+ tag v2.6.22-rc7
+ tagger Linus Torvalds <torvalds@woody.linux-foundation.org> 1183319674 +0000
+ 
+ Linux 2.6.22-rc7
+ -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1.4.7 (GNU/Linux)
+ 
+ iD8DBQBGiAaAF3YsRnbiHLsRAitMAKCiLboJkQECM/jpYsY3WPfvUgLXkACgg3ql
+ OK2XeQOiEeXtT76rV4t2WR4=
+ =ivrA
+ -----END PGP SIGNATURE-----
+
+Merge ../b
+""", commit.as_raw_string())
+
+    def test_deserialize_mergetag(self):
+        tag = make_object(
+            Tag, object=(Commit, "a38d6181ff27824c79fc7df825164a212eff6a3f"),
+            object_type_name="commit",
+            name="v2.6.22-rc7",
+            tag_time=1183319674,
+            tag_timezone=0,
+            tagger="Linus Torvalds <torvalds@woody.linux-foundation.org>",
+            message=default_message)
+        commit = self.make_commit(mergetag=[tag])
+
+        d = Commit()
+        d._deserialize(commit.as_raw_chunks())
+        self.assertEqual(commit, d)
+
+    def test_deserialize_mergetags(self):
+        tag = make_object(
+            Tag, object=(Commit, "a38d6181ff27824c79fc7df825164a212eff6a3f"),
+            object_type_name="commit",
+            name="v2.6.22-rc7",
+            tag_time=1183319674,
+            tag_timezone=0,
+            tagger="Linus Torvalds <torvalds@woody.linux-foundation.org>",
+            message=default_message)
+        commit = self.make_commit(mergetag=[tag, tag])
+
+        d = Commit()
+        d._deserialize(commit.as_raw_chunks())
+        self.assertEquals(commit, d)
+
 
 
 default_committer = 'James Westby <jw+debian@jameswestby.net> 1174773719 +0000'
 default_committer = 'James Westby <jw+debian@jameswestby.net> 1174773719 +0000'
 
 

+ 79 - 0
dulwich/tests/test_patch.py

@@ -369,6 +369,85 @@ class DiffTests(TestCase):
             '-same'
             '-same'
             ], f.getvalue().splitlines())
             ], f.getvalue().splitlines())
 
 
+    def test_object_diff_bin_blob_force(self):
+        f = StringIO()
+        # Prepare two slightly different PNG headers
+        b1 = Blob.from_string(
+            "\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d\x49\x48\x44\x52"
+            "\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x04\x00\x00\x00\x05\x04\x8b")
+        b2 = Blob.from_string(
+            "\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d\x49\x48\x44\x52"
+            "\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x03\x00\x00\x00\x98\xd3\xb3")
+        store = MemoryObjectStore()
+        store.add_objects([(b1, None), (b2, None)])
+        write_object_diff(f, store, ('foo.png', 0644, b1.id),
+                                    ('bar.png', 0644, b2.id), diff_binary=True)
+        self.assertEqual([
+            'diff --git a/foo.png b/bar.png',
+            'index f73e47d..06364b7 644',
+            '--- a/foo.png',
+            '+++ b/bar.png',
+            '@@ -1,4 +1,4 @@',
+            ' \x89PNG',
+            ' \x1a',
+            ' \x00\x00\x00',
+            '-IHDR\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x04\x00\x00\x00\x05\x04\x8b',
+            '\\ No newline at end of file',
+            '+IHDR\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x03\x00\x00\x00\x98\xd3\xb3',
+            '\\ No newline at end of file'
+            ], f.getvalue().splitlines())
+
+    def test_object_diff_bin_blob(self):
+        f = StringIO()
+        # Prepare two slightly different PNG headers
+        b1 = Blob.from_string(
+            "\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d\x49\x48\x44\x52"
+            "\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x04\x00\x00\x00\x05\x04\x8b")
+        b2 = Blob.from_string(
+            "\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d\x49\x48\x44\x52"
+            "\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x03\x00\x00\x00\x98\xd3\xb3")
+        store = MemoryObjectStore()
+        store.add_objects([(b1, None), (b2, None)])
+        write_object_diff(f, store, ('foo.png', 0644, b1.id),
+                                    ('bar.png', 0644, b2.id))
+        self.assertEqual([
+            'diff --git a/foo.png b/bar.png',
+            'index f73e47d..06364b7 644',
+            'Binary files a/foo.png and b/bar.png differ'
+            ], f.getvalue().splitlines())
+
+    def test_object_diff_add_bin_blob(self):
+        f = StringIO()
+        b2 = Blob.from_string(
+            '\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d\x49\x48\x44\x52'
+            '\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x03\x00\x00\x00\x98\xd3\xb3')
+        store = MemoryObjectStore()
+        store.add_object(b2)
+        write_object_diff(f, store, (None, None, None),
+                                    ('bar.png', 0644, b2.id))
+        self.assertEqual([
+            'diff --git /dev/null b/bar.png',
+            'new mode 644',
+            'index 0000000..06364b7 644',
+            'Binary files /dev/null and b/bar.png differ'
+            ], f.getvalue().splitlines())
+
+    def test_object_diff_remove_bin_blob(self):
+        f = StringIO()
+        b1 = Blob.from_string(
+            '\x89\x50\x4e\x47\x0d\x0a\x1a\x0a\x00\x00\x00\x0d\x49\x48\x44\x52'
+            '\x00\x00\x01\xd5\x00\x00\x00\x9f\x08\x04\x00\x00\x00\x05\x04\x8b')
+        store = MemoryObjectStore()
+        store.add_object(b1)
+        write_object_diff(f, store, ('foo.png', 0644, b1.id),
+                                    (None, None, None))
+        self.assertEqual([
+            'diff --git a/foo.png /dev/null',
+            'deleted mode 644',
+            'index f73e47d..0000000',
+            'Binary files a/foo.png and /dev/null differ'
+            ], f.getvalue().splitlines())
+
     def test_object_diff_kind_change(self):
     def test_object_diff_kind_change(self):
         f = StringIO()
         f = StringIO()
         b1 = Blob.from_string("new\nsame\n")
         b1 = Blob.from_string("new\nsame\n")

+ 159 - 0
dulwich/tests/test_repository.py

@@ -21,6 +21,7 @@
 
 
 from cStringIO import StringIO
 from cStringIO import StringIO
 import os
 import os
+import stat
 import shutil
 import shutil
 import tempfile
 import tempfile
 import warnings
 import warnings
@@ -51,6 +52,7 @@ from dulwich.tests import (
 from dulwich.tests.utils import (
 from dulwich.tests.utils import (
     open_repo,
     open_repo,
     tear_down_repo,
     tear_down_repo,
+    setup_warning_catcher,
     )
     )
 
 
 missing_sha = 'b91fa4d900e17e99b433218e988c4eb4a3e9a097'
 missing_sha = 'b91fa4d900e17e99b433218e988c4eb4a3e9a097'
@@ -412,6 +414,163 @@ class RepositoryTests(TestCase):
             shutil.rmtree(r1_dir)
             shutil.rmtree(r1_dir)
             shutil.rmtree(r2_dir)
             shutil.rmtree(r2_dir)
 
 
+    def test_shell_hook_pre_commit(self):
+        if os.name != 'posix':
+            self.skipTest('shell hook tests requires POSIX shell')
+
+        pre_commit_fail = """#!/bin/sh
+exit 1
+"""
+
+        pre_commit_success = """#!/bin/sh
+exit 0
+"""
+
+        repo_dir = os.path.join(tempfile.mkdtemp())
+        r = Repo.init(repo_dir)
+        self.addCleanup(shutil.rmtree, repo_dir)
+
+        pre_commit = os.path.join(r.controldir(), 'hooks', 'pre-commit')
+
+        f = open(pre_commit, 'wb')
+        try:
+            f.write(pre_commit_fail)
+        finally:
+            f.close()
+        os.chmod(pre_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        self.assertRaises(errors.CommitError, r.do_commit, 'failed commit',
+                          committer='Test Committer <test@nodomain.com>',
+                          author='Test Author <test@nodomain.com>',
+                          commit_timestamp=12345, commit_timezone=0,
+                          author_timestamp=12345, author_timezone=0)
+
+        f = open(pre_commit, 'wb')
+        try:
+            f.write(pre_commit_success)
+        finally:
+            f.close()
+        os.chmod(pre_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        commit_sha = r.do_commit(
+            'empty commit',
+            committer='Test Committer <test@nodomain.com>',
+            author='Test Author <test@nodomain.com>',
+            commit_timestamp=12395, commit_timezone=0,
+            author_timestamp=12395, author_timezone=0)
+        self.assertEqual([], r[commit_sha].parents)
+
+    def test_shell_hook_commit_msg(self):
+        if os.name != 'posix':
+            self.skipTest('shell hook tests requires POSIX shell')
+
+        commit_msg_fail = """#!/bin/sh
+exit 1
+"""
+
+        commit_msg_success = """#!/bin/sh
+exit 0
+"""
+
+        repo_dir = os.path.join(tempfile.mkdtemp())
+        r = Repo.init(repo_dir)
+        self.addCleanup(shutil.rmtree, repo_dir)
+
+        commit_msg = os.path.join(r.controldir(), 'hooks', 'commit-msg')
+
+        f = open(commit_msg, 'wb')
+        try:
+            f.write(commit_msg_fail)
+        finally:
+            f.close()
+        os.chmod(commit_msg, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        self.assertRaises(errors.CommitError, r.do_commit, 'failed commit',
+                          committer='Test Committer <test@nodomain.com>',
+                          author='Test Author <test@nodomain.com>',
+                          commit_timestamp=12345, commit_timezone=0,
+                          author_timestamp=12345, author_timezone=0)
+
+        f = open(commit_msg, 'wb')
+        try:
+            f.write(commit_msg_success)
+        finally:
+            f.close()
+        os.chmod(commit_msg, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        commit_sha = r.do_commit(
+            'empty commit',
+            committer='Test Committer <test@nodomain.com>',
+            author='Test Author <test@nodomain.com>',
+            commit_timestamp=12395, commit_timezone=0,
+            author_timestamp=12395, author_timezone=0)
+        self.assertEqual([], r[commit_sha].parents)
+
+    def test_shell_hook_post_commit(self):
+        if os.name != 'posix':
+            self.skipTest('shell hook tests requires POSIX shell')
+
+        repo_dir = os.path.join(tempfile.mkdtemp())
+        r = Repo.init(repo_dir)
+        self.addCleanup(shutil.rmtree, repo_dir)
+
+        (fd, path) = tempfile.mkstemp(dir=repo_dir)
+        post_commit_msg = """#!/bin/sh
+unlink %(file)s
+""" % {'file': path}
+
+        root_sha = r.do_commit(
+            'empty commit',
+            committer='Test Committer <test@nodomain.com>',
+            author='Test Author <test@nodomain.com>',
+            commit_timestamp=12345, commit_timezone=0,
+            author_timestamp=12345, author_timezone=0)
+        self.assertEqual([], r[root_sha].parents)
+
+        post_commit = os.path.join(r.controldir(), 'hooks', 'post-commit')
+
+        f = open(post_commit, 'wb')
+        try:
+            f.write(post_commit_msg)
+        finally:
+            f.close()
+        os.chmod(post_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        commit_sha = r.do_commit(
+            'empty commit',
+            committer='Test Committer <test@nodomain.com>',
+            author='Test Author <test@nodomain.com>',
+            commit_timestamp=12345, commit_timezone=0,
+            author_timestamp=12345, author_timezone=0)
+        self.assertEqual([root_sha], r[commit_sha].parents)
+
+        self.assertFalse(os.path.exists(path))
+
+        post_commit_msg_fail = """#!/bin/sh
+exit 1
+"""
+        f = open(post_commit, 'wb')
+        try:
+            f.write(post_commit_msg_fail)
+        finally:
+            f.close()
+        os.chmod(post_commit, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
+
+        warnings.simplefilter("always", UserWarning)
+        self.addCleanup(warnings.resetwarnings)
+        warnings_list = setup_warning_catcher()
+
+        commit_sha2 = r.do_commit(
+            'empty commit',
+            committer='Test Committer <test@nodomain.com>',
+            author='Test Author <test@nodomain.com>',
+            commit_timestamp=12345, commit_timezone=0,
+            author_timestamp=12345, author_timezone=0)
+        self.assertEqual(len(warnings_list), 1)
+        self.assertIsInstance(warnings_list[-1], UserWarning)
+        self.assertTrue("post-commit hook failed: " in str(warnings_list[-1]))
+        self.assertEqual([commit_sha], r[commit_sha2].parents)
+
 
 
 class BuildRepoTests(TestCase):
 class BuildRepoTests(TestCase):
     """Tests that build on-disk repos from scratch.
     """Tests that build on-disk repos from scratch.

+ 14 - 0
dulwich/tests/utils.py

@@ -26,6 +26,7 @@ import shutil
 import tempfile
 import tempfile
 import time
 import time
 import types
 import types
+import warnings
 
 
 from dulwich.index import (
 from dulwich.index import (
     commit_tree,
     commit_tree,
@@ -310,3 +311,16 @@ def build_commit_graph(object_store, commit_spec, trees=None, attrs=None):
         commits.append(commit_obj)
         commits.append(commit_obj)
 
 
     return commits
     return commits
+
+
+def setup_warning_catcher():
+    """Wrap warnings.showwarning with code that records warnings."""
+
+    caught_warnings = []
+    original_showwarning = warnings.showwarning
+
+    def custom_showwarning(*args,  **kwargs):
+        caught_warnings.append(args[0])
+
+    warnings.showwarning = custom_showwarning
+    return caught_warnings

+ 44 - 11
dulwich/web.py

@@ -398,16 +398,30 @@ def make_wsgi_chain(*args, **kwargs):
 
 
 
 
 # The reference server implementation is based on wsgiref, which is not
 # The reference server implementation is based on wsgiref, which is not
-# distributed with python 2.4. If wsgiref is not present, users will not be able
-# to use the HTTP server without a little extra work.
+# distributed with python 2.4. If wsgiref is not present, users will not be
+# able to use the HTTP server without a little extra work.
 try:
 try:
     from wsgiref.simple_server import (
     from wsgiref.simple_server import (
         WSGIRequestHandler,
         WSGIRequestHandler,
+        ServerHandler,
+        WSGIServer,
         make_server,
         make_server,
-        )
+    )
+    class ServerHandlerLogger(ServerHandler):
+        """ServerHandler that uses dulwich's logger for logging exceptions."""
+        
+        def log_exception(self, exc_info):
+            logger.exception('Exception happened during processing of request',
+                             exc_info=exc_info)
 
 
-    class HTTPGitRequestHandler(WSGIRequestHandler):
-        """Handler that uses dulwich's logger for logging exceptions."""
+        def log_message(self, format, *args):
+            logger.info(format, *args)
+
+        def log_error(self, *args):
+            logger.error(*args)
+
+    class WSGIRequestHandlerLogger(WSGIRequestHandler):
+        """WSGIRequestHandler that uses dulwich's logger for logging exceptions."""
 
 
         def log_exception(self, exc_info):
         def log_exception(self, exc_info):
             logger.exception('Exception happened during processing of request',
             logger.exception('Exception happened during processing of request',
@@ -418,7 +432,24 @@ try:
 
 
         def log_error(self, *args):
         def log_error(self, *args):
             logger.error(*args)
             logger.error(*args)
-
+        
+        def handle(self):
+            """Handle a single HTTP request"""
+    
+            self.raw_requestline = self.rfile.readline()
+            if not self.parse_request(): # An error code has been sent, just exit
+                return
+    
+            handler = ServerHandlerLogger(
+                self.rfile, self.wfile, self.get_stderr(), self.get_environ()
+            )
+            handler.request_handler = self      # backpointer for logging
+            handler.run(self.server.get_app())
+    
+    class WSGIServerLogger(WSGIServer):
+        def handle_error(self, request, client_address):
+            """Handle an error. """
+            logger.exception('Exception happened during processing of request from %s' % str(client_address))
 
 
     def main(argv=sys.argv):
     def main(argv=sys.argv):
         """Entry point for starting an HTTP git server."""
         """Entry point for starting an HTTP git server."""
@@ -428,22 +459,24 @@ try:
             gitdir = os.getcwd()
             gitdir = os.getcwd()
 
 
         # TODO: allow serving on other addresses/ports via command-line flag
         # TODO: allow serving on other addresses/ports via command-line flag
-        listen_addr=''
+        listen_addr = ''
         port = 8000
         port = 8000
 
 
         log_utils.default_logging_config()
         log_utils.default_logging_config()
         backend = DictBackend({'/': Repo(gitdir)})
         backend = DictBackend({'/': Repo(gitdir)})
         app = make_wsgi_chain(backend)
         app = make_wsgi_chain(backend)
         server = make_server(listen_addr, port, app,
         server = make_server(listen_addr, port, app,
-                             handler_class=HTTPGitRequestHandler)
+                             handler_class=WSGIRequestHandlerLogger,
+                             server_class=WSGIServerLogger)
         logger.info('Listening for HTTP connections on %s:%d', listen_addr,
         logger.info('Listening for HTTP connections on %s:%d', listen_addr,
                     port)
                     port)
         server.serve_forever()
         server.serve_forever()
 
 
 except ImportError:
 except ImportError:
-    # No wsgiref found; don't provide the reference functionality, but leave the
-    # rest of the WSGI-based implementation.
+    # No wsgiref found; don't provide the reference functionality, but leave
+    # the rest of the WSGI-based implementation.
     def main(argv=sys.argv):
     def main(argv=sys.argv):
         """Stub entry point for failing to start a server without wsgiref."""
         """Stub entry point for failing to start a server without wsgiref."""
-        sys.stderr.write('Sorry, the wsgiref module is required for dul-web.\n')
+        sys.stderr.write(
+            'Sorry, the wsgiref module is required for dul-web.\n')
         sys.exit(1)
         sys.exit(1)

+ 8 - 8
setup.py

@@ -10,7 +10,7 @@ except ImportError:
     has_setuptools = False
     has_setuptools = False
 from distutils.core import Distribution
 from distutils.core import Distribution
 
 
-dulwich_version_string = '0.8.6'
+dulwich_version_string = '0.9.0'
 
 
 include_dirs = []
 include_dirs = []
 # Windows MSVC support
 # Windows MSVC support
@@ -30,8 +30,8 @@ class DulwichDistribution(Distribution):
         return not self.pure and not '__pypy__' in sys.modules
         return not self.pure and not '__pypy__' in sys.modules
 
 
     global_options = Distribution.global_options + [
     global_options = Distribution.global_options + [
-        ('pure', None, 
-            "use pure Python code instead of C extensions (slower on CPython)")]
+        ('pure', None, "use pure Python code instead of C "
+                       "extensions (slower on CPython)")]
 
 
     pure = False
     pure = False
 
 
@@ -45,8 +45,7 @@ if sys.platform == 'darwin' and os.path.exists('/usr/bin/xcodebuild'):
     out, err = p.communicate()
     out, err = p.communicate()
     for l in out.splitlines():
     for l in out.splitlines():
         # Also parse only first digit, because 3.2.1 can't be parsed nicely
         # Also parse only first digit, because 3.2.1 can't be parsed nicely
-        if (l.startswith('Xcode') and
-            int(l.split()[1].split('.')[0]) >= 4):
+        if l.startswith('Xcode') and int(l.split()[1].split('.')[0]) >= 4:
             os.environ['ARCHFLAGS'] = ''
             os.environ['ARCHFLAGS'] = ''
 
 
 setup_kwargs = {}
 setup_kwargs = {}
@@ -59,7 +58,8 @@ setup(name='dulwich',
       keywords='git',
       keywords='git',
       version=dulwich_version_string,
       version=dulwich_version_string,
       url='http://samba.org/~jelmer/dulwich',
       url='http://samba.org/~jelmer/dulwich',
-      download_url='http://samba.org/~jelmer/dulwich/dulwich-%s.tar.gz' % dulwich_version_string,
+      download_url='http://samba.org/~jelmer/dulwich/'
+                   'dulwich-%s.tar.gz' % dulwich_version_string,
       license='GPLv2 or later',
       license='GPLv2 or later',
       author='Jelmer Vernooij',
       author='Jelmer Vernooij',
       author_email='jelmer@samba.org',
       author_email='jelmer@samba.org',
@@ -73,14 +73,14 @@ setup(name='dulwich',
       """,
       """,
       packages=['dulwich', 'dulwich.tests'],
       packages=['dulwich', 'dulwich.tests'],
       scripts=['bin/dulwich', 'bin/dul-daemon', 'bin/dul-web'],
       scripts=['bin/dulwich', 'bin/dul-daemon', 'bin/dul-web'],
-      ext_modules = [
+      ext_modules=[
           Extension('dulwich._objects', ['dulwich/_objects.c'],
           Extension('dulwich._objects', ['dulwich/_objects.c'],
                     include_dirs=include_dirs),
                     include_dirs=include_dirs),
           Extension('dulwich._pack', ['dulwich/_pack.c'],
           Extension('dulwich._pack', ['dulwich/_pack.c'],
               include_dirs=include_dirs),
               include_dirs=include_dirs),
           Extension('dulwich._diff_tree', ['dulwich/_diff_tree.c'],
           Extension('dulwich._diff_tree', ['dulwich/_diff_tree.c'],
               include_dirs=include_dirs),
               include_dirs=include_dirs),
-          ],
+      ],
       distclass=DulwichDistribution,
       distclass=DulwichDistribution,
       **setup_kwargs
       **setup_kwargs
       )
       )