object-store.txt 5.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186
  1. .. _tutorial-object-store:
  2. The object store
  3. ================
  4. The objects are stored in the ``object store`` of the repository.
  5. >>> from dulwich.repo import Repo
  6. >>> repo = Repo.init("myrepo", mkdir=True)
  7. Initial commit
  8. --------------
  9. When you use Git, you generally add or modify content. As our repository is
  10. empty for now, we'll start by adding a new file::
  11. >>> from dulwich.objects import Blob
  12. >>> blob = Blob.from_string("My file content\n")
  13. >>> blob.id
  14. 'c55063a4d5d37aa1af2b2dad3a70aa34dae54dc6'
  15. Of course you could create a blob from an existing file using ``from_file``
  16. instead.
  17. As said in the introduction, file content is separed from file name. Let's
  18. give this content a name::
  19. >>> from dulwich.objects import Tree
  20. >>> tree = Tree()
  21. >>> tree.add("spam", 0100644, blob.id)
  22. Note that "0100644" is the octal form for a regular file with common
  23. permissions. You can hardcode them or you can use the ``stat`` module.
  24. The tree state of our repository still needs to be placed in time. That's the
  25. job of the commit::
  26. >>> from dulwich.objects import Commit, parse_timezone
  27. >>> from time import time
  28. >>> commit = Commit()
  29. >>> commit.tree = tree.id
  30. >>> author = "Your Name <your.email@example.com>"
  31. >>> commit.author = commit.committer = author
  32. >>> commit.commit_time = commit.author_time = int(time())
  33. >>> tz = parse_timezone('-0200')[0]
  34. >>> commit.commit_timezone = commit.author_timezone = tz
  35. >>> commit.encoding = "UTF-8"
  36. >>> commit.message = "Initial commit"
  37. Note that the initial commit has no parents.
  38. At this point, the repository is still empty because all operations happen in
  39. memory. Let's "commit" it.
  40. >>> object_store = repo.object_store
  41. >>> object_store.add_object(blob)
  42. Now the ".git/objects" folder contains a first SHA-1 file. Let's continue
  43. saving the changes::
  44. >>> object_store.add_object(tree)
  45. >>> object_store.add_object(commit)
  46. Now the physical repository contains three objects but still has no branch.
  47. Let's create the master branch like Git would::
  48. >>> repo.refs['refs/heads/master'] = commit.id
  49. The master branch now has a commit where to start. When we commit to master, we
  50. are also moving HEAD, which is Git's currently checked out branch:
  51. >>> head = repo.refs['HEAD']
  52. >>> head == commit.id
  53. True
  54. >>> head == repo.refs['refs/heads/master']
  55. True
  56. How did that work? As it turns out, HEAD is a special kind of ref called a
  57. symbolic ref, and it points at master. Most functions on the refs container
  58. work transparently with symbolic refs, but we can also take a peek inside HEAD:
  59. >>> repo.refs.read_ref('HEAD')
  60. 'ref: refs/heads/master'
  61. Normally, you won't need to use read_ref. If you want to change what ref HEAD
  62. points to, in order to check out another branch, just use set_symbolic_ref.
  63. Now our repository is officially tracking a branch named "master" referring to a
  64. single commit.
  65. Playing again with Git
  66. ----------------------
  67. At this point you can come back to the shell, go into the "myrepo" folder and
  68. type ``git status`` to let Git confirm that this is a regular repository on
  69. branch "master".
  70. Git will tell you that the file "spam" is deleted, which is normal because
  71. Git is comparing the repository state with the current working copy. And we
  72. have absolutely no working copy using Dulwich because we don't need it at
  73. all!
  74. You can checkout the last state using ``git checkout -f``. The force flag
  75. will prevent Git from complaining that there are uncommitted changes in the
  76. working copy.
  77. The file ``spam`` appears and with no surprise contains the same bytes as the
  78. blob::
  79. $ cat spam
  80. My file content
  81. Changing a File and Committing it
  82. ---------------------------------
  83. Now we have a first commit, the next one will show a difference.
  84. As seen in the introduction, it's about making a path in a tree point to a
  85. new blob. The old blob will remain to compute the diff. The tree is altered
  86. and the new commit'task is to point to this new version.
  87. Let's first build the blob::
  88. >>> from dulwich.objects import Blob
  89. >>> spam = Blob.from_string("My new file content\n")
  90. >>> spam.id
  91. '16ee2682887a962f854ebd25a61db16ef4efe49f'
  92. An alternative is to alter the previously constructed blob object::
  93. >>> blob.data = "My new file content\n"
  94. >>> blob.id
  95. '16ee2682887a962f854ebd25a61db16ef4efe49f'
  96. In any case, update the blob id known as "spam". You also have the
  97. opportunity of changing its mode::
  98. >>> tree["spam"] = (0100644, spam.id)
  99. Now let's record the change::
  100. >>> from dulwich.objects import Commit
  101. >>> from time import time
  102. >>> c2 = Commit()
  103. >>> c2.tree = tree.id
  104. >>> c2.parents = [commit.id]
  105. >>> c2.author = c2.committer = "John Doe <john@example.com>"
  106. >>> c2.commit_time = c2.author_time = int(time())
  107. >>> c2.commit_timezone = c2.author_timezone = 0
  108. >>> c2.encoding = "UTF-8"
  109. >>> c2.message = 'Changing "spam"'
  110. In this new commit we record the changed tree id, and most important, the
  111. previous commit as the parent. Parents are actually a list because a commit
  112. may happen to have several parents after merging branches.
  113. Let's put the objects in the object store::
  114. >>> repo.object_store.add_object(spam)
  115. >>> repo.object_store.add_object(tree)
  116. >>> repo.object_store.add_object(c2)
  117. You can already ask git to introspect this commit using ``git show`` and the
  118. value of ``c2.id`` as an argument. You'll see the difference will the
  119. previous blob recorded as "spam".
  120. The diff between the previous head and the new one can be printed using
  121. write_tree_diff::
  122. >>> from dulwich.patch import write_tree_diff
  123. >>> import sys
  124. >>> write_tree_diff(sys.stdout, repo.object_store, commit.tree, tree.id)
  125. diff --git a/spam b/spam
  126. index c55063a..16ee268 100644
  127. --- a/spam
  128. +++ b/spam
  129. @@ -1,1 +1,1 @@
  130. -My file content
  131. +My new file content
  132. You won't see it using git log because the head is still the previous
  133. commit. It's easy to remedy::
  134. >>> repo.refs['refs/heads/master'] = c2.id
  135. Now all git tools will work as expected.