python3.txt 14 KB


  1. ===================
  2. Porting to Python 3
  3. ===================
  4. Django 1.5 is the first version of Django to support Python 3. The same code
  5. runs both on Python 2 (≥ 2.6.5) and Python 3 (≥ 3.2), thanks to the six_
  6. compatibility layer.
  7. .. _six: http://packages.python.org/six/
  8. This document is primarily targeted at authors of pluggable application
  9. who want to support both Python 2 and 3. It also describes guidelines that
  10. apply to Django's code.
  11. Philosophy
  12. ==========
  13. This document assumes that you are familiar with the changes between Python 2
  14. and Python 3. If you aren't, read `Python's official porting guide`_ first.
  15. Refreshing your knowledge of unicode handling on Python 2 and 3 will help; the
  16. `Pragmatic Unicode`_ presentation is a good resource.
  17. Django uses the *Python 2/3 Compatible Source* strategy. Of course, you're
  18. free to chose another strategy for your own code, especially if you don't need
  19. to stay compatible with Python 2. But authors of pluggable applications are
  20. encouraged to use the same porting strategy as Django itself.
  21. Writing compatible code is much easier if you target Python ≥ 2.6. Django 1.5
  22. introduces compatibility tools such as :mod:`django.utils.six`. For
  23. convenience, forwards-compatible aliases were introduced in Django 1.4.2. If
  24. your application takes advantage of these tools, it will require Django ≥
  25. 1.4.2.
  26. Obviously, writing compatible source code adds some overhead, and that can
  27. cause frustration. Django's developers have found that attempting to write
  28. Python 3 code that's compatible with Python 2 is much more rewarding than the
  29. opposite. Not only does that make your code more future-proof, but Python 3's
  30. advantages (like the saner string handling) start shining quickly. Dealing
  31. with Python 2 becomes a backwards compatibility requirement, and we as
  32. developers are used to dealing with such constraints.
  33. Porting tools provided by Django are inspired by this philosophy, and it's
  34. reflected throughout this guide.
  35. .. _Python's official porting guide: http://docs.python.org/py3k/howto/pyporting.html
  36. .. _Pragmatic Unicode: http://nedbatchelder.com/text/unipain.html
  37. Porting tips
  38. ============
  39. Unicode literals
  40. ----------------
  41. This step consists in:
  42. - Adding ``from __future__ import unicode_literals`` at the top of your Python
  43. modules -- it's best to put it in each and every module, otherwise you'll
  44. keep checking the top of your files to see which mode is in effect;
  45. - Removing the ``u`` prefix before unicode strings;
  46. - Adding a ``b`` prefix before bytestrings.
  47. Performing these changes systematically guarantees backwards compatibility.
  48. However, Django applications generally don't need bytestrings, since Django
  49. only exposes unicode interfaces to the programmer. Python 3 discourages using
  50. bytestrings, except for binary data or byte-oriented interfaces. Python 2
  51. makes bytestrings and unicode strings effectively interchangeable, as long as
  52. they only contain ASCII data. Take advantage of this to use unicode strings
  53. wherever possible and avoid the ``b`` prefixes.
  54. .. note::
  55. Python 2's ``u`` prefix is a syntax error in Python 3.2 but it will be
  56. allowed again in Python 3.3 thanks to :pep:`414`. Thus, this
  57. transformation is optional if you target Python ≥ 3.3. It's still
  58. recommended, per the "write Python 3 code" philosophy.
  59. String handling
  60. ---------------
  61. Python 2's :func:`unicode` type was renamed :func:`str` in Python 3,
  62. :func:`str` was renamed ``bytes()``, and :func:`basestring` disappeared.
  63. six_ provides :ref:`tools <string-handling-with-six>` to deal with these
  64. changes.
  65. Django also contains several string related classes and functions in the
  66. :mod:`django.utils.encoding` and :mod:`django.utils.safestring` modules. Their
  67. names used the words ``str``, which doesn't mean the same thing in Python 2
  68. and Python 3, and ``unicode``, which doesn't exist in Python 3. In order to
  69. avoid ambiguity and confusion these concepts were renamed ``bytes`` and
  70. ``text``.
  71. Here are the name changes in :mod:`django.utils.encoding`:
  72. ================== ==================
  73. Old name New name
  74. ================== ==================
  75. ``smart_str`` ``smart_bytes``
  76. ``smart_unicode`` ``smart_text``
  77. ``force_unicode`` ``force_text``
  78. ================== ==================
  79. For backwards compatibility, the old names still work on Python 2. Under
  80. Python 3, ``smart_str`` is an alias for ``smart_text``.
  81. For forwards compatibility, the new names work as of Django 1.4.2.
  82. .. note::
  83. :mod:`django.utils.encoding` was deeply refactored in Django 1.5 to
  84. provide a more consistent API. Check its documentation for more
  85. information.
  86. :mod:`django.utils.safestring` is mostly used via the
  87. :func:`~django.utils.safestring.mark_safe` and
  88. :func:`~django.utils.safestring.mark_for_escaping` functions, which didn't
  89. change. In case you're using the internals, here are the name changes:
  90. ================== ==================
  91. Old name New name
  92. ================== ==================
  93. ``EscapeString`` ``EscapeBytes``
  94. ``EscapeUnicode`` ``EscapeText``
  95. ``SafeString`` ``SafeBytes``
  96. ``SafeUnicode`` ``SafeText``
  97. ================== ==================
  98. For backwards compatibility, the old names still work on Python 2. Under
  99. Python 3, ``EscapeString`` and ``SafeString`` are aliases for ``EscapeText``
  100. and ``SafeText`` respectively.
  101. For forwards compatibility, the new names work as of Django 1.4.2.
  102. :meth:`~object.__str__` and :meth:`~object.__unicode__` methods
  103. ---------------------------------------------------------------
  104. In Python 2, the object model specifies :meth:`~object.__str__` and
  105. :meth:`~object.__unicode__` methods. If these methods exist, they must return
  106. ``str`` (bytes) and ``unicode`` (text) respectively.
  107. The ``print`` statement and the :func:`str` built-in call
  108. :meth:`~object.__str__` to determine the human-readable representation of an
  109. object. The :func:`unicode` built-in calls :meth:`~object.__unicode__` if it
  110. exists, and otherwise falls back to :meth:`~object.__str__` and decodes the
  111. result with the system encoding. Conversely, the
  112. :class:`~django.db.models.Model` base class automatically derives
  113. :meth:`~object.__str__` from :meth:`~object.__unicode__` by encoding to UTF-8.
  114. In Python 3, there's simply :meth:`~object.__str__`, which must return ``str``
  115. (text).
  116. (It is also possible to define ``__bytes__()``, but Django application have
  117. little use for that method, because they hardly ever deal with
  118. ``bytes``.)
  119. Django provides a simple way to define :meth:`~object.__str__` and
  120. :meth:`~object.__unicode__` methods that work on Python 2 and 3: you must
  121. define a :meth:`~object.__str__` method returning text and to apply the
  122. :func:`~django.utils.encoding.python_2_unicode_compatible` decorator.
  123. On Python 3, the decorator is a no-op. On Python 2, it defines appropriate
  124. :meth:`~object.__unicode__` and :meth:`~object.__str__` methods (replacing the
  125. original :meth:`~object.__str__` method in the process). Here's an example::
  126. from __future__ import unicode_literals
  127. from django.utils.encoding import python_2_unicode_compatible
  128. @python_2_unicode_compatible
  129. class MyClass(object):
  130. def __str__(self):
  131. return "Instance of my class"
  132. This technique is the best match for Django's porting philosophy.
  133. For forwards compatibility, this decorator is available as of Django 1.4.2.
  134. Finally, note that :meth:`~object.__repr__` must return a ``str`` on all
  135. versions of Python.
  136. :class:`dict` and :class:`dict`-like classes
  137. --------------------------------------------
  138. :meth:`dict.keys`, :meth:`dict.items` and :meth:`dict.values` return lists in
  139. Python 2 and iterators in Python 3. :class:`~django.http.QueryDict` and the
  140. :class:`dict`-like classes defined in :mod:`django.utils.datastructures`
  141. behave likewise in Python 3.
  142. six_ provides compatibility functions to work around this change:
  143. :func:`~six.iterkeys`, :func:`~six.iteritems`, and :func:`~six.itervalues`.
  144. It also contains an undocumented ``iterlists`` function that works well for
  145. ``django.utils.datastructures.MultiValueDict`` and its subclasses.
  146. :class:`~django.http.HttpRequest` and :class:`~django.http.HttpResponse` objects
  147. --------------------------------------------------------------------------------
  148. According to :pep:`3333`:
  149. - headers are always ``str`` objects,
  150. - input and output streams are always ``bytes`` objects.
  151. Specifically, :attr:`HttpResponse.content <django.http.HttpResponse.content>`
  152. contains ``bytes``, which may become an issue if you compare it with a
  153. ``str`` in your tests. The preferred solution is to rely on
  154. :meth:`~django.test.TestCase.assertContains` and
  155. :meth:`~django.test.TestCase.assertNotContains`. These methods accept a
  156. response and a unicode string as arguments.
  157. Coding guidelines
  158. =================
  159. The following guidelines are enforced in Django's source code. They're also
  160. recommended for third-party application who follow the same porting strategy.
  161. Syntax requirements
  162. -------------------
  163. Unicode
  164. ~~~~~~~
  165. In Python 3, all strings are considered Unicode by default. The ``unicode``
  166. type from Python 2 is called ``str`` in Python 3, and ``str`` becomes
  167. ``bytes``.
  168. You mustn't use the ``u`` prefix before a unicode string literal because it's
  169. a syntax error in Python 3.2. You must prefix byte strings with ``b``.
  170. In order to enable the same behavior in Python 2, every module must import
  171. ``unicode_literals`` from ``__future__``::
  172. from __future__ import unicode_literals
  173. my_string = "This is an unicode literal"
  174. my_bytestring = b"This is a bytestring"
  175. If you need a byte string literal under Python 2 and a unicode string literal
  176. under Python 3, use the :func:`str` builtin::
  177. str('my string')
  178. In Python 3, there aren't any automatic conversions between ``str`` and
  179. ``bytes``, and the :mod:`codecs` module became more strict. :meth:`str.encode`
  180. always returns ``bytes``, and ``bytes.decode`` always returns ``str``. As a
  181. consequence, the following pattern is sometimes necessary::
  182. value = value.encode('ascii', 'ignore').decode('ascii')
  183. Be cautious if you have to `index bytestrings`_.
  184. .. _index bytestrings: http://docs.python.org/py3k/howto/pyporting.html#bytes-literals
  185. Exceptions
  186. ~~~~~~~~~~
  187. When you capture exceptions, use the ``as`` keyword::
  188. try:
  189. ...
  190. except MyException as exc:
  191. ...
  192. This older syntax was removed in Python 3::
  193. try:
  194. ...
  195. except MyException, exc: # Don't do that!
  196. ...
  197. The syntax to reraise an exception with a different traceback also changed.
  198. Use :func:`six.reraise`.
  199. Magic methods
  200. -------------
  201. Use the patterns below to handle magic methods renamed in Python 3.
  202. Iterators
  203. ~~~~~~~~~
  204. ::
  205. class MyIterator(six.Iterator):
  206. def __iter__(self):
  207. return self # implement some logic here
  208. def __next__(self):
  209. raise StopIteration # implement some logic here
  210. Boolean evaluation
  211. ~~~~~~~~~~~~~~~~~~
  212. ::
  213. class MyBoolean(object):
  214. def __bool__(self):
  215. return True # implement some logic here
  216. def __nonzero__(self): # Python 2 compatibility
  217. return type(self).__bool__(self)
  218. Division
  219. ~~~~~~~~
  220. ::
  221. class MyDivisible(object):
  222. def __truediv__(self, other):
  223. return self / other # implement some logic here
  224. def __div__(self, other): # Python 2 compatibility
  225. return type(self).__truediv__(self, other)
  226. def __itruediv__(self, other):
  227. return self // other # implement some logic here
  228. def __idiv__(self, other): # Python 2 compatibility
  229. return type(self).__itruediv__(self, other)
  230. .. module: django.utils.six
  231. Writing compatible code with six
  232. --------------------------------
  233. six_ is the canonical compatibility library for supporting Python 2 and 3 in
  234. a single codebase. Read its documentation!
  235. :mod:`six` is bundled with Django as of version 1.4.2. You can import it as
  236. :mod:`django.utils.six`.
  237. Here are the most common changes required to write compatible code.
  238. .. _string-handling-with-six:
  239. String handling
  240. ~~~~~~~~~~~~~~~
  241. The ``basestring`` and ``unicode`` types were removed in Python 3, and the
  242. meaning of ``str`` changed. To test these types, use the following idioms::
  243. isinstance(myvalue, six.string_types) # replacement for basestring
  244. isinstance(myvalue, six.text_type) # replacement for unicode
  245. isinstance(myvalue, bytes) # replacement for str
  246. Python ≥ 2.6 provides ``bytes`` as an alias for ``str``, so you don't need
  247. :data:`six.binary_type`.
  248. ``long``
  249. ~~~~~~~~
  250. The ``long`` type no longer exists in Python 3. ``1L`` is a syntax error. Use
  251. :data:`six.integer_types` check if a value is an integer or a long::
  252. isinstance(myvalue, six.integer_types) # replacement for (int, long)
  253. ``xrange``
  254. ~~~~~~~~~~
  255. Import ``six.moves.xrange`` wherever you use ``xrange``.
  256. Moved modules
  257. ~~~~~~~~~~~~~
  258. Some modules were renamed in Python 3. The :mod:`django.utils.six.moves
  259. <six.moves>` module provides a compatible location to import them.
  260. The ``urllib``, ``urllib2`` and ``urlparse`` modules were reworked in depth
  261. and :mod:`django.utils.six.moves <six.moves>` doesn't handle them. Django
  262. explicitly tries both locations, as follows::
  263. try:
  264. from urllib.parse import urlparse, urlunparse
  265. except ImportError: # Python 2
  266. from urlparse import urlparse, urlunparse
  267. PY3
  268. ~~~
  269. If you need different code in Python 2 and Python 3, check :data:`six.PY3`::
  270. if six.PY3:
  271. # do stuff Python 3-wise
  272. else:
  273. # do stuff Python 2-wise
  274. This is a last resort solution when :mod:`six` doesn't provide an appropriate
  275. function.
  276. .. module:: django.utils.six
  277. Customizations of six
  278. ---------------------
  279. The version of six bundled with Django includes a few extras.
  280. .. function:: assertRaisesRegex(testcase, *args, **kwargs)
  281. This replaces ``testcase.assertRaisesRegexp`` on Python 2, and
  282. ``testcase.assertRaisesRegex`` on Python 3. ``assertRaisesRegexp`` still
  283. exists in current Python 3 versions, but issues a warning.
  284. .. function:: assertRegex(testcase, *args, **kwargs)
  285. This replaces ``testcase.assertRegexpMatches`` on Python 2, and
  286. ``testcase.assertRegex`` on Python 3. ``assertRegexpMatches`` still
  287. exists in current Python 3 versions, but issues a warning.
  288. In addition to six' defaults moves, Django's version provides ``thread`` as
  289. ``_thread`` and ``dummy_thread`` as ``_dummy_thread``.