python3.txt 14 KB


  1. ===================
  2. Porting to Python 3
  3. ===================
  4. Django 1.5 is the first version of Django to support Python 3. The same code
  5. runs both on Python 2 (≥ 2.6.5) and Python 3 (≥ 3.2), thanks to the six_
  6. compatibility layer.
  7. .. _six: https://pythonhosted.org/six/
  8. This document is primarily targeted at authors of pluggable applications
  9. who want to support both Python 2 and 3. It also describes guidelines that
  10. apply to Django's code.
  11. Philosophy
  12. ==========
  13. This document assumes that you are familiar with the changes between Python 2
  14. and Python 3. If you aren't, read :ref:`Python's official porting guide
  15. <pyporting-howto>` first. Refreshing your knowledge of unicode handling on
  16. Python 2 and 3 will help; the `Pragmatic Unicode`_ presentation is a good
  17. resource.
  18. Django uses the *Python 2/3 Compatible Source* strategy. Of course, you're
  19. free to chose another strategy for your own code, especially if you don't need
  20. to stay compatible with Python 2. But authors of pluggable applications are
  21. encouraged to use the same porting strategy as Django itself.
  22. Writing compatible code is much easier if you target Python ≥ 2.6. Django 1.5
  23. introduces compatibility tools such as :mod:`django.utils.six`, which is a
  24. customized version of the :mod:`six module <six>`. For convenience,
  25. forwards-compatible aliases were introduced in Django 1.4.2. If your
  26. application takes advantage of these tools, it will require Django ≥ 1.4.2.
  27. Obviously, writing compatible source code adds some overhead, and that can
  28. cause frustration. Django's developers have found that attempting to write
  29. Python 3 code that's compatible with Python 2 is much more rewarding than the
  30. opposite. Not only does that make your code more future-proof, but Python 3's
  31. advantages (like the saner string handling) start shining quickly. Dealing
  32. with Python 2 becomes a backwards compatibility requirement, and we as
  33. developers are used to dealing with such constraints.
  34. Porting tools provided by Django are inspired by this philosophy, and it's
  35. reflected throughout this guide.
  36. .. _Pragmatic Unicode: http://nedbatchelder.com/text/unipain.html
  37. Porting tips
  38. ============
  39. Unicode literals
  40. ----------------
  41. This step consists in:
  42. - Adding ``from __future__ import unicode_literals`` at the top of your Python
  43. modules -- it's best to put it in each and every module, otherwise you'll
  44. keep checking the top of your files to see which mode is in effect;
  45. - Removing the ``u`` prefix before unicode strings;
  46. - Adding a ``b`` prefix before bytestrings.
  47. Performing these changes systematically guarantees backwards compatibility.
  48. However, Django applications generally don't need bytestrings, since Django
  49. only exposes unicode interfaces to the programmer. Python 3 discourages using
  50. bytestrings, except for binary data or byte-oriented interfaces. Python 2
  51. makes bytestrings and unicode strings effectively interchangeable, as long as
  52. they only contain ASCII data. Take advantage of this to use unicode strings
  53. wherever possible and avoid the ``b`` prefixes.
  54. .. note::
  55. Python 2's ``u`` prefix is a syntax error in Python 3.2 but it will be
  56. allowed again in Python 3.3 thanks to :pep:`414`. Thus, this
  57. transformation is optional if you target Python ≥ 3.3. It's still
  58. recommended, per the "write Python 3 code" philosophy.
  59. String handling
  60. ---------------
  61. Python 2's `unicode`_ type was renamed :class:`str` in Python 3,
  62. ``str()`` was renamed :class:`bytes`, and `basestring`_ disappeared.
  63. six_ provides :ref:`tools <string-handling-with-six>` to deal with these
  64. changes.
  65. Django also contains several string related classes and functions in the
  66. :mod:`django.utils.encoding` and :mod:`django.utils.safestring` modules. Their
  67. names used the words ``str``, which doesn't mean the same thing in Python 2
  68. and Python 3, and ``unicode``, which doesn't exist in Python 3. In order to
  69. avoid ambiguity and confusion these concepts were renamed ``bytes`` and
  70. ``text``.
  71. Here are the name changes in :mod:`django.utils.encoding`:
  72. ================== ==================
  73. Old name New name
  74. ================== ==================
  75. ``smart_str`` ``smart_bytes``
  76. ``smart_unicode`` ``smart_text``
  77. ``force_unicode`` ``force_text``
  78. ================== ==================
  79. For backwards compatibility, the old names still work on Python 2. Under
  80. Python 3, ``smart_str`` is an alias for ``smart_text``.
  81. For forwards compatibility, the new names work as of Django 1.4.2.
  82. .. note::
  83. :mod:`django.utils.encoding` was deeply refactored in Django 1.5 to
  84. provide a more consistent API. Check its documentation for more
  85. information.
  86. :mod:`django.utils.safestring` is mostly used via the
  87. :func:`~django.utils.safestring.mark_safe` function, which didn't change. In
  88. case you're using the internals, here are the name changes:
  89. ================== ==================
  90. Old name New name
  91. ================== ==================
  92. ``SafeString`` ``SafeBytes``
  93. ``SafeUnicode`` ``SafeText``
  94. ================== ==================
  95. For backwards compatibility, the old names still work on Python 2. On Python 3,
  96. ``SafeString`` is an alias for ``SafeText``.
  97. For forwards compatibility, the new names work as of Django 1.4.2.
  98. ``__str__()`` and ``__unicode__()`` methods
  99. -------------------------------------------
  100. In Python 2, the object model specifies :meth:`~object.__str__` and
  101. ` __unicode__()`_ methods. If these methods exist, they must return
  102. ``str`` (bytes) and ``unicode`` (text) respectively.
  103. The ``print`` statement and the :class:`str` built-in call
  104. :meth:`~object.__str__` to determine the human-readable representation of an
  105. object. The ``unicode`` built-in calls ` __unicode__()`_ if it
  106. exists, and otherwise falls back to :meth:`~object.__str__` and decodes the
  107. result with the system encoding. Conversely, the
  108. :class:`~django.db.models.Model` base class automatically derives
  109. :meth:`~object.__str__` from ` __unicode__()`_ by encoding to UTF-8.
  110. In Python 3, there's simply :meth:`~object.__str__`, which must return ``str``
  111. (text).
  112. (It is also possible to define :meth:`~object.__bytes__`, but Django applications
  113. have little use for that method, because they hardly ever deal with ``bytes``.)
  114. Django provides a simple way to define :meth:`~object.__str__` and
  115. ` __unicode__()`_ methods that work on Python 2 and 3: you must
  116. define a :meth:`~object.__str__` method returning text and to apply the
  117. :func:`~django.utils.encoding.python_2_unicode_compatible` decorator.
  118. On Python 3, the decorator is a no-op. On Python 2, it defines appropriate
  119. ` __unicode__()`_ and :meth:`~object.__str__` methods (replacing the
  120. original :meth:`~object.__str__` method in the process). Here's an example::
  121. from __future__ import unicode_literals
  122. from django.utils.encoding import python_2_unicode_compatible
  123. @python_2_unicode_compatible
  124. class MyClass(object):
  125. def __str__(self):
  126. return "Instance of my class"
  127. This technique is the best match for Django's porting philosophy.
  128. For forwards compatibility, this decorator is available as of Django 1.4.2.
  129. Finally, note that :meth:`~object.__repr__` must return a ``str`` on all
  130. versions of Python.
  131. :class:`dict` and :class:`dict`-like classes
  132. --------------------------------------------
  133. :meth:`dict.keys`, :meth:`dict.items` and :meth:`dict.values` return lists in
  134. Python 2 and iterators in Python 3. :class:`~django.http.QueryDict` and the
  135. :class:`dict`-like classes defined in ``django.utils.datastructures``
  136. behave likewise in Python 3.
  137. six_ provides compatibility functions to work around this change:
  138. :func:`~six.iterkeys`, :func:`~six.iteritems`, and :func:`~six.itervalues`.
  139. It also contains an undocumented ``iterlists`` function that works well for
  140. ``django.utils.datastructures.MultiValueDict`` and its subclasses.
  141. :class:`~django.http.HttpRequest` and :class:`~django.http.HttpResponse` objects
  142. --------------------------------------------------------------------------------
  143. According to :pep:`3333`:
  144. - headers are always ``str`` objects,
  145. - input and output streams are always ``bytes`` objects.
  146. Specifically, :attr:`HttpResponse.content <django.http.HttpResponse.content>`
  147. contains ``bytes``, which may become an issue if you compare it with a
  148. ``str`` in your tests. The preferred solution is to rely on
  149. :meth:`~django.test.SimpleTestCase.assertContains` and
  150. :meth:`~django.test.SimpleTestCase.assertNotContains`. These methods accept a
  151. response and a unicode string as arguments.
  152. Coding guidelines
  153. =================
  154. The following guidelines are enforced in Django's source code. They're also
  155. recommended for third-party applications that follow the same porting strategy.
  156. Syntax requirements
  157. -------------------
  158. Unicode
  159. ~~~~~~~
  160. In Python 3, all strings are considered Unicode by default. The ``unicode``
  161. type from Python 2 is called ``str`` in Python 3, and ``str`` becomes
  162. ``bytes``.
  163. You mustn't use the ``u`` prefix before a unicode string literal because it's
  164. a syntax error in Python 3.2. You must prefix byte strings with ``b``.
  165. In order to enable the same behavior in Python 2, every module must import
  166. ``unicode_literals`` from ``__future__``::
  167. from __future__ import unicode_literals
  168. my_string = "This is an unicode literal"
  169. my_bytestring = b"This is a bytestring"
  170. If you need a byte string literal under Python 2 and a unicode string literal
  171. under Python 3, use the :class:`str` builtin::
  172. str('my string')
  173. In Python 3, there aren't any automatic conversions between ``str`` and
  174. ``bytes``, and the :mod:`codecs` module became more strict. :meth:`str.encode`
  175. always returns ``bytes``, and ``bytes.decode`` always returns ``str``. As a
  176. consequence, the following pattern is sometimes necessary::
  177. value = value.encode('ascii', 'ignore').decode('ascii')
  178. Be cautious if you have to `index bytestrings`_.
  179. .. _index bytestrings: https://docs.python.org/3/howto/pyporting.html#text-versus-binary-data
  180. Exceptions
  181. ~~~~~~~~~~
  182. When you capture exceptions, use the ``as`` keyword::
  183. try:
  184. ...
  185. except MyException as exc:
  186. ...
  187. This older syntax was removed in Python 3::
  188. try:
  189. ...
  190. except MyException, exc: # Don't do that!
  191. ...
  192. The syntax to reraise an exception with a different traceback also changed.
  193. Use :func:`six.reraise`.
  194. Magic methods
  195. -------------
  196. Use the patterns below to handle magic methods renamed in Python 3.
  197. Iterators
  198. ~~~~~~~~~
  199. ::
  200. class MyIterator(six.Iterator):
  201. def __iter__(self):
  202. return self # implement some logic here
  203. def __next__(self):
  204. raise StopIteration # implement some logic here
  205. Boolean evaluation
  206. ~~~~~~~~~~~~~~~~~~
  207. ::
  208. class MyBoolean(object):
  209. def __bool__(self):
  210. return True # implement some logic here
  211. def __nonzero__(self): # Python 2 compatibility
  212. return type(self).__bool__(self)
  213. Division
  214. ~~~~~~~~
  215. ::
  216. class MyDivisible(object):
  217. def __truediv__(self, other):
  218. return self / other # implement some logic here
  219. def __div__(self, other): # Python 2 compatibility
  220. return type(self).__truediv__(self, other)
  221. def __itruediv__(self, other):
  222. return self // other # implement some logic here
  223. def __idiv__(self, other): # Python 2 compatibility
  224. return type(self).__itruediv__(self, other)
  225. Special methods are looked up on the class and not on the instance to reflect
  226. the behavior of the Python interpreter.
  227. .. module: django.utils.six
  228. Writing compatible code with six
  229. --------------------------------
  230. six_ is the canonical compatibility library for supporting Python 2 and 3 in
  231. a single codebase. Read its documentation!
  232. A :mod:`customized version of six <django.utils.six>` is bundled with Django
  233. as of version 1.4.2. You can import it as ``django.utils.six``.
  234. Here are the most common changes required to write compatible code.
  235. .. _string-handling-with-six:
  236. String handling
  237. ~~~~~~~~~~~~~~~
  238. The ``basestring`` and ``unicode`` types were removed in Python 3, and the
  239. meaning of ``str`` changed. To test these types, use the following idioms::
  240. isinstance(myvalue, six.string_types) # replacement for basestring
  241. isinstance(myvalue, six.text_type) # replacement for unicode
  242. isinstance(myvalue, bytes) # replacement for str
  243. Python ≥ 2.6 provides ``bytes`` as an alias for ``str``, so you don't need
  244. :data:`six.binary_type`.
  245. ``long``
  246. ~~~~~~~~
  247. The ``long`` type no longer exists in Python 3. ``1L`` is a syntax error. Use
  248. :data:`six.integer_types` check if a value is an integer or a long::
  249. isinstance(myvalue, six.integer_types) # replacement for (int, long)
  250. ``xrange``
  251. ~~~~~~~~~~
  252. If you use ``xrange`` on Python 2, import ``six.moves.range`` and use that
  253. instead. You can also import ``six.moves.xrange`` (it's equivalent to
  254. ``six.moves.range``) but the first technique allows you to simply drop the
  255. import when dropping support for Python 2.
  256. Moved modules
  257. ~~~~~~~~~~~~~
  258. Some modules were renamed in Python 3. The ``django.utils.six.moves``
  259. module (based on the :mod:`six.moves module <six.moves>`) provides a
  260. compatible location to import them.
  261. ``PY2``
  262. ~~~~~~~
  263. If you need different code in Python 2 and Python 3, check :data:`six.PY2`::
  264. if six.PY2:
  265. # compatibility code for Python 2
  266. This is a last resort solution when :mod:`six` doesn't provide an appropriate
  267. function.
  268. .. module:: django.utils.six
  269. Django customized version of ``six``
  270. ------------------------------------
  271. The version of six bundled with Django (``django.utils.six``) includes a few
  272. customizations for internal use only.
  273. .. _unicode: https://docs.python.org/2/library/functions.html#unicode
  274. .. _ __unicode__(): https://docs.python.org/2/reference/datamodel.html#object.__unicode__
  275. .. _basestring: https://docs.python.org/2/library/functions.html#basestring