file-uploads.txt 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392
  1. ============
  2. File Uploads
  3. ============
  4. .. currentmodule:: django.core.files
  5. When Django handles a file upload, the file data ends up placed in
  6. :attr:`request.FILES <django.http.HttpRequest.FILES>` (for more on the
  7. ``request`` object see the documentation for :doc:`request and response objects
  8. </ref/request-response>`). This document explains how files are stored on disk
  9. and in memory, and how to customize the default behavior.
  10. Basic file uploads
  11. ==================
  12. Consider a simple form containing a :class:`~django.forms.FileField`::
  13. from django import forms
  14. class UploadFileForm(forms.Form):
  15. title = forms.CharField(max_length=50)
  16. file = forms.FileField()
  17. A view handling this form will receive the file data in
  18. :attr:`request.FILES <django.http.HttpRequest.FILES>`, which is a dictionary
  19. containing a key for each :class:`~django.forms.FileField` (or
  20. :class:`~django.forms.ImageField`, or other :class:`~django.forms.FileField`
  21. subclass) in the form. So the data from the above form would
  22. be accessible as ``request.FILES['file']``.
  23. Note that :attr:`request.FILES <django.http.HttpRequest.FILES>` will only
  24. contain data if the request method was ``POST`` and the ``<form>`` that posted
  25. the request has the attribute ``enctype="multipart/form-data"``. Otherwise,
  26. ``request.FILES`` will be empty.
  27. Most of the time, you'll simply pass the file data from ``request`` into the
  28. form as described in :ref:`binding-uploaded-files`. This would look
  29. something like::
  30. from django.http import HttpResponseRedirect
  31. from django.shortcuts import render_to_response
  32. # Imaginary function to handle an uploaded file.
  33. from somewhere import handle_uploaded_file
  34. def upload_file(request):
  35. if request.method == 'POST':
  36. form = UploadFileForm(request.POST, request.FILES)
  37. if form.is_valid():
  38. handle_uploaded_file(request.FILES['file'])
  39. return HttpResponseRedirect('/success/url/')
  40. else:
  41. form = UploadFileForm()
  42. return render_to_response('upload.html', {'form': form})
  43. Notice that we have to pass :attr:`request.FILES <django.http.HttpRequest.FILES>`
  44. into the form's constructor; this is how file data gets bound into a form.
  45. Handling uploaded files
  46. -----------------------
  47. The final piece of the puzzle is handling the actual file data from
  48. :attr:`request.FILES <django.http.HttpRequest.FILES>`. Each entry in this
  49. dictionary is an ``UploadedFile`` object -- a simple wrapper around an uploaded
  50. file. You'll usually use one of these methods to access the uploaded content:
  51. ``UploadedFile.read()``
  52. Read the entire uploaded data from the file. Be careful with this
  53. method: if the uploaded file is huge it can overwhelm your system if you
  54. try to read it into memory. You'll probably want to use ``chunks()``
  55. instead; see below.
  56. ``UploadedFile.multiple_chunks()``
  57. Returns ``True`` if the uploaded file is big enough to require
  58. reading in multiple chunks. By default this will be any file
  59. larger than 2.5 megabytes, but that's configurable; see below.
  60. ``UploadedFile.chunks()``
  61. A generator returning chunks of the file. If ``multiple_chunks()`` is
  62. ``True``, you should use this method in a loop instead of ``read()``.
  63. In practice, it's often easiest simply to use ``chunks()`` all the time;
  64. see the example below.
  65. ``UploadedFile.name``
  66. The name of the uploaded file (e.g. ``my_file.txt``).
  67. ``UploadedFile.size``
  68. The size, in bytes, of the uploaded file.
  69. There are a few other methods and attributes available on ``UploadedFile``
  70. objects; see `UploadedFile objects`_ for a complete reference.
  71. Putting it all together, here's a common way you might handle an uploaded file::
  72. def handle_uploaded_file(f):
  73. destination = open('some/file/name.txt', 'wb+')
  74. for chunk in f.chunks():
  75. destination.write(chunk)
  76. destination.close()
  77. Looping over ``UploadedFile.chunks()`` instead of using ``read()`` ensures that
  78. large files don't overwhelm your system's memory.
  79. Where uploaded data is stored
  80. -----------------------------
  81. Before you save uploaded files, the data needs to be stored somewhere.
  82. By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold
  83. the entire contents of the upload in memory. This means that saving the file
  84. involves only a read from memory and a write to disk and thus is very fast.
  85. However, if an uploaded file is too large, Django will write the uploaded file
  86. to a temporary file stored in your system's temporary directory. On a Unix-like
  87. platform this means you can expect Django to generate a file called something
  88. like ``/tmp/tmpzfp6I6.upload``. If an upload is large enough, you can watch this
  89. file grow in size as Django streams the data onto disk.
  90. These specifics -- 2.5 megabytes; ``/tmp``; etc. -- are simply "reasonable
  91. defaults". Read on for details on how you can customize or completely replace
  92. upload behavior.
  93. Changing upload handler behavior
  94. --------------------------------
  95. Three settings control Django's file upload behavior:
  96. :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE`
  97. The maximum size, in bytes, for files that will be uploaded into memory.
  98. Files larger than :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be
  99. streamed to disk.
  100. Defaults to 2.5 megabytes.
  101. :setting:`FILE_UPLOAD_TEMP_DIR`
  102. The directory where uploaded files larger than
  103. :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be stored.
  104. Defaults to your system's standard temporary directory (i.e. ``/tmp`` on
  105. most Unix-like systems).
  106. :setting:`FILE_UPLOAD_PERMISSIONS`
  107. The numeric mode (i.e. ``0644``) to set newly uploaded files to. For
  108. more information about what these modes mean, see the `documentation for
  109. os.chmod`_
  110. If this isn't given or is ``None``, you'll get operating-system
  111. dependent behavior. On most platforms, temporary files will have a mode
  112. of ``0600``, and files saved from memory will be saved using the
  113. system's standard umask.
  114. .. warning::
  115. If you're not familiar with file modes, please note that the leading
  116. ``0`` is very important: it indicates an octal number, which is the
  117. way that modes must be specified. If you try to use ``644``, you'll
  118. get totally incorrect behavior.
  119. **Always prefix the mode with a 0.**
  120. :setting:`FILE_UPLOAD_HANDLERS`
  121. The actual handlers for uploaded files. Changing this setting allows
  122. complete customization -- even replacement -- of Django's upload
  123. process. See `upload handlers`_, below, for details.
  124. Defaults to::
  125. ("django.core.files.uploadhandler.MemoryFileUploadHandler",
  126. "django.core.files.uploadhandler.TemporaryFileUploadHandler",)
  127. Which means "try to upload to memory first, then fall back to temporary
  128. files."
  129. .. _documentation for os.chmod: http://docs.python.org/library/os.html#os.chmod
  130. ``UploadedFile`` objects
  131. ========================
  132. .. class:: UploadedFile
  133. In addition to those inherited from :class:`File`, all ``UploadedFile`` objects
  134. define the following methods/attributes:
  135. ``UploadedFile.content_type``
  136. The content-type header uploaded with the file (e.g. ``text/plain`` or
  137. ``application/pdf``). Like any data supplied by the user, you shouldn't
  138. trust that the uploaded file is actually this type. You'll still need to
  139. validate that the file contains the content that the content-type header
  140. claims -- "trust but verify."
  141. ``UploadedFile.charset``
  142. For ``text/*`` content-types, the character set (i.e. ``utf8``) supplied
  143. by the browser. Again, "trust but verify" is the best policy here.
  144. ``UploadedFile.temporary_file_path()``
  145. Only files uploaded onto disk will have this method; it returns the full
  146. path to the temporary uploaded file.
  147. .. note::
  148. Like regular Python files, you can read the file line-by-line simply by
  149. iterating over the uploaded file:
  150. .. code-block:: python
  151. for line in uploadedfile:
  152. do_something_with(line)
  153. However, *unlike* standard Python files, :class:`UploadedFile` only
  154. understands ``\n`` (also known as "Unix-style") line endings. If you know
  155. that you need to handle uploaded files with different line endings, you'll
  156. need to do so in your view.
  157. Upload Handlers
  158. ===============
  159. When a user uploads a file, Django passes off the file data to an *upload
  160. handler* -- a small class that handles file data as it gets uploaded. Upload
  161. handlers are initially defined in the ``FILE_UPLOAD_HANDLERS`` setting, which
  162. defaults to::
  163. ("django.core.files.uploadhandler.MemoryFileUploadHandler",
  164. "django.core.files.uploadhandler.TemporaryFileUploadHandler",)
  165. Together the ``MemoryFileUploadHandler`` and ``TemporaryFileUploadHandler``
  166. provide Django's default file upload behavior of reading small files into memory
  167. and large ones onto disk.
  168. You can write custom handlers that customize how Django handles files. You
  169. could, for example, use custom handlers to enforce user-level quotas, compress
  170. data on the fly, render progress bars, and even send data to another storage
  171. location directly without storing it locally.
  172. Modifying upload handlers on the fly
  173. ------------------------------------
  174. Sometimes particular views require different upload behavior. In these cases,
  175. you can override upload handlers on a per-request basis by modifying
  176. ``request.upload_handlers``. By default, this list will contain the upload
  177. handlers given by ``FILE_UPLOAD_HANDLERS``, but you can modify the list as you
  178. would any other list.
  179. For instance, suppose you've written a ``ProgressBarUploadHandler`` that
  180. provides feedback on upload progress to some sort of AJAX widget. You'd add this
  181. handler to your upload handlers like this::
  182. request.upload_handlers.insert(0, ProgressBarUploadHandler())
  183. You'd probably want to use ``list.insert()`` in this case (instead of
  184. ``append()``) because a progress bar handler would need to run *before* any
  185. other handlers. Remember, the upload handlers are processed in order.
  186. If you want to replace the upload handlers completely, you can just assign a new
  187. list::
  188. request.upload_handlers = [ProgressBarUploadHandler()]
  189. .. note::
  190. You can only modify upload handlers *before* accessing
  191. ``request.POST`` or ``request.FILES`` -- it doesn't make sense to
  192. change upload handlers after upload handling has already
  193. started. If you try to modify ``request.upload_handlers`` after
  194. reading from ``request.POST`` or ``request.FILES`` Django will
  195. throw an error.
  196. Thus, you should always modify uploading handlers as early in your view as
  197. possible.
  198. Also, ``request.POST`` is accessed by
  199. :class:`~django.middleware.csrf.CsrfViewMiddleware` which is enabled by
  200. default. This means you will probably need to use
  201. :func:`~django.views.decorators.csrf.csrf_exempt` on your view to allow you
  202. to change the upload handlers. Assuming you do need CSRF protection, you
  203. will then need to use :func:`~django.views.decorators.csrf.csrf_protect` on
  204. the function that actually processes the request. Note that this means that
  205. the handlers may start receiving the file upload before the CSRF checks have
  206. been done. Example code:
  207. .. code-block:: python
  208. from django.views.decorators.csrf import csrf_exempt, csrf_protect
  209. @csrf_exempt
  210. def upload_file_view(request):
  211. request.upload_handlers.insert(0, ProgressBarUploadHandler())
  212. return _upload_file_view(request)
  213. @csrf_protect
  214. def _upload_file_view(request):
  215. ... # Process request
  216. Writing custom upload handlers
  217. ------------------------------
  218. All file upload handlers should be subclasses of
  219. ``django.core.files.uploadhandler.FileUploadHandler``. You can define upload
  220. handlers wherever you wish.
  221. Required methods
  222. ~~~~~~~~~~~~~~~~
  223. Custom file upload handlers **must** define the following methods:
  224. ``FileUploadHandler.receive_data_chunk(self, raw_data, start)``
  225. Receives a "chunk" of data from the file upload.
  226. ``raw_data`` is a byte string containing the uploaded data.
  227. ``start`` is the position in the file where this ``raw_data`` chunk
  228. begins.
  229. The data you return will get fed into the subsequent upload handlers'
  230. ``receive_data_chunk`` methods. In this way, one handler can be a
  231. "filter" for other handlers.
  232. Return ``None`` from ``receive_data_chunk`` to sort-circuit remaining
  233. upload handlers from getting this chunk.. This is useful if you're
  234. storing the uploaded data yourself and don't want future handlers to
  235. store a copy of the data.
  236. If you raise a ``StopUpload`` or a ``SkipFile`` exception, the upload
  237. will abort or the file will be completely skipped.
  238. ``FileUploadHandler.file_complete(self, file_size)``
  239. Called when a file has finished uploading.
  240. The handler should return an ``UploadedFile`` object that will be stored
  241. in ``request.FILES``. Handlers may also return ``None`` to indicate that
  242. the ``UploadedFile`` object should come from subsequent upload handlers.
  243. Optional methods
  244. ~~~~~~~~~~~~~~~~
  245. Custom upload handlers may also define any of the following optional methods or
  246. attributes:
  247. ``FileUploadHandler.chunk_size``
  248. Size, in bytes, of the "chunks" Django should store into memory and feed
  249. into the handler. That is, this attribute controls the size of chunks
  250. fed into ``FileUploadHandler.receive_data_chunk``.
  251. For maximum performance the chunk sizes should be divisible by ``4`` and
  252. should not exceed 2 GB (2\ :sup:`31` bytes) in size. When there are
  253. multiple chunk sizes provided by multiple handlers, Django will use the
  254. smallest chunk size defined by any handler.
  255. The default is 64*2\ :sup:`10` bytes, or 64 KB.
  256. ``FileUploadHandler.new_file(self, field_name, file_name, content_type, content_length, charset)``
  257. Callback signaling that a new file upload is starting. This is called
  258. before any data has been fed to any upload handlers.
  259. ``field_name`` is a string name of the file ``<input>`` field.
  260. ``file_name`` is the unicode filename that was provided by the browser.
  261. ``content_type`` is the MIME type provided by the browser -- E.g.
  262. ``'image/jpeg'``.
  263. ``content_length`` is the length of the image given by the browser.
  264. Sometimes this won't be provided and will be ``None``.
  265. ``charset`` is the character set (i.e. ``utf8``) given by the browser.
  266. Like ``content_length``, this sometimes won't be provided.
  267. This method may raise a ``StopFutureHandlers`` exception to prevent
  268. future handlers from handling this file.
  269. ``FileUploadHandler.upload_complete(self)``
  270. Callback signaling that the entire upload (all files) has completed.
  271. ``FileUploadHandler.handle_raw_input(self, input_data, META, content_length, boundary, encoding)``
  272. Allows the handler to completely override the parsing of the raw
  273. HTTP input.
  274. ``input_data`` is a file-like object that supports ``read()``-ing.
  275. ``META`` is the same object as ``request.META``.
  276. ``content_length`` is the length of the data in ``input_data``. Don't
  277. read more than ``content_length`` bytes from ``input_data``.
  278. ``boundary`` is the MIME boundary for this request.
  279. ``encoding`` is the encoding of the request.
  280. Return ``None`` if you want upload handling to continue, or a tuple of
  281. ``(POST, FILES)`` if you want to return the new data structures suitable
  282. for the request directly.