file-uploads.txt 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400
  1. ============
  2. File Uploads
  3. ============
  4. .. currentmodule:: django.core.files.uploadedfile
  5. When Django handles a file upload, the file data ends up placed in
  6. :attr:`request.FILES <django.http.HttpRequest.FILES>` (for more on the
  7. ``request`` object see the documentation for :doc:`request and response objects
  8. </ref/request-response>`). This document explains how files are stored on disk
  9. and in memory, and how to customize the default behavior.
  10. Basic file uploads
  11. ==================
  12. Consider a simple form containing a :class:`~django.forms.FileField`::
  13. from django import forms
  14. class UploadFileForm(forms.Form):
  15. title = forms.CharField(max_length=50)
  16. file = forms.FileField()
  17. A view handling this form will receive the file data in
  18. :attr:`request.FILES <django.http.HttpRequest.FILES>`, which is a dictionary
  19. containing a key for each :class:`~django.forms.FileField` (or
  20. :class:`~django.forms.ImageField`, or other :class:`~django.forms.FileField`
  21. subclass) in the form. So the data from the above form would
  22. be accessible as ``request.FILES['file']``.
  23. Note that :attr:`request.FILES <django.http.HttpRequest.FILES>` will only
  24. contain data if the request method was ``POST`` and the ``<form>`` that posted
  25. the request has the attribute ``enctype="multipart/form-data"``. Otherwise,
  26. ``request.FILES`` will be empty.
  27. Most of the time, you'll simply pass the file data from ``request`` into the
  28. form as described in :ref:`binding-uploaded-files`. This would look
  29. something like::
  30. from django.http import HttpResponseRedirect
  31. from django.shortcuts import render_to_response
  32. # Imaginary function to handle an uploaded file.
  33. from somewhere import handle_uploaded_file
  34. def upload_file(request):
  35. if request.method == 'POST':
  36. form = UploadFileForm(request.POST, request.FILES)
  37. if form.is_valid():
  38. handle_uploaded_file(request.FILES['file'])
  39. return HttpResponseRedirect('/success/url/')
  40. else:
  41. form = UploadFileForm()
  42. return render_to_response('upload.html', {'form': form})
  43. Notice that we have to pass :attr:`request.FILES <django.http.HttpRequest.FILES>`
  44. into the form's constructor; this is how file data gets bound into a form.
  45. Handling uploaded files
  46. -----------------------
  47. .. class:: UploadedFile
  48. The final piece of the puzzle is handling the actual file data from
  49. :attr:`request.FILES <django.http.HttpRequest.FILES>`. Each entry in this
  50. dictionary is an ``UploadedFile`` object -- a simple wrapper around an uploaded
  51. file. You'll usually use one of these methods to access the uploaded content:
  52. .. method:: read()
  53. Read the entire uploaded data from the file. Be careful with this
  54. method: if the uploaded file is huge it can overwhelm your system if you
  55. try to read it into memory. You'll probably want to use ``chunks()``
  56. instead; see below.
  57. .. method:: multiple_chunks()
  58. Returns ``True`` if the uploaded file is big enough to require
  59. reading in multiple chunks. By default this will be any file
  60. larger than 2.5 megabytes, but that's configurable; see below.
  61. .. method:: chunks()
  62. A generator returning chunks of the file. If ``multiple_chunks()`` is
  63. ``True``, you should use this method in a loop instead of ``read()``.
  64. In practice, it's often easiest simply to use ``chunks()`` all the time;
  65. see the example below.
  66. .. attribute:: name
  67. The name of the uploaded file (e.g. ``my_file.txt``).
  68. .. attribute:: size
  69. The size, in bytes, of the uploaded file.
  70. There are a few other methods and attributes available on ``UploadedFile``
  71. objects; see `UploadedFile objects`_ for a complete reference.
  72. Putting it all together, here's a common way you might handle an uploaded file::
  73. def handle_uploaded_file(f):
  74. destination = open('some/file/name.txt', 'wb+')
  75. for chunk in f.chunks():
  76. destination.write(chunk)
  77. destination.close()
  78. Looping over ``UploadedFile.chunks()`` instead of using ``read()`` ensures that
  79. large files don't overwhelm your system's memory.
  80. Where uploaded data is stored
  81. -----------------------------
  82. Before you save uploaded files, the data needs to be stored somewhere.
  83. By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold
  84. the entire contents of the upload in memory. This means that saving the file
  85. involves only a read from memory and a write to disk and thus is very fast.
  86. However, if an uploaded file is too large, Django will write the uploaded file
  87. to a temporary file stored in your system's temporary directory. On a Unix-like
  88. platform this means you can expect Django to generate a file called something
  89. like ``/tmp/tmpzfp6I6.upload``. If an upload is large enough, you can watch this
  90. file grow in size as Django streams the data onto disk.
  91. These specifics -- 2.5 megabytes; ``/tmp``; etc. -- are simply "reasonable
  92. defaults". Read on for details on how you can customize or completely replace
  93. upload behavior.
  94. Changing upload handler behavior
  95. --------------------------------
  96. Three settings control Django's file upload behavior:
  97. :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE`
  98. The maximum size, in bytes, for files that will be uploaded into memory.
  99. Files larger than :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be
  100. streamed to disk.
  101. Defaults to 2.5 megabytes.
  102. :setting:`FILE_UPLOAD_TEMP_DIR`
  103. The directory where uploaded files larger than
  104. :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be stored.
  105. Defaults to your system's standard temporary directory (i.e. ``/tmp`` on
  106. most Unix-like systems).
  107. :setting:`FILE_UPLOAD_PERMISSIONS`
  108. The numeric mode (i.e. ``0644``) to set newly uploaded files to. For
  109. more information about what these modes mean, see the documentation for
  110. :func:`os.chmod`.
  111. If this isn't given or is ``None``, you'll get operating-system
  112. dependent behavior. On most platforms, temporary files will have a mode
  113. of ``0600``, and files saved from memory will be saved using the
  114. system's standard umask.
  115. .. warning::
  116. If you're not familiar with file modes, please note that the leading
  117. ``0`` is very important: it indicates an octal number, which is the
  118. way that modes must be specified. If you try to use ``644``, you'll
  119. get totally incorrect behavior.
  120. **Always prefix the mode with a 0.**
  121. :setting:`FILE_UPLOAD_HANDLERS`
  122. The actual handlers for uploaded files. Changing this setting allows
  123. complete customization -- even replacement -- of Django's upload
  124. process. See `upload handlers`_, below, for details.
  125. Defaults to::
  126. ("django.core.files.uploadhandler.MemoryFileUploadHandler",
  127. "django.core.files.uploadhandler.TemporaryFileUploadHandler",)
  128. Which means "try to upload to memory first, then fall back to temporary
  129. files."
  130. ``UploadedFile`` objects
  131. ========================
  132. In addition to those inherited from :class:`File`, all ``UploadedFile`` objects
  133. define the following methods/attributes:
  134. .. attribute:: UploadedFile.content_type
  135. The content-type header uploaded with the file (e.g. :mimetype:`text/plain`
  136. or :mimetype:`application/pdf`). Like any data supplied by the user, you
  137. shouldn't trust that the uploaded file is actually this type. You'll still
  138. need to validate that the file contains the content that the content-type
  139. header claims -- "trust but verify."
  140. .. attribute:: UploadedFile.charset
  141. For :mimetype:`text/*` content-types, the character set (i.e. ``utf8``)
  142. supplied by the browser. Again, "trust but verify" is the best policy here.
  143. .. attribute:: UploadedFile.temporary_file_path()
  144. Only files uploaded onto disk will have this method; it returns the full
  145. path to the temporary uploaded file.
  146. .. note::
  147. Like regular Python files, you can read the file line-by-line simply by
  148. iterating over the uploaded file:
  149. .. code-block:: python
  150. for line in uploadedfile:
  151. do_something_with(line)
  152. However, *unlike* standard Python files, :class:`UploadedFile` only
  153. understands ``\n`` (also known as "Unix-style") line endings. If you know
  154. that you need to handle uploaded files with different line endings, you'll
  155. need to do so in your view.
  156. Upload Handlers
  157. ===============
  158. When a user uploads a file, Django passes off the file data to an *upload
  159. handler* -- a small class that handles file data as it gets uploaded. Upload
  160. handlers are initially defined in the :setting:`FILE_UPLOAD_HANDLERS` setting,
  161. which defaults to::
  162. ("django.core.files.uploadhandler.MemoryFileUploadHandler",
  163. "django.core.files.uploadhandler.TemporaryFileUploadHandler",)
  164. Together the ``MemoryFileUploadHandler`` and ``TemporaryFileUploadHandler``
  165. provide Django's default file upload behavior of reading small files into memory
  166. and large ones onto disk.
  167. You can write custom handlers that customize how Django handles files. You
  168. could, for example, use custom handlers to enforce user-level quotas, compress
  169. data on the fly, render progress bars, and even send data to another storage
  170. location directly without storing it locally.
  171. .. _modifying_upload_handlers_on_the_fly:
  172. Modifying upload handlers on the fly
  173. ------------------------------------
  174. Sometimes particular views require different upload behavior. In these cases,
  175. you can override upload handlers on a per-request basis by modifying
  176. ``request.upload_handlers``. By default, this list will contain the upload
  177. handlers given by :setting:`FILE_UPLOAD_HANDLERS`, but you can modify the list
  178. as you would any other list.
  179. For instance, suppose you've written a ``ProgressBarUploadHandler`` that
  180. provides feedback on upload progress to some sort of AJAX widget. You'd add this
  181. handler to your upload handlers like this::
  182. request.upload_handlers.insert(0, ProgressBarUploadHandler())
  183. You'd probably want to use ``list.insert()`` in this case (instead of
  184. ``append()``) because a progress bar handler would need to run *before* any
  185. other handlers. Remember, the upload handlers are processed in order.
  186. If you want to replace the upload handlers completely, you can just assign a new
  187. list::
  188. request.upload_handlers = [ProgressBarUploadHandler()]
  189. .. note::
  190. You can only modify upload handlers *before* accessing
  191. ``request.POST`` or ``request.FILES`` -- it doesn't make sense to
  192. change upload handlers after upload handling has already
  193. started. If you try to modify ``request.upload_handlers`` after
  194. reading from ``request.POST`` or ``request.FILES`` Django will
  195. throw an error.
  196. Thus, you should always modify uploading handlers as early in your view as
  197. possible.
  198. Also, ``request.POST`` is accessed by
  199. :class:`~django.middleware.csrf.CsrfViewMiddleware` which is enabled by
  200. default. This means you will need to use
  201. :func:`~django.views.decorators.csrf.csrf_exempt` on your view to allow you
  202. to change the upload handlers. You will then need to use
  203. :func:`~django.views.decorators.csrf.csrf_protect` on the function that
  204. actually processes the request. Note that this means that the handlers may
  205. start receiving the file upload before the CSRF checks have been done.
  206. Example code:
  207. .. code-block:: python
  208. from django.views.decorators.csrf import csrf_exempt, csrf_protect
  209. @csrf_exempt
  210. def upload_file_view(request):
  211. request.upload_handlers.insert(0, ProgressBarUploadHandler())
  212. return _upload_file_view(request)
  213. @csrf_protect
  214. def _upload_file_view(request):
  215. ... # Process request
  216. Writing custom upload handlers
  217. ------------------------------
  218. All file upload handlers should be subclasses of
  219. ``django.core.files.uploadhandler.FileUploadHandler``. You can define upload
  220. handlers wherever you wish.
  221. Required methods
  222. ~~~~~~~~~~~~~~~~
  223. Custom file upload handlers **must** define the following methods:
  224. ``FileUploadHandler.receive_data_chunk(self, raw_data, start)``
  225. Receives a "chunk" of data from the file upload.
  226. ``raw_data`` is a byte string containing the uploaded data.
  227. ``start`` is the position in the file where this ``raw_data`` chunk
  228. begins.
  229. The data you return will get fed into the subsequent upload handlers'
  230. ``receive_data_chunk`` methods. In this way, one handler can be a
  231. "filter" for other handlers.
  232. Return ``None`` from ``receive_data_chunk`` to sort-circuit remaining
  233. upload handlers from getting this chunk.. This is useful if you're
  234. storing the uploaded data yourself and don't want future handlers to
  235. store a copy of the data.
  236. If you raise a ``StopUpload`` or a ``SkipFile`` exception, the upload
  237. will abort or the file will be completely skipped.
  238. ``FileUploadHandler.file_complete(self, file_size)``
  239. Called when a file has finished uploading.
  240. The handler should return an ``UploadedFile`` object that will be stored
  241. in ``request.FILES``. Handlers may also return ``None`` to indicate that
  242. the ``UploadedFile`` object should come from subsequent upload handlers.
  243. Optional methods
  244. ~~~~~~~~~~~~~~~~
  245. Custom upload handlers may also define any of the following optional methods or
  246. attributes:
  247. ``FileUploadHandler.chunk_size``
  248. Size, in bytes, of the "chunks" Django should store into memory and feed
  249. into the handler. That is, this attribute controls the size of chunks
  250. fed into ``FileUploadHandler.receive_data_chunk``.
  251. For maximum performance the chunk sizes should be divisible by ``4`` and
  252. should not exceed 2 GB (2\ :sup:`31` bytes) in size. When there are
  253. multiple chunk sizes provided by multiple handlers, Django will use the
  254. smallest chunk size defined by any handler.
  255. The default is 64*2\ :sup:`10` bytes, or 64 KB.
  256. ``FileUploadHandler.new_file(self, field_name, file_name, content_type, content_length, charset)``
  257. Callback signaling that a new file upload is starting. This is called
  258. before any data has been fed to any upload handlers.
  259. ``field_name`` is a string name of the file ``<input>`` field.
  260. ``file_name`` is the unicode filename that was provided by the browser.
  261. ``content_type`` is the MIME type provided by the browser -- E.g.
  262. ``'image/jpeg'``.
  263. ``content_length`` is the length of the image given by the browser.
  264. Sometimes this won't be provided and will be ``None``.
  265. ``charset`` is the character set (i.e. ``utf8``) given by the browser.
  266. Like ``content_length``, this sometimes won't be provided.
  267. This method may raise a ``StopFutureHandlers`` exception to prevent
  268. future handlers from handling this file.
  269. ``FileUploadHandler.upload_complete(self)``
  270. Callback signaling that the entire upload (all files) has completed.
  271. ``FileUploadHandler.handle_raw_input(self, input_data, META, content_length, boundary, encoding)``
  272. Allows the handler to completely override the parsing of the raw
  273. HTTP input.
  274. ``input_data`` is a file-like object that supports ``read()``-ing.
  275. ``META`` is the same object as ``request.META``.
  276. ``content_length`` is the length of the data in ``input_data``. Don't
  277. read more than ``content_length`` bytes from ``input_data``.
  278. ``boundary`` is the MIME boundary for this request.
  279. ``encoding`` is the encoding of the request.
  280. Return ``None`` if you want upload handling to continue, or a tuple of
  281. ``(POST, FILES)`` if you want to return the new data structures suitable
  282. for the request directly.