123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294 |
- ==========================
- Performing raw SQL queries
- ==========================
- .. currentmodule:: django.db.models
- When the :doc:`model query APIs </topics/db/queries>` don't go far enough, you
- can fall back to writing raw SQL. Django gives you two ways of performing raw
- SQL queries: you can use :meth:`Manager.raw()` to `perform raw queries and
- return model instances`__, or you can avoid the model layer entirely and
- `execute custom SQL directly`__.
- __ `performing raw queries`_
- __ `executing custom SQL directly`_
- .. _executing-raw-queries:
- Performing raw queries
- ======================
- .. versionadded:: 1.2
- The ``raw()`` manager method can be used to perform raw SQL queries that
- return model instances:
- .. method:: Manager.raw(raw_query, params=None, translations=None)
- This method method takes a raw SQL query, executes it, and returns a
- :class:`~django.db.models.query.RawQuerySet` instance. This
- :class:`~django.db.models.query.RawQuerySet` instance can be iterated
- over just like an normal QuerySet to provide object instances.
- This is best illustrated with an example. Suppose you've got the following model::
- class Person(models.Model):
- first_name = models.CharField(...)
- last_name = models.CharField(...)
- birth_date = models.DateField(...)
- You could then execute custom SQL like so::
- >>> for p in Person.objects.raw('SELECT * FROM myapp_person'):
- ... print p
- John Smith
- Jane Jones
- Of course, this example isn't very exciting -- it's exactly the same as
- running ``Person.objects.all()``. However, ``raw()`` has a bunch of other
- options that make it very powerful.
- .. admonition:: Model table names
- Where'd the name of the ``Person`` table come from in that example?
- By default, Django figures out a database table name by joining the
- model's "app label" -- the name you used in ``manage.py startapp`` -- to
- the model's class name, with an underscore between them. In the example
- we've assumed that the ``Person`` model lives in an app named ``myapp``,
- so its table would be ``myapp_person``.
- For more details check out the documentation for the
- :attr:`~Options.db_table` option, which also lets you manually set the
- database table name.
- .. warning::
- No checking is done on the SQL statement that is passed in to ``.raw()``.
- Django expects that the statement will return a set of rows from the
- database, but does nothing to enforce that. If the query does not
- return rows, a (possibly cryptic) error will result.
- Mapping query fields to model fields
- ------------------------------------
- ``raw()`` automatically maps fields in the query to fields on the model.
- The order of fields in your query doesn't matter. In other words, both
- of the following queries work identically::
- >>> Person.objects.raw('SELECT id, first_name, last_name, birth_date FROM myapp_person')
- ...
- >>> Person.objects.raw('SELECT last_name, birth_date, first_name, id FROM myapp_person')
- ...
- Matching is done by name. This means that you can use SQL's ``AS`` clauses to
- map fields in the query to model fields. So if you had some other table that
- had ``Person`` data in it, you could easily map it into ``Person`` instances::
- >>> Person.objects.raw('''SELECT first AS first_name,
- ... last AS last_name,
- ... bd AS birth_date,
- ... pk as id,
- ... FROM some_other_table''')
- As long as the names match, the model instances will be created correctly.
- Alternatively, you can map fields in the query to model fields using the
- ``translations`` argument to ``raw()``. This is a dictionary mapping names of
- fields in the query to names of fields on the model. For example, the above
- query could also be written::
- >>> name_map = {'first': 'first_name', 'last': 'last_name', 'bd': 'birth_date', 'pk': 'id'}
- >>> Person.objects.raw('SELECT * FROM some_other_table', translations=name_map)
- Index lookups
- -------------
- ``raw()`` supports indexing, so if you need only the first result you can
- write::
- >>> first_person = Person.objects.raw('SELECT * from myapp_person')[0]
- However, the indexing and slicing are not performed at the database level. If
- you have a big amount of ``Person`` objects in your database, it is more
- efficient to limit the query at the SQL level::
- >>> first_person = Person.objects.raw('SELECT * from myapp_person LIMIT 1')[0]
- Deferring model fields
- ----------------------
- Fields may also be left out::
- >>> people = Person.objects.raw('SELECT id, first_name FROM myapp_person')
- The ``Person`` objects returned by this query will be deferred model instances
- (see :meth:`~django.db.models.QuerySet.defer()`). This means that the fields
- that are omitted from the query will be loaded on demand. For example::
- >>> for p in Person.objects.raw('SELECT id, first_name FROM myapp_person'):
- ... print p.first_name, # This will be retrieved by the original query
- ... print p.last_name # This will be retrieved on demand
- ...
- John Smith
- Jane Jones
- From outward appearances, this looks like the query has retrieved both
- the first name and last name. However, this example actually issued 3
- queries. Only the first names were retrieved by the raw() query -- the
- last names were both retrieved on demand when they were printed.
- There is only one field that you can't leave out - the primary key
- field. Django uses the primary key to identify model instances, so it
- must always be included in a raw query. An ``InvalidQuery`` exception
- will be raised if you forget to include the primary key.
- Adding annotations
- ------------------
- You can also execute queries containing fields that aren't defined on the
- model. For example, we could use `PostgreSQL's age() function`__ to get a list
- of people with their ages calculated by the database::
- >>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person')
- >>> for p in people:
- ... print "%s is %s." % (p.first_name, p.age)
- John is 37.
- Jane is 42.
- ...
- __ http://www.postgresql.org/docs/8.4/static/functions-datetime.html
- Passing parameters into ``raw()``
- ---------------------------------
- If you need to perform parameterized queries, you can use the ``params``
- argument to ``raw()``::
- >>> lname = 'Doe'
- >>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname])
- ``params`` is a list of parameters. You'll use ``%s`` placeholders in the
- query string (regardless of your database engine); they'll be replaced with
- parameters from the ``params`` list.
- .. warning::
- **Do not use string formatting on raw queries!**
- It's tempting to write the above query as::
- >>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' % lname
- >>> Person.objects.raw(query)
- **Don't.**
- Using the ``params`` list completely protects you from `SQL injection
- attacks`__, a common exploit where attackers inject arbitrary SQL into
- your database. If you use string interpolation, sooner or later you'll
- fall victim to SQL injection. As long as you remember to always use the
- ``params`` list you'll be protected.
- __ http://en.wikipedia.org/wiki/SQL_injection
- .. _executing-custom-sql:
- Executing custom SQL directly
- =============================
- Sometimes even :meth:`Manager.raw` isn't quite enough: you might need to
- perform queries that don't map cleanly to models, or directly execute
- ``UPDATE``, ``INSERT``, or ``DELETE`` queries.
- In these cases, you can always access the database directly, routing around
- the model layer entirely.
- The object ``django.db.connection`` represents the
- default database connection, and ``django.db.transaction`` represents the
- default database transaction. To use the database connection, call
- ``connection.cursor()`` to get a cursor object. Then, call
- ``cursor.execute(sql, [params])`` to execute the SQL and ``cursor.fetchone()``
- or ``cursor.fetchall()`` to return the resulting rows. After performing a data
- changing operation, you should then call
- ``transaction.commit_unless_managed()`` to ensure your changes are committed
- to the database. If your query is purely a data retrieval operation, no commit
- is required. For example::
- def my_custom_sql():
- from django.db import connection, transaction
- cursor = connection.cursor()
- # Data modifying operation - commit required
- cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz])
- transaction.commit_unless_managed()
- # Data retrieval operation - no commit required
- cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz])
- row = cursor.fetchone()
- return row
- If you are using more than one database you can use
- ``django.db.connections`` to obtain the connection (and cursor) for a
- specific database. ``django.db.connections`` is a dictionary-like
- object that allows you to retrieve a specific connection using its
- alias::
- from django.db import connections
- cursor = connections['my_db_alias'].cursor()
- # Your code here...
- transaction.commit_unless_managed(using='my_db_alias')
- By default, the Python DB API will return results without their field
- names, which means you end up with a ``list`` of values, rather than a
- ``dict``. At a small performance cost, you can return results as a
- ``dict`` by using something like this::
-
- def dictfetchall(cursor):
- "Returns all rows from a cursor as a dict"
- desc = cursor.description
- return [
- dict(zip([col[0] for col in desc], row))
- for row in cursor.fetchall()
- ]
- Here is an example of the difference between the two::
- >>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
- >>> cursor.fetchall()
- ((54360982L, None), (54360880L, None))
-
- >>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
- >>> dictfetchall(cursor)
- [{'parent_id': None, 'id': 54360982L}, {'parent_id': None, 'id': 54360880L}]
- .. _transactions-and-raw-sql:
- Transactions and raw SQL
- ------------------------
- When you make a raw SQL call, Django will automatically mark the
- current transaction as dirty. You must then ensure that the
- transaction containing those calls is closed correctly. See :ref:`the
- notes on the requirements of Django's transaction handling
- <topics-db-transactions-requirements>` for more details.
- .. versionchanged:: 1.3
- Prior to Django 1.3, it was necessary to manually mark a transaction
- as dirty using ``transaction.set_dirty()`` when using raw SQL calls.
- Connections and cursors
- -----------------------
- ``connection`` and ``cursor`` mostly implement the standard Python DB-API
- described in :pep:`249` (except when it comes to :doc:`transaction handling
- </topics/db/transactions>`). If you're not familiar with the Python DB-API, note
- that the SQL statement in ``cursor.execute()`` uses placeholders, ``"%s"``,
- rather than adding parameters directly within the SQL. If you use this
- technique, the underlying database library will automatically add quotes and
- escaping to your parameter(s) as necessary. (Also note that Django expects the
- ``"%s"`` placeholder, *not* the ``"?"`` placeholder, which is used by the SQLite
- Python bindings. This is for the sake of consistency and sanity.)
|