123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191 |
- ================
- Full text search
- ================
- .. versionadded:: 1.10
- The database functions in the ``django.contrib.postgres.search`` module ease
- the use of PostgreSQL's `full text search engine
- <http://www.postgresql.org/docs/current/static/textsearch.html>`_.
- For the examples in this document, we'll use the models defined in
- :doc:`/topics/db/queries`.
- .. seealso::
- For a high-level overview of searching, see the :doc:`topic documentation
- </topics/db/search>`.
- .. currentmodule:: django.contrib.postgres.search
- The ``search`` lookup
- =====================
- .. fieldlookup:: search
- The simplest way to use full text search is to search a single term against a
- single column in the database. For example::
- >>> Entry.objects.filter(body_text__search='Cheese')
- [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
- This creates a ``to_tsvector`` in the database from the ``body_text`` field
- and a ``plainto_tsquery`` from the search term ``'Potato'``, both using the
- default database search configuration. The results are obtained by matching the
- query and the vector.
- To use the ``search`` lookup, ``'django.contrib.postgres'`` must be in your
- :setting:`INSTALLED_APPS`.
- ``SearchVector``
- ================
- .. class:: SearchVector(\*expressions, config=None, weight=None)
- Searching against a single field is great but rather limiting. The ``Entry``
- instances we're searching belong to a ``Blog``, which has a ``tagline`` field.
- To query against both fields, use a ``SearchVector``::
- >>> from django.contrib.postgres.search import SearchVector
- >>> Entry.objects.annotate(
- ... search=SearchVector('body_text', 'blog__tagline'),
- ... ).filter(search='Cheese')
- [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
- The arguments to ``SearchVector`` can be any
- :class:`~django.db.models.Expression` or the name of a field. Multiple
- arguments will be concatenated together using a space so that the search
- document includes them all.
- ``SearchVector`` objects can be combined together, allowing you to reuse them.
- For example::
- >>> Entry.objects.annotate(
- ... search=SearchVector('body_text') + SearchVector('blog__tagline'),
- ... ).filter(search='Cheese')
- [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
- See :ref:`postgresql-fts-search-configuration` and
- :ref:`postgresql-fts-weighting-queries` for an explanation of the ``config``
- and ``weight`` parameters.
- ``SearchQuery``
- ===============
- .. class:: SearchQuery(value, config=None)
- ``SearchQuery`` translates the terms the user provides into a search query
- object that the database compares to a search vector. By default, all the words
- the user provides are passed through the stemming algorithms, and then it
- looks for matches for all of the resulting terms.
- ``SearchQuery`` terms can be combined logically to provide more flexibility::
- >>> from django.contrib.postgres.search import SearchQuery
- >>> SearchQuery('potato') & SearchQuery('ireland') # potato AND ireland
- >>> SearchQuery('potato') | SearchQuery('penguin') # potato OR penguin
- >>> ~SearchQuery('sausage') # NOT sausage
- See :ref:`postgresql-fts-search-configuration` for an explanation of the
- ``config`` parameter.
- ``SearchRank``
- ==============
- .. class:: SearchRank(vector, query, weights=None)
- So far, we've just returned the results for which any match between the vector
- and the query are possible. It's likely you may wish to order the results by
- some sort of relevancy. PostgreSQL provides a ranking function which takes into
- account how often the query terms appear in the document, how close together
- the terms are in the document, and how important the part of the document is
- where they occur. The better the match, the higher the value of the rank. To
- order by relevancy::
- >>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
- >>> vector = SearchVector('body_text')
- >>> query = SearchQuery('cheese')
- >>> Entry.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')
- [<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
- See :ref:`postgresql-fts-weighting-queries` for an explanation of the
- ``weights`` parameter.
- .. _postgresql-fts-search-configuration:
- Changing the search configuration
- =================================
- You can specify the ``config`` attribute to a :class:`SearchVector` and
- :class:`SearchQuery` to use a different search configuration. This allows using
- a different language parsers and dictionaries as defined by the database::
- >>> from django.contrib.postgres.search import SearchQuery, SearchVector
- >>> Entry.objects.annotate(
- ... search=SearchVector('body_text', config='french'),
- ... ).filter(search=SearchQuery('œuf', config='french'))
- [<Entry: Pain perdu>]
- The value of ``config`` could also be stored in another column::
- >>> from djanog.db.models import F
- >>> Entry.objects.annotate(
- ... search=SearchVector('body_text', config=F('blog__language')),
- ... ).filter(search=SearchQuery('œuf', config=F('blog__language')))
- [<Entry: Pain perdu>]
- .. _postgresql-fts-weighting-queries:
- Weighting queries
- =================
- Every field may not have the same relevance in a query, so you can set weights
- of various vectors before you combine them::
- >>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
- >>> vector = SearchVector('body_text', weight='A') + SearchVector('blog__tagline', weight='B')
- >>> query = SearchQuery('cheese')
- >>> Entry.objects.annotate(rank=SearchRank(vector, query)).filter(rank__gte=0.3).order_by('rank')
- The weight should be one of the following letters: D, C, B, A. By default,
- these weights refer to the numbers ``0.1``, ``0.2``, ``0.4``, and ``1.0``,
- respectively. If you wish to weight them differently, pass a list of four
- floats to :class:`SearchRank` as ``weights`` in the same order above::
- >>> rank = SearchRank(vector, query, weights=[0.2, 0.4, 0.6, 0.8])
- >>> Entry.objects.annotate(rank=rank).filter(rank__gte=0.3).order_by('-rank')
- Performance
- ===========
- Special database configuration isn't necessary to use any of these functions,
- however, if you're searching more than a few hundred records, you're likely to
- run into performance problems. Full text search is a more intensive process
- than comparing the size of an integer, for example.
- In the event that all the fields you're querying on are contained within one
- particular model, you can create a functional index which matches the search
- vector you wish to use. For example:
- .. code-block:: sql
- CREATE INDEX body_text_search ON blog_entry (to_tsvector(body_text));
- This index will then be used by subsequent queries. In many cases this will be
- sufficient.
- ``SearchVectorField``
- ---------------------
- .. class:: SearchVectorField
- If this approach becomes too slow, you can add a ``SearchVectorField`` to your
- model. You'll need to keep it populated with triggers, for example, as
- described in the `PostgreSQL documentation`_. You can then query the field as
- if it were an annotated ``SearchVector``::
- >>> Entry.objects.update(search_vector=SearchVector('body_text'))
- >>> Entry.objects.filter(search_vector='potato')
- [<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
- .. _PostgreSQL documentation: http://www.postgresql.org/docs/current/static/textsearch-features.html#TEXTSEARCH-UPDATE-TRIGGERS
|