2
0

search.txt 7.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191
  1. ================
  2. Full text search
  3. ================
  4. .. versionadded:: 1.10
  5. The database functions in the ``django.contrib.postgres.search`` module ease
  6. the use of PostgreSQL's `full text search engine
  7. <http://www.postgresql.org/docs/current/static/textsearch.html>`_.
  8. For the examples in this document, we'll use the models defined in
  9. :doc:`/topics/db/queries`.
  10. .. seealso::
  11. For a high-level overview of searching, see the :doc:`topic documentation
  12. </topics/db/search>`.
  13. .. currentmodule:: django.contrib.postgres.search
  14. The ``search`` lookup
  15. =====================
  16. .. fieldlookup:: search
  17. The simplest way to use full text search is to search a single term against a
  18. single column in the database. For example::
  19. >>> Entry.objects.filter(body_text__search='Cheese')
  20. [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
  21. This creates a ``to_tsvector`` in the database from the ``body_text`` field
  22. and a ``plainto_tsquery`` from the search term ``'Potato'``, both using the
  23. default database search configuration. The results are obtained by matching the
  24. query and the vector.
  25. To use the ``search`` lookup, ``'django.contrib.postgres'`` must be in your
  26. :setting:`INSTALLED_APPS`.
  27. ``SearchVector``
  28. ================
  29. .. class:: SearchVector(\*expressions, config=None, weight=None)
  30. Searching against a single field is great but rather limiting. The ``Entry``
  31. instances we're searching belong to a ``Blog``, which has a ``tagline`` field.
  32. To query against both fields, use a ``SearchVector``::
  33. >>> from django.contrib.postgres.search import SearchVector
  34. >>> Entry.objects.annotate(
  35. ... search=SearchVector('body_text', 'blog__tagline'),
  36. ... ).filter(search='Cheese')
  37. [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
  38. The arguments to ``SearchVector`` can be any
  39. :class:`~django.db.models.Expression` or the name of a field. Multiple
  40. arguments will be concatenated together using a space so that the search
  41. document includes them all.
  42. ``SearchVector`` objects can be combined together, allowing you to reuse them.
  43. For example::
  44. >>> Entry.objects.annotate(
  45. ... search=SearchVector('body_text') + SearchVector('blog__tagline'),
  46. ... ).filter(search='Cheese')
  47. [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
  48. See :ref:`postgresql-fts-search-configuration` and
  49. :ref:`postgresql-fts-weighting-queries` for an explanation of the ``config``
  50. and ``weight`` parameters.
  51. ``SearchQuery``
  52. ===============
  53. .. class:: SearchQuery(value, config=None)
  54. ``SearchQuery`` translates the terms the user provides into a search query
  55. object that the database compares to a search vector. By default, all the words
  56. the user provides are passed through the stemming algorithms, and then it
  57. looks for matches for all of the resulting terms.
  58. ``SearchQuery`` terms can be combined logically to provide more flexibility::
  59. >>> from django.contrib.postgres.search import SearchQuery
  60. >>> SearchQuery('potato') & SearchQuery('ireland') # potato AND ireland
  61. >>> SearchQuery('potato') | SearchQuery('penguin') # potato OR penguin
  62. >>> ~SearchQuery('sausage') # NOT sausage
  63. See :ref:`postgresql-fts-search-configuration` for an explanation of the
  64. ``config`` parameter.
  65. ``SearchRank``
  66. ==============
  67. .. class:: SearchRank(vector, query, weights=None)
  68. So far, we've just returned the results for which any match between the vector
  69. and the query are possible. It's likely you may wish to order the results by
  70. some sort of relevancy. PostgreSQL provides a ranking function which takes into
  71. account how often the query terms appear in the document, how close together
  72. the terms are in the document, and how important the part of the document is
  73. where they occur. The better the match, the higher the value of the rank. To
  74. order by relevancy::
  75. >>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
  76. >>> vector = SearchVector('body_text')
  77. >>> query = SearchQuery('cheese')
  78. >>> Entry.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')
  79. [<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
  80. See :ref:`postgresql-fts-weighting-queries` for an explanation of the
  81. ``weights`` parameter.
  82. .. _postgresql-fts-search-configuration:
  83. Changing the search configuration
  84. =================================
  85. You can specify the ``config`` attribute to a :class:`SearchVector` and
  86. :class:`SearchQuery` to use a different search configuration. This allows using
  87. a different language parsers and dictionaries as defined by the database::
  88. >>> from django.contrib.postgres.search import SearchQuery, SearchVector
  89. >>> Entry.objects.annotate(
  90. ... search=SearchVector('body_text', config='french'),
  91. ... ).filter(search=SearchQuery('œuf', config='french'))
  92. [<Entry: Pain perdu>]
  93. The value of ``config`` could also be stored in another column::
  94. >>> from djanog.db.models import F
  95. >>> Entry.objects.annotate(
  96. ... search=SearchVector('body_text', config=F('blog__language')),
  97. ... ).filter(search=SearchQuery('œuf', config=F('blog__language')))
  98. [<Entry: Pain perdu>]
  99. .. _postgresql-fts-weighting-queries:
  100. Weighting queries
  101. =================
  102. Every field may not have the same relevance in a query, so you can set weights
  103. of various vectors before you combine them::
  104. >>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
  105. >>> vector = SearchVector('body_text', weight='A') + SearchVector('blog__tagline', weight='B')
  106. >>> query = SearchQuery('cheese')
  107. >>> Entry.objects.annotate(rank=SearchRank(vector, query)).filter(rank__gte=0.3).order_by('rank')
  108. The weight should be one of the following letters: D, C, B, A. By default,
  109. these weights refer to the numbers ``0.1``, ``0.2``, ``0.4``, and ``1.0``,
  110. respectively. If you wish to weight them differently, pass a list of four
  111. floats to :class:`SearchRank` as ``weights`` in the same order above::
  112. >>> rank = SearchRank(vector, query, weights=[0.2, 0.4, 0.6, 0.8])
  113. >>> Entry.objects.annotate(rank=rank).filter(rank__gte=0.3).order_by('-rank')
  114. Performance
  115. ===========
  116. Special database configuration isn't necessary to use any of these functions,
  117. however, if you're searching more than a few hundred records, you're likely to
  118. run into performance problems. Full text search is a more intensive process
  119. than comparing the size of an integer, for example.
  120. In the event that all the fields you're querying on are contained within one
  121. particular model, you can create a functional index which matches the search
  122. vector you wish to use. For example:
  123. .. code-block:: sql
  124. CREATE INDEX body_text_search ON blog_entry (to_tsvector(body_text));
  125. This index will then be used by subsequent queries. In many cases this will be
  126. sufficient.
  127. ``SearchVectorField``
  128. ---------------------
  129. .. class:: SearchVectorField
  130. If this approach becomes too slow, you can add a ``SearchVectorField`` to your
  131. model. You'll need to keep it populated with triggers, for example, as
  132. described in the `PostgreSQL documentation`_. You can then query the field as
  133. if it were an annotated ``SearchVector``::
  134. >>> Entry.objects.update(search_vector=SearchVector('body_text'))
  135. >>> Entry.objects.filter(search_vector='potato')
  136. [<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
  137. .. _PostgreSQL documentation: http://www.postgresql.org/docs/current/static/textsearch-features.html#TEXTSEARCH-UPDATE-TRIGGERS