123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140 |
- ======
- Search
- ======
- A common task for web applications is to search some data in the database with
- user input. In a simple case, this could be filtering a list of objects by a
- category. A more complex use case might require searching with weighting,
- categorization, highlighting, multiple languages, and so on. This document
- explains some of the possible use cases and the tools you can use.
- We'll refer to the same models used in :doc:`/topics/db/queries`.
- Use Cases
- =========
- Standard textual queries
- ------------------------
- Text-based fields have a selection of matching operations. For example, you may
- wish to allow lookup up an author like so:
- .. code-block:: pycon
- >>> Author.objects.filter(name__contains="Terry")
- [<Author: Terry Gilliam>, <Author: Terry Jones>]
- This is a very fragile solution as it requires the user to know an exact
- substring of the author's name. A better approach could be a case-insensitive
- match (:lookup:`icontains`), but this is only marginally better.
- A database's more advanced comparison functions
- -----------------------------------------------
- If you're using PostgreSQL, Django provides :doc:`a selection of database
- specific tools </ref/contrib/postgres/search>` to allow you to leverage more
- complex querying options. Other databases have different selections of tools,
- possibly via plugins or user-defined functions. Django doesn't include any
- support for them at this time. We'll use some examples from PostgreSQL to
- demonstrate the kind of functionality databases may have.
- .. admonition:: Searching in other databases
- All of the searching tools provided by :mod:`django.contrib.postgres` are
- constructed entirely on public APIs such as :doc:`custom lookups
- </ref/models/lookups>` and :doc:`database functions
- </ref/models/database-functions>`. Depending on your database, you should
- be able to construct queries to allow similar APIs. If there are specific
- things which cannot be achieved this way, please open a ticket.
- In the above example, we determined that a case insensitive lookup would be
- more useful. When dealing with non-English names, a further improvement is to
- use :lookup:`unaccented comparison <unaccent>`:
- .. code-block:: pycon
- >>> Author.objects.filter(name__unaccent__icontains="Helen")
- [<Author: Helen Mirren>, <Author: Helena Bonham Carter>, <Author: Hélène Joy>]
- This shows another issue, where we are matching against a different spelling of
- the name. In this case we have an asymmetry though - a search for ``Helen``
- will pick up ``Helena`` or ``Hélène``, but not the reverse. Another option
- would be to use a :lookup:`trigram_similar` comparison, which compares
- sequences of letters.
- For example:
- .. code-block:: pycon
- >>> Author.objects.filter(name__unaccent__lower__trigram_similar="Hélène")
- [<Author: Helen Mirren>, <Author: Hélène Joy>]
- Now we have a different problem - the longer name of "Helena Bonham Carter"
- doesn't show up as it is much longer. Trigram searches consider all
- combinations of three letters, and compares how many appear in both search and
- source strings. For the longer name, there are more combinations that don't
- appear in the source string, so it is no longer considered a close match.
- The correct choice of comparison functions here depends on your particular data
- set, for example the language(s) used and the type of text being searched. All
- of the examples we've seen are on short strings where the user is likely to
- enter something close (by varying definitions) to the source data.
- Document-based search
- ---------------------
- Standard database operations stop being a useful approach when you start
- considering large blocks of text. Whereas the examples above can be thought of
- as operations on a string of characters, full text search looks at the actual
- words. Depending on the system used, it's likely to use some of the following
- ideas:
- - Ignoring "stop words" such as "a", "the", "and".
- - Stemming words, so that "pony" and "ponies" are considered similar.
- - Weighting words based on different criteria such as how frequently they
- appear in the text, or the importance of the fields, such as the title or
- keywords, that they appear in.
- There are many alternatives for using searching software, some of the most
- prominent are Elastic_ and Solr_. These are full document-based search
- solutions. To use them with data from Django models, you'll need a layer which
- translates your data into a textual document, including back-references to the
- database ids. When a search using the engine returns a certain document, you
- can then look it up in the database. There are a variety of third-party
- libraries which are designed to help with this process.
- .. _Elastic: https://www.elastic.co/
- .. _Solr: https://solr.apache.org/
- PostgreSQL support
- ~~~~~~~~~~~~~~~~~~
- PostgreSQL has its own full text search implementation built-in. While not as
- powerful as some other search engines, it has the advantage of being inside
- your database and so can easily be combined with other relational queries such
- as categorization.
- The :mod:`django.contrib.postgres` module provides some helpers to make these
- queries. For example, a query might select all the blog entries which mention
- "cheese":
- .. code-block:: pycon
- >>> Entry.objects.filter(body_text__search="cheese")
- [<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
- You can also filter on a combination of fields and on related models:
- .. code-block:: pycon
- >>> Entry.objects.annotate(
- ... search=SearchVector("blog__tagline", "body_text"),
- ... ).filter(search="cheese")
- [
- <Entry: Cheese on Toast recipes>,
- <Entry: Pizza Recipes>,
- <Entry: Dairy farming in Argentina>,
- ]
- See the ``contrib.postgres`` :doc:`/ref/contrib/postgres/search` document for
- complete details.
|