SQL- LIKE performance question

Discussion in 'OT Technology' started by elevator, Jan 6, 2008.

  1. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    I have this web app. The search page has 3 text fields that you can search by (Name, Addres, and Contact Name) and a Search button.

    I have an SQLDataSource1 with the statement: SELECT [Name], [Address], [City], [State], [ContactName] FROM [Owner] WHERE (([Address] LIKE '%' + @Address + '%') AND ([Name] LIKE '%' + @Name + '%') AND ([ContactName] LIKE '%' + @ContactName + '%')). The three search fields have the wildcard (%) as the default in case nothing is entered.

    I have a GridView that uses SQLDataSource1. I have set it to Visible = false.

    In the code behind for the button, it sets the Visible property of the GridView to true.

    I have 3 records in the database now and it looks and operates EXACTLY how I need it to be when I click the search button.



    MY Question:
    I've heard this would be very slow. I've heard that I should use full-text indexing, but it seems like overkill for this simple search. I am using SQL Server 2005. Can you guys estimate the seconds it would take to search through the following (completely returning all results, no TOP 10) number of records using LIKE in a varchar(60) address:
    1,000 -
    10,000 -
    100,000 -
    1,000,000 -
     
  2. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    no performance elitists?? :o
     
  3. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    This is not something to guess at. Its something to test. Generate a bunch of data and see.
     
  4. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    I have it set up on a local development server, will this reflect real life situation? I really just want an estimate. Like 10 seconds for 1,000,000 records... Is that an unreasonable estimate?
     
  5. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    Why are name, address, and contact information set to text? You can use character for name and varchar for address/contact. That alone will give you better performance just because of the different ways SQL queries the two different data types. You absolutely do not need FTI for this.

    It's impossible to estimate the performance of these queries because of the billion different factors involved (memory, processor, server load, etc.). I just recently ran a query at work that had about 8 different like clauses on a text field for about 1.2 million rows, and that took a half hour, if that tells you anything.
     
  6. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    Textboxes on the asp.net page. They are all data type varchar(50) in the database. Character? The name of the company cannot be a fixed length. People tell me full text index would improve performance. Is that true?

    :noes: I am only getting the TOP 30 or so... and then having advanced search options to return TOP 100 or whatever. Hopefully I won't have any 30 minute searches.
     
  7. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    If the fields' data type is not text, then FTI can't do anything for you.

    Even if you're returning only the 'top 30', it still has to find all the records that have the values you're looking for (and, presumably, sorting them in a specific order).

    Unless you're using a shit server, query performance shouldn't be an issue for you.
     
  8. 5Gen_Prelude

    5Gen_Prelude There might not be an "I" in the word "Team", but

    Joined:
    Mar 14, 2000
    Messages:
    14,519
    Likes Received:
    1
    Location:
    Vancouver, BC, CANADA
    You can also speed it up by changing the SQL Statement if nothing is entered into a field - the code obviously is slightly longer, but it should yield better results.
     
  9. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    I thought TOP xx searched until xx records were found and then stopped searching? After that, I would sort them using a sub query type structure. This, of course, has its own flaw of not seeing the most recent records if there are more than 30 matches.

    You are the first person I've talked to to say it wouldn't be an issue. I didn't think it would be such a big issue, but I hear "half an hour" and it scares me. I want to do it right the first time.
     
  10. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    Would this be done in the SQL statement, or in the code behind for the asp.net page? :o
     
    Last edited: Jan 7, 2008
  11. 5Gen_Prelude

    5Gen_Prelude There might not be an "I" in the word "Team", but

    Joined:
    Mar 14, 2000
    Messages:
    14,519
    Likes Received:
    1
    Location:
    Vancouver, BC, CANADA
    Either - SQL can do if statements and build the SQL statement on the fly, or asp can do it and simply pass the final SQL statement as a whole to the SQL server to run.
     
  12. 5Gen_Prelude

    5Gen_Prelude There might not be an "I" in the word "Team", but

    Joined:
    Mar 14, 2000
    Messages:
    14,519
    Likes Received:
    1
    Location:
    Vancouver, BC, CANADA
    How many tables did it involve though?
     
  13. Nocera

    Nocera ...

    Joined:
    Aug 9, 2000
    Messages:
    1,307
    Likes Received:
    0
    Location:
    Long Island, NY
    Your query WILL result in a full table scan on each execution and WILL NOT scale with database size or application usage. The best solution is to not allow wildcards at the start of the search criteria. This will allow you to take advantage of the indices on the columns in question (don't forget to add them).

    For instance, the following predicate will result in a constant time lookup independent of the size of the table (provided there is an index on each of the three columns):
    Code:
    WHERE (([Address] LIKE '' + @Address + '%') 
         AND ([Name] LIKE '' + @Name + '%') 
         AND ([ContactName] LIKE '' + @ContactName + '%'))
    
     
  14. Astro

    Astro Code Monkey

    Joined:
    Mar 18, 2000
    Messages:
    2,047
    Likes Received:
    0
    Location:
    Cleveland Ohio
    Hmm... Wonder if some indexing would help here or maybe even a clustered index. Toss it in query analyzer and do an explain plan might help, if you know what to look for. Otherwise, best thing to do is toss a bunch of data at it and see how it performs under the conditions you have.
     
  15. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    If I don't allow wildcards at the start of the search criteria, the user can't list by street or the latter half of the number address. It needs the flexibility. What do you mean take advantage of the indices on the columns in question? Add them? Add what?
     
  16. Nocera

    Nocera ...

    Joined:
    Aug 9, 2000
    Messages:
    1,307
    Likes Received:
    0
    Location:
    Long Island, NY
    Then you should store and search on each of the address fields separately (and preferably place them in a separate address table).

    If you search on the Address, Name, and ContactName columns often, you should add an index on each.

    http://en.wikipedia.org/wiki/Index_(database)
     
  17. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    This would not solve me searching for 560 and expecting 12560 coming up, AND 125 and expecting 12560 to come up. This application requires it.
     
  18. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    So I guess if I require true front and back wildcard searching, I can't use any index? This is so weird because I use the same technique as a user searching an existing SQL database of 40,000 records and it returns the results in no time at all.
     
    Last edited: Jan 8, 2008
  19. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    A TOP query is almost useless unless you sort the results. Otherwise, you're just getting 30 (or whatever) random results. If that's OK, then go without a sort and it will just return the first 30 it finds.
     
  20. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    4 One table had the text field, and the other three had names for the ID fields in the main table.
     
  21. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    Indexing on these tables is a must. It's a must on anything that is repeatedly searched.

    EDIT: I don't believe an index seek is done for a LIKE clause. I could check on that.
     
  22. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    Can I use indexing with a LIKE '%' + whatever + '%' query?
     
  23. elevator

    elevator The tenants think it's wonderful! ヽ(´ OT Supporter

    Joined:
    Mar 25, 2005
    Messages:
    10,967
    Likes Received:
    0
    Location:
    Houston, TX
    If I sort all the records in the table and then get TOP 30 (which after I sorted would technically be the LAST 30), wouldn't that be just as inefficient as doing a whole table search?
     
  24. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    Are you talking about full-text indexing, or regular table indexes? You can't use FTI on anything but a text field.

    Index scans are not done on LIKE clauses when you use a wildcard (the %).
     
  25. Pork Chop Sandwiches

    Pork Chop Sandwiches Extraordinary Member

    Joined:
    Feb 25, 2005
    Messages:
    7,217
    Likes Received:
    0
    Location:
    TGP
    Yes, a table scan is necessary. Just create an index with the fields you'll be querying the most.

    I think your concern is unwarranted here. Again, unless you're using like a 286 processor with 64MB of RAM, your query should run very quickly. The query I referred to above was a very expensive multi-table query run on heavy production data. Your query involves three fields in one table. If the table is sufficiently indexed and you have a decent server, it should run almost instantaneously.
     

Share This Page