Looking for a fast way to compare two large databases

Discussion in 'OT Technology' started by Seeders, Jul 23, 2008.

  1. Seeders

    Seeders OT Supporter

    Joined:
    Jul 10, 2007
    Messages:
    3,750
    Likes Received:
    0
    i have two mysql databases that are supposed to be relatively synched. One is the student database for my school (about 28,000 records), and the other is a Moodle database (about 33,000) which should contain a user account for each student.

    The two databases share a few fields, the primary key being the student's idnumber. The other important fields are firstname, lastname, and email.

    Im trying to write a php program to report the differences between the two databases. For example, the moodle database has some instructors and other users who are not in the Student database. Sometimes a student will change their name or email address, etc etc.

    What methods would you use to do this semi-efficiently?

    I've thought of sorting the records in arrays by their idnumber (6 digit unique integer) so that i dont have to do a nested mysql select statement like

    for each idnumber in student_ids
    select firstname, lastname from moodle_table where id = idnumber

    because 28000 select statements takes awhile. However, then i would still be searching through two arrays that dont quite match and that would take awhile as well.
     
    Last edited: Jul 24, 2008
  2. EkriirkE

    EkriirkE Zika Xenu OT Supporter

    Joined:
    Jan 11, 2004
    Messages:
    14,799
    Likes Received:
    0
    Location:
    Dublin & San Francisco, CA
    are the 2 db on the same sql server instance?
     
  3. Ameter

    Ameter Active Member

    Joined:
    Mar 31, 2005
    Messages:
    97,795
    Likes Received:
    1
    Location:
    Calgary, AB
    use a hash table
     
  4. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    28000 and 33000 are not large databases. The problem is you are doing individual selects. Why don't you just grab all the data you need from each in ONE select on each database, and then sort/compare? With 60K rows you shouldn't run out of memory doing that in PHP.

    SELECT firstname, lastname, id FROM moddle_table ORDER by id;

    Do that twice, and fill a hash or an array of hashes or an array of arrays: whatever. Then do your sort/compare with a big ass grep for each value and note the differences.
     
  5. critter783

    critter783 OT Supporter

    Joined:
    Jul 15, 2005
    Messages:
    1,785
    Likes Received:
    0
    Why not just do it all with an SQL statement?
     
  6. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    I kinda figured these were independent, and not on one database server?
     

Share This Page