Previous Topic: Exercise 3-5Next Topic: Exercise 3-6


Eliminating Duplicate Rows

Sometimes a row in a selected column contains information that is the same as information in another row.

Why Duplicates Occur

The EXPERTISE table contains skill IDs, the IDs of employees who have the skills, the level of ability an employee has in a skill, and the date the ability was acquired. Commonwealth Auto may have several employees who match a particular skill in the SKILL table and may have no employees who match another skill.

How It's Done

To obtain a list of skill IDs associated with at least one employee, enter:

select skill_id
       from expertise;

This gives a list of skill IDs that have been matched to employees who have that skill. Any skill that has no employees associated with it will not show up in the result.

The result looks like this:

SKILL_ID --------     1000     6470     1000     6770     6770     7000     3065     3333     6770     4430     7000     5309     1000     6670     6470     3333     4444     7000     4250     4370     5180     1030     4490     5200     6666     5420

    5430     1000     5500     5309     5180     1000     4430     3333     6650     6670     6770     6770     5309     5500     6650     5200     7000     7000     7000     5309     5200     6666     4370     4410     7000     7000     4370     4410     4420     7000     1000     4430     5500     3065     6670     7000     4250     5130     5309     5130     6770     7000     5200  69 rows processed

Why are some of the skill IDs repeated?

The result shows the skill ID for each employee. If more than one employee has that particular skill, the skill ID is repeated.

To see this more clearly, look at the skills and the associated employees by entering:

select skill_id, emp_id
       from expertise;

The result looks like this:

SKILL_ID EMP_ID --------  ------     1000    1003     6470    1034     1000    1234     6770    1765     6770    2004     7000    2010     3065    2096     3333    2096     6770    2106     4430    2174     7000    2180     5309    2209     1000    2246     6670    2246     6470    2424     3333    2437     4444    2437     7000    2448     4250    2461     4370    2461     5180    2461     1030    2466     4490    2466     5200    2466     6666    2598     5420    2781     5430    2781     1000    2894     5500    2894     5309    3082     5180    3118     1000    3222     4430    3222     3333    3288     6650    3288     6670    3288     6770    3294     6770    3338     5309    3341     5500    3411     6650    3433     5200    3449     7000    3704     7000    3764     7000    3767     5309    3769     5200    3778     6666    3778     4370    3841     4410    3841     7000    3991     7000    4001     4370    4002     4410    4002     4420    4008

    7000    4027     1000    4321     4430    4321     5500    4358     3065    4456     6670    4456     7000    4660     4250    4703     5130    4703     5309    4773     5130    4962     6770    5008     7000    5090     5200    5103  69 rows processed

You want to eliminate the duplicate rows resulting from your first SELECT statement in order to see each skill ID only once.

To eliminate these rows, you can use the DISTINCT option immediately after the word SELECT.

How It's Done with DISTINCT

Using the first SELECT statement, add DISTINCT after SELECT:

select distinct skill_id
       from expertise;

The result looks like this:

 SKILL_ID  --------      1000      1030      3065      3333      4250      4370      4410      4420      4430      4444      4490      5130      5180      5200      5309      5420      5430      5500      6470      6650      6666      6670      6770      7000  24 rows processed

Now the result shows a list of skill IDs with no duplicates.

Since using DISTINCT eliminates duplicate rows, fewer rows are returned.