Wednesday, November 23, 2011

SIMPLE QUERIES IN SQL AND.....

SIMPLE BUT POWERFUL QUERIES IN SQL AND INTERESTING GLOBAL GEOGRAPHY QUIZ



Let’s assume we have a database with the following basic characteristics:


Where region is the name of the table and name, region, area, population, and gdp are the variables or columns where data is stored within the table. This is a table with a relatively low number of observations because there are only less than 200 countries in the world. However, it serves as a great case study to practice SQL queries. 

I am going to be showing 3 different queries to retrieve different groups or points of data. While this is nothing more than a mental exercise (and a fun geography quiz), the same logic for retrieving data can be applied to solve important business needs. 
 

1.       List each country name where the population is larger than 'Russia'

SELECT name FROM bbc WHERE population>
(SELECT population FROM bbc WHERE name='Russia')
What the query in red is telling SQL is that you want to select the values of the variable “name” from the table “bbc” where the population is greater than. The second query (with a yellow background) is telling SQL that to identify the comparison group (the country of Russia in this case) and that it needs to look in the “population” variable in the table “bbc” and identify the country with the “name” =’Russia’. Notice that we use single quotation marks  ‘…’ for the word ‘Russia’ because it is the specific word within that column of data (“name”) that we want SQL to identify.
Curious to know what the final answer is?

2.       Show the countries in Europe with a per capita GDP greater than 'United Kingdom'.
SELECT name FROM bbc WHERE  gdp/population>
(SELECT gdp/population FROM bbc  WHERE name ='United Kingdom')
AND region='Europe'

We observe a similar pattern in the first query (the on in red). It wants the names of countries from the table bbc where gdp/capita is greater than. In this case SQL calculates the gpd/capita since it is not one of the original variables in this table. It simply divides the variable gdp over the population variable.  Then the second query (yellow background) selects the same calculated variable (gdp/capita) from the table bbc for countries with the name ‘United Kingdom’. We use the quotation marks again to tell SQL to look for that particular set of characters. There is an additional query here, which is telling to select the conditions in the yellow query PLUS (that is why we include the command AND ) countries that are located in the “region” =’Europe’.
Curious to know what the final answer is?

3.       Which country has a population that is more than Canada but less than Algeria?

SELECT name FROM  bbc WHERE population >
(SELECT population FROM bbc WHERE name='Canada')
AND population < (SELECT population FROM bbc WHERE name='Algeria')
My exposure to the world of command writing whether it is an Stata, Access, Eviews, Excel, or SQL has shown me that is all about patterns. The 3 queries in this blog entry have been increasing in complexity (the 3 of them are still considered to be at the ‘novice’ level)  but we observe that the basic query in red letters follows the same pattern. So the query in red is telling SQL to look for the name of a country in the bbc table where the population is greater than. The second query in yellow is telling SQL to population value in the bbc table for an entry with the name =’Canada’. The interesting addition here is the AND command and the less than (<) sign. The green query is very similar to the yellow query but is the necessary item we need to solve our geography challenge! 
Curious to know what the final answer is?