I’m not sure about everybody else, but I definitely feel that knowing a few human languages, knowing a few computer languages, certainly makes remembering things a little more challenging. So here is a mnemonic to remember the order of statements. So few workers go home on time. If you are a brand-new SQL user, this might be handy. If you have already been using SQL, then I’d appreciate hearing from you with a comment down below on the mnemonic that you have used to make your SQL remembering super easy. Here is my library, already assigned. It’s pointing to a location on my C drive. I have three datasets.
Again, as a reminder, our datasets the Ingredients dataset, the Chef dataset, and the spices’ dataset. So to take a look, very beginning, very fundamental, at the rows and columns in the Ingredients dataset, we’ll provide two statements. We’ll invoke PROC SQL, and then a semicolon. And then, we’ll submit the SELECT statement and FROM statement. On this SELECT statement, we are going to add asterisks, which is a wild card for saying, give me all the columns on the Ingredients dataset. In the front clause, we are going to give the name of the table, which is the Ingredients table in their dinner library. Above the PROC SQL statement, I have a TITLE statement. This is global, so I can stick it anywhere in my code. And because it actually corresponds to the SQL query right below it, it makes sense to stick the TITLE just before my SQL step. So you’ll notice that the SELECT is like one big paragraph that ends in a semicolon. And this is SQL.
So SQL says as soon as I see the semicolon, I’m going to instantly execute the query. So this might also make sense why we did not introduce a TITLE statement below the PROC SQL step. Because once a query is over and done, it’s already executed and it’s printed the results in the Results window. There is no point in having a TITLE statement after that because this query is finished. So that’s the position of the TITLE statement. The QUIT tells SQL I’d like you to leave a memory. And if you didn’t leave the QUIT, then what you could do is execute some more subsequent queries. But we keep it basic and simple. We’ll go ahead and execute this query so that we can see the rows and columns in the Ingredients’ data set.
A full look at the Ingredients dataset. Eight rows and all the columns, plus their values. So if you had a lot of data, it would be really difficult to scroll through. If you had, for example, a million rows, it would be virtually impossible for you to know how many rows there are. So there’s a couple of options that can help me. I’ll head back to the code and show you this really clever macro option, with a %put statement. I’m telling SAS, please go ahead and give me the value of this macro variable, slobs. So slobs are basically saying give me the number of rows that this query returned. So I’ll submit this. And chances are when I look at my log, I do see the number eight. But it’s kind of hidden among the column numberings, so it’s not super clear to me.
What I’d like to do instead, is perhaps add a bit of text and ask SQL to give me the number of rows this way. So there’s a piece of text, and then after that, followed by the macro variable. Now, this makes it so much easier to check out in the long. What we saw is an asterisk, which was, basically, give me all the columns in this table. And if you look at the columns in the table we later read on the query, we have some interesting columns here. We have the Name of the dish. Then, I’m also curious about the OilInTbsp column, as well as the spice quotient.
If I want just those three columns on the SELECT, I’m just going to list those three columns separated by a comma. And I want to list these columns in the order in which I want them to print in the Results window. This now is going to give me the same number of– heading back to the code, once again, verifying that I’m going to list the columns in the order in which I wish to see them. And indeed, I got exactly what I wanted. Now, that’s the beginning. The introduction of two statements. The SELECT and the FROM. Two mandatory statements.
Now, let’s move a little further into our code, and let’s see if we can subset the data. For example, we are working from home and we might have started becoming more conscious of heart-healthy food choices. So how can we subset the data? Here is the WHERE clause. It is one of the most powerful clauses in the language of SQL. In fact, it also makes its appearance in the SAS data step and other SAS procedures as the WHERE statement. Essentially, it operates in a very similar manner. It needs something that you’re asking for– the column that I’m requesting. And then, on the right side, it needs a value. And in between, it needs some kind of comparison operator.
I’m curious to see which of my dinners are low fat. And so, that’s why I have the WHERE clause. Is the order of statements important? Yes. The order syntax is important to SQL. And did I say SQL? Sometimes you’ll hear this interchangeably. SQL, SQL, and they both are the same. So we have a select asterisk again. On the FROM clause, I put my Ingredients table. On the WHERE clause, I’m asking for only those rows that need a certain condition. My condition could be anything. The thing to remember about the WHERE clause is that it needs you to act on existing columns. And what does that mean? An existing column is a column that is appearing from a table on the FROM clause. A calculated column would be something that you built in the query. And there are some workarounds around it.
But for this presentation, suffice to say the WHERE clause acts on existing columns. OilInTbsp is an existing column. The other thing to remember about the WHERE clause is like has to match. If the column on the left side is numeric in nature, any comparison data value, which is 3, needs to also be a number. However, if he had queried a character column, a comparison on the right side needs to be for character data. Very simple. Like has to match like. That is rule number two with the WHERE. And the first rule was I need to subset on existing columns.
I’m going to submit this code and I already have a %put. I called it a number, but we can go back and call it rose. We can call it whatever our heart desires. It’s a text value. So how many rows am I going to get for dinners which are low fat? We let the WHERE clause examine this. It tells me two rows of data. So these are the two dishes that have low fat, relatively low fat compared to the rest of my dishes.