What is Castrate, how can we use it?

I’m using to castrate after you finally see it. Castrate location put the notes in the log– this option will make the notes more readable. And here’s my pro SQL SELECT all the columns with the LIKE operator, and then in the second SELECT, I use the DROP equals. And then I turn trace off so it doesn’t continue to check and give notes in a log. All right, we get two reports. Here’s the first with the star. Prepared. You can see the SELECT expanded with all the columns, even a supplier, we supply a country– those that begin with the word supplier. And check it out. Even the WHERE product name LIKE, we already verified LIKE would work by the database. And then in the next one, I’ll choose all of the columns but DROP supplier.


As you look at this Prepare SELECT, it does not include anything supplier. So both work. Both were handled by the database, which is good. That’s what we want. So I’m going to switch back to my slides. What portion of the SQL query submitted to the database was modified by the DROP equal option? And I think you saw that. It wasn’t WHERE. It wasn’t FROM. Furthermore, it was the SELECT where you select which columns you want. That’s the answer. All right, so there are data set options like KEEP equal, DROP equal. Well, SAS/ACCESS data set options also exist for the Oracle database, like DB, Database Condition, REBUFF, RHINOS. The CONDITION would allow you to specify an SQL clause that’s processed by the database.


REBUFF will control how many rows to read into the buffer. That can potentially decrease network activity if you are reading in bigger chunks of data at a time. And remember, when you improve one resource, it typically goes against a different one. So you think about it, if you’re pulling in more data, you might be taking up more memory. And then there are ORIENTS. That influences the decision made by Oracle, the Oracle optimizer, rather to have the database process part of the query or all of the query. And then these improve efficiency. Let’s go back to the WEEKDAY story, against date part. And here are the notes. Can not be passed through the database? We saw this earlier. Even though it cannot translate WEEKDAY, there are some specific DBMS, SQL, if you know it if you understand it, that you can specify to improve the processing.


All right, so we’ll use some data set options to improve the performance of a PROC MEANS on the SAS platform and then replicate the PROC MEANS for the database to do that task, explicitly passing it through the database. Let’s try these options. The CONDITION allows you to pass database specific, like Oracle-specific, WHERE clause for the filter. All right, so give it a shot. Oh, also notice you put it in quotation marks. So here it is. Next to the Order_Fact Oracle table, I’ve put CONDITION equals, and in quotation marks is basically what I want– where to_char order_date D equals 7. Basically, I’m looking for the day of week 7, which is Saturday. Well, this is more specific to the Oracle database syntax, not the SAS syntax. So if you understand this, you can type it yourself. If you know the ANSI standard for Oracle and its extra features, if you’re familiar with it, go for it. That’s what CONDITION will allow you to do.


It says Oracle prepared SELECT these columns WHERE to_char order_date D equals 1. That minimizes the movement of the data from the database table down to SAS. CONDITION. As I mentioned earlier, this option you can use to control the amount of data coming in. How many rows are in the buffer? You can increase it or decrease it. It might result in more memory usage. Here it is. Perhaps you could speed things up by increasing that read buff number? This particular change made almost no difference, and I’ve put in read buff equals 1,000. If I made it higher, it might make a difference. With it. Without it. Look at the statistics. The clock time, includes any waiting period. The CPU time. The time needs to actually perform the task. They’re very close at how much memory was used. There’s not a big distinction in terms of performance between them. But sometimes minimizing the I/O can improve performance.


Let’s look at this option, this data set option PRINTS,, and the oracle hint text. So this might provide hints to the Oracle optimizer to make a better decision and speed the processing. So let’s try this option. We put PRINTS, and there’s the CONDITION. Highly parallel tables, full table scanners, we call this retrieve data. All right. So let’s look at the speed boost. We went from without it, and look at your real-time, 0.48. User CPU time. System CPU time. And look at the drop with the hints. All right? So it pays, that’s denoted about them, to know the specifics of that database. If you are comfortable, and you’ve been writing SQL for Oracle or Teradata, and you know that syntax, go for it. You can use it. And this option can improve the likelihood of it performing your task. So what are the advantages and disadvantages of an implicit SQL pass-through? Well, here’s a decision made between SAS and Oracle as to who will handle it. Well, first, it proves transparent access to the database tables.


The DATA and PROC step syntax look as you’ve always used it. If you know the database– in this case, Oracle SQL– it’s not necessary, because it will try to translate things for you. Some SAS functions like MIN, MAX will automatically translate to the database. Remember that you had YEAR, and we had WEEKDAY. YEAR worked. The YEAR function, was translated, but not WEEKDAY. And remember SCAN– the SCAN issue. All right? What are some of the disadvantages? If you reference the Oracle tables and the DATA step when you’re using MERGE, then it will not be passed to the Oracle database to do a JOIN. The subletting IF you will always be done by SAS. There are certain options, if you’re familiar with them, that you can try to use to improve the likelihood of the database processing, especially when you’re pulling down a lot of data if you have large database tables. So the idea– if you can, get the database to do some of that processing on this side. That will speed things up. All right, 3.3. We go from implicit to explicit with PROC SQL and PROC Fed SQL.


Here are the objectives. We’ll take a look at PROC SQL and PROC Fed SQL. We’ll look at some of the advantages and challenges of both. Furthermore, we’ll do explicit pass-through through both to the database and then do a review at the end of both. So what is PROC SQL? Structured Query Language. It’s been around a long time, back in the ’70s. That’s when it was developed to go against relational databases, so there’s history associated with it here. The latest version is based on ANSI SQL2, the 1992 standard. And here are some standards most vendors follow like using a SELECT, using a FROM, using a WHERE. Can use it against any SAS data set. Includes many SAS extensions like data set options, SAS functions you can use with PROC SQL. You can use case logic like TRUE or FALSE. It can process DOUBLE and CHAR data types. And you can do explicit pass-through. Check out the syntax.

Leave a Comment