Building Better Software

Wednesday, February 4, 2015

What SQL Statements are Currently Running in my Oracle Database

We are often times faced with the situation where we need to know what statements are running right now in our Oracle database. Maybe we are having some sort of issue, and applications are not responding or responding very slowly, and you need to get some information fast about what is happening in your system. One of the first things I always take a look at is what is happening in the database. This may just be from experience, but on (too) many occasions, I've found the issue to be a long running, inefficient query that is consuming way too many resources. In any case, one of the things I always want to do is narrow down what tier of the system the problem might be in. So by scanning through the results of this query, I can quickly determine if the problem is in Oracle or I need to look somewhere else.

So here is the query:

 SELECT  
     s.sid, s.username, s.osuser,   
     s.machine, s.process, s.program, s.module,   
     q.sql_text, q.optimizer_cost,   
     s.blocking_session, bs.username as blocking_user,   
     bs.machine as blocking_machine, bs.module as blocking_module,  
     bq.sql_text AS blocking_sql, s.event AS wait_event,  
     q.sql_fulltext  
   FROM v$session s  
   INNER JOIN v$sql q  
     ON s.sql_id = q.sql_id  
   LEFT OUTER JOIN v$session bs -- blocking sessions  
     ON s.blocking_session = bs.sid  
   LEFT OUTER JOIN v$sql bq -- blocking queries  
     ON bs.sql_id = bq.sql_id  
   WHERE s.type = 'USER';

What do I look for when I run this query? First, statements with a high optimizer cost. The cost is usually high because the statement is expensive. So I check those out first. I also look to see if there is a blocking session for a statement. I have seen cases where someone created a lock and then everyone piled up behind them bringing an entire environment to a halt (fortunately this was is DEV, but still). And finally, I'll run this statement a couple of times, and see if there are any statements that are still the same on the second or even third running of the query. Most statements that are run (in an OLTP environment anyway) execute in sub-second time, so if a statement is sticking around over 10, 15 or even 30 seconds, I know it is trouble.

What I also get from the statement is enough information to find out who is running the statement. Session id, Oracle user, OS user and machine are all there. I've seen cases where someone ran a statement in production but forgot a join condition or didn't realize they weren't using an index. From the data in the above statement, I usually am able to track down who it is so I can give them a call and get them to cancel the statement.

I usually keep a cheat sheet around of about 10 statements like this that I commonly use to diagnose what is going on with an Oracle database, and this is always on of the more frequently used statements in that list. So put this in your own cheat sheet or bookmark this page so that the next time someone asks "Does anyone know what is going on in the Oracle database right now?" you will be prepared to find the answers.

Monday, January 26, 2015

Archiving Rows in Oracle Using a Trigger

Often times, we are faced with the situation that any time a row is updated or deleted in a table, we want to store the previous version of the row in an archive table. In some cases this is a requirement for audit purposes. Even in cases where it is not required for audit, it is still quite nice to be able to go back and look what a row was on a certain date and to know when the value last changed.

For example, several places I have worked have kept application settings in a table in the database. And I've run into situations where a value was updated and all of the sudden things started to go wrong. It is those times I wished that we had some sort of archiving mechanism so we could easily tell what changed, when, and what the previous values were.

This is something that can be easily accomplished with a trigger. And the nice thing about the trigger solution is that this happens transparently in the background. It doesn't matter if a row is updated by application code or an ad-hoc query in a SQL Editor, the prior version of the row will get archived. So lets see how this works.

First, consider that we have the following data table.

CREATE TABLE students
(
    student_id      NUMBER(10)      GENERATED ALWAYS AS IDENTITY NOT NULL,
    first_name      VARCHAR2(30)    NOT NULL,
    last_name       VARCHAR2(30)    NOT NULL,
    email           VARCHAR2(50)    NULL,
    phone           VARCHAR2(20)    NULL,
    create_date     DATE            NOT NULL,
    modified_date   DATE            NOT NULL,
    CONSTRAINT pk_students PRIMARY KEY (student_id)
);

Then, we have this as an archive table. Note, I am not enforcing a foreign key on the STUDENT_ID column back to the STUDENTS table from STUDENTS_ARCHIVE. If we deleted a row from the STUDENTS table, then there would be no parent key to reference, so this would not work.

CREATE TABLE students_archive
(
    archive_id          NUMBER(10)      GENERATED ALWAYS AS IDENTITY NOT NULL,
    student_id          NUMBER(10)      NOT NULL,
    first_name          VARCHAR2(30)    NOT NULL,
    last_name           VARCHAR2(30)    NOT NULL,
    email               VARCHAR2(50)    NULL,
    phone               VARCHAR2(20)    NULL,
    create_date         DATE            NOT NULL,
    active_start_time   DATE            NOT NULL,
    active_end_time     DATE            NOT NULL,
    operation_type      VARCHAR2(1)     NOT NULL,
    CONSTRAINT pk_students_archive PRIMARY KEY (archive_id)
);

And now for the trigger. Its quite simple really.

CREATE OR REPLACE TRIGGER tr_students_archive_row
    AFTER UPDATE OR DELETE    
    ON students    
    FOR EACH ROW        
DECLARE 
        op_type  VARCHAR2(1);
BEGIN
    CASE
        WHEN UPDATING THEN
            op_type := 'U';          
        WHEN DELETING THEN
            op_type := 'D';
    END CASE;
    
    INSERT INTO students_archive (student_id, first_name, last_name, 
        email, phone, create_date, 
        active_start_time, active_end_time, operation_type)
    VALUES (:old.student_id, :old.first_name, :old.last_name, 
        :old.email, :old.phone, :old.create_date, 
        :old.modified_date, sysdate, op_type);
    
END;

The trigger fires AFTER EACH ROW on an UPDATE or DELETE statement. I fire AFTER EACH ROW because of course I want to archive every row affected and also because I want to make sure the UPDATE or DELETE was successful.

Then, I use a CASE...WHEN statement to put a code into a variable indicating if an UPDATE or DELETE operation is being performed.

Finally, I insert the OLD data (via the :OLD) pseudorecord into the archive table. The time this version the record started being active is the modified_date on the old version of the record, and the time it stopped being active is right now. In this case, I am using DATE values, but you just as well could use TIMESTAMP values for more accuracy.

And that is it. Now, whenever a record is changed or deleted in our data table, we'll have a history of it in our archive table. It is a very clean solution to the problem.

If the updates and deletes against your table are mostly of the single row variety, then this trigger will work great for you. If you are doing a lot of bulk updates against the data in your table, then a compound trigger will likely offer better performance, which I'll cover in a future blog post.

SQL Server DMV to get Stored Procedure Usage Stats

If you are working in a SQL Server environment that uses stored procedures for data access, then ii is very useful to get some execution stats around those stored procedures. What procedures are executing most often? How much CPU and logical IO has a procedure used? What is the average time it takes for a procedure to execute.

The following query uses the dm_exec_procedure_stats DMV in SQL Server to do just that.

SELECT sc.name, p.name, st.cached_time,
    st.execution_count, st.total_elapsed_time, st.total_worker_time, st.total_logical_reads ,
 st.total_elapsed_time / st.execution_count as avg_elapsed_time,
 st.total_worker_time / st.execution_count as avg_worker_time,
 st.total_logical_reads / st.execution_count as avg_logical_reads
FROM sys.procedures AS p
INNER JOIN sys.schemas AS sc
  ON p.[schema_id] = sc.schema_id
LEFT OUTER JOIN sys.dm_exec_procedure_stats AS st
  ON p.[object_id] = st.object_id
ORDER BY st.execution_count DESC;

Now it is possible to go down to the query level by looking at the dm_exec_query_stats, and that is also a useful query. But sometimes, its nice to look at things from the procedure level before diving down to the individual statement level, and that is what this query offers.

Important to note, this view only holds items that are currently in the SQL Cache, so it is not a statistical count since SQL Server was rebooted or anything like that. It is just what is in the SQL cache, which is usually quite a bit. If you are looking to see if a procedure is no longer used though, you might want to run this query multiple times at regular intervals over the course of a few days to make sure that the procedure is not getting cached out such that you don't see it in an individual query.

Updating Administrative Columns Using a Trigger in Oracle

Consider a table like the one below:

CREATE TABLE students
(
    student_id     NUMBER(10)      GENERATED ALWAYS AS IDENTITY NOT NULL,
    first_name     VARCHAR2(30)    NOT NULL,
    last_name      VARCHAR2(30)    NOT NULL,
    email           VARCHAR2(50)    NULL,
    phone           VARCHAR2(20)    NULL,
    create_date   DATE            NOT NULL,
    modified_date DATE            NOT NULL,
    CONSTRAINT pk_students PRIMARY KEY (student_id)
);

Specifically, we have two columns in this table CREATE_DATE and MODIFIED_DATE that I call administrative columns. These are here not because they have anything to do with the business domain data for the record (in this case a student), but purely for tracking purposes. Its useful to know when a row was created and last modified. In fact, if you are performing optimistic locking to update columns, it is imperative to know when the row was last modified, but that is a topic for a different day.

The problem we face is that we want to keep these rows accurate and in sync. If we only create and update rows in our application, no problem. We can make sure that our data access code or stored procedure appropriately sets these rows for each use case and we are gold.

But here is the catch. We can't always guarantee that someone will insert or update rows from our application.

Maybe someone writes another application that needs to modify data in our database. Yes, we would like for it to use the same data access libraries, but in the real world, we know this is not always the case.

And what about the use case for someone modifying data via their SQL Editor. Sure, in production, things should be locked down such that this doesn't happen often, but inevitably, it will need to happen at some point. Some rows will get messed up and someone will need to go in and fix them.

And so our challenge is how do we make sure these columns are always correct and always updated no matter what the use case is when they are inserted or updated to.

This is a place where a trigger can really help.

Consider the following:

CREATE OR REPLACE TRIGGER tr_students_create_mod_date 
    BEFORE INSERT OR UPDATE
    ON students
    FOR EACH ROW
    BEGIN
    CASE
        WHEN INSERTING THEN
            :new.create_date := sysdate;
            :new.modified_date := sysdate;
        WHEN NOT UPDATING THEN
            :new.create_date := :old.create_date;
            :new.modified_date := sysdate;        
    END CASE;
END;

This is a very simple trigger. When a row is being inserted, it sets the value of CREATE_DATE and MODIFIED_DATE to the current server date (sysdate). When a row is being updated, the MODIFIED_DATE is set to sysdate, but the CREATE_DATE is set to the original create date. The effect this has is that even if someone tries to change the CREATE_DATE for a row after the row is inserted, it won't matter. We'll always preserve that original CREATE_DATE.

And so this is one of the little niceties that make our life easier. We don't have to include these values in our statements and we can be assured that they will always be correct, no matter how a row is inserted or updated in our database.

An Oracle Trigger for an Auto-Increment Column

In Oracle 12c, you can now define columns as auto increment columns, similar to what you can do in SQL Server and other databases. Ultimately, what Oracle is doing is creating a sequence in the background and wiring this up behind the scenes to the column in your table.

Here is an example of an auto incrementing column in Oracle 12c

CREATE TABLE students
(
    student_id     NUMBER(10)      GENERATED ALWAYS AS IDENTITY NOT NULL,
    first_name     VARCHAR2(30)    NOT NULL,
    last_name      VARCHAR2(30)    NOT NULL,
    email          VARCHAR2(50)    NULL,
    phone          VARCHAR2(20)    NULL,
    create_date    DATE            NOT NULL,
    modified_date  DATE            NOT NULL,
    CONSTRAINT pk_students PRIMARY KEY (student_id)
);

However, most databases around today were designed before 12c was available. No worries though, you can use the combination of a trigger and a sequence to synthesize this behavior in Oracle. And in fact, people have been doing this for years, at least as long as I have been working with Oracle which goes all the way back to 1996. So how do we do this:

First, lets consider a table of the form:

CREATE TABLE students
(
    student_id     NUMBER(10)      NOT NULL,
    first_name     VARCHAR2(30)    NOT NULL,
    last_name      VARCHAR2(30)    NOT NULL,
    email          VARCHAR2(50)    NULL,
    phone          VARCHAR2(20)    NULL,
    create_date    DATE            NOT NULL,
    modified_date  DATE            NOT NULL,
    CONSTRAINT pk_students PRIMARY KEY (student_id)
);

And now, we need to define a sequence and a trigger:

-- Start at any number you want to - here I'll start at 100    
CREATE SEQUENCE seq_student_id_sequence START WITH 100;    
    
    
CREATE OR REPLACE TRIGGER tr_students_assign_id 
    BEFORE INSERT
    ON students
    FOR EACH ROW
    BEGIN
        :new.student_id := seq_student_id_sequence.nextval;
END;

The trigger is pretty simple. Before a row is inserted, it will grab the next value from the sequence and use the :new psuedorow to put that value in the appropriate field, in this case student id.

As I said in my Pluralsight course, if you work with Oracle for any length of time, you are going to see triggers very similar to this in the databases you work with. So even though we might not need to create triggers like this in a post 12c world, it is still good to understand how this works, because databases using these constructs will be around for years.

Friday, December 12, 2014

Useful Websites for Web Developers

Many times a blog posts functions as a place where you can write something down so you will be able to find the information later, and that is what this blog post is. There are a number of useful websites that I have bookmarked or sometimes not, but I want to keep this list all in one place. If you are a web developer, you probably already know about most or all of these sites. But if you don't, you might want to check them out as well.

Browserscope (http://www.browserscope.org/)

This site allows you to test the capabilities of your browser as well as see the capabilities for other browsers people have tested. This does test browser capability, but also gets a little more down into the details than other sites on this list. For example, it you are wondering how many concurrent network connections a particular browser supports, then this is the site to go to.

HTML5 Test (https://html5test.com/)

More and more browsers support HTML 5 these days, but there are still important differences out there. This site will not only test your browser, but compare different browsers to each other so you know what capabilities are and are not supported.

Can I Use (http://caniuse.com/)

Another site about what features are available in what browsers. In this site, you select the feature from the main page and it then takes you to a page showing you what browser versions support that feature.

Modern IE (https://www.modern.ie/en-us)

This site by Microsoft allows you to scan a page to see if you are using any old or deprecated features that you should be uplifting. What is even better is they have an interface to Browserstack which will take screenshots of your website right from a webpage.

Google PageSpeed Insights (https://developers.google.com/speed/pagespeed/insights/)

Web performance is a big deal, and PageSpeed Insights quickly scans a target website for front end best practices. You will get a score for both mobile and desktop page speed as well as mobile usability. In addition, you will get specific recommendations about how the performance of your site can be improved, along with links to Google's documentation explaining exactly what needs to be done.

The online version of PageSpeed Insights can only target publicly available web sites. You can also check out the Chrome Extension. One of the nice things about the extension is that for things like optimized images, the extension will provide you with an already optimized version.

Web Page Test (http://www.webpagetest.org/)

This site allows you to run a performance test against your website from locations around the world. This is very useful to get some real data about what your performance might look like for someone across the country or across the world. You get back detailed reports including waterfall charts and a breakdown of the type of data that was loaded.

Alexa (http://www.alexa.com/)

Alexa provides site ranking and demographic data about who is viewing a site. This is useful for comparing your site to competitors in your categories. The data provided has an SEO bent. There is a pay version, but still a good amount of interesting data available for free.

Built With (http://builtwith.com/)

Ever wonder what technologies a web site was using. Built With scans the site and delivers you a nice report on everything ranging from the web server to JavaScript and CSS libraries used. Could you get all of this information yourself by inspecting HTTP headers and HTML? Sure. But why do that when this site can do it for you.

Sunday, September 7, 2014

Sample External Tables

When creating external tables in Oracle, it is much easier to start with a sample definition and then just modify it to fit your specific needs. So that is what this post is all about, is providing some sample definitions that all of us can use the next time we need to come up with an external table.

With all of these, you will need to adjust your DEFAULT DIRECTORY definition in the external table definition if you just want to try out my files.

Comma Separated Values

Probably the most popular flat file format. The sample file contains data that has a header row and all of the values are quoted.

The sample file is states.csv

CREATE TABLE import_states_csv
(
    abbreviation        VARCHAR2(50),
    state_name          VARCHAR2(30),
    date_of_statehood   DATE,
    capital_city        VARCHAR2(30)
)
ORGANIZATION EXTERNAL
(
    TYPE ORACLE_LOADER
    DEFAULT DIRECTORY data_import
    ACCESS PARAMETERS
    (
        RECORDS DELIMITED BY NEWLINE
        SKIP 1
        FIELDS TERMINATED BY ',' ENCLOSED BY '"'
        MISSING FIELD VALUES ARE NULL
        REJECT ROWS WITH ALL NULL 
        FIELDS
        ( 
          abbreviation,
           state_name,
           date_of_statehood DATE "YYYY-MM-DD",
           capital_city
        )
    )
    LOCATION ('states.csv') 
);

Tab Separated Values

This type of file is not seen as often these days, but you still may run into this from time to time. Notice how in the external table definition the field separator character is specified as hex value 9 (which is the tab character).

The sample data file is states.tsv

CREATE TABLE import_states_tsv
(
    abbreviation        VARCHAR2(50),
    state_name          VARCHAR2(30),
    date_of_statehood   DATE,
    capital_city        VARCHAR2(30)
)
ORGANIZATION EXTERNAL
(
    TYPE ORACLE_LOADER
    DEFAULT DIRECTORY data_import
    ACCESS PARAMETERS
    (
        RECORDS DELIMITED BY NEWLINE
        SKIP 1
        FIELDS TERMINATED BY X'9' OPTIONALLY ENCLOSED BY '"'
        MISSING FIELD VALUES ARE NULL
        REJECT ROWS WITH ALL NULL 
        FIELDS
        ( 
           abbreviation,
           state_name,
           date_of_statehood DATE "YYYY-MM-DD",
           capital_city
        )
    )
    LOCATION ('states.tsv') 
);

Fixed Width Data File

You can download a list of all of the US banks that participate in ACH transactions from the US Federal Reserve at this link - http://www.fededirectory.frb.org/download.cfm.

What is useful about this is that you can put this list in a database table, and then if oyu have customers enter an ACH number on your website, you can validate that the routing number they gave you is an actual bank that exists. Yes, you can calculate a checksum on a routing number, but this is one step better.

So how would we import this file into Oracle using an external table. Simple, the definition is below.

What is also interesting about this is that we can use the NULLIF operator. Banks that have underwent mergers have a new routing number assigned, but for most banks this is "000000000", so we can use the NULLIF operator to translate this value to NULL in our external table. The same is true with zip code.

 CREATE TABLE fedwire_ach_file_import
(
    routing_number           VARCHAR2(9),
    last_name                VARCHAR2(1),
    servicing_frb_number     VARCHAR2(9),
    record_type_code         VARCHAR2(1),
    change_date              DATE,
    new_routing_number       VARCHAR2(9),
    customer_name            VARCHAR2(36),
    address                  VARCHAR2(36),
    city                     VARCHAR2(20),
    state_code               VARCHAR2(2),
    zip_code                 VARCHAR2(5),
    zip_code_extension       VARCHAR2(4),
    telephone                VARCHAR2(10),
    institution_status_code  VARCHAR2(1),
    data_view_code           VARCHAR2(1)
)
ORGANIZATION EXTERNAL
(
    TYPE ORACLE_LOADER
    DEFAULT DIRECTORY data_import
    ACCESS PARAMETERS
    (
      RECORDS FIXED 157 
      LOGFILE data_import:'fedwire_import.log'
      BADFILE data_import:'fedwire_import.bad'   
      FIELDS    
   (
          routing_number           CHAR(9),
          last_name                CHAR(1),
          servicing_frb_number     CHAR(9),
          record_type_code         CHAR(1),
          change_date              CHAR(6) DATE_FORMAT DATE MASK "MMDDYY",
          new_routing_number       CHAR(9) NULLIF new_routing_number = '000000000',
          customer_name            CHAR(36),
          address                  CHAR(36),
          city                     CHAR(20),
          state_code               CHAR(2),
          zip_code                 CHAR(5),
          zip_code_extension       CHAR(4) NULLIF zip_code_extension = '0000',
          telephone                CHAR(10),
          institution_status_code  CHAR(1),
          data_view_code           CHAR(1),   
          filler                   CHAR(5)
      ) 
 )
  LOCATION ('FedACHdir.txt')
)
REJECT LIMIT UNLIMITED;