Oracle optimizer doesn’t understand elementary math


If A < B and B < C then A < C

Right? We all know the transitive rule… all of us except the Oracle optimizer.
Here’s a seasonal example to illustrate how the Optimizer changes it’s methods based on seemingly trivial options.

Let’s say I bake 10 types of cookies and then I want to see how many distinct bundles of 3 types I can make.
Mathematically we’d represent that as C(10,3) = 120.

In SQL, we could solve for them like this:

CREATE TABLE cookies
AS
SELECT 'oatmeal' cookie FROM DUAL
UNION ALL
SELECT 'm&m' cookie FROM DUAL
UNION ALL
SELECT 'chocolate chip' cookie FROM DUAL
UNION ALL
SELECT 'molasses' cookie FROM DUAL
UNION ALL
SELECT 'gingerbread' cookie FROM DUAL
UNION ALL
SELECT 'macadamia nut' cookie FROM DUAL
UNION ALL
SELECT 'peanut butter' cookie FROM DUAL
UNION ALL
SELECT 'snickerdoodle' cookie FROM DUAL
UNION ALL
SELECT 'sugar cookie' cookie FROM DUAL
UNION ALL
SELECT 'shortbread' cookie FROM DUAL;

EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'COOKIES');

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie < b.cookie AND b.cookie < c.cookie;

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie < b.cookie AND b.cookie < c.cookie AND a.cookie < c.cookie;

Note, both queries return the expected 120 rows.
However, the optimizer came up with 2 different plans for these functionally identical queries.

With implied transitivity
--------------------------------------------------
| Id | Operation | Name | E-Rows |
--------------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | MERGE JOIN | | 203 |
| 2 | SORT JOIN | | 45 |
| 3 | MERGE JOIN | | 45 |
| 4 | SORT JOIN | | 10 |
| 5 | TABLE ACCESS FULL| COOKIES | 10 |
|* 6 | SORT JOIN | | 10 |
| 7 | TABLE ACCESS FULL| COOKIES | 10 |
|* 8 | SORT JOIN | | 10 |
| 9 | TABLE ACCESS FULL | COOKIES | 10 |
--------------------------------------------------

With explicit transitivity
-------------------------------------------------
| Id | Operation | Name | E-Rows |
-------------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | MERGE JOIN | | 91 |
| 2 | MERGE JOIN | | 45 |
| 3 | SORT JOIN | | 10 |
| 4 | TABLE ACCESS FULL| COOKIES | 10 |
|* 5 | SORT JOIN | | 10 |
| 6 | TABLE ACCESS FULL| COOKIES | 10 |
|* 7 | FILTER | | |
|* 8 | SORT JOIN | | 10 |
| 9 | TABLE ACCESS FULL| COOKIES | 10 |
-------------------------------------------------

We can take it a step futher, let’s say I want all of the cookies to be the same type.
This is obviously going to be just 10 different combinations and no joins are needed; but I’ll write it this way to further illustrate the point.

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie = b.cookie AND b.cookie = c.cookie;

------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------
| 0 | SELECT STATEMENT | | 10 |
|* 1 | HASH JOIN | | 10 |
|* 2 | HASH JOIN | | 10 |
| 3 | TABLE ACCESS FULL| COOKIES | 10 |
| 4 | TABLE ACCESS FULL| COOKIES | 10 |
| 5 | TABLE ACCESS FULL | COOKIES | 10 |
------------------------------------------------

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie = b.cookie AND b.cookie = c.cookie AND a.cookie = c.cookie;


------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
|* 1 | HASH JOIN | | 1 |
|* 2 | HASH JOIN | | 10 |
| 3 | TABLE ACCESS FULL| COOKIES | 10 |
| 4 | TABLE ACCESS FULL| COOKIES | 10 |
| 5 | TABLE ACCESS FULL | COOKIES | 10 |
------------------------------------------------

Again note the change in estimated rows for an evaluation that we might assume the optimizer could figure out on its own.

The optimizer doesn’t understand transitivity of columns; but what if we add variables or constants?
Going back to the original query and take a closer look:

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie < b.cookie AND b.cookie < c.cookie;

The plan’s predicates include:

Predicate Information (identified by operation id):
---------------------------------------------------

6 - access(INTERNAL_FUNCTION("A"."COOKIE")<INTERNAL_FUNCTION("B"."COOKIE"))
filter(INTERNAL_FUNCTION("A"."COOKIE")<INTERNAL_FUNCTION("B"."COOKIE"))
8 - access("B"."COOKIE"<"C"."COOKIE")
filter("B"."COOKIE"<"C"."COOKIE")

But, if we add a variable, obviously this changes the results, but it’s also interesting to note how the SQL condition becomes interpreted as filter conditions within the optimizer’s plan predicates.

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie < b.cookie AND b.cookie < c.cookie and a.cookie= :b1;

------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | NESTED LOOPS | | 1 |
| 2 | NESTED LOOPS | | 1 |
|* 3 | TABLE ACCESS FULL| COOKIES | 1 |
|* 4 | TABLE ACCESS FULL| COOKIES | 1 |
|* 5 | TABLE ACCESS FULL | COOKIES | 1 |
------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."COOKIE"=:B1)
4 - filter("B"."COOKIE">:B1 AND "A"."COOKIE"<"B"."COOKIE")
5 - filter("C"."COOKIE">:B1 AND "B"."COOKIE"<"C"."COOKIE")

The variable does get propagated through to the other conditions as we might expect.
Now, let’s try adding a literal value and we’ll see Oracle again is able to derive transitivity and propagates the constant literal through to all of the table predicates and radically alters the estimates and execution plan

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie < b.cookie AND b.cookie < c.cookie and a.cookie= 'gingerbread';

--------------------------------------------------
| Id | Operation | Name | E-Rows |
--------------------------------------------------
| 0 | SELECT STATEMENT | | 14 |
| 1 | MERGE JOIN | | 14 |
| 2 | MERGE JOIN CARTESIAN| | 8 |
|* 3 | TABLE ACCESS FULL | COOKIES | 1 |
| 4 | BUFFER SORT | | 8 |
|* 5 | TABLE ACCESS FULL | COOKIES | 8 |
|* 6 | FILTER | | |
|* 7 | SORT JOIN | | 8 |
|* 8 | TABLE ACCESS FULL | COOKIES | 8 |
--------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."COOKIE"='gingerbread')
5 - filter("C"."COOKIE">'gingerbread')
6 - filter("B"."COOKIE"<"C"."COOKIE")
7 - access("A"."COOKIE"<"B"."COOKIE")
filter("A"."COOKIE"<"B"."COOKIE")
8- filter("A"."COOKIE">'gingerbread')

But, even though it now seems to understands the transitivity, that understanding is still flawed, because if we put the implicit condition of a.cookie < c.cookie, which offers no functional change, we still get different plans with different joins and estimates.

SELECT *
FROM cookies a, cookies b, cookies c
WHERE a.cookie < b.cookie AND b.cookie < c.cookie AND a.cookie < c.cookie and a.cookie='gingerbread';

-------------------------------------------------
| Id | Operation | Name | E-Rows |
-------------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | MERGE JOIN | | 8 |
| 2 | NESTED LOOPS | | 4 |
|* 3 | TABLE ACCESS FULL | COOKIES | 1 |
|* 4 | TABLE ACCESS FULL | COOKIES | 4 |
|* 5 | FILTER | | |
|* 6 | SORT JOIN | | 8 |
|* 7 | TABLE ACCESS FULL| COOKIES | 8 |
-------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."COOKIE"='gingerbread')
4 - filter(("C"."COOKIE">'gingerbread' AND "A"."COOKIE"<"C"."COOKIE"))
5 - filter("B"."COOKIE"<"C"."COOKIE")
6 - access("A"."COOKIE"<"B"."COOKIE")
filter("A"."COOKIE"'gingerbread')
7 - filter("B"."COOKIE">'gingerbread')

So, what are the lessons here?

First, I really like oatmeal cookies.

Second, I’m reminded that I should bake more.

Third, if you’re stuck on a query where the optimizer seems like it’s ignoring some crucial information, it may be a hole in its logic where it doesn’t extend “obvious” rules throughout the plan. If so, it may be worth changing the where conditions to include, exclude or alter functionally identical conditions.

For example:
WHERE a=b and b=c
could be written
WHERE a=b and a=c
Or
WHERE a=b and a=c and b=c

All three conditions are functionally identical; but writing them in different ways could hide or expose information the optimizer can’t derive on its own (even if we think it should.)

Book Review: Oracle PL/SQL Performance Tuning Tips & Techniques


I finished “Oracle PL/SQL Performance Tuning Tips & Techniques” by Michael Rosenblum and Paul Dorsey today.

I recommend it as a lesson, a reference and an inspiration for things to try.
Steven Feuerstein still gets the nod as the “go to” for first learning how to use pl/sql; but this book has great examples for learning how to use it better.

It’s quite readable, definitely one of the best out of Oracle Press I’ve seen in a while.
The examples cover 11g and some 12c as well when warranted. The code snippet construction is particularly good. They are not only complete and functional but don’t come off as completely arbitrary. They are modeled after real world tasks, making it much easier to relate to them.

Update to LDAP package


I’m in the process of updating my OID package from last year to version 2.0. The new version will include a significant change in API but will also include more functionality, in particular, support for TNS aliases. I’m also changing the way exceptions are raised as well as improving the efficiency of searches.

Along the way to 2.0, I have made a few incremental on the original release; so, I’ve updated the original post with a link to Version 1.3 and updated the example to show the new authentication method.

If someone still wants the original release, it can be found here: Version 1.2

Parallel PL/SQL with DBMS_PARALLEL_EXECUTE


11gR2 introduced the DBMS_PARALLEL_EXECUTE package.

The basic idea behind it is you divide some large task into smaller chunks and then execute all the chunks in parallel.
The API and the examples in the Oracle Documentation might lead one to believe the package can only be used to split large SQL statements into parallel chunks.

You can however use the same package to execute a pl/sql block multiple times simultaneously.
With proper use of the chunk ids you can even execute multiple different procedures in parallel as single task.

I’ll illustrate both below.

First, create a table to hold some results:

CREATE TABLE parallel_exec_test_table
(
    test_name   VARCHAR2(100),
    chunk_name  VARCHAR2(100),
    chunk_value INTEGER,
    chunk_start TIMESTAMP,
    chunk_end   TIMESTAMP
);

Next, I’ll create a procedure that does some dummy work, pausing for a second and updating the test table multiple times.

CREATE OR REPLACE PROCEDURE parallel_exec_test1_chunk(
    p_test_name     IN VARCHAR2,
    p_chunk_name    IN VARCHAR2,
    p_chunk_value   IN INTEGER
)
IS
BEGIN
    IF p_chunk_value = 4
    THEN
        RAISE VALUE_ERROR;
    END IF;

    INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start)
    VALUES (p_test_name, p_chunk_name, SYSTIMESTAMP);

    DBMS_LOCK.sleep(2);

    FOR i IN 1 .. p_chunk_value
    LOOP
        UPDATE parallel_exec_test_table
           SET chunk_value = i
         WHERE test_name = p_test_name AND chunk_name = p_chunk_name;
    END LOOP;

    UPDATE parallel_exec_test_table
       SET chunk_end = SYSTIMESTAMP
     WHERE test_name = p_test_name AND chunk_name = p_chunk_name;
END;

You might note the intentional RAISE for an input of 4, that’s to illustrate the parallel execution exception handling.

Next, create a procedure to call the test chunk multiple times in parallel.
Important to note your pl/sql block must have bind variables called :start_id and :end_id.
If you don’t have both, or name them something different you’ll get an ORA-01006 exception.
In this example, I generate a list of chunks with ids 1-10. Each chunk starts and ends with a single id, so 10 chunks.
I then pass these ids to the chunk procedure and execute 5 of them at a time. Since each invocation of the chunk procedure will log when it starts this should create 2 clusters of five parallel executions.

CREATE OR REPLACE PROCEDURE parallel_exec_test_proc
IS
    v_task_name   VARCHAR2(30) := 'parallel_test1';
    v_plsql_block VARCHAR2(32767);
BEGIN
    DELETE parallel_exec_test_table
     WHERE test_name = v_task_name;

    DBMS_PARALLEL_EXECUTE.create_task(task_name => v_task_name);

    DBMS_PARALLEL_EXECUTE.create_chunks_by_sql(
        task_name   => v_task_name,
        sql_stmt    => 'SELECT level start_id, level end_id FROM dual connect by level <=10',
        by_rowid    => FALSE
    );


    v_plsql_block :=
        q'[
begin 
   parallel_exec_test1_chunk('parallel_test1',to_char(to_date(:start_id,'j'),'jsp'),:end_id);
end;
]';
    DBMS_PARALLEL_EXECUTE.run_task(
        task_name        => v_task_name,
        sql_stmt         => v_plsql_block,
        language_flag    => DBMS_SQL.native,
        parallel_level   => 5
    );

    DBMS_OUTPUT.put_line(
           TO_CHAR(SYSTIMESTAMP, 'yyyy-mm-dd hh24:mi:ss.ff')
        || '  '
        || DBMS_PARALLEL_EXECUTE.task_status(v_task_name)
    );
END;
/

Testing produces the following results…

SQL> set timing on
SQL> EXEC parallel_exec_test_proc

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.03
SQL> SELECT * FROM parallel_exec_test_table WHERE test_name = 'parallel_test1' ORDER BY chunk_start;

TEST_NAME            CHUNK_NAME CHUNK_VALUE CHUNK_START                CHUNK_END
-------------------- ---------- ----------- -------------------------- ------------------------------
parallel_test1       three                3 2013-10-17 00:44:27.494957 2013-10-17 00:44:29.496812
parallel_test1       five                 5 2013-10-17 00:44:27.495694 2013-10-17 00:44:29.499113
parallel_test1       six                  6 2013-10-17 00:44:27.496019 2013-10-17 00:44:29.499743
parallel_test1       seven                7 2013-10-17 00:44:27.496139 2013-10-17 00:44:29.498747
parallel_test1       eight                8 2013-10-17 00:44:27.510579 2013-10-17 00:44:29.514314
parallel_test1       nine                 9 2013-10-17 00:44:29.498327 2013-10-17 00:44:31.500831
parallel_test1       ten                 10 2013-10-17 00:44:29.500020 2013-10-17 00:44:31.503208
parallel_test1       one                  1 2013-10-17 00:44:29.500746 2013-10-17 00:44:31.502249
parallel_test1       two                  2 2013-10-17 00:44:29.501337 2013-10-17 00:44:31.502495

9 rows selected.

Elapsed: 00:00:00.04

As you can see, 5 runs at 44:27 and 4 runs at 44:29. You might also note that even though the chunks were defined in order 1-10, the actual execution order is effectively random.
Also note, 10 chunks were submitted but only 9 records were written. This is because for id 4 raised an exception.

Checking the chunk status in the dictionary we can see the error

SQL> SELECT start_id,end_id,status,start_ts,error_message FROM dba_parallel_execute_chunks WHERE task_name = 'parallel_test1';

  START_ID     END_ID STATUS               START_TS                                                            ERROR_MESSAGE
---------- ---------- -------------------- --------------------------------------------------------------------------- --------------------------------------------------
         3          3 PROCESSED            2013-10-17 00:44:27.494342
         5          5 PROCESSED            2013-10-17 00:44:27.494354
         6          6 PROCESSED            2013-10-17 00:44:27.495262
         7          7 PROCESSED            2013-10-17 00:44:27.495951
         8          8 PROCESSED            2013-10-17 00:44:27.510169
         9          9 PROCESSED            2013-10-17 00:44:29.497890
        10         10 PROCESSED            2013-10-17 00:44:29.499605
         1          1 PROCESSED            2013-10-17 00:44:29.500361
         2          2 PROCESSED            2013-10-17 00:44:29.500928
         4          4 PROCESSED_WITH_ERROR 2013-10-17 00:44:27.494370                                              ORA-06502: PL/SQL: numeric or value error

10 rows selected.

Another, potentially more useful option is executing different procedures simultaneously.
To demonstrate this I’ll create 5 new procedures.

CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk1(p_test_name IN VARCHAR2)
IS
    c_chunk_name CONSTANT VARCHAR2(6) := 'chunk1';
BEGIN
    INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start)
    VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP);

    DBMS_LOCK.sleep(2);

    UPDATE parallel_exec_test_table
       SET chunk_end = SYSTIMESTAMP
     WHERE test_name = p_test_name AND chunk_name = c_chunk_name;
END;

CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk2(p_test_name IN VARCHAR2)
IS
    c_chunk_name CONSTANT VARCHAR2(6) := 'chunk2';
BEGIN
    INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start)
    VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP);

    DBMS_LOCK.sleep(2);

    UPDATE parallel_exec_test_table
       SET chunk_end = SYSTIMESTAMP
     WHERE test_name = p_test_name AND chunk_name = c_chunk_name;
END;

CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk3(p_test_name IN VARCHAR2)
IS
    c_chunk_name CONSTANT VARCHAR2(6) := 'chunk3';
BEGIN
    INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start)
    VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP);

    DBMS_LOCK.sleep(2);

    UPDATE parallel_exec_test_table
       SET chunk_end = SYSTIMESTAMP
     WHERE test_name = p_test_name AND chunk_name = c_chunk_name;
END;

CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk4(p_test_name IN VARCHAR2)
IS
    c_chunk_name CONSTANT VARCHAR2(6) := 'chunk4';
BEGIN
    INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start)
    VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP);

    DBMS_LOCK.sleep(2);

    UPDATE parallel_exec_test_table
       SET chunk_end = SYSTIMESTAMP
     WHERE test_name = p_test_name AND chunk_name = c_chunk_name;
END;

CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk5(p_test_name IN VARCHAR2)
IS
    c_chunk_name CONSTANT VARCHAR2(6) := 'chunk5';
BEGIN
    INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start)
    VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP);

    DBMS_LOCK.sleep(2);

    UPDATE parallel_exec_test_table
       SET chunk_end = SYSTIMESTAMP
     WHERE test_name = p_test_name AND chunk_name = c_chunk_name;
END;

Just as in the first test, I’ll create a procedure that will execute them simultaneously. The trick here will be that the API requires chunk ids not procedure names. This is easily handled though with a simple CASE statement to call the procedure I want for each id.

CREATE OR REPLACE PROCEDURE parallel_exec_test2_proc
IS
    v_task_name   VARCHAR2(30) := 'parallel_test2';
    v_plsql_block VARCHAR2(32767);
BEGIN
    DELETE parallel_exec_test_table
     WHERE test_name = v_task_name;

    DBMS_PARALLEL_EXECUTE.create_task(task_name => v_task_name);

    DBMS_PARALLEL_EXECUTE.create_chunks_by_sql(
        task_name   => v_task_name,
        sql_stmt    => 'SELECT level start_id, level end_id FROM dual connect by level <=5',
        by_rowid    => FALSE
    );


    v_plsql_block := q'[
declare
    v_dummy integer := :end_id; 
begin 
  case :start_id
    when 1 then parallel_exec_test2_chunk1('parallel_test2');
    when 2 then parallel_exec_test2_chunk2('parallel_test2');
    when 3 then parallel_exec_test2_chunk3('parallel_test2');
    when 4 then parallel_exec_test2_chunk4('parallel_test2');
    when 5 then parallel_exec_test2_chunk5('parallel_test2');
  end case; 
end;
]';
    DBMS_PARALLEL_EXECUTE.run_task(
        task_name        => v_task_name,
        sql_stmt         => v_plsql_block,
        language_flag    => DBMS_SQL.native,
        parallel_level   => 5
    );

    DBMS_OUTPUT.put_line(
           TO_CHAR(SYSTIMESTAMP, 'yyyy-mm-dd hh24:mi:ss.ff')
        || '  '
        || DBMS_PARALLEL_EXECUTE.task_status(v_task_name)
    );
END;
/

And testing produces these results…

SQL> EXEC parallel_exec_test2_proc

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.02
SQL> SELECT *
  2    FROM parallel_exec_test_table
  3   WHERE test_name = 'parallel_test2'
  4  ORDER BY chunk_start;

TEST_NAME            CHUNK_NAME CHUNK_VALUE CHUNK_START                CHUNK_END
-------------------- ---------- ----------- -------------------------- ----------------------------
parallel_test2       chunk3                 2013-10-17 01:11:18.236996 2013-10-17 01:11:20.238041
parallel_test2       chunk4                 2013-10-17 01:11:18.242705 2013-10-17 01:11:20.244119
parallel_test2       chunk5                 2013-10-17 01:11:18.251068 2013-10-17 01:11:20.252110
parallel_test2       chunk1                 2013-10-17 01:11:18.259176 2013-10-17 01:11:20.260017
parallel_test2       chunk2                 2013-10-17 01:11:18.269944 2013-10-17 01:11:20.271156

If I was working with 11gR1 or older I’d use dbms_scheduler or dbms_job (for really old versions like 9i or lower.) Using dbms_parallel_execute is certainly easier. One important thing to note is each chunk is committed if it completes successfully. So, this is not a good technique if you need to maintain an atomic transaction across all the chunks. Still, it’s another tool for the toolbox and one I appreciate.

I hope it helps.
Questions and comments, as always, are welcome.

Managing TNS lookups in OpenLDAP via PL/SQL


I have an Oracle Express Edition (XE) database running on almost every pc in my home. It’s not so numerous that I can’t manage them by manipulation of TNSNAMES.ORA; but I always thought it would be nice to manage them centrally via LDAP service name lookup.

Unfortunately Oracle Internet Directory (OID) licensing is included with a valid Oracle Database license unless that database is XE. I don’t really need or want a full OID installation anyway. So, I looked into other LDAP alternatives and decided on OpenLDAP. It’s open source, free and I can install it on Linux or Windows. I chose Linux and followed the instructions here…
http://diznix.com/articles/oracle-names-resolution-with-openldap without issue.

configuring one of the Windows XE installs to use my new linux server was simple enough
C:\oraclexe\app\oracle\product\11.2.0\server\network\ADMIN\sqlnet.ora
NAMES.DIRECTORY_PATH= (LDAP, EZCONNECT)

C:\oraclexe\app\oracle\product\11.2.0\server\network\ADMIN\ldap.ora
DIRECTORY_SERVERS= (mylinuxserver:389:636)
DEFAULT_ADMIN_CONTEXT = "dc=home"
DIRECTORY_SERVER_TYPE = OID

The next step was getting my databases defined in the ldap server. I could do it through ldif files as shown in the link above; but I thought it would be nicer to have a sql and pl/sql interface to the server. Oracle facilitates that with the DBMS_LDAP package. To make it even easier, I built a wrapper package for it.

First, set up a user to own the package and, since I’m using 11g XE I need to create an Access Control List to allow my owner to talk to my ldap/oid server. I’m using the default non-SSL port 389.

create user oidowner identified by pa55w0rd;
grant create session to oidowner;
grant create procedure to oidowner;
grant create type to oidowner;

BEGIN
DBMS_NETWORK_ACL_ADMIN.create_acl(acl => 'oid.xml',
description => 'ACL that lets users see oid server',
principal => 'OIDOWNER',
is_grant => TRUE,
privilege => 'connect'
);

DBMS_NETWORK_ACL_ADMIN.assign_acl('oid.xml', 'mylinuxserver',389,389);
COMMIT;
END;

Next I create my package and a supporting type to make listing the contents easier

create or replace type vcarray as table of varchar2(4000);

Package code is available from my dropbox https://dl.dropboxusercontent.com/u/75336811/sdsoid_pkg_1.3.txt

And finally, I add an entry for an XE installation on one of my PCs.

BEGIN
sdsoid.insert_oid(
'dc=home',
'mylinuxserver',
'seanxe',
'(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=seanpc)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=xe)))',
'cn=myadminuser,dc=home',
'myadminpassword'
);
END;

Then I check my work…

C:\oraclexe\app\oracle\product\11.2.0\server\network\ADMIN>tnsping seanxe

TNS Ping Utility for 32-bit Windows: Version 11.2.0.2.0 - Production on 24-AUG-2013 13:38:16

Copyright (c) 1997, 2010, Oracle. All rights reserved.

Used parameter files:
c:\oracle\product\11.2.0.2\network\admin\sqlnet.ora

Used LDAP adapter to resolve the alias
Attempting to contact (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=seanpc)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=xe)))
OK (0 msec)

But, I want a SQL interface to the LDAP entries. This is where the collection type defined before the package comes in.
Using the list_oid table function it’s easy to retrieve some of the entries for other databases I’ve created.

C:\oraclexe\app\oracle\product\11.2.0\server\network\ADMIN>sqlplus oidowner/pa55w0rd@seanxe

SQL*Plus: Release 11.2.0.2.0 Production on Sat Aug 24 14:01:54 2013

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Express Edition Release 11.2.0.2.0 - Production

SQL> set lines 1000
SQL> SELECT * FROM TABLE(sdsoid.list_oid('dc=home', 'mylinuxserver'));

COLUMN_VALUE
-------------------------------------------------------------------------------------------------------------------------
kidsxe: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=familypc)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=xe)))
testxe: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=testpc)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=xe)))
seanxe: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=seanpc)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=xe)))

Currently all entries are maintained and retrieved with the full connect descriptor. In my next release I’ll add support for parameterized entries. I have tested this code on 12c databases and it works without modification.

Questions and comments are welcome.

Analogies to help explain views


A not-uncommon request I hear is to explain when the contents of a view are updated.
Sometimes this will be to explain the difference between a “view” and a “materialized view.”

I like analogies as a tool to help describe what I’m trying to say.

I wear glasses.
Querying through a view is like looking at your data through glasses.

When I look at the world without my glasses I get one image.
When I put my glasses on I get another way of looking at the same thing.
If somebody walks into the room where I am, I don’t need to wait for my glasses to update.
What I see reflects the contents of the room just as if I were looking without my glasses and there is no lag.
What I see with or without my glasses is “updated” instantaneously.

If I take a picture of the room with my camera, I get another view through that lens and the view is static.
These are materialized views. Not only is what I see determined by the lens of the camera, that image won’t ever change unless I force the view to refresh (take another picture.) Here the terminology is helpful. One image capture of a picture is called a snapshot, similarly one data capture with a materialized view is also called a snapshot.
In older versions of Oracle, the terms Materialized Views and Snapshots were used interchangeably.

So… if normal views and materialized views are updated instantaneously then why are materialized views often considered “faster” than normal views?
Well, that’s a trick question.

First, neither type of view is actually fast or slow.
A view holds no data, it’s simply a query filter. Like tinted glasses, a view doesn’t really change the data, it just applies an additional set of operations when you look at it.
In the case of glasses, the speed of light makes the analogy break down a little bit because it passes through the filters so fast we can’t perceive the operation.
If our glasses were hundreds of thousands of meters thick we’d be able to detect the difference between looking through the lens and not because the light would move slower through our glasses versus through the air.

In the case of a view, if we assume the view actually does something for us like join to another table, call a function, include a sub-query or a combination of these, then of course we’ll see a lag as opposed to simply looking at a single table’s data.

The view is able to instantly pickup changes in the underlying data; but when you query for that new data, your query still needs to process it through whatever joins, functions or sub-queries you use and it’s those operations that create the lag.

With a materialized view, you can hide some of the work by doing it ahead of time. I can drive to the Grand Canyon and look at it myself or I can look at a picture of it. The picture is faster because the data has already been recorded and delivered to me. However, the content is limited by the delay and scope of the captured image. If you could somehow deliver an image that was simultaneously as vast, detailed and up-to-date as looking at the real thing live, then your image would be completely equivalent to the original. However, that’s not really possible.

So that’s why I look through my own normal view (my glasses), rather than a materialized view.

If however, I merely need a minor update without the full detail, like an update on the current weather; I might we willing to use a small photo taken every hour from a web cam.
That snapshot isn’t completely accurate and doesn’t have all the information; but it captures what I need with adequate frequency to tell me what I need.

I hope these analogies help more than hinder and if I have confused, please feel free to ask and I’ll be happy to try help explain a different way.
If you have other analogies that you’ve used to help explain the topic to others, I’d like to hear about them.

12c big strings – quasi-lobs


One of the new features of 12c is the ability to create varchar2 columns up to 32k in size.
This feature isn’t enabled by default.
You have to put the db into upgrade mode, set max_string_size to EXTENDED and restart the db.

However, once you do that, then the big strings are just like normal varchar2 columns

SQL> CREATE TABLE test_table (big_col VARCHAR2(32767));

Table created.

Or are they? Where does all that data go?


SQL> SELECT table_name,column_name,segment_name,in_row FROM user_lobs;

TABLE_NAME COLUMN_NAME SEGMENT_NAME IN_ROW
--------------- --------------- ------------------------------ ----------
TEST_TABLE BIG_COL SYS_LOB0000092677C00001$$ YES

The big varchar2 columns act like mini-clobs and are written to LOB segments.
So that got me to thinking…
I can create an index on a varchar2 column but I can’t on a clob column.
I’ll get ORA-02327: cannot create index on expression with datatype LOB

So what about these hybrid things?

SQL> CREATE INDEX test_index ON test_table(big_col);
CREATE INDEX test_index ON test_table(big_col)
*
ERROR at line 1:
ORA-01450: maximum key length (6398) exceeded

ah, now that’s different. That’s an old error message but not one I encounter very often.

So, one last test. What about a varchar2 that’s bigger than the old limits but not quite so huge to run into the max key length?

SQL> alter table test_table modify (big_col varchar2(5555));

Table altered.

SQL> CREATE INDEX test_index ON test_table(big_col);

Index created.

So, now I have data stored as if it were a clob but I can index it because we call it a varchar2.
It’s not likely I’d be using a big column like these as an indexed field, but it’s interesting how the new structures behave.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: