My blog is moving


I’m moving my blog to a new host under my own domain: seanstuber.com

Raw content has already been moved, still need to work out some cosmetics.

Advertisements

Using MERGE to INSERT overlapping source data


Recently I’ve been working with a data source that generates a new file each day for me to import. That seemed simple enough, I created an external table to read the file and then run an insert statement to load the data into a target table. However, the source files sometimes overlap each other. That is, on day 1, I might get ids 1,2,3,4,5. Then on day 2, I might get ids 3,4,5,6,7,8,9. I could simply use a MERGE statement to update matching records and then insert the new unmatched ones. That’s what MERGE is for; but in this case I know the source data has static history. So, if I update ID 3 on the second day because I find a match from the first day, I know the update won’t actually change any data. Thus making for an inefficient process doing dummy work.

As an alternate solution, I could use insert with a unique constraint to prevent duplicates and use DBMS_ERRLOG to create a log table to catch the duplicate rows, thus allowing the insert to continue rather than aborting when a duplicate is found. This too though seems like a waste of resources.

Of course, another option would be to do a select from the source table that excludes the matching ids with an anti-join, not-exists subquery, or not-in subquery.

All of these variations would work but are less desirable in that they require extra steps or are needlessly complex.

The easiest method I found was to simply use a one-sided merge. That is, only define a WHEN NOT MATCHED condition, so the matching records simply drop out of the merge statement when they are detected without extra processing or raising exceptions.

As a test case to illustrate, I’ll use a normal table instead of an external table as the source and simulate 3 days of results to load into the target table.

CREATE TABLE target
(
    id         NUMBER,
    textval    VARCHAR2(10),
    dateval    DATE,
    numval     NUMBER
);

CREATE TABLE source
(
    id         NUMBER,
    textval    VARCHAR2(10),
    dateval    DATE,
    numval     NUMBER
);

On day 1 we receive 5 rows of data.

-- Source data Day-1   
truncate table source;            
Insert into source (id, textval, dateval, numval) values (1,'one',  date '2018-09-02',1);
Insert into source (id, textval, dateval, numval) values (2,'two',  date '2018-09-02',4);
Insert into source (id, textval, dateval, numval) values (3,'three',date '2018-09-02',9);
Insert into source (id, textval, dateval, numval) values (4,'four', date '2018-09-02',16);
Insert into source (id, textval, dateval, numval) values (5,'five', date '2018-09-02',25);
commit;

Then use merge to insert the first day’s data.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

5 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25

Then on day 2 we receive 4 rows of data, 2 repeats and 2 new one ones.

-- Source data Day-2 
truncate table source;  
Insert into source (id, textval, dateval, numval) values (4,'four', date '2018-09-02',16);
Insert into source (id, textval, dateval, numval) values (5,'five', date '2018-09-02',25);
Insert into source (id, textval, dateval, numval) values (6,'six',  date '2018-09-03',36);
Insert into source (id, textval, dateval, numval) values (7,'seven',date '2018-09-03',49);
commit;

Then merge again. Note only the two new rows are merged.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

2 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25
         6 six        03-SEP-18         36
         7 seven      03-SEP-18         49

On day 3 we receive 4 more rows, this time with 1 duplicates and 3 new ones.

-- Source data Day-3
truncate table source;
Insert into source (id, textval, dateval, numval) values (7, 'seven',date '2018-09-03',49);
Insert into source (id, textval, dateval, numval) values (8, 'eight',date '2018-09-04',64);
Insert into source (id, textval, dateval, numval) values (9, 'nine', date '2018-09-04',81);
Insert into source (id, textval, dateval, numval) values (10,'ten',  date '2018-09-04',100);
commit;

Merge one more time and we have 3 new rows merged and only the 10 distinct copies from all 3 days in the target table.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

3 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25
         6 six        03-SEP-18         36
         7 seven      03-SEP-18         49
         8 eight      04-SEP-18         64
         9 nine       04-SEP-18         81
        10 ten        04-SEP-18        100

I’m not Steven Feuerstein, should I even bother blogging?


If you’re not into PL/SQL, substitute Jonathan Lewis, Tanel Poder, Maria Colgan, or any other technology great you might respect. Doesn’t even need to be Oracle related.

The answer, quite simply, is yes.

First – it’s not a competition. If it were, then yes, I concede to the guy that literally wrote book on “Oracle PL/SQL Programming.” I don’t blog to show I know more or even as much as he does. I blog to share examples, interesting quirks, and use cases I’ve encountered. For all of the syntax descriptions in the Oracle manuals and all of the code downloads and explanatory videos out there, maybe my contribution will be the one that clicks with some reader googling for a particular problem. The point of my code isn’t that nobody else could have written what I did, but simply that I did write it, and maybe somebody else can benefit from it.

So, if you’ve written something you thought was interesting, by all means share it. Maybe it’s something with nested table collections that has been covered a thousand times before. Post it anyway, maybe your specific example, with your choice of descriptive phrasing will be the one that catches the next reader’s eye.

Second – more diverse voices matter. I’ve been told my style of writing is too casual and sometime unprofessionally conversational. I’ve also been told my writing helps make new syntax and/or complicated problems more approachable. So even if my content might be exactly what someone needs, if my delivery doesn’t work for them maybe the message gets lost. I’d feel a little bad about that; but it happens. Some of it just implicit. I only read one language well, English, so I only read blogs and watch videos in English. I’m sure I miss out on some good content because of that limitation. Others might too.

So, maybe you have something to say about collections that you know for sure Feuerstein has covered before. Go ahead, say it anyway. Use your code, your words, your language. Get your message, explained your way out there. If you happen to be a polyglot, even better. Translate your own work.

Third – what you’re posting is probably new to someone I’ve been coding in PL/SQL since version 7. I’ve written tens of thousands of lines of code. I don’t believe there is anything I’ve written that couldn’t have been written by lots of other people, but nobody is writing everything. You might be surprised with what others will look for. Month after month, the hottest article on my blog (by far) from search engines is one I wrote several years ago on using DBMS_PARALLEL_EXECUTE to run parallel pl/sql . I posted it because I thought it was a kind of quirky thing to do. I had no idea so many other people would search for exactly that. I’m going to guess I’m not the first or only person to do it; but I bothered to blog about it, so now that article is a resource out there for others. If I picked my blog articles based on what I thought would garner the most “hits” then I never would have written that article. My goal is one. If one person likes a post then I’ve done my job. Two and up is just bonus.

So, if you’ve got some strange use case and think it’s too obscure to be helpful to anyone else – post it anyway. Even if you’re right that it can’t be used in your exact form in other code; your approach, or general idea behind your solution might be the inspiration somebody else needs to get their problem solved.

Fourth – blogging is also helpful to the author. The act of of writing an instructional blog about code is different than the act of writing the code itself. It’s also different than writing comments for the code. When you write a blog article about some block of code you are removing that code from the context in which it was originally written. So, comments you might leave in the code for yourself or future maintainers might be meaningless when read with a stand-alone example. Similarly, intent may be obvious within the context of an application, but in isolation a code block could be opaque. Also, you don’t know your audience, so you need to write in such a way that it aids the intended reader. This may mean exploring syntax minutia and documenting features that are or are not used in order to give scope to your code. These extra steps improve your skill as a writer; but you will often find you learn more about your own code as you start fleshing out the areas around it.

So, if you got some code or practice that you think you have some expertise, challenge yourself to write about it so a novice can understand it. Or, you can flip it around. If there is something you want to learn about for yourself, start writing about it. You don’t have to publish immediately at each twist, turn, trial and error; but as you reach milestones – write about them. Your own blog can then be a reference for yourself as well.

Fifth – start the conversation If it’s your first line of code or your millionth, you have something to say about what you’re doing. Write it down. If you’ve got something good – show it off. If you’ve got a question – ask it. Maybe it’s something in between, a little code block, nothing fancy, but explained well. Put the code out there, get some feedback. You might get kudos and thanks, you might get questions, you might get corrections. Any kind of feedback is good. You might get nothing. That’s fine too. Most people don’t comment, “like”, or “vote” on posts. It’s ok, post anyway. When I started I was getting 1 or 2 visits per month, then a few more, then a dozen, then a few dozen, then a hundred, and another hundred, and another hundred, and so on. The audience is growing in silence but growing. But, let’s say it didn’t. A lack of traffic would itself be feedback. Maybe you need a different topic, different writing style, different audience. Maybe there are no problems with your content, but simply your blog’s interaction with search engines.

So, everyone had something to share, go ahead and share it. You might feel self-conscious about posting some code you’re not sure about it. You can either take a little more time to refine it or go ahead and post it, and ask for improvements. Maybe you’re afraid you’ll post your awesome toolkit and somebody better will come along and tear it to shreds, detailing hole after hole. For one, that’s probably not going to happen. If it does, and it’s bad enough, you can delete the article and repost later after applying corrections, or simply append to the article showing before and after. The work of changing could itself become an article series. I once wrote a 3-part series exploring syntax variations of a single query to solve a question posed by a friend. All but the very last example in the third article are objectively bad ways to do the work; but the point of the series was to explore and explain different approaches to the problem. I intentionally posted answers that I would not recommend for the sake of the exploration. Had i jumped immediately to the simple solution I gave him it wouldn’t have been as interesting or educational. When my friend asked me how to do it, he already assumed I could. When he read all the different ways I came up with he responded “I didn’t know you could do that!” I both solved the problem and showed him something new, making the whole exercise worthwhile.

In conclusion, if you’re thinking about starting a blog, just start. Some blog every day, some once a week, some once a month. Some are regular, some in bursts. It doesn’t matter, just start. Push out some content, talk to some people. Who knows? The next person to read your blog might be a potential employer that didn’t know you existed until he or she saw your work.

Go, put fingers to keyboard and welcome aboard!

How to use DBMS_LDAP (part 6: Modifying)


Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data
  4. Browsing Attributes
  5. Timeouts
  6. Modifying Data (this page)

Modifying data in the LDAP hierarchy

The DBMS_LDAP package allows you to write as well as read data from the directory. However, it does not provide an API for manipulating the schemas within the tree. That is, you can add, delete, or modify records of predefined structures but you may not create new structure types of your own.

In my previous articles in this series I used publicly available servers for my examples. As you might expect, those are read-only. So, in this article my examples will be based off of the LDAP structures used for Oracle database name resolution. These could be from Oracle’s own OID product, or some other LDAP server hosting the same schema.

Inserting a new record

To insert a new record, you might think of it similar to a SQL insert into an index-organized table. That is, unlike a normal heap-table, where rows can go anywhere in the table; in an IOT, or an LDAP structure, each row can only exist within a specific location. Thus an LDAP insert consists of defining the unique key of the record, i.e. where the new record will go, as well as the content fields of the new record.

The where portion consists of the DN as we’ve used in searching from previous sessions. The content is defined by an array of attributes. Each attribute is itself an array. In the example below, each attribute has only a single value, so as the modification array is constructed, each array adds an array of just one element. Since every node within a directory must have an object class, there will always be at least on attribute populated in the modification array to set that value.

In the case of an Oracle Net Service Name, the object class is “orclNetService”. In order to be a useful record it must also include the name to be resolved, as well as the TNS description to which the name resolves. So, in this example I am inserting a new name “test_name” that resolves to “(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))”

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 3);

        v_strings(1) := 'orclNetService';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'objectclass',
                                     modval     => v_strings);

        v_strings(1) := 'test_name';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'cn',
                                     modval     => v_strings);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_result :=
            DBMS_LDAP.add_s(ld        => v_session,
                            entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                            modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Add successful');
        ELSE
            DBMS_OUTPUT.put_line('Add Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Add successful

If you run the block twice, you should get an error on the second run for trying to create two entries at the same DN.

Add Result: 68 Already exists

Deleting an old record

Deleting a record is fairly simple. To make a correlation with SQL, you simply specify the unique key to delete that record. DBMS_LDAP does not provide an API for mass deletion. If you wanted to delete multiple records, you could call the delete_s function for each DN, or, if you didn’t know them, you could perform a search as shown in prior chapters, and then retrieve the DN from each search result to do the delete.

DECLARE
    v_session          DBMS_LDAP.session;
    v_result           PLS_INTEGER;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_result := DBMS_LDAP.delete_s(ld => v_session, entrydn => 'cn=test_name,cn=OracleContext,dc=example,dc=net');

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Delete successful');
        ELSE
            DBMS_OUTPUT.put_line('Delete Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Delete successful

If you run the block twice, you should get an error on the second run for trying to delete a DN that doesn’t exist.

Delete Result: 32 No such object

Modifying a record

Modifying a record in LDAP can but doesn’t always correlate with a SQL update. At the simplest level, you can specify the unique key, the DN, as you would with a delete, but instead of removing the record, you can replace the value of an attribute with a new value. In SQL terms this would be analogous to something like this:

update my_table set my_attribute = new_value where id = my_dn;

You can update more than one attribute by populating the modification array with more than one replace command. Here I update our test record with a new TNS description pointing to another server with a different service name for that listener.

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result := DBMS_LDAP.simple_bind_s(ld => v_session, dn => 'uid=myuserid,ou=users,dc=example,dc=net', passwd => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 1);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myotherserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myotherservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_replace,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_result :=
            DBMS_LDAP.modify_s(ld        => v_session,
                               entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                               modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Modify successful');
        ELSE
            DBMS_OUTPUT.put_line('Modify Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Modify successful

While that replace looks like a fairly standard update, correlating well with SQL, you can also modify records to remove attributes. In SQL terms, that would be like removing a column from just one row of a table. Not setting it to NULL, but actually removing the column entirely.

Or, maybe a closer analogy, if your table had a collection type for a column and you deleted an element from that collection.

Even that though still doesn’t quite compare, because if you delete all attributes of a certain type, there is no place holder attribute left behind in an LDAP directory. But, if you had a collection type in a table, the column would still exist, but have a null or empty collection there.

So, the lesson here is that LDAP modify operations should be seen as a more complex operation.

Next I’ll perform an equivalent update as before, except instead of doing a replace, I’ll add a new description attribute and then delete the old one. This might seem like a violation because there will temporarily be two descriptions for a single name, but modify_s function is essentially atomic, no matter how many attribute modifications are applied. So, the LDAP constraint ensuring a single description for a name isn’t applied until after the all of the modifications are completed.

This example is merely for illustration not a recommendation. If you have a single value to update, using replace is more efficient than insert and delete. Also note this example assumes the directory is populated after the initial insert above, not after the other modify example.

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 2);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myotherserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myotherservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_delete,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);


        v_result :=
            DBMS_LDAP.modify_s(ld        => v_session,
                               entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                               modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Modify successful');
        ELSE
            DBMS_OUTPUT.put_line('Modify Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;

If the block is run twice you’ll get an error trying to add the same description value twice.

Modify Result: 20 Type or value exists

Rearranging the delete to occur first won’t help, because the delete is for the old description value. So, if you put the delete first and run it twice, you’ll get a different error, but the block will still fail.

Modify Result: 16 No such attribute

In addition to the inefficiency using two modifications instead of one, these errors also help to illustrate why using the replace modification is a better choice when it applies.

Of course, if you have a record with a multi-value attribute, it may be appropriate to delete one attribute and add another instead of doing a replace. For example, updating a person’s record, deleting one or more old email records and adding new ones. Or, not doing a replace at all, maybe adding children to a parent or making changes to employees of a department. The modification action may consist only of add operations or only of deletes. During a business’s reorganization a department might change names using a replace, change managers, also using a replace, and then add several employees and remove others. All of which could be executed with a single modify_s call to the department’s record.

This concludes my series on DBMS_LDAP. I hope you found it enlightening and the examples useful starting points for your own applications. I know I learned more in the process of writing it.

Questions on this chapter or any of the previous chapters are always welcome.

Strange (erroneous) query from Oracle Audit Vault


Today I was asked about a query from an Audit Vault job that was failing. Unfortunately, the job didn’t report any useful error messages to explain the failure. After some digging the problem was narrowed down to a query pulling user account information. In particular, a calculated column was proving both erroneous as well as confusing, thus prompting the question: “What is this doing?”

The calculated column (with a little formatting) was:

TO_CHAR(lock_date
        - (TO_NUMBER(SUBSTR(SYSTIMESTAMP,
                            (INSTR(SYSTIMESTAMP,':',-1,1)) - 3,
                            3)
                    )
          ) / 24,
        'MM/dd/yyyy HH24:mi:ss')

So, the first question was – what is this trying to do and why does it fail? The first thing I noticed was the function was attempting to parse SYSTIMESTAMP as if it were text. That is possible, because Oracle will implicitly convert it to a VARCHAR2 type, but we don’t know the format the job was using when it failed. I ran the query in a test database with my session’s NLS_TIMESTAMP_TZ_FORMAT set to ‘yyyy-mm-dd hh24:mi:ssxff tzr’ producing results like this:

SQL> alter session set nls_timestamp_tz_format='yyyy-mm-dd hh24:mi:ssxff tzr';

Session altered.

SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select lock_date,TO_CHAR(lock_date
  2          - (TO_NUMBER(SUBSTR(SYSTIMESTAMP,
  3                              (INSTR(SYSTIMESTAMP,':',-1,1)) - 3,
  4                              3)
  5                      )
  6            ) / 24,
  7          'MM/dd/yyyy HH24:mi:ss') ae$date from sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

Thus it seems the calculation is shifting the LOCK_DATE by 4 hours, but why and why does it work for me in the test db but not for the Audit Vault in the real one? The answer to that is in the text parsing.

In the problem db there was a logon trigger setting the NLS_TIMESTAMP_TZ_FORMAT to a custom form. The calculation makes the assumption that either the ‘TZR’ format model (with numeric offset) or ‘TZH:TZM’ is the last part of the converted text. In my test scenario that was the case, in the problem db it was not, that database used a format to support the assumptions of other processes. This is the reason implicit conversions should be avoided. You can’t always count on implicit formats being what you expect. Furthermore, there are security implications when implicit transformations are used. A better solution would have been to replace the implicit conversions with explicit ones. For example, rewriting the query as shown below would work regardless of the session timestamp format mask.

SQL> select lock_date,TO_CHAR(lock_date
  2          - (TO_NUMBER(SUBSTR(TO_CHAR(SYSTIMESTAMP,'TZR'),
  3                              (INSTR(TO_CHAR(SYSTIMESTAMP,'TZR'),':',-1,1)) - 3,
  4                              3)
  5                      )
  6            ) / 24,
  7          'MM/dd/yyyy HH24:mi:ss') ae$date from sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

That solves the immediate problem of why the query fails and what needs to be done to fix it; but still doesn’t answer the how and intent. The function is extracting the time zone hour offset from SYSTIMESTAMP, but how? In my test case the offset was -04:00 because I was running in US/Eastern time zone during Daylight Saving Time.

INSTR finds the colon separating the hours from minutes, then adjusts by 3 characters to the left to accommodate hour offsets of -23 to +23 (the ranges of real time zones are only -12 to +14 though.) Then using SUBSTR from that string index, parses the SYSTIMESTAMP converted text again, returning the hour portion of the offset; and finally TO_NUMBER changes that substring into a number – here too it would be better to use an explicit format such as TO_NUMBER(…,’S00′) to ensure a valid conversion.

Thus a “final” form with all conversions made explicit and reliable regardless of session settings might look like this:

TO_CHAR(lock_date
- (TO_NUMBER(SUBSTR(TO_CHAR(SYSTIMESTAMP,'TZH'),
                    (INSTR(TO_CHAR(SYSTIMESTAMP,'TZH:TZM'),':',-1,1)) - 3,
                    3),
             'S00'
            )
  ) / 24,
'MM/dd/yyyy HH24:mi:ss')

While this syntax would function reliably, it would still fail a code inspection for violating my rule of “Does this code jump through too many hoops?”
This was evident before ever running the code myself. The initial question was “What is this doing?” With 4 data type conversions, multiple substring parses, index adjustment, and date math the result is certainly not immediately obvious, thus prompting the next question: “Is there a better way?”

Yes, in multiple ways. First, it is possible to simplify the syntax above by only using the TZH format, thus removing the need to find that field within the string. Even better though, when timestamps were introduced in 9i, they also came with the EXTRACT function to pull pieces from them. So a much simpler, more efficient, and self-documenting syntax would be: extract(timezone_hour from systimestamp)

SQL> select extract(timezone_hour from systimestamp) from dual;

EXTRACT(TIMEZONE_HOURFROMSYSTIMESTAMP)
--------------------------------------
                                    -4

If used in the context of the original query…

SQL> SELECT lock_date, TO_CHAR(lock_date - EXTRACT(TIMEZONE_HOUR FROM SYSTIMESTAMP)/24,
  2                            'MM/dd/yyyy HH24:mi:ss') ae$date
  3    FROM sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

That’s much better. Using the offset, the LOCK_DATE values are now adjusted to UTC/GMT time. This is helpful within the Audit Vault so all of the data is stored in a consistent form for easier comparison, joins, filtering, etc.

But…

The query is still not entirely correct!
Note all of the lock_date values from February. Their adjusted AE$DATE value is NOT equivalent to UTC/GMT time, but the August values are. The reason for this the conversion logic does not take Daylight Saving Time into account. During February, my system would be in US/Eastern Standard time thus having an offset of -5 hours. If I wait a few more months until November when I’m back in Standard time the February values will convert correctly but then the August values will be wrong.

Unfortunately, there isn’t a good solution for this problem. It is possible with a java stored procedure or an external procedure call to retrieve the system timezone setting. Ideally that will return a time zone name or unique abbreviation. If so, that could be used from the lock date value instead of a fixed offset from current system time.

That still isn’t completely reliable though!
The transition from Daylight Saving Time to Standard Time has two instances of the 1am-2am hour (in the US, other countries will have their own variations of the same issue) and nothing in the DATE value of LOCK_DATE will indicate whether a value is from the first or second iteration through the transition period.

But, even if you somehow solve that problem, it’s STILL not completely correct!
All of the math to adjust dates to UTC/GMT assume an integer hour offset; but some time zones around the world are not on even hour offsets. Asia/Kathmandu is +05:45, Canada/Newfoundland is -02:30. So the entire premise of the adjustment math is incorrect. This problem is possible to fix though, if you know the full offset. You simply include the minutes in the calculation as well as the hours.

lock_date - EXTRACT(TIMEZONE_HOUR FROM SYSTIMESTAMP)/24  - EXTRACT(TIMEZONE_MINUTE FROM SYSTIMESTAMP)/1440

In summary, the original query has a multitude of failings:

  1. Unreliable syntax that simply fails if specific system assumptions are not met
  2. Overly complex computation when native functionality exists
  3. Unreliable date math not supporting Daylight Saving Time
  4. Incorrect time adjustment not taking into account non-hourly offsets

The point of this article wasn’t to pick on Audit Vault, it’s a fine product, and these issues were relatively easy to work around within the real system. No, the real message here is I see these same types of problems all the time, especially when date/time values are involved. This one query highlighted several from a family of logic and syntax errors that developers fall prey to with alarming frequency. Hopefully the breakdown and various rewrites and comments will help others avoid some of these pitfalls and correct the ones they do encounter.

Questions and comments, as always, are welcome!

Updated Oracle password algorithm article for 18c.


https://seanstuber.wordpress.com/how-oracle-stores-passwords/

Just minor updates.

18c supports the same hashes as 12cR2 and doesn’t add any new ones. 18c also uses the same sqlnet.ora parameters and values as 12cR2.

Updated links to 18c documentation and added note that 18c is the same as 12.1.0.2 and 12cR2.

How to use DBMS_LDAP (part 5: Timeouts)


Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data
  4. Browsing Attributes
  5. Timeouts (this page)
  6. Modifying Data

Timeouts

LDAP servers are designed to be fast; but it is still possible for an action to take a long time. A deep sub-tree search with an open search criteria could require from seconds to even minutes to return data if the directory was large and no other criteria limited the results.

Thus clients are able to put timeout limits on their requests to abort if they take too long. The public servers I’ve been using are relatively small in scope and/or have result limits on them so they tend to be fast but we’ll create an intentionally expensive search and time it. I’ll use the approximate equality operator to do an expensive search into a deep and wide subtree.

DECLARE
    v_result           PLS_INTEGER;
    v_session          DBMS_LDAP.session;
    v_search_attrs     DBMS_LDAP.string_collection;
    v_search_results   DBMS_LDAP.MESSAGE;
    v_start            TIMESTAMP WITH TIME ZONE;
    v_end              TIMESTAMP WITH TIME ZONE;
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_start := SYSTIMESTAMP;
    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'dc=cmu,dc=edu',
                           scope      => DBMS_LDAP.scope_subtree,
                           filter     => 'maillocaladdress~=ayushb',
                           attrs      => v_search_attrs,
                           attronly   => 0,
                           res        => v_search_results);
    v_end := SYSTIMESTAMP;

    DBMS_OUTPUT.put_line(v_end - v_start);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

+000000000 00:00:20.977627000

So, just over 20 seconds to do that search. If that is too long, a time out can be created using the DBMS_LDAP.TIMEVAL type. Then instead of using SEARCH_S, we use SEARCH_ST – “T” for timed. If the timeout is exceeded an exception will be raised. In order to more clearly capture the time, I’ll disable exceptions and examine the result code of the SEARCH_ST function instead. We’ll put a 2 second time limit on the search and check the timing when we’re done.

DECLARE
    v_result           PLS_INTEGER;
    v_session          DBMS_LDAP.session;
    v_search_attrs     DBMS_LDAP.string_collection;
    v_search_results   DBMS_LDAP.MESSAGE;
    v_start            TIMESTAMP WITH TIME ZONE;
    v_end              TIMESTAMP WITH TIME ZONE;
    v_timeout          DBMS_LDAP.timeval;
BEGIN
    DBMS_LDAP.use_exception := FALSE;

    v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_timeout.seconds := 2;

    v_start := SYSTIMESTAMP;
    v_result :=
        DBMS_LDAP.search_st(ld         => v_session,
                            base       => 'dc=cmu,dc=edu',
                            scope      => DBMS_LDAP.scope_subtree,
                            filter     => 'maillocaladdress~=ayushb',
                            attrs      => v_search_attrs,
                            attronly   => 0,
                            tv         => v_timeout,
                            res        => v_search_results);
    v_end := SYSTIMESTAMP;

    DBMS_OUTPUT.put_line('Error code: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    DBMS_OUTPUT.put_line(v_end - v_start);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Error code: 85 Timed out
+000000000 00:00:02.000382000

As expected, the search quit after 2 seconds and exited.

The TIMEVAL type is actually a record of two fields: seconds and microseconds.

    TYPE TIMEVAL IS RECORD
      ( seconds  PLS_INTEGER,
        useconds PLS_INTEGER
      );

Either or both fields may be populated. If both are populated the timeout value will be the sum of the two times. As in the following example where seconds is populated with 1 and useconds with 1.5 million, for a total timeout of 2.5 seconds.

DECLARE
    v_result           PLS_INTEGER;
    v_session          DBMS_LDAP.session;
    v_search_attrs     DBMS_LDAP.string_collection;
    v_search_results   DBMS_LDAP.MESSAGE;
    v_start            TIMESTAMP WITH TIME ZONE;
    v_end              TIMESTAMP WITH TIME ZONE;
    v_timeout          DBMS_LDAP.timeval;
BEGIN
    DBMS_LDAP.use_exception := FALSE;

    v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_timeout.seconds := 1;
    v_timeout.useconds := 1500000;

    v_start := SYSTIMESTAMP;
    v_result :=
        DBMS_LDAP.search_st(ld         => v_session,
                            base       => 'dc=cmu,dc=edu',
                            scope      => DBMS_LDAP.scope_subtree,
                            filter     => 'maillocaladdress~=ayushb',
                            attrs      => v_search_attrs,
                            attronly   => 0,
                            tv         => v_timeout,
                            res        => v_search_results);
    v_end := SYSTIMESTAMP;

    DBMS_OUTPUT.put_line('Error code: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    DBMS_OUTPUT.put_line(v_end - v_start);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Error code: 85 Timed out
+000000000 00:00:02.500799000

While the datatype of each value is PLS_INTEGER, which accepts negative values, if either value is negative you will receive an error code 1027 (Unknown error.)
Similarly if both values are left unpopulated (both NULL) a 1027 error will be returned.

If exceptions are enabled then the following exception will be raised.

ORA-31207: DBMS_LDAP: PL/SQL - Invalid LDAP search time value.

If both values are set to 0, that is treated as no timelimit. The search will run to completion or the server’s limit.

%d bloggers like this: