The Information Systems and Computer Applications examination covers material that is usually taught in an introductory college-level business information systems course.

oracle

Estimate_percent

Here’s a live one from OTN – here are a couple of extracts from the problem statement:

We’re experiencing an issue where it seems that the query plan changes from day to day for a particular procedure that runs once a night.
It’s resulting in a performance variance of 10 second completion time vs 20 minutes (nothing in between).
It started occurring about 2 months ago and now it’s becoming more prevalent where the bad query plan is coming up more often.
I noticed that the query plans vary for a simple query.
We do run gather statistics every night. (DBMS_STATS.GATHER_SCHEMA_STATS (ownname=>sys_context( ‘userenv’, ‘current_schema’ ), estimate_percent => 1);)

The query and two execution plans look like this:

select count(*) from cs_bucket_member_v2 where bucket_type='P' and sec_id > 0 and order_id=0;

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     1 |    12 |   155   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| CS_BUCKET_MEMBER_V2 |  1148 | 13776 |   155   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | CS_BUCKET_MEMBER_N1 |  1272 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("BUCKET_TYPE"='P' AND "SEC_ID">0)
   3 - access("ORDER_ID"=0)


------------------------------------------------------------------------------------------
| Id  | Operation          | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                     |     1 |    12 | 11215   (2)| 00:01:41 |
|   1 |  SORT AGGREGATE    |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| CS_BUCKET_MEMBER_V2 |  1522K|    17M| 11215   (2)| 00:01:41 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=0 AND "SEC_ID">0 AND "BUCKET_TYPE"='P')

There are a couple of bits of information that would be useful – such as the database version, the number of rows in the table, the number of distinct values in each column, and whether any of the columns have histograms – but there are a couple of reasonable guesses that we might make about the problem. Notice particularly that the number of rows estimated from the index ranges scan is 1272 and only a small volume is then eliminated by the table filter predicates on sec_id and bucket_type. This suggests that the optimizer has information that tells it that most of the rows in the table have sec_id > 0 and bucket_type = ‘P’, and you might note that that suggests that there’s a histogram on bucket_type.

Rather than stating the most obvious guesses about the problem, though, I’ll start by creating a data set and emulating the problem, starting from an empty schema on 11.2.0.4:

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual
        connect by 
                level <= 1e4
)
select
        rownum                  sec_id,
        case
                when mod(rownum,1000) = 0
                        then 'X'
                        else 'P'
        end                     bucket_type,
        case
                when rownum < 1e6 - 50000 
                        then mod(rownum-1,1e5)
                        else 1000
        end                     order_id,
        lpad(rownum,10,'0')     id_vc,
        rpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
create index t1_i1 on t1(order_id) nologging; 

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

The column names in the table match those needed by the query, and the bucket_p column has a very skewed distribution that will eliminate very little data; the sec_id column is also not going to eliminate data, but it’s very evenly distributed with no large gaps so not a good candidate for a histogram in any case. The order_id has 50,000 rows out of 1,000,000 (5%) set of a single value, and most of those special rows are at the end of the table – it’s a pretty good candidate for a histogram (if Oracle spots it, and if we actually write queries to access that data).

I’ve run a query that references all three columns so that the default method_optof “for all columns size auto” will apply to them when I gather stats. So here’s the code that gathers stats and checks the result execution plans, first for “auto_sample_size” then for the 1% used by the OP:

set autotrace traceonly explain

begin
        dbms_stats.gather_schema_stats(
/*              estimate_percent => 1, */
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

begin
        dbms_stats.gather_schema_stats(
                estimate_percent => 1,
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

set autotrace off

And here are the two plans – in the same order:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |  2333   (4)| 00:00:12 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 51063 |   598K|  2333   (4)| 00:00:12 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=1000 AND "SEC_ID">1000 AND "BUCKET_TYPE"='P')


--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    12 |    23   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |       |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| T1    |    20 |   240 |    23   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |    20 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):  
---------------------------------------------------
   2 - filter("SEC_ID">1000 AND "BUCKET_TYPE"='P')
   3 - access("ORDER_ID"=1000)


I don’t know why, but with a 1% sample (which really did sample 10,000 rows) the optimizer didn’t spot the need for a histogram on order_id, but with the auto_sample_size (which sampled 5,500 – yes, half as many rows) the optimizer spotted the need for the histogram. Checking the trace files the only difference visible in the sample SQL was the presence in the 1% sample of the id_vc and padding columns which the auto_sample_size ignored because they hadn’t been logged as used by col_usage$.

Moral

Histograms are tricky things – and you can only make things worse in 11g by NOT using the auto_sample_size.

Footnote

Based on previous experience – my “obvious” guess about the OP’s data was that there was a special-case value for order_id, that the rows for that value were fairly well clustered, probably towards the end of the table, and constituted a small percentage of the table, and that the rest of the data reported “a few” rows per value. That’s why I built the model you see above.


NEWS: Oracle Updates Development Tools For Public Cloud Services

Oracle Technology Network Blog (aka TechBlog) by Laura Ramsey-Oracle

Today the Oracle Database Application Development Tools team released two important updates to the development tools line up. This guest post from Jeff Smith, Senior Principal Product Manager at Oracle, provides the full story.

Oracle Updates Development Tools for Public Cloud Services

by Jeff Smith

Oracle SQL Developer version 4.1 and Oracle SQL Developer Data Modeler 4.1 are now generally available. The two products empower customers to design, develop, and deploy best-of-breed applications running on-premises or in Oracle Database Cloud Services.

“We are very pleased with the strong adoption of Oracle SQL Developer,” said Michael J. Hichwa, Vice President of Oracle Database Tools. “SQL Developer has been embraced by the developer community as the de-facto standard database IDE for designing, developing, and DBA activities. Don’t let the dot 1 version fool you, this release is packed full of good stuff.”

Oracle SQL Developer version 4.1
Oracle SQL Developer, with its 4,000,000+ active users, serves as the IDE for Oracle SQL and PL/SQL and as the GUI for Oracle Database. In our latest release we have introduced several new features as well as enhanced some of the more popular ones to be easier to use and to provide additional value.

Perhaps the most noteworthy new feature is our one-button-click solution for copying on-premises Oracle Databases to the Oracle Database Cloud Service.

A database administrator simply right-clicks on a 12c Multitenant Pluggable Database and chooses to either move, or copy to their Cloud Service. When the operation is complete, the database is automatically made available in the Oracle Database Cloud Service.

Additionally, Oracle SQL Developer offers tighter integration with Oracle REST Data Services. Starting with the 4.1 release, SQL Developer can now install, configure, and run Oracle REST Data Services version 3.0 without any additional downloads.

SQL Developer will also make automatic RESTful services available for entire schemas or selected tables when connected to an Oracle Database being serviced by Oracle REST Data Services.

Database Administrators can also enjoy a new interface for checking on the activity in their database using SQL Developer’s DB Instance page. All in a single view, administrators can observe critical metrics in real-time around IO, networking, wait events, storage, and more. Each activity graph can be drilled down into a detailed report so that immediate action can be taken to address any problems.

SQLDev4.1
Figure 1: DB Status Page in Version 4.1 of Oracle SQL Developer

DBstatuspage
Oracle SQL Developer Data Modeler version 4.1
Oracle SQL Developer Data Modeler provides a no-cost solution for designing and deploying both Oracle Database and Oracle Big Data Appliance data models. Version 4.1 continues to add additional support for taking advantage of new Oracle Database 12c features.

Support for comparing physical data mode elements such as storage properties for tables, PL/SQL code objects, and security around users, roles, and their permissions is now available. In addition, users are able to define user defined properties or import them from existing Oracle Designer or ERWin data models.


OTN Virtual Technology Summit – Available OnDemand

Thanks to all of you that participated in the recent OTN Virtual Technology Summit – It was a success!

In case you missed the live day event you can visit our on-demand site to explore each content track at your convenience. As a reminder we feature content on Database, Java, Middleware and Systems presented by Oracle ACEs, Java Champions, and Oracle product experts. A replay of the presentations is now available for your viewing.

Call For Sessions –

We want to make sure the OTN Virtual Technology Summit series provides you with the information you want and need. To that end, we have created “Idea Spaces” in which you can submit suggestions for sessions in future Virtual Technology Summits. If you are interested in presenting a VTS session, use these spaces to submit your session proposal today! The deadline for submissions for an upcoming VTS is May 15.

Not the presenter type? Go here and vote on ideas that have been submitted to ensure you get to take part in choosing what future event content will be.

Database Track

Java Track

Middleware Track

Database Innovation Online Forum – March 18, 2015 at 10:00 a.m. PST

Are you leveraging Oracle’s database innovations for Cloud and Big data? Join Tom Kyte, Oracle Database Architect, to learn what’s new in Oracle Database 12c, understand the top reasons why customers choose Oracle. Register Today!

Become an Oracle ACE

The ACE program is a peer-reviewed community of industry thought leaders and Oracle technology experts.