Oracle

Consolidating Diverse Data Sources With Oracle Database 12c

Consolidating Diverse Data Sources With Oracle Database 12c

By Yuli Vasiliev

Oracle Database 12c cannot only store different types of data natively, but also allows you to access that data stored outside of the database. This article overviews some Oracle Database 12c features that allow you to consolidate diverse data sources located outside of the database. In particular, you’ll look into how to derive and consolidate data from non-relational external sources as if it were in regular tables in the database.

The list of such features keeps growing from release to release. Thus, Oracle 9i came with a set of technologies for handling XML in the database, named Oracle XML DB. Oracle Database 10g came with a native implementation of XQuery, the W3C standard for querying XML, integrated into the database. Then, the preprocessor feature for external tables was introduced in Oracle Database 11g, allowing you to automatically preprocess the data being loaded within an external script. And Oracle Database 12c introduced support for JSON. The above, of course, is not an entire list of features Oracle Database has to let you consolidate diverse data sources located outside of the database. For details, you can check out the Oracle Database Online Documentation.

Let’s now take a closer look at some of those features mentioned above. Thus, the following example illustrates how you can use Oracle XQuery to extract data from an external RSS document. In particular, you’re pulling an RSS feed that contains information about latest OTN technical articles, retrieving only the title and the link for each item.

SELECT XMLQuery(

‘for $i in $h//channel

return

New OTN articles on Oracle SOA

{for $j in $h//item

where ora:contains($j, "Oracle SOA")

return {($j/title, $j/link)}}

PASSING xmlparse (document httpuritype (‘http://feeds.delicious.com/v2/rss/OracleTechnologyNetwork/

otntecharticle’).getCLOB()) as "h"

RETURNING CONTENT).getStringVal() as RESULT FROM DUAL;

Note that the XQuery query shown in this example uses filtering, so that only items containing Oracle SOA in the title will be retrieved. So, the output might look like this:

New OTN articles on Oracle SOA

</span></p> <p class="MsoNormal"><span style="font-size:10.0pt;mso-bidi-font-size:12.0pt; font-family:Courier;mso-bidi-font-family:Arial"><span style="mso-spacerun:yes"> </span>Tech Article: Oracle SOA Suite 12.2.1 New Features</span></p> <p class="MsoNormal"><span style="font-size:10.0pt;mso-bidi-font-size:12.0pt; font-family:Courier;mso-bidi-font-family:Arial"><span style="mso-spacerun:yes"> </span>

https://community.oracle.com/docs/DOC-994154

</span></p> <p class="MsoNormal"><span style="font-size:10.0pt;mso-bidi-font-size:12.0pt; font-family:Courier;mso-bidi-font-family:Arial"><span style="mso-spacerun:yes"> </span>Sample Chapter: Oracle SOA Suite 12c: Startup and Shutdown</span></p> <p class="MsoNormal"><span style="font-size:10.0pt;mso-bidi-font-size:12.0pt; font-family:Courier;mso-bidi-font-family:Arial"><span style="mso-spacerun:yes"> </span>

https://community.oracle.com/docs/DOC-992148

As you can see in the preceding example, XQuery works fine when it comes to querying XML sources. But what if the data you need to access is plain text? This might be, for example, our operating system data or any other data obtained from a static source or generated dynamically on the fly. The following example illustrates the later. You put the ls command output data into an external table, using the preprocessor feature first introduced in Oracle Database 11g Release 2.

In fact, you don’t specify the ls command as a preprocessor program directly. Instead, you use a shell script as a wrapper. So, let’s first create a directory to place a shell script wrapping the ls command:

$ mkdir /home/oracle/ext_tbl_dir

Then, create the exec_ls_sh shell script in this directory:

#!/bin/bash

/bin/ls /home/oracle/app/oracle/oradata/orcl -s

As you can see, the command used here will work with an oracle directory. Of course, this might be another directory. Before proceeding though, allow execute the above file as a program:

chmod +x /home/oracle/ext_tbl_dir/exec_ls.sh

To make sure that everything works as planned so far, try to execute the script:

$ /home/oracle/ext_tbl_dir/exec_ls.sh

The output might look like this:

total 2982200

4 afiedt.buf 9536 control01.ctl 862040 system01.dbf

2060 APEX_1930613455248703.dbf 84064 example01.dbf 161956 temp01.dbf

2060 APEX_2041602962184952.dbf 51260 redo01.log 174264 undotbs01.dbf

2060 APEX_2610402357158758.dbf 51260 redo02.log 230640 users01.dbf

2060 APEX_2611417663389985.dbf 51260 redo03.log

7188 APEX_2614203650434107.dbf 1290488 sysaux01.dbf

Now you can go to the database and create an external table that will use the above script as a preprocessor program. To start with, create a new database user and grant it a minimal set of priviliges:

CREATE USER usr IDENTIFIED BY pswd;

GRANT connect, resource TO usr;

The next step is to specify an alias in your database for the directory where the external table data file is located.

CREATE OR REPLACE DIRECTORY my_dir AS ‘/home/oracle/ext_tbl_dir’;

GRANT READ,WRITE,EXECUTE ON DIRECTORY my_dir TO usr;

Now, connect as the newly created user and create an external table:

CONNECT usr/pswd

CREATE TABLE ls_tbl(

file_sz NUMBER(12),

file_nm VARCHAR2(250 ))

ORGANIZATION EXTERNAL (

TYPE ORACLE_LOADER

DEFAULT DIRECTORY my_dir

ACCESS PARAMETERS (

records delimited by newline

preprocessor my_dir:‘exec_ls.sh’

skip 1

fields terminated by whitespace ldrtrim

)

LOCATION(my_dir:’exec_ls.sh’)

)

/

Provided the table has been successfully created, you can query it like a regular database table:

SELECT * FROM ls_tbl

The output generated might look like this:

file_sz file_nm

———- ———

4 afiedt.buf

9536 control01.ctl

About the Author

Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open source development, Java technologies, business intelligence (BI), databases, and service-oriented architecture (SOA). He is the author of Oracle Business Intelligence: The Condensed Guide to Analysis and Reporting (Packt, 2010) and PHP Oracle Web Development (Packt, 2007) as well as a series of other books on the Oracle technology.