SQL Server Technical Article
Writer: Doug Wheaton (Attunity Ltd.)
Technical Reviewer: Ramakrishnan Krishnan (Microsoft), Robert Zare (Microsoft), Jeff Bernhardt (Microsoft), David Noor (Microsoft)
Published: September 2009
Applies to: SQL Server 2008
Summary: The Microsoft Connector for Oracle by Attunity provides optimal performance during loading or unloading of data from Oracle with Microsoft SQL Server Integration Services packages. The Attunity connector offers a user experience similar to that of the built-in connectors, with intuitive capabilities including configuration, metadata browsing, and error routing. It is available for download at no cost to users of SQL Server 2008 Enterprise. This paper discusses how to install and configure the Microsoft Connector for Oracle by Attunity, the use cases for the connector, and the performance of the connector. The paper also discusses how to troubleshoot and analyze run-time failures related to the connector.
IntroductionThe new high-speed Microsoft® Connector for Oracle by Attunity is available download for Microsoft SQL Server® Enterprise customers as a Web download from the Microsoft and Attunity Web sites. This connector was designed to be significantly faster than existing connectors for Oracle. For more information, see “Performance” later in this paper. The connector enables massive data extraction from and to Oracle within the Microsoft SQL Server environment, via standard interfaces in both Full and Incremental modes. The Oracle connector supports the Fast Load method and can run several processes in parallel.
The Microsoft Oracle Connector 1.0 by Attunity has three main components:
- An Oracle source, to extract bulk data from Oracle
- An Oracle destination, to import bulk data into Oracle
- An Oracle Connection Manager, to enable a package to connect to an Oracle data source
Supported product versions:
- Microsoft SQL Server 2008 data management software
- Oracle Client version 10.x or 11.x
Supported operating systems:
- The 32-bit (x86) and 64-bit (x64) editions of the Windows® XP operating system
- The 32-bit (x86) and 64-bit (x64) editions of the Windows Vista® operating system
- The 32-bit (x86), 64-bit (x64), and 64-bit Itanium (IA64) editions of the Windows Server® 2003 operating system
- The 32-bit (x86), 64-bit (x64), and 64-bit Itanium (IA64) editions of Windows Server 2008 32-bit (x86), 64-bit (x64), and 64-bit Itanium (IA64)
Installation:
- The Oracle client software must be installed on the computer running SQL Server Integration Services (SSIS)
- Oracle database versions 9.2.0.4 and higher are supported
Additional considerations for 64-bit environments:
- On 64-bit platforms (both x64 and IA64), you need only install the Attunity Oracle 64 bit connector for the appropriate architecture (x64 or IA64). You should not install the Attunity Oracle 32 bit connector alongside the 64-bit versions, because the 64 bit connector installs both 32-bit and 64-bit versions.
- The Oracle 32 Bit client is required in order to develop packages on a 64-bit computer because SQL Server Business Intelligence Studio runs in 32-bit mode.
- The Oracle 64 Bit client is required in order to run packages on a 64-bit computer.
- If you run the package in 32-bit mode, you do not need the Oracle 64 Bit client.
In summary, if you want both to develop and to run SSIS packages on a 64-bit computer:
- Install the Attunity Oracle 64 bit connector.
- Install the Oracle 11G 32 Bit client.
- Install the Oracle 11G 64 Bit client.
- Restart the computer after the Oracle installations are complete.
- You should be able to develop and run SSIS packages.
If after completing these steps you still experience functional issues, you can use the SQLPLUS Oracle utility to connect to the database in order to confirm that the client was installed properly.
The following screen capture shows both Oracle clients installed on a computer running a 64-bit edition of Windows Server 2008.
Figure 1: Registry entries showing the 32-bit and 64-bit Oracle client software installed on the same computer, which is running a 64-bit edition of Windows Server.
Figure 2: Supported platforms and versions for the Oracle connector
Install and Add the Microsoft Connector for Oracle by Attunity
The installation steps for the connector are the same for all of the Use Cases described in this article. Before you proceed with any individual Use Case, follow these steps:
- Install the Microsoft Connector for Oracle by Attunity.
- Enable and as data flow sources in the Toolbox in Business Intelligence Development Studio.
- Set up the Connection Manager for Oracle.
Each step is described in more detail in the following sections.
Install the Microsoft Connector for Oracle by Attunity
Installation Process
Figure 3: Starting the installation process for the Oracle Connector by Attunity
Figure 4: Accepting the license agreement
Figure 5: Selecting the installation folder
Figure 6: Finishing the installation
After installation is complete, you are prompted to restart SQL Server Integration Services.
Figure 7: Restarting the Integration Services service after installation of the Oracle Connector by Attunity
In addition to the component installation, a Start menu item is created to provide easy access to documentation.
Figure 8: New item on the Start menu for the Help file for the Oracle Connector by Attunity
Add the Oracle Source and Oracle Destination Components to the Toolbox
After installation, open SQL Server Business Intelligence Development Studio and create a new Integration Services project. The first time after the SQL Server 2008 Feature Pack installation, the Microsoft Connector components are not enabled by default as data flow sources. To enable them, in the Toolbox, right-click Data Flow Sources, click Choose Items, and then on the SSIS Data Flow Items tab, and select the Oracle Source check box, as shown in the following figures.
Oracle Source Component
1. In the Toolbox, right-click Data Flow Sources and then click Choose Items.
Figure 9: The first step in adding the Oracle source to the Toolbox
2. Click the SSIS Data Flow Items tab.
3. Under Name, select the Oracle Source check box, and then click OK.
Figure 10: Selecting the Oracle source
At this point Oracle Source is now available as a data flow source.
Figure 11: Oracle Source now appears in the Toolbox
Oracle Destination Component
Repeat the steps to enable display of the Oracle destination component.
Configure the Microsoft Connector for Oracle by Attunity
Oracle Connection Manager
In the SSIS project, add a new connection of type MSORA.
Figure 12: Adding an Oracle connection to a package
After the connection is created, edit it and fill out the system and logon information. To validate connectivity, click Test Connection.
- The connection parameters are the same you would use for SQLPLUS.
- The connection manager is used by both the source and destination components.
- The connection manager uses the Oracle client to connect to the Oracle database.
Figure 13: Setting up the Oracle connection
Oracle Source Component
The Oracle source extracts data from Oracle databases by using a database table, a view, or an SQL command.
The Oracle source has the following data access modes for extracting data:
- A table or view.
- The results of an SQL statement.
The source uses an Oracle connection manager, which specifies the Oracle provider to use.
The Oracle source has one regular output and one error output.
Error Handling in the Oracle Source Component
The Oracle source has an error output. The component error output includes the following output columns:
- Oracle Error Code: The number that corresponds to the current error. See the Oracle documentation for a list of relevant error codes.
- Error Column: The source column causing the error (for conversion errors).
- Error Row Columns: The record data that causes the error.
Depending on the error behavior setting, the Oracle source supports returning errors (data conversion, truncation) that occur during the extraction process in the error output.
Parallelism
There is no limit on the number of Oracle sources that can run in parallel against the same table or different tables, on the same computer, or on different computers (other than normal global session limits). The Oracle source can also run in parallel to the Oracle destination working against the same table.
Troubleshooting the Oracle Source
You can log the calls that the Oracle source makes to the Oracle Call Interface (OCI). You can use this logging ability to troubleshoot the extraction of data from Oracle data sources that the Oracle source performs. To log the calls that the Oracle source makes to an Oracle data source, enable package logging and select the Diagnostic event at the package level.
Configuring the Oracle Source
You configure the Oracle source in the Oracle Source dialog box.
Oracle Source Dialog Box (Connection Manager Page)
Use the Connection Manager page of the Oracle Source dialog box to select the Oracle connection manager for the source. On this page you can also select a table or view from the database.
Figure 14: Connection Manager page of the Oracle Source dialog box
Options on the Connection Manager Page of the Oracle Source Dialog Box
Connection Manager - Select an existing connection manager from the list, or click New to create a new connection.
New - Click New to open the Oracle Connection Manager Editor, where you can create a new connection manager.
Data Access Mode - Select the method for selecting data from the source. The options are shown in the following table.
Option | Description |
---|---|
Table or view |
Retrieve data from a table or view in the Oracle data source. When you select this option, select a value from the drop-down list for the name of the table or the view. Select an available table or view from the database from the list. This list contains the first 1,000 tables only. If your database contains more than 1,000 tables, you can type the beginning of a table name or use the (*) wild card to enter any part of the name to display the table or tables you want to use. If no default database is selected for the connection manager, all tables for all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS, CTXSYS, XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST. |
SQL command |
Retrieve data from the Oracle data source by using an SQL query. When you select this option, enter a query in one of the following ways: Enter the text of the SQL query in SQL command. Click Browse to load the SQL query from a text file. Click Parse query to verify the syntax of the query text. |
Table 1: Data access mode options for the Oracle source
Oracle Source Dialog Box (Columns Page)
Use the Columns page of the Oracle Source dialog box to map an output column to each external (source) column.
Figure 15: Columns page of the Oracle Source dialog box
Options on the Columns Page of the Oracle Source Dialog Box
Available External Columns - A list of available external columns in the data source. You cannot use this table to add or delete columns. Select the columns to use in the source. The selected columns are added to the External Column list in the order they are selected. Select the Select All check box to select all of the columns.
External Column - A view of the external (source) columns in the order that you see them when configuring components that consume data from the Oracle source. To change this order, first clear the selected columns in the Available External Columns list, and then select external columns from the list in a different order. The selected columns are added to the External Column list in the order you select them.
Output Column - Enter a unique name for each output column. The default is the name of the selected external (source) column; however, you can choose any unique, descriptive name. The name entered is displayed in the SSIS Designer.
Note: Columns of unsupported data types are shown as external columns, but they are not exposed as output columns.
Oracle Source Dialog Box (Error Output Page)
Use the Error Output page of the Oracle Source dialog box to select error handling options.
Figure 16: Error Output page of the Oracle Source dialog box
Options on the Error Output Page of the Oracle Source Dialog Box
Error - Select how the Oracle source should handle errors in a flow: ignore the failure, redirect the row, or fail the component.
Truncation - Select how the Oracle source should handle truncation in a flow: ignore the failure, redirect the row, or fail the component.
The following are the options for handling errors and truncation:
- Fail Component. The Data Flow task fails when an error or a truncation occurs. This is the default behavior.
- Ignore Failure. The error or the truncation is ignored and the data row is directed to the Oracle source output.
- Redirect Flow. The error or the truncation data row is directed to the error output of the Oracle source. In this case the Oracle source error handling is used.
Oracle Source Advanced Editor
The Advanced Editor contains properties that cannot be set in the Oracle Source dialog box. You can view and configure these properties using the Component Properties tab.
To open the Advanced Editor: In the Data Flow screen of your Integration Services project, right-click the Oracle source and then click Show Advanced Editor.
Figure 17: The Component Properties tab of the Advanced Editor for the Oracle source
Custom Properties of the Oracle Source
The Component Properties tab includes a Custom Properties list. The following table describes the custom properties of the Oracle source. All properties are read/write.
Custom Property | Data Type | Description |
---|---|---|
AccessMode |
Integer (enumeration) |
The mode used to access the database. The possible options are Table Name and SQL Command. The default is Table Name. |
BatchSize |
Integer |
The size of the batch for bulk loading. This is the number of records extracted as an array. Note: This property is not available in the Oracle Source dialog box, but it can be set by using the Advanced Editor. |
DefaultCodePage |
Integer |
The code page to use when code page information is unavailable from the data source. Note: This property is not available in the Oracle Source dialog box, but can be set by using the Advanced Editor. |
PreFetchCount |
Integer |
The possible number of prefetched (look ahead) rows. Note: This property is not available in the Oracle Source dialog box, but can be set by using the Advanced Editor. |
SqlCommand |
String |
The SQL command to be executed. |
TableName |
String |
The name of the table with the data that is being used. |
Table 2: Custom properties of the Oracle source
Oracle Destination Component
The Oracle destination connects to a local or remote Oracle database and bulk loads data into Oracle databases.
The destination uses the Oracle Connection Manager to connect to a data source.
An Oracle destination includes mappings between input columns and columns in the destination data source. You do not have to map input columns to all destination columns, but depending on the properties of the destination columns, errors can occur if no input columns are mapped to the destination columns. For example, if a destination column does not allow null values, an input column must be mapped to that column. In addition, the data types of mapped columns must be compatible. For example, you cannot map an input column with a string data type to a destination column with a numeric data type.
The input column and the column in the destination data source are compatible if:
- They have the same data type, precision, scale, and code page. The length can be different.
- The input column type is DT_NUMERIC and the destination data source type is DT_R8.
The Oracle destination has one regular input and one error output.
Note: Columns of unsupported data types are shown, but cannot be mapped. Tables that have columns of unsupported data types that are not nullable cannot be loaded.
Load Options
The Oracle destination can use one of two access load modes. You set the mode on the Connection Manager page of the Oracle Destination Editor. The two modes are:
- Arrayed load: In this mode, data is loaded into Oracle table in batches and the entire batch is inserted under the same transaction.
- Fast load using Direct Path: In this mode, the destination component uses the OCI direct path protocol for loading the Oracle table.
When using fast load mode, the following are some restrictions:
- Triggers are not supported.
- Referential integrity constraints are not supported.
- Clustered tables are not supported.
- Loading of remote objects is not supported.
For more information, see Oracle Corporation’s Call Interface Programmer’s Guide.
Error Handling
The Oracle destination has a single error output. The component error output includes the following output columns:
- Error Code: The number that corresponds to the current error. See the Oracle documentation for a list of relevant error codes.
- Error Column: The source column causing the error (for conversion errors).
- Error Row Columns: The record data that causes the error.
Depending on the error behavior setting, the Oracle destination supports returning errors (such as, data conversion, truncation, or constraint violation) that occur during the loading process in the error output.
You can set the maximum number of errors that can occur using the Maximum number of errors (MaxErrors) property. When the error limit is reached, the Oracle destination returns an error and stops. In all cases the target table will include all records completed successfully up to the point where the component stopped. Only the record that exceeded the limit is not included in the target table.
Parallelism
When arrayed loading is used, there is no restriction on the number of Oracle destinations that run in parallel against the same tables as long as all of the Oracle destinations are configured for arrayed loading. However, this may reduce performance because of standard record locking. The amount of performance loss depends on the data and table organization.
When the direct path protocol (fast load) is used, only one Oracle destination can be configured to run against the same table in the same time. However, you can choose to use Parallel mode, which denotes a parallel direct path load. A parallel direct path load allows multiple direct path load sessions to concurrently load the same data segments (multisegment parallelism). A parallel direct path is naturally more restrictive than a normal direct path. In this case, you should plan any use of parallelism in advance by splitting the source data into segments and applying them concurrently in parallel mode. There is no reason to use a single parallel session.
The following restrictions apply when parallel direct path loads are used:
- The load cannot maintain local or global indexes.
- Referential integrity and CHECK constraints must be disabled.
- Triggers must be disabled.
For more information, see the Oracle documentation.
Troubleshooting the Oracle Destination
You can log the calls that the Oracle destination makes to the Oracle Call Interface (OCI). You can use this logging ability to troubleshoot the saving of data to Oracle data sources that the Oracle destination performs. To log the calls that the Oracle destination makes to an Oracle data source, enable package logging and select the Diagnostic event at the package level.
Configuring the Oracle Destination
You configure the Oracle destination in the Oracle Destination Editor.
Oracle Destination Editor (Connection Manager Page)
Use the Connection Manager page of the Oracle Destination Editor to select the Oracle connection manager for the destination. This page also lets you select a table or view from the database.
Options on the Connection Manager Page of the Oracle Destination Editor
Connection manager - Select an existing connection manager from the list, or click New to create a new connection.
New - Click New to open the Oracle Connection Manager Editor, where you can create a new connection manager.
Data access mode - Select the method for selecting data from the source. The options are shown in the following table.
Option | Description |
---|---|
Table Name |
Select this option to configure the Oracle destination to work in arrayed mode. When you select this option, the following options are available. Name of the table or the view: Select an available table or view from the database from the list. This list contains the first 1,000 tables only. If your database contains more than 1,000 tables, you can type the beginning of a table name or use the (*) wild card to enter any part of the name to display the table or tables you want to use. If no default database is selected for the connection manager, all tables for all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS, CTXSYS, XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST. Maximum number of errors: Type the number of errors that can occur before the data flow is stopped. By default the value is 0, which indicates that there is no error limit. If you select Redirect flow in the Error handling page, all errors that are returned before the data flow reaches the error limit are returned in the error output. For more information, see the “Options on the Error Output Page of the Oracle Destination Editor” section. Transaction size: Type the number of inserts that can be carried out in a single transaction. The default is the value of BatchSize. Batch size: Type the size of the batch for bulk loading. This is the number of rows loaded as a batch. |
Table Name-Fast Load |
Select this option to configure the Oracle destination to work in fast (Direct Path) load mode. If this option is not used, the Oracle destination is configured to work in arrayed mode. When you select this option, the following options are available. Name of the table or the view: Select an available table or view from the database from the list. This list contains the first 1,000 tables only. If your database contains more than 1,000 tables, you can type the beginning of a table name or use the (*) wild card to enter any part of the name to display the table or tables you want to use. If no default database is selected for the connection manager, all tables for all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS, CTXSYS, XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST. Parallel load: Select this check box to indicate whether parallel loading is enabled. No logging: Select this check box to disable database logging. This is not related not to tracing but to the logging that the Oracle database carries out for recovery purposes. Maximum number of errors: Type the number of errors that can occur before the data flow is stopped. By default the value is 0, which indicates that there is no error limit. All errors that are returned before the data flow reaches the error limit are returned in the error output. For more information, see the “Options on the Error Output Page of the Oracle Destination Editor” section. Transfer buffer size (KB): Type the size of the transfer buffer. The default size is 64 KB. |
Table 3: Data Access Mode options for the Oracle destination
Oracle Destination Editor (Mappings Page)
Use the Mappings page of the Oracle Destination Editor to map input columns to destination columns.
Options on the Mappings Page of the Oracle Destination Editor
Available Input Columns - The list of available input columns. Drag an input column to an available destination column to map the columns.
Note: Columns of unsupported data types are shown, but they cannot be mapped.
Available Destination Columns - The list of available destination columns. Drag a destination column to an available input column to map the columns.
Note: Columns of unsupported data types are shown, but they cannot be mapped.
Input Column - View the input columns that you selected. You can remove mappings by selecting <ignore> to exclude columns from the output.
Destination Column - View all available destination columns, both mapped and unmapped.
Oracle Destination Editor (Error Output Page)
Use the Error Output page of the Oracle Destination Editor to select error handling options.
Options on the Error Output Page of the Oracle Destination Editor
Error behavior - Select how the Oracle destination should handle errors in a flow: ignore the failure, redirect the row, or fail the component.
Truncation - Select how the Oracle destination should handle truncation in a flow: ignore the failure, redirect the row, or fail the component.
The following are the options for handling errors and truncation:
- Fail Component. The Data Flow task fails when an error or a truncation occurs. This is the default behavior.
- Ignore Failure. The error or the truncation is ignored and the data row is not inserted.
- Redirect Flow. The error or the truncation data row is directed to the error output of the Oracle destination.
Oracle Destination Advanced Editor
The Advanced Editor contains the properties that can be set programmatically.
To open the Advanced Editor: In the Data Flow screen of your Integration Services project, right-click the Oracle destination and then click Show Advanced Editor.
Figure 18: The Component Properties page of the Advanced Editor for the Oracle destination
Oracle Destination Custom Properties
The following table describes the custom properties of the Oracle destination. All properties are read/write.
Property | Data Type | Description | Loading Mode |
---|---|---|---|
BatchSize |
Integer |
The size of the batch for bulk loading. This is the number of rows loaded as a batch. |
Used only in arrayed mode. |
DefaultCodePage |
Integer |
The code page to use when code page information is unavailable from the data source. Note: This property is not available in the Oracle Destination Editor, but can be set by using the Advanced Editor. |
Used for both modes. |
FastLoad |
Boolean |
A value that indicates whether Fast Loading is used. The default value is True. This can also be set in the Oracle Destination Editor (Connection Manager Page). For more information, the “Options on the Error Output Page of the Oracle Destination Editor” section. |
Used for both modes. |
MaxErrors |
Integer |
The number of errors that can occur before the data flow is stopped. By default the value is 0, which indicates that there is no error limit. If you select Redirect flow in the Error handling page, all errors that are returned before the data flow reaches the error limit are returned in the error output. For more information, see the “Options on the Error Output Page of the Oracle Destination Editor” section. |
Used for both modes. |
NoLogging |
Boolean |
A value that indicates whether database logging is disabled. The default value is False, indicating that logging is enabled. |
Used for both modes. |
Parallel |
Boolean |
A value that indicates whether parallel loading is enabled. For information about how to use parallel loading and restrictions on its use, see the “Options on the Error Output Page of the Oracle Destination Editor” section. |
Used only in Fast Load mode. |
TableName |
String |
The name of the table with the data that is being used. |
Used for both modes. |
TableSubName |
String |
The subname or subpartition. This value is optional. Note: This property is not available in the Oracle Destination Editor, but it can be set by using the Advanced Editor. |
Used only in Fast Load mode. |
TransactionSize |
Integer |
The number of inserts that can be made in a single transaction. The default is the value of BatchSize. |
Used only in arrayed mode |
TransferBufferSize |
Integer |
The size of the transfer buffer. The default value is 64 KB. |
Used only in Fast Load mode. |
Table 4: Custom properties of the Oracle destination
Use CasesUsage Scenario 1: Bulk Extract from Oracle to SQL Server Using OCI Array Binding
Solution Architecture Overview
Figure 19: Extracting from Oracle using OCI array binding
- Source table: in Oracle
- Target table: in SQL Server
- Data flow: Configure Oracle source component and configure SQL Server destination component
Define the SSIS Data Flow
Add and Configure the Oracle Source Component
1. Add the source component to the SSIS package data flow.
2. Select the Oracle source component from the Data Flow Sources box.
Figure 20: Selecting the Oracle source in the Toolbox
Figure 21: The Data Flow tab of SSIS Designer, showing the Oracle source and the SQL Server destination before they are connected
3. Configure the Oracle source component.
The Oracle source extracts data from Oracle databases by using a database table, a view, or an SQL command. The source uses an Oracle connection manager, which specifies the Oracle provider to use.
The Oracle source has one regular output and one error output.
Data Access Modes
Option | Description |
---|---|
Table Name |
Retrieve data from a table or view in the Oracle data source defined in the Connection Manager. When you select this option, select a value from the Name of the table or the view list This list contains the first 1,000 tables only. If your database contains more than 1,000 tables, you can type the beginning of a table name or use the (*) wild card to enter any part of the name to display the table or tables you want to use. If no default database is selected for the connection manager, all tables for all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS, CTXSYS, XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST. |
SQL Statement |
Retrieve data from the Oracle data source by using an SQL query. When you select this option, enter a query in one of the following ways: Enter the text of the SQL query in SQL command. Click Browse to load the SQL query from a text file. Click Parse query to verify the syntax of the query text. |
Table 5: Data access modes for the Oracle source
Figure 22: Connection Manager page of the Oracle Source dialog box, configured to use a table or view as the data access mode
Figure 23: Connection Manager page of the Oracle Source dialog box, configured to use an SQL command as the data access mode
Add and Configure SQL Server Destination Component
After the Oracle source component is set up, define the destination node. After the mapping has been completed, the SQL Server Integration Services package is ready to run.
Figure 24: The Data Flow tab of SSIS Designer, showing the Oracle source and the SQL Server destination after they are connected
Usage Scenario 2: Bulk Load Using FastLoad
Solution Architecture Overview
Figure 25: Loading into Oracle using OCI Direct Path
Description
This scenario typically entails moving large amounts of data into an Oracle database.
Define SSIS Data Flow
Add and Configure Source and Oracle Destination Components
1. Add a source component. Any supported source can be used.
Figure 26: Selecting a source in the Toolbox
2. Add the Oracle destination component.
Figure 27: Selecting the Oracle destination in the Toolbox
3. Link the source and the destination components.
Figure 28: The Data Flow tab of SSIS Designer, showing the OLE DB source and the Oracle destination after they are connected
Configure the Oracle destination component.
Figure 29: Connection Manager page of the Oracle Destination Editor, configured to use a table and fast load using direct path as the data access mode
After the mapping is done, the SSIS package is ready to run.
Usage Scenario 3: Incremental Data Load
Description
The Oracle destination connects to a local or remote Oracle database and bulk loads data into Oracle databases using the Array Binding API. In this mode, data is loaded into Oracle table in batches and the batches are inserted under the transactions. This mode allows for bulk loading to an Oracle table that already has data in it.
The destination uses the Oracle Connection Manager to connect to a data source.
Architecture
Figure 30: Loading into Oracle using OCI Array Binding
Define SSIS Data Flow
Add and Configure Source and Oracle Destination Components
1. Add a source component. Any supported source can be used.
Figure 31: Selecting a source in the Toolbox
2. Add the Oracle destination component.
Figure 32: Selecting the Oracle destination in the Toolbox
3. Link the source and the destination components.
Figure 33: The Data Flow tab of SSIS Designer, showing the OLE DB source and the Oracle destination after they are connected
4. Configure the Oracle destination component.
Figure 34: Connection Manager page of the Oracle Destination Editor, configured to use a table as the data access mode
After the mapping is done, the SSIS package is ready to run.
PerformanceMethodology
Test Scenario and Configuration
This section discusses test scenarios that were performed and the results of the tests.
All testing was done using the LINEITEM table from the TPC-H dataset. The LINEITEM table is the largest table and contains the largest variety of data types.
Two identical servers were used, each with Intel 64-bit processors (24 cores at 2.4 GHz), 32 GB of RAM and an HP disk array partitioned into 16 logical drives for data plus one for database logs (each logical drive two physical disks, striped).
Software used:
- Windows Server 2008
- SQL Server 2008
- Oracle 11g
- Attunity SSIS Connectors 1.0 GA release – 2008-09-16.
64-bit software is used in all tests except where noted.
Up to eight streams of data were used. A stream is a single SSIS package reading from a single flat file and loading into the database (or in the reverse direction).
To ensure that I/O does not become the bottleneck, each stream sources data from a different logical drive, so that LINEITEM.TBL.1 is on Drive1, LINEITEM.TBL.2 is on Drive2, and so on. Except where noted, source files have about 75,000,000 rows and are about 10 GB in size.
Likewise, the target database is striped across eight logical drives, Drive9 to Drive16. For SQL Server, the target database is striped onto eight file groups, each of which has one file on each drive. For Oracle, storage extents were preallocated across the eight drives, and the target table mapped to a tablespace, which spans the extents.
In SQL Server, the following table definition was used:
create table LINEITEM (L_SHIPDATE smalldatetime not null, L_ORDERKEY bigint not null, L_DISCOUNT smallmoney not null, L_EXTENDEDPRICE money not null, L_SUPPKEY int not null, L_QUANTITY smallint not null, L_RETURNFLAG char(1) not null, L_PARTKEY int not null, L_LINESTATUS char(1) not null, L_TAX smallmoney not null, L_COMMITDATE smalldatetime not null, L_RECEIPTDATE smalldatetime not null, L_SHIPMODE varchar(10) not null, L_LINENUMBER int not null, L_SHIPINSTRUCT varchar(25) not null, L_COMMENT varchar(44) not null )
SQL Server and Oracle data types do not have a simple one-to-one mapping. The following table definition was used for Oracle:
create table LINEITEM (L_SHIPDATE DATE not null, L_ORDERKEY NUMBER(19,0) not null, L_DISCOUNT NUMBER(10,4) not null, L_EXTENDEDPRICE NUMBER(19,4) not null, L_SUPPKEY NUMBER(10,0) not null, L_QUANTITY NUMBER(5,0) not null, L_RETURNFLAG char(1) not null, L_PARTKEY NUMBER(10,0) not null, L_LINESTATUS char(1) not null, L_TAX NUMBER(10,4) not null, L_COMMITDATE DATE not null, L_RECEIPTDATE DATE not null, L_SHIPMODE varchar2(10) not null, L_LINENUMBER NUMBER(10,0) not null, L_SHIPINSTRUCT varchar2(25) not null, L_COMMENT varchar2(44) not null )
The SSIS packages were kept as simple as possible. Except as noted, each package contains a single data flow that reads data from one source file and writes to the database. Below is an example. There are also versions of the packages that read from the database and write to flat files.
Figure 35: Simple package structure used for performance testing
Test Definitions
Three primary tests scenarios were performed:
- “EmptyTable” tests, which loaded data from flat files into an empty database table.
- “AddToTable” tests, which loaded data from flat files into a database table that already had data in it.
- “Extract” tests, which pulled data from a database table and wrote to a flat file. The size of the resulting file was somewhat larger than the original flat files due to differences in the formatting of the output strings.
Results of Performance Testing
Figure 36: Attunity connectors vs. Oracle OLE DB vs. Microsoft OLE DB
Figure 37: 64-bit vs. 32- bit connectors
Data Type MappingThe following table shows the Oracle database data types and their default mapping to SSIS data types. SSIS components for Oracle do not support all data types. Columns with unsupported data types cannot be mapped. Tables with columns of unsupported data types that are not nullable cannot be loaded.
Oracle Data Type | Integration Services Data Type |
---|---|
VARCHAR2 |
DT_STR |
NVARCHAR2 |
DT_WSTR |
NUMBER |
DT_R81 |
NUMBER(P, S)
|
When the scale is 0, according to the precision (P) DT_I1, DT_I2, DT_I4, DT_NUMERIC(P, 0) |
DT_NUMERIC(P, S) |
|
LONG |
Not supported |
DATE |
DT_DBTIMESTAMP |
RAW |
DT_BYTES |
LONG RAW |
Not supported |
ROWID |
Not supported |
CHAR |
DT_STR |
User-defined type (object type, VARRAY, Nested Table) |
Not supported |
REF |
Not supported |
CLOB, NCLOB, BLOB, BFILE |
Not supported |
TIMESTAMP TIMESTAMP WITH TIME ZONE INTERVAL YEAR TO MONTH INTERVAL DAY TO SECOND TIMESTAMP WITH LOCAL TIME ZONE |
DT_STR |
UROWID |
Not supported |
Table 6: Mapping of Oracle data types
You can change the output column type to DT_NUMERIC with specific precision and scale. You can set the precision and scale according to the data stored in the column. When the output column type is changed to DT_Numeric with specific precision and scale, the component will extract the column data as a number with fixed precision and scale.
TroubleshootingThis section contains troubleshooting information, including specific steps for resolution, to help you address common errors you may encounter when you are working with the Microsoft Connector for Oracle by Attunity.
Troubleshooting: Run-Time Failures
This step-by-step section describes how to troubleshoot and analyze run-time failures related to the Microsoft Connector for Oracle by Attunity.
The debugging process depends on the logging facility that SSIS provides for the external providers. Using the verbose log files is necessary if the other debugging facilities of the Microsoft SQL Server Business Intelligence Development Studio have not helped, or if the nature of the problems is related to the data provider.
STEP 1 – Eliminate the common problems
Problem | Solution |
---|---|
Permission problems at run time. |
Check the relevant error message and confirm that the SQLServer service and the SQL Server Agent has the required permissions in the specified account. |
Problem running a package on 64-bit platform, if the package worked on 32-bit platform. |
Make sure you are using the 64-bit connector, for the Microsoft Connector for Oracle. There are special installation kits for 64-bit editions of Windows and Itanium IA64. If you are calling DTS packages using the Execute DTS 2000 Package task to run a SQL Server 2000 DTS package, you must run the package in a 32-bit environment. |
General errors in the Event Viewer. |
Always check the Windows Event Viewer for general error messages. If the problem is indeed related to the data provider, proceed to STEP 2. |
Table 7: Eliminating common run-time failures
STEP 2 - Using the Logging Facility
The Microsoft Connector for Oracle outputs meaningful error messages to SSIS; however, there are cases where there is a need for verbose debugging log file, which can show the complete lifecycle of the interaction with the connector and the back-end database.
In these cases SSIS provides a complete logging facility and several logging providers.
To enable logging, perform the following steps:
1. In the SQL Server Business Intelligence Development Studio, open the package in which you want to enable logging.
2. On the SSIS menu, click Logging.
Figure 38: Selecting Logging on the SSIS menu in Business Intelligence Development Studio
3. On the Providers and Logs tab of the Configure SSIS Logs dialog box, choose a logging provider. For example, SSIS log provider for Text files enables you to output the logging to a simple text file in your file system.
Figure 39: Selecting a log provider on the Providers and Logs tab of the Configure SSIS Logs dialog box
4. Click Add to add the selected provider.
5. Select the check box next to the provider to enable it.
6. Configure the logging provider by clicking on the Configuration column. You can either create a new connection or use an existing one. For the Text provider, you can choose to create a new text file and output the logging to it. Another option may be to always append to an existing file. You can define several log file providers to output to multiple log files of different formats.
Figure 40: Configuring the connection for logging to a text file
7. Next, select the diagnostic level for debugging. Click the Details tab, and then select the events to log. For the Oracle connector, choose the Diagnostic debugging level, which returns important interactions with the OCI interface of Oracle as well as other types of information. Note that you can select other events to be logged as needed, such as OnError, OnInformation, and OnWarning.
Figure 41: Selecting the Diagnostic event for logging
8. To save the current logging configuration for reuse as a template, click Save.
9. To accept your logging configuration, click OK.
10. To save the changes to the package, click the Save icon.
Run the package that you want to debug, and then review the output log file.
STEP 3 - Understanding the log file
The verbose log file contains the details and interaction of the different components in your package. Quickly reviewing the log file can reveal problems that you may be able to resolve without involving your technical support department.
Many general problems related to the SQL query, the backend database, or SQL Server Integration Services can be solved by reviewing the log file. If for any reason you cannot understand the cause of the failure, a support call should be made in order to get an explanation of the failure.
Analysis Tips
1. The log contains many Pre/Post and Enter/Exit messages. Make sure for every request there is a response, for example:
2:46:34 PM,0,0x,ExternalRequest_pre: The object is ready to make the following external request: 'IRowset::GetData'.
2:46:34 PM,0,0x,ExternalRequest_post: 'IRowset::GetData succeeded'. The external request has completed.
2. For OCI calls you will see Enter/Exit pairs as follows:
2:46:36 PM,0,0x,*Enter > OCIAttrGet
2:46:36 PM,0,0x,*Exit < OCIAttrGet
Troubleshooting: Error Message: “Failed to load OCI DLL”
This error indicates that a failure occurred while the Oracle oci.dll was being loaded. The error can occur if there are multiple Oracle homes or permission problems.
SCENARIO 1 – Multiple Oracle Homes
In this scenario, the computer is configured for development. The computer is running 64-bit edition of the Windows operating system, and the following components, which are required for development, are installed:
- Oracle Client 32Bit (required for design)
- Oracle Client 64Bit
- Microsoft SSIS Oracle Connector 64Bit
The following error may appear when the Oracle component is run in 32-bit (design time):
Error at Package [Connection manager "Oracle Connector 1"]: Failed to load OCI DLL.
SOLUTION
First, check that the environments are working and configured properly, and that SQLPlus is responding on both installations of the Oracle client.
On the computer on which the error occurred, it is noticed that the Oracle Home for the 64-bit Oracle client is defined in the 32-bit registry portion (Wow6432Node); this is causing the wrong version of oci.dll to be loaded.
To work around the problem, define a dummy registry entry (Z_SSIS) as follows:
Note: Incorrectly editing the registry can severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.
- Open the Registry Editor.
- Locate the following key: HKEY_LOCAL_COMPUTER\SOFTWARE\Wow6432Node\ORACLE.
- Right-click the ORACLE node, click New, and then click Key.
- Call the new key Z_SSIS (to make sure it's the last entry).
- Right-click on the Z_SSIS node, click New, and then click String.
- Name the property ORACLE_HOME.
- Double-click ORACLE_HOME and set it to the location of the Oracle 32-bit installation home directory.
SCENARIO 2 – Permission problems to Oracle HOME directory and files
The user is working in a 32-bit or 64-bit environment and receiving the error message while designing the SSIS process.
SOLUTION
The error could occur for several reasons:
- The wrong ORACLE_HOME is set up in your registry. If this is the case, use the Oracle Installer to configure your environment.
- The interactive user does not have the proper permission to access the Oracle HOME directory and its files. If this is the case, check the permissions and configure them properly.
Finally, verify the connection using the Oracle SQLPlus utility.
For more information, see this Attunity Forum post (http://www.attunity.com/forums/microsoft-ssis-oracle-connector/error-failed-load-oci-dll-1308.html#post1942).
ConclusionThis paper discussed the functionality of the SSIS Connector for Oracle by Attunity 1.0, and it provided detailed step-by-step instructions on how to use the connector with SQL Server Integration Services. Three general use cases are presented with the design highlights, rationale, and performance statistics. Overall, the connector provides a high-performance means of loading and extracting data from Oracle databases.
For more information:
See the recent blog post, Using SSIS to get data out of Oracle: A big surprise! (http://blogs.msdn.com/sqlperf/archive/2009/08/14/using-ssis-to-get-data-out-of-oracle-a-big-surprise.aspx) by Len Wyatt of the Microsoft SQL Server Performance team.
http://www.microsoft.com/sqlserver/: SQL Server Web site
http://msdn.microsoft.com/en-us/sqlserver/cc511477.aspx: SQL Server Integration Services TechCenter
http://technet.microsoft.com/en-us/sqlserver/cc510302.aspx: SQL Server Integration Services DevCenter
Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:
- Are you rating it high due to having good examples, excellent screen shots, clear writing, or another reason?
- Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?
This feedback will help us improve the quality of white papers we release.