Oracle’s Business Intelligence Cloud Service (BICS) offers several different ways to load data into the complementary cloud database schema that comes with a subscription to BICS. The most basic data loading utility is the Data Loader which is accessible off the home screen after logging into BICS. This post describes the data load process and then digs into what goes on behind the scenes when data is loaded. For anyone responsible for managing the database in their own BICS environment, this should be useful information.
Data Loader Basics
The Data Loader allows for uploading data from either a cut-and-paste operation or uploading CSV, TXT, ZIP, or Microsoft Excel files. ZIP files can only contain TXT or CSC data files and the first file in the ZIP archive will be loaded by the Data Loader.
Accessing the Data Loader begins with clicking on the “Load” button on the lower left of the BICS home screen.
The Data Loader Home Page
The BICS Data Loader home page shows a summary of all data load activities including links to show the actual data records loaded and any records that were not loaded due to errors.
- Date: The date / timestamp of when the load was performed
- User: The application user who performed the data load
- Destination Table: The name of the table in the associated Oracle Cloud database into which the load was performed
- Inserted Records: The count of records which were successfully inserted into the target table
- Updated Records: The count of records which were successfully updated in the target table
- Not Loaded Records: The count of records which had errors while performing the load; this includes records which had errors while inserting and updating
- Processed Records: The count number of records processed; this includes inserted, updated and not loaded records
Clicking on the “Inserted Records” value for a load job will show a detail screen listing all of the data that was successfully loaded into the database. A dropdown list will then shows updated or error records.
After viewing this detailed data summary, you may ask a few important questions, especially if you are responsible for maintaining the cloud database, such as:
- Where is the detailed information about the data loads stored?
- Is it necessary to keep duplicate copies of every record loaded by the data load in the database tables?
- How do we go about purging the data load information when it is no longer needed?
Under The Covers
By using the “Object Browser” in the Oracle Application Express (APEX) environment associated with the BICS cloud database, we see three tables named “EBA_ADL_LOADS,” “EBA_ADL_LOAD_MAPPINGS,” and “EBA_ADL_LOAD_RECORDS.”
Clicking on “EBA_ADL_LOADS” and then selecting the “Model” tab from the view shows the relationship between the three tables. The “EBA_ADL_LOADS” table is the master table, while the other two tables contain the details about each load.
“EBA_ADL_LOADS” contains one row for each attempted data load and stores who performed the load, when the load was performed, and what table was loaded.
“EBA_ADL_LOAD_MAPPINGS” contains one row for every column loaded in each instance of a data load. The table has a foreign key reference between “EBA_DATA_LOAD_ID” and the “EBA_ADL_LOADS.ID” column.
Most of the columns are self-explanatory except for these two:
“UPDATE_ROW_ON_MATCH” is a flag that determines whether to use the column as a key to identify rows that will be updated versus inserted during a load.
“DATA_TRANSFORMATION” is a numeric code that relates to which optional basic transformation rule was selected for the column during the data load process.
- 1 = Uppercase
- 2 = Lowercase
- 3 = Trim Spaces
- 4 = Format as number
“EBA_ADL_LOAD_RECORDS” contains one row per row inserted by each data load.
Let’s repeat that statement just to be clear: “EBA_ADL_LOAD_RECORDS” contains one row per row inserted by each data load.
This might be a cause for concern over time as data is loaded into the Oracle Database Cloud instance and there may be limitations on available storage space within that instance.
The structure of “EBA_ADL_LOAD_RECORDS” includes a primary key ID column; an “ACTION” column showing whether the row was inserted or updated; an “ERROR” column containing the error message that may have occurred when inserting the row; and a “SEQ_ID” containing the sequential row number from the data loaded. The table also maintains a foreign key reference between “EBA_DATA_LOAD_ID” and the “EBA_ADL_LOADS.ID” column.
In addition, there are 50 columns prefixed with “C00%” that map to the individual columns in the “EBA_ADL_LOAD_MAPPINGS” columns. This is why a data load using the BICS Data Loader tool is limited to 50 columns, as this is a limitation imposed by the underlying tables used to track the data loads.
The combination of the three “EBA_ADL_%” tables forms the metadata that controls the BICS data load process.
Do we really want to keep duplicate copies of every record loaded by the BICS Data Loader in our database?
In most cases, the answer would presumably be “No.” The Oracle Cloud Database instance that is provided with BICS is limited to 50 GB of space. Retaining duplicate copies of every row from every data load would reduce the actual available space for analysis by half or more.
It is important to note that the “EBA_ADL_%” tables are only utilized when the data is loaded via the BICS loader. Loading data via other methods like Oracle APEX Data Workshop, Oracle SQL Developer carts, or REST API services will not cause information to be loaded into those tables.
How do we clean up the metadata?
There does not appear to be any purge utility for the data load tables included within the BICS environment. Hopefully Oracle will add this functionality at a later point.
For now, the only way to purge the data appears to be via direct “DELETE” or “TRUNCATE” statements executed within the Oracle Database Cloud APEX environment.
For a complete purge of the data load information, the following SQL commands can be executed within the APEX environment:
DELETE FROM EBA_ADL_LOAD_RECORDS;
DELETE FROM EBA_ADL_LOAD_MAPPINGS;
DELETE FROM EBA_ADL_LOADS;
To delete only the data related to a specific load, first identify the ID column from the “EBA_ADL_LOADS” table of the load to be purged.
SELECT * FROM EBA_ADL_LOADS;
DELETE FROM EBA_ADL_LOAD_RECORDS WHERE EBA_ADL_LOAD_ID = value;
DELETE FROM EBA_ADL_LOAD_MAPPINGS WHERE EBA_ADL_LOAD_ID = value;
DELETE FROM EBA_ADL_LOADS WHERE ID = value;
COMMIT; — If necessary
It is also possible to create a PL/SQL package to purge the tables based on more complex criteria like dates or target table names.