Saturday, November 9, 2013

Reading and writing a delimited and Fixed Length Files in ODI

Reading and writing a delimited and Fixed Length Files in ODI


In my last post Table to Table load I showed how to load a table from source to target. In this tutorial, I want to expand this to load a fixed length/delimited file to table and produce a fixed length/delimited file from table.

In this post, I will use a employees files which are similar to employees table in HR but the employee_id column will be missing in the file which we will generate through oracle sequence while loading into the table.
I have a comma delimited file and fixed length file. This is how they look like. They can be downloaded from here. CSV File Fixed Length



There are two tables which I will create in the target schema DATA_TARGET (which i created in table to table post) to hold the data loaded from csv file and fixed length file. Their DDL is shown below.

CREATE TABLE "DATA_TARGET"."EMPLOYEES_CSV"  (    "EMPLOYEE_ID"  NUMBER(6,0) NOT NULL ENABLE,    "FIRST_NAME"   VARCHAR2(20 BYTE),    "LAST_NAME"    VARCHAR2(25 BYTE) NOT NULL ENABLE,    "EMAIL"        VARCHAR2(25 BYTE) NOT NULL ENABLE,    "PHONE_NUMBER" VARCHAR2(20 BYTE),    "HIRE_DATE" DATE NOT NULL ENABLE,    "JOB_ID"         VARCHAR2(10 BYTE) NOT NULL ENABLE,    "SALARY"         NUMBER(8,2),    "COMMISSION_PCT" NUMBER(2,2),    "MANAGER_ID"     NUMBER(6,0),    "DEPARTMENT_ID"  NUMBER(4,0),    CONSTRAINT "EMPLOYEES_CSV_PK" PRIMARY KEY ("EMPLOYEE_ID")  )CREATE TABLE "DATA_TARGET"."EMPLOYEES_FL"  (    "EMPLOYEE_ID"  NUMBER(6,0) NOT NULL ENABLE,    "FIRST_NAME"   VARCHAR2(20 BYTE),    "LAST_NAME"    VARCHAR2(25 BYTE) NOT NULL ENABLE,    "EMAIL"        VARCHAR2(25 BYTE) NOT NULL ENABLE,    "PHONE_NUMBER" VARCHAR2(20 BYTE),    "HIRE_DATE" DATE NOT NULL ENABLE,    "JOB_ID"         VARCHAR2(10 BYTE) NOT NULL ENABLE,    "SALARY"         NUMBER(8,2),    "COMMISSION_PCT" NUMBER(2,2),    "MANAGER_ID"     NUMBER(6,0),    "DEPARTMENT_ID"  NUMBER(4,0),    CONSTRAINT "EMPLOYEES_FL_PK" PRIMARY KEY ("EMPLOYEE_ID")  )

Lets login to ODI repository now and configure the file information in Topology and create the model.

Topology and Model


Go to topology tab, expand File technology and select FILE_GENERIC data server (ODI provides it out of the box as it is not synonymous to database schemas but fitting it into odi architecture). Right click on file_generic and select new physical schema. Provide the directory where the file is residing in schema and work schema. Since I am going to have both csv and fixed length file in this directory, one physical schema is enough in this use case. If you are placing the fixed length file in different directory, you will have to create different physical schema for that here.


Now that we have data server and physical schema in place, lets create a logical schema. Create a new logical schema in logical architecture section under File technology. Right click on File and select new logical schema. Provide a name and select physical schema we created earlier for global context.



Lets now create a model. Go to designer tab and in the data model section, create a new model. Provide a name, select File technology and logical schema we created before and save it.


For files we can't do reverse engineering. we need to manually create the data stores (representation of file). It is going to be single model but will have different data store (file representation) for csv and fixed length file.
Right click on the data model we created and select new data store. In the definition tab Provide name, data store type as table and  make sure that in resource name field exact file name is give. You can also use the search icon to select the file from directory.



Go to file tab. Provide file format as delimited, heading as 1 , field separator as comma and text delimiter as double quotes.  Please make sure that heading number of lines provided otherwise first line will be considered as data and also when we reverse engineer for columns (coming up next), column names will not be identified.

Go to columns tab. Click on reverse engineer. Since we provided first row as heading, it put the column names from heading columns. By default it identifies all data as string, we can convert them to respective data types while loading to table using database functions. Adjust the size it automated set as 50 if you think that field will have more characters. Save the csv data store.

Lets create a similar data store for fixed length file. But here it is driven by fixed length columns. Provide the details in definition tab.

Under files tab, provide fixed file format, head lines as 1 and  fixed separator as space and text delimiter as double quotes.

Go to columns tab and click on reverse engineer. Popup will appear to decide the fixed length size of each columns here. If you are from Oracle SOA world, it would look similar to configuring file adapter.  A vertical line will appear inside the dialog as you move your mouse on the scale, based on the size you need for the column, click and fix the length. If you play for sometime, you will be able to figure how it works if you are new to this.
Change column names from Cn to actual columns since it was not captured during reverse engineering. I have also specified the type as string as it was also not set by default.

We have completed the topology and model configuration for file. I will also reverse engineer the data_target model since we need the two new tables we created in data_target to appear in the model.
Right click on the model and select reverse engineer. Two new data stores will appear in the model.

Now that we have completed the model, lets create interfaces to load data from file to table.

Create Interfaces

I will use the same project for this post we created for  Table to Table load. Lets create the sequence we are going use for loading and also import the file knowledge module before we create the interface in the project.

Under the project PRJ_TEST, right click on sequences and select new sequence. I am going to use the sequence which is available in HR schema even if we are not dealing with HR schema here. Thats the beauty of ODI that you can create your own ODI sequence or use native sequence from the respective technology.

Provide a name for sequence, select native sequence, select HR logical schema to get employee sequence in HR schema. save it.
Import the knowledge modules to load data from file and to file. LKM File to SQL is used to load data from file to table and IKM SQL to File Append is used to load data from table to file. 
LKM File to SQL
IKM SQL to File Append
Lets create a Interface for CSV File to EMPLOYEES_CSV. Right click on interfaces and create a interface INT_CSV_TO_TABLE

Go to mapping tab, drag an drop Employees_csv from file data model into source and employees_csv from data_target into target sections. Do the mapping from source field to target.

For the employees id field in target, select it and from the properties below, select the sequence for its value.
Go to flow tab, select target and make sure that IKM SQL Incremental update is selected to merge the rows into target (insert or update).  We also use IKM SQL Control Append knowledge module if our requirement in just appending rows to target. LKM File to SQL is automatially selected since file is source here.
Go to control tab and check CKM Oracle is selected for constraints check. Save the interface.
Lets go ahead and run this interface to see the results. right click on this interface and run it with global context, no agent and desired log level. Go to operator tab and check the run results under All executions.
It is successful.

Lets check the target table for result data.

Lets create a similar interface for fixed length file. The steps are exactly same, the only difference is that in mapping tab, the source table is employees_fl of file model and target table is employees_fl of data_target model.
Lets run the interface INT_FL_TO_TABLE and verify results. This failed since the target column size is only 20 but my source has 30 as size. Either we can trim the mapping values for extra spaces through oracle functions and load it or increase the target column size here.

I went the path of adding functions as the actual data doesn't exceed the target field size but due to additional spaces and double quotes, the data size has increased. I have added functions replace(trim([field]),'",'') to remove double quotes and trim the spaces.

Lets run and check the output table. I could see the run is successful and output table is populated.

Lets create another two interfaces to load from table to file. There are couple differences between the previous interfaces from these two new interfaces.
1. source will table and target will be file
2. IKM SQL to File Append will be used to load the output file which will be selected in flow tab for target.

Let me show the mapping tab and flow tab for one interface here. Other steps are similar to what we have seen already. Also import the import point is that under definition tab, staging area different from target check box should be enabled and staging area logical schema need to be selected (data_target is selected in my case)
definition tab,
mapping tab,

Let's see the flow tab, here select target and select IKM SQL to File Append. Optionally we can select GENERATE_HEADER to true to generate headers in the file.

Lets run this this interface. Before that rename the existing files input directory to some other name so that they are not overwritten.
Run is successful and we can see the file in the output directory.


Follow the similar procedure for the table to fixed length file loading. 

Another question comes to our mind is how do we specify the file name dynamically every time the interface is run for reading as well as writing?
Well. Variable comes to the rescue for this problem. Create a variable in ODI which has dynamic name for the file and once reverse engineering is done, we can specify this variable in place of the resource name in the data store. In the package we can refrench the variable and then the interface to take file name from variable in the interface. I will be covering procedure and package in my next posts. I will take a similar use case that time show the demo.

With this we have completed reading and writing delimited and fixed length files in ODI. In my next posts i will show reading writing complex files and xml and also introduce the procedure and package concepts.

Thanks for reading and let me know your comments and questions.



20 comments:

  1. HI Muniyasamy,
    i need help for setting ckm for trasforing error records to E$ table.
    My requirment is file to db,
    i will get good records as well as bad records form file, i need to move the bad records to E$ table.

    ReplyDelete
  2. hum dil de chuke sanam

    ReplyDelete
  3. In my CSV file has address column and does not have "" code for that column.how can we implement in ODI. Any setting has to set for data store level? . my file is comma(,) delimiter file.

    ReplyDelete
  4. I really appreciate the information shared above. It’s of great help. MaxMunus provides Remote Support For Corporate and for Individuals. If anyone is facing any issue in his project of # ORACLE DATA INTEGRATOR we can support them remotely , kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Industry best Consultant on#ORACLE DATA INTEGRATOR. We provide end to end Remote Support on Projects. MaxMunus is successfully doing remote support for countries like India, USA, UK, Australia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain, and UAE etc.
    Saurabh
    MaxMunus
    E-mail: saurabh@maxmunus.com
    Skype id: saurabhmaxmunus
    Ph:(0) 8553576305/ 080 - 41103383
    http://www.maxmunus.com

    ReplyDelete
  5. Hi , I want to load file from my remote server , local from my PC i am load file into oracle table , but trying to load data file from source server.
    Please let me know what is the process for the same.

    ReplyDelete
  6. Hi

    We have one requirement which needs to be implemented in ODI 12C.

    In ODI, in one of our target table, we upload data from csv file from one folder and its path is on linux machine, this file name contains timestamp in it...so every time source file name changes...so I want to read the file name and load in one of target table column..How can we achieve this?

    Can you please share any document or link for this on email rishikeshk11@yahoo.co.in ?

    Thank You.

    ReplyDelete
  7. If you are searching out repair to QuickBooks troubles, dial Quickbooks Customer Service Phone Number Help to solve your troubles speedy recovery.

    ReplyDelete