Saturday, November 9, 2013

Reading and Writing Complex XML files in ODI - Part1

Reading and Writing Complex XML files in ODI - Part1

In my last post Read write Files, I showed how to read and write delimited files and fixed length files in ODI. In this blog, I will show, how to read and write to complex files in ODI.

For this post, I would like to take the HR schema tables for xml schema. The following database diagram shows the relationship between tables in the HR schema. we will represent this Region->Country->Location->Departments->Employees relationships as XML.

I have added a addtional colum COUNTRY_NUM (numeric) in the country and location tables of HR schema to use for element ordering in XML. This is needed as COUNTRY_ID which exists already in country table is varchar. I have also provided numeric values for this column in countries and location.

This data hierarchy will be converted into the following xml hierarchy using xsd.

I have created the xsd for this representation and it can be downloaded here. xsd
I have created a sample xml also using the xsd and it can be downloaded here. xml

Lets get started with writing xml first with ODI and use the same output to load it into different set of tables to demonstrate loading.

Creating Topology and XML Model configuration

As usual we need to create a physical schema in ODI for the schema similar to table and file examples.
This xml schema is created as tables in ODI to represent xml. Each complex type in xml is considered as a table. Each element and attribute inside a complex element are represented as columns in the table. So ODI will represent the hierarchy with region, country, location, department, employees table internally when we create a physical schema for the XSD.
The table representation by default is created in memory but it can be created using external database also and external database schema is the suggested approach for production environments so that they are accessible any time for debugging as opposed to memory approach.
To use external database schema approach we need to use properties file to tell ODI the schema, connection details to store the table representation of xsd.
For this purpose, I have created a new schema XML_STAGING in my XE database and I have created the properties file. The property file looks as follows. I have named the property file as all_employees.properties. Use the encode tool to encode the password to specify here. The encode tool is available in agent/bin directory.


driver=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@localhost:1521:xe
user=XML_STAGING
password=g0yXfhlSfzhz4bro5P7HCBQ
schema=XML_STAGING
drop_on_connect=Y
create_tables=AUTO
create_indexes=Y
truncate_before_load=Y
ids_in_db=Y
drop_tables_on_drop_schema=Y
use_prepared_statements=Y
use_batch_update=Y
batch_update_size=30
commit_periodically=Y
num_inserts_before_commit=1000
reserve_chars_for_column=3
reserve_chars_for_table=3

This all_employees.properties need to be placed in the classpath odi client or agents classpath wherever xml is used so that it can use this file to identify the xml_staging.
For windows odi client class path is : <UserHome>/AppData/Roaming/odi/oracledi/userlib
For standalone agent this classpath is: <standalone agen dir>/oracledi/agent/drivers

Since i am using windows odi client, I have placed in appdata odi userlib directory.
Lets create the topology physical schema
Under Topology tab under physical architecture section, right click on XML technology and select create data server. Provide name for the data server
Go to jdbc tab and provide jdbc details. Select jdbc driver for xml and provide jdbc url.
jdbc url provided is 
jdbc:snps:xml?f=C:\temp\odi\input\all_employees.xml&d=C:\temp\odi\input\all_employees.xsd&s=EMPLOYEES&re=regions&dp=all_employees.properties&lf=C:\temp\odi\input\all_employees.log&ll=31

Here
f - represents xml sample file
d - represents xsd file
re- root element name in xsd (regions in our case)
s - logical schema name for this xml
dp - property file for external database schema to store xsd representation
lf - log file to redirect logs
ll - log level (31 is the highest level)
Check the connection by testing it.

Lets create a physical schema now. Right click on data server and select create physical schema.

Lets create a logical schema and map the physical schema to logical schema.
Lets go ahead and create model now for the xml. Go to designer tab, under data model section, create a new model. Provide names, xml as technology and select logical schema we created earlier.

Click on reverse engineer button at the top. ODI converts all complex types to tables and puts them in the model. all elements and attributes becomes columns for the table. There are addition things to note here that those tables have some special columns.
They are
<element>Order column and FK column.
1. elementOrder columns decide the order in which elements need to appear in the xml since xsd made the sequence mandatory through xs:sequence for complex element.
2. FK column stores the id of parent so that child comes correctly under the parent. In our snapeshot it is regionfk for country.

Actually the tables shown in the model are physically present in the XML_STAGING table. when we do reverse engineer those tables are created in the schema and the meta data of those tables are represented as data stores in the model. EMPLOYEES represent the schema name we provided in the s attribute of jdbc url of the data server.

Please also increase the column sizes in data store and xml_staging schema tables  if needed that the input xml or the tables in hr schema has bigger column sizes.

Writing XML File

Lets create series of interfaces for loading each complex types in xml from hr tables. First lets create region interface. I am creating another folder xml under PRJ_TEST project and placing all interfaces and package here.

Import LKM SQL to SQL and IKM XML Control Append knowledge modules before creating the interface.
1. LKM SQL to SQL - loads data from source to staging
2. IKM XML Control Append - writes data from staging to target.

Create interface for region. Provide name and set staging different from target. select xml logical schema for staging.


In the mapping tab, provide the mapping for region interface with region of hr as source and region of xml model as target.

1. Note that here regions parent (REGIONSFK)  is only one element and so it is mapped to 0
2. Name is the first element in region and so it is set to 0
3. REGIONORDER is set to source region id since it is a sequence and so it will automatically maintain the order of region under regions based on this source regions id.


In the flow tab, select source and staging tables and select  LKM SQL to SQL as knowledge modules.
Select target and select IKM XML Control Append as knowledge module.
Set the following properties and these need to be set only for the top element (region) under ROOT (regions)
1. Select TRUNCATE_SCHEMA to yes (this truncates all xml data available in previous run)
2. Set ROOT_TABLE to regions (to show it is the root node in xml)
We need to create similar interfaces for country, locations, department and employees.
For country, location and department we should NOT set any options for IKM XML Control Append leave all the options default. So I am just showing here the mapping for country, location and department.

Mapping for country, Note here that newly added numeric column country_num is used for countrypk and countryorder as country_id is varchar which i cant use for order column. In the flow tab, make sure all options are default for IKM XML Control Append.

Mapping for location, similar to country,
1. staging different from target, set xml logical schema
2. for mapping note the sequence is provided from 0 to 3 for elements under location.
3. locationpk and locationorder are set to locationid from source since it is a numeric and it is a sequence.
4. leave deafult option for IKM XML control append for target in flow tab.
Mapping for department, similar to country,
1. staging different from target, set xml logical schema
2. for mapping note the sequence is provided from 0 to 1 for elements under department.
3. departmentpk and departmentorder are set to dpeartmentid from source since it is a numeric and it is a sequence.
4. leave deafult option for IKM XML control append for target in flow tab.
Mapping for employee,
flow tab for employee, make sure that CREATE_XML option is set to true here as it is the last interface which will be populated and it needs to create the xml file to disk. Optionally we can set XML_PATH option to full path of the xml file so that xml file is written here. I have set this to C:\temp\odi\input\employees_<%=odiRef.getSession("SESS_NO")%>.xml in this example.
<%=odiRef.getSession("SESS_NO")%> get the currenr session number for the odi run appends to the file name. so if session number is 300, file will be created as C:\temp\odi\input\employees_300.xml

Lets put everything together in a package so that they run in a sequence and produce a file.
Right click on packages and select new package. Provide a name.
select diagram tab, drag and drop region, country, location , department and employees interfaces in sequence and connect them with ok for success flow. save the package.

Lets run the package and verify the results. Right click on package and execute. Select global and no agent and click ok. package execution is successful.

Lets verify the file in the directory and it is created successfully. File name also has the session number appended to it.

Lets open the file and verify the data. All data is successfully produced in the file and it is a well formed xml file.


With this we have completed the xml writing part.Since this is a big, I am separating reading from xml file into a seperate post. You can continue reading xml file post here. Part2 - Read from xml file


52 comments:

  1. OH MAN!!!! YOU THE BEST!!! THIS IS JUST PERFECT!!!!

    I have spend 3 weeks i didn't understand how to do this. I will recommend to everyone!!!!

    Thanks a lot

    ReplyDelete
    Replies
    1. Hi Can oyu give the XSD and XML which you used for this.

      Delete
    2. While ODI is a very good ETL tool in general it is not brilliant at loading complex XML files. It struggles to parse complex XSD schemas. Performance is poor for large volumes of XML files as the workload can't be parallelised. It also takes a lot of development time and effort to map the XML sources to targets, orchestration of XML interfaces/mappings is painful, you can't viusalise the source schema, error handling is hard, scalability is limited to one CPU core. All of the issues are documented in this post on parsing XML in ODI. Because of all of these problems you may be better off using a dedicated XML parser.

      Delete
  2. I wish to thank as well for this impeccable article that saved us quite some time in trying to figure out how to make XML files and ODI behave.

    May your days be plenty and interesting =)

    ReplyDelete
  3. This article has been most useful. I've been working on similar need to write down to xml-file the data to be to be delivered to other system.

    I've done all the necessary steps explained here. The staging goes fine, but when the package is writing down the xml as a last step, I'm getting this nasty Java error below. How can I narrow down why this is happening.

    Thanks in advance for your help.


    ODI-1226: Step INT_xx_table_XML fails after 1 attempt(s).
    ODI-1240: Flow INT_xx_table_XML fails while performing a Integration operation. This flow loads target table xx_table.
    ODI-1228: Task INT_xx_ID_XML (Integration) fails on the target XML connection XML_xx_EXPORT.
    Caused By: java.sql.SQLException: ODI-40768: Could not save the file C:\tmp\XML_xx_EXPORT_166201111.xml because a class java.sql.SQLException occurred and said: java.sql.SQLException: java.sql.SQLException: Numeerinen ylivuoto
    at com.sunopsis.jdbc.driver.xml.SnpsXmlFile.writeToFile(SnpsXmlFile.java:756)
    at com.sunopsis.jdbc.driver.xml.SnpsXmlConnection.internalExecute(SnpsXmlConnection.java:744)
    at com.sunopsis.jdbc.driver.xml.SnpsXmlPreparedStatement.execute(SnpsXmlPreparedStatement.java:46)
    :
    :

    ReplyDelete
  4. Do you have an update for 12c? As i cant find a flow option to select the XML Path?

    ReplyDelete
  5. This is very good post, Thank you so much, I am able to get the XML and XSD file mentioned in the LINK ( used in above example).
    Can you please provide that

    ReplyDelete
  6. Hi ,
    Is it possible to convert xml to excel in oracle odi...if it is possible pls share me the steps...Thanks

    ReplyDelete
  7. Hi,

    I followed your instructions and everything was fine in the last days, but now when I execute the procedure to create XML file I get 'Statement is not in batch mode' ...Can you please suggest a solution or is that an ODI bug? I am using ODI11. thanks

    ReplyDelete
    Replies
    1. In the definition of the Data Server (Topology/Physical Architecture), set 'Batch Update Size' to 1.

      Delete
  8. Priscila Luft SmaniottoMay 7, 2015 at 11:00 AM

    Hello,
    This is very good post, Thank you so much, I am able to get the XML and XSD file mentioned in the LINK ( used in above example).
    Can you please provide that?

    Thanks.

    ReplyDelete
  9. Hi, I implemented in odi 12.c. I did this data transfers in one mapping. I had an output xml file. But some childs are replicated. I cannot find the problem? Can you please help me?
    Thank you.

    ReplyDelete
  10. Hi again, my output's half part is here:




    20140730005051


    0.3


    0.7


    20140730005051


    0.3


    0.7


    1
    11
    2610415
    0
    1
    0
    4001
    TRL
    1
    000000000000000
    019495885290
    8790569242
    013544815146
    97
    18
    2
    2610415


    20140730005051


    0.3


    0.7


    20140730005051


    0.3


    0.7


    1
    11
    2610415
    0
    1
    0
    4001
    TRL
    1
    000000000000000
    019495885290
    8790569242
    013544815146
    97
    18
    2
    2610415

    ReplyDelete
  11. Sorry my xml output cannot be displayed here...:( If you ask, i can send via email.
    Thank you...

    ReplyDelete
    Replies
    1. Hi sir, could you provide me XSD and XML file
      my email: hongcntt88@gmail.com

      Best regards,
      HongTT

      Delete
    2. Hi,

      Could you please share the XML and XSD file used in this post to: mukherjee.promit@gmail.com.

      This seems to be a very useful and helpful topics.

      Thanks, in advance.

      Regards..

      Delete
  12. Hi sir ,could some one provide me XSD and XML files for the above post.....

    ReplyDelete
  13. Lick to XSD and XML are broken. Can you please attach them again..

    ReplyDelete
  14. Can you give us new links to XML and XSD files?
    Thank you so much!

    ReplyDelete
  15. Can you give us new links to XML and XSD files?
    Thank you so much!

    ReplyDelete
  16. Can you give us new links to XML and XSD files?!?!
    Thank you!

    ReplyDelete
  17. please provide links for XSD and XML files or mail me ksckr.20@gmail.com

    ReplyDelete
  18. please provide links for XSD and XML files or mail me madhavapps077@gmail.com

    Thank you!

    ReplyDelete
  19. could you please provide links to XSD and XML files or email to Buddanlou@hotmail.co.uk thanks you

    ReplyDelete
  20. could you please provide links to XSD and XML files or email to sadettincil@gmail.com thanks you

    ReplyDelete
  21. Thanks for sharing these information. It’s a very nice topic. We are providing online training classesoracleodionlinetraining

    ReplyDelete
  22. Thanks for sharing these information. It’s a very nice topic. We are providing online training classesoracleodionlinetraining

    ReplyDelete
  23. could you please provide links to XSD and XML files or email to hoanm1412@yahoo.com

    thanks you

    ReplyDelete
  24. Thanks for this Article, I followed above steps, I am getting below error.
    ODI-1228: Task Insert new lines-IKM XML Control Append- fails on the target connection COMPLEX_XML_POC.
    Caused By: java.sql.SQLException: invalid schema name: S_COMPLEX3

    ReplyDelete
    Replies
    1. Hi Satya ,
      I'm getting same error "Caused By: java.sql.SQLException: invalid schema".
      Could you please explain how to fix this error if you solved.
      It's very significant for me.

      Delete
  25. Thanks for all these details.
    Could you provide the xsd and xml link please?
    Best regards

    ReplyDelete
  26. could you please provide links to XSD and XML files or email to ahmadndiaye200@hotmail.com thanks you.

    ReplyDelete
  27. As per the above post,
    I have installed ODI11g in my windows laptop and put the all_employees.properties file in the below location C:\Oracle\product\11.1.1\Oracle_ODI_1\oracledi\agent\drivers. But while create the physical schema and trying to test the connection, I am getting the below errors:
    java.sql.SQLException: ODI-40848: Could not find the file called all_employees.properties

    Could you please advise quickly.

    ReplyDelete
  28. Muruganantham Muniyasamy very gud post ...Great work..Expecting more concepts on ODI...

    Thanks,
    suresh.

    ReplyDelete
  29. Hi,
    Can't we load more that one record at a time using the above interface?
    i am getting multiple keys found on the table while doing so?

    Could you please help here.
    Thanks

    ReplyDelete
  30. Hi,

    How to add that country_num column in XML target table.

    Thanks
    Suman

    ReplyDelete
  31. Hi sir, could you provide me XSD and XML file
    my email: gilson.fonsaca@gmail.com

    Best regards,
    Gilson

    ReplyDelete
  32. could you please provide links to XSD and XML files on email to juanmabozalmorales@gmail.com
    thank you.

    ReplyDelete
  33. could you please provide links to XSD and XML files on email to shirkeshashi04@gmail.com

    I m getting 404 error while redirecting to XML path.

    ReplyDelete
  34. Can you please provide me the XSD and XML files ,Below is my email id .

    Jyothirmai.ps@gmail.com

    ReplyDelete
  35. Hi can you please mail me the xml and xsd thats been used the link is not working, please i really need this. Thank you.

    my mail id : jagasrik@gmail.com

    ReplyDelete
  36. Hi could you provide xsd and xml to tkosnik@yahoo.com

    ReplyDelete
  37. Hi , thank you for your useful sharing.
    Could you please send me the files (XSD and XML).
    e-mail : volkan.camas@gmail.com
    Thanks in advance.

    ReplyDelete
  38. Hi ,

    I am using 12c and could you please show me where we do the mapping in order(0,1,2,3..) in XML datstores. Could you alsoplease send me the files (XSD and XML).
    e-mail : replyme.deva@gmail.com

    Thanks in advance.

    ReplyDelete
  39. "Great blog created by you. I read your blog, its best and useful information. You have done a great work. Super blogging and keep it up.php jobs in hyderabad.
    "

    ReplyDelete
  40. I read this blog You explained clearly, it's easy to understand everyone keep sharing..check this information for a knowledge on Devops Online Training Hyderabad

    ReplyDelete
  41. • Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updatingAzure Online Training

    ReplyDelete
  42. Does anyone of you encounter below error when loading large xml? Is there anything you might suggest to avoid this error? already tried to increase the heap size but nothing works

    ODI-40761: Not enough memory to read the XML file

    ReplyDelete
  43. Hi can you please mail me the XML and XSD thats been used the link is not working, please i really need this. Thank you.
    email : nguyentrungkien21956@gmail.com

    ReplyDelete