Configure XML Input Stage with EXTERNAL SOURCE Stage

In this article we will see two important stages in Datastage with a simple scenario:

           1.       XML Input Stage in Datastage

           2.       External Source Srage

Here External Source stage used to read the XML file path stored. Advantge of External Source is, we can use unix commands to list or read files. This file path is used to convert data stored in XML file in tabular format. Let’s study these two stages in detail by implementing above scenario step by step.

Step #1: Design your job structure like below.

Job Design for XML Input

In above figure External Source stage is a input stage named extrnl_src_books_det. Target is a Dataset file named ds_trgt_book_det.

XML Input named XML_Input which will be used to convert XML data into tabular form.

Step #2: Double click on External Source stage, following window will pop up.

Configure External Source Stage

To configure External Source Stage we can use Unix commands to read a file path. Mention Specific Program(s) as we are going get list of XML files stored by using a command in Source Program as shown in above figure.

Filepath is a Job Parameter used here which has value G:\Study\Project.

ls #Filepath#*oks.XML will list the all files of having oks.XML.

Step #3: To read this file path we need to define metadata as shown below.

Define Column

After defining metadata click on View Data shown at right upper corner of above window, it will show following output.

C:\Study\Project\Books.XML

This is the required file which we will use for our next process and in this way we used External Source Stage to read the file path of a XML file. Now our next job is to convert this XML file data into tabular form.

Step #4: Now to load above XML file we need to define Table Definitions first. Follow the following procedure to load XML Table Definition.

  1. Click Import -> Table Definition -> XML Table Definitions through Title Bar.
  2. XML Meta Data Importer window will pop up
  3. Open XML file through File -> Open -> Required XML File
  4. Column names will pop up.
  5. Drop down each column and check box against Text.
  6. Under Table Definition pane you can see the Description which gives the address of each and every column.
  7. Save this as BooksTD.
  8. Refer below image for better understanding.

XML Table Definition


Step #5: Now we have to configure XML Input stage. Double click on it, following window will pop up.

Confiure XML Input Stage

Select XML Source Column as File_name as we have declared it in Metadata for External Source stage in previous step. Here shown by green box.

As External Source stage has XML file path as output, select URL/File path as shown by red box in above image.

Step #6: This step is to load the XML table definition which we have created in Step #4.

Under Output tab we need to define Columns here. For that purpose click on Load button at the bottom and load the XML table definition which we have saved earlier i.e. BooksBT. After loading it you can see the metadata in following two images.

Load XML Table Definition

Load XML Table Definition 1

Step #7: Compile and Run the job. View Output on target Dataset file you will get the tabular form of XML file.

Hope this tutorial to configure XML Input Stage in Datastage is useful to you.
 
Explore more about External Source stage by practicing it on different scenarios. Best Luck.


7 Responses to “Configure XML Input Stage with EXTERNAL SOURCE Stage”

  1. Girish says:

    XML_Input_60,0: Error occurred in call to ORPHCallActivePluginInitialize().
    i am getting this error. Do u know why? Do i need to have any permission to process xml file in xml input stage?

    • admin says:

      This error can be faced if there is improper mapping with the key also please check metadata definition for XML

  2. jp says:

    hey my job was executed without any errors but I am not getting the output , i.e tabular form of my XML file. It shows 0 rows selected from the xml input stage.

    Thanks

  3. Fanny says:

    this ”ls #Filepath#*oks.XML” is only for unix how about datastage which is installed in windows?
    THANK YOU

  4. Jeya Praveena J says:

    Please select the repetition element required and select the option yes for the repetation column as yes in the column values.For example here Id is the repetation column for that we need to select the key option as yes instead of no

  5. Jeya Praveena J says:

    It works fine for me after selecting the Id column as YES(In Key column).Pls let me know if that works fine

Leave a Reply

© 2017 Database ETL. All rights reserved.