GET EXCEL FIRST SHEET NAME USING UNSTRUCTURED DATA STAGE

We know we can give hard coded sheet name or parameterized sheet name while reading excel data from unstructured data stage.

But the problem comes when you want to read data from excel in specific sheet number such as first sheet or second sheet and so on. If you know sheet name, you can directly hard code sheet name there. But if sheet name is changing everytime, there is no direct option to read data from specific sheet.

I will show you how to read specific sheet name using a parallel job and you can pass this input to next parallel job to read it.

Below parallel job will demonstratie how to read sheet name.

excel-first-sheet-name-parallel-datastage-job

Of course there is no other option to read excel sheet unless we use unstructured data stage. I used this stage only to read excel workbook.

Step 1: Click on ‘Configure’ in unstructured data stage

unstructured-data-stage-configure

Step 2: Select ‘Sheet name’ from the Import properties as shown below.

unstructured-data-stage-sheet-name-property

Step 3: Do not select any other columns in the Import, we are not reading entire excel sheet here. Keep all the options exactly like below. Specify the range according to your excel sheet metadata.

unstructured-data-stage-configure-window

Step 4: ‘SheetName’ column will be there in Columns tab, it comes automatically.

sheet-name-column-tab

Step 5: Use constraint in transformer stage as like below.

@INROWNUM=1

If you are using more than one node to run this job, please run the job in sequential mode. Otherwise you may not see accurate result for @INROWNUM system variable.

transformer-stage-to-resrtict-by-inrownum

That’s it!

Just write this FirstSheetName into one flat file and read that using execute command stage in sequence. Pass that value to your parallel job as a parameter.

If stage is not reading your sheet name, that means we don’t have any data in excel sheet. That being said, unstructured data stage reads ‘SheetName’ from non-empty sheets only.

You can parameterize the sheet name in ‘Range expression’ field in unstructured data stage.

Thanks for reading! Let me now your thoughts on this by commenting below.

Comments

comments

Leave a Reply