DATASTAGE SCENARIO BASED QUESTION – 2 : CITY NAMES AND DISTANCE BETWEEN THEM : PROBLEM AND SOLUTION

DATASTAGE-SCENARIO-CITY-NAMES-AND-DISTANCE-BETWEEN-THEM

Problem:

I have one file, it contains city names and distance between them. But there are duplicates, we need to remove those duplicates by using stage variables in transformer stage.

Input file:

SOURCE,DESTINATION,DISTANCE
HYD,CHN,500
CHN,HYD,500
BANG,HYD,600
HYD,BANG,600
PUN,HYD,750
HYD,PUN,750
CHN,BANG,500
BANG,CHN ,500

Expected output:

HYD,CHN,500
BANG,HYD,600
PUN,HYD,750
CHN,BANG,500

Solution:
Read the file with sequential file stage.

read-city-names-data

Next, put transformer stage and select the option to run it in sequential mode.

 check-execution-mode-as-sequential-mode
Declare 2 stage variables.
Flag  : To store the flag value like YES/NO. I will use 1 and 0 for this purpose.
city1 : Buffer variable to store the previous record Source and Destination fields.
initial-stage-variable-values
Stage variables defination will be,
Flag = If input.DESTINATION:input.SOURCE=city1 Then 0 Else 1
city1 = input.SOURCE:input.DESTINATION
define-stage-variable

Compile and run the job.

job-run-for-city-distance-scenario

After run, you can see the output like below.

final-output-unique-city-distances

Comments

comments

Leave a Reply