DATASTAGE SCENARIO BASED QUESTION – 2 : CITY NAMES AND DISTANCE BETWEEN THEM : PROBLEM AND SOLUTION

DATASTAGE-SCENARIO-CITY-NAMES-AND-DISTANCE-BETWEEN-THEM

Problem:

I have one file and it contains city names and distance between them. But there are duplicates, we need to remove those duplicates by using stage variables in transformer stage.

Input file:

SOURCE,DESTINATION,DISTANCE
HYD,CHN,500
CHN,HYD,500
BANG,HYD,600
HYD,BANG,600
PUN,HYD,750
HYD,PUN,750
CHN,BANG,500
BANG,CHN ,500

Expected output:

HYD,CHN,500
BANG,HYD,600
PUN,HYD,750
CHN,BANG,500

Solution:
Read the file with sequential file stage.

read-city-names-data

Next, put transformer stage and select the option to run it in sequential mode.

 check-execution-mode-as-sequential-mode
Declare 2 stage variables.
Flag  : To store the flag value like YES/NO. I will use 1 and 0 for this purpose.
city1 : Buffer variable to store the previous record Source and Destination fields.
initial-stage-variable-values
Stage variables defination will be,
Flag = If input.DESTINATION:input.SOURCE=city1 Then 0 Else 1
city1 = input.SOURCE:input.DESTINATION
define-stage-variable

Compile and run the job.

job-run-for-city-distance-scenario

After run, you can see the output like below.

final-output-unique-city-distances

Comments

comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: