Pig allows you to remove unwanted records based on a condition. The Filter functionality is similar to the WHERE clause in SQL. The FILTER operator in pig is used to remove unwanted records from the data file. The syntax of FILTER operator is shown below:
Here relation is the data set on which the filter is applied, condition is the filter condition and new relation is the relation created after filtering the rows.
Pig Filter Examples:
Lets consider the below sales data set as an example
1. select products whose quantity is greater than or equal to 1000.
2. select products whose quantity is greater than 1000 and year is 2001
3. select products with year not in 2000
You can use all the logical operators (NOT, AND, OR) and relational operators (< , >, ==, !=, >=, <= ) in the filter conditions.
<new relation> = FILTER <relation> BY <condition>
Here relation is the data set on which the filter is applied, condition is the filter condition and new relation is the relation created after filtering the rows.
Pig Filter Examples:
Lets consider the below sales data set as an example
year,product,quantity --------------------- 2000, iphone, 1000 2001, iphone, 1500 2002, iphone, 2000 2000, nokia, 1200 2001, nokia, 1500 2002, nokia, 900
1. select products whose quantity is greater than or equal to 1000.
grunt> A = LOAD '/user/hadoop/sales' USING PigStorage(',') AS (year:int,product:chararray,quantity:int); grunt> B = FILTER A BY quantity >= 1000; grunt> DUMP B; (2000,iphone,1000) (2001,iphone,1500) (2002,iphone,2000) (2000,nokia,1200) (2001,nokia,1500)
2. select products whose quantity is greater than 1000 and year is 2001
grunt> C = FILTER A BY quantity > 1000 AND year == 2001; (2001,iphone,1500) (2001,nokia,1500)
3. select products with year not in 2000
grunt> D = FILTER A BY year != 2000; grunt> DUMP D; (2001,iphone,1500) (2002,iphone,2000) (2001,nokia,1500) (2002,nokia,900)
You can use all the logical operators (NOT, AND, OR) and relational operators (< , >, ==, !=, >=, <= ) in the filter conditions.
No comments:
Post a Comment