logo头像
Snippet 博客主题

HBase学习之路 (六)过滤器

** HBase学习之路 (六)过滤器:** <Excerpt in index | 首页摘要>

​ HBase学习之路 (六)过滤器

<The rest of contents | 余下全文>

过滤器(Filter)

  基础API中的查询操作在面对大量数据的时候是非常苍白的,这里Hbase提供了高级的查询方法:Filter。Filter可以根据簇、列、版本等更多的条件来对数据进行过滤,基于Hbase本身提供的三维有序(主键有序、列有序、版本有序),这些Filter可以高效的完成查询过滤的任务。带有Filter条件的RPC查询请求会把Filter分发到各个RegionServer,是一个服务器端(Server-side)的过滤器,这样也可以降低网络传输的压力。

  要完成一个过滤的操作,至少需要两个参数。一个是抽象的操作符,Hbase提供了枚举类型的变量来表示这些抽象的操作符:LESS/LESS_OR_EQUAL/EQUAL/NOT_EUQAL等;另外一个就是具体的比较器(Comparator),代表具体的比较逻辑,如果可以提高字节级的比较、字符串级的比较等。有了这两个参数,我们就可以清晰的定义筛选的条件,过滤数据。

抽象操作符(比较运算符)

LESS <

LESS_OR_EQUAL <=

EQUAL =

NOT_EQUAL <>

GREATER_OR_EQUAL >=

GREATER >

NO_OP 排除所有

比较器(指定比较机制)

BinaryComparator 按字节索引顺序比较指定字节数组,采用 Bytes.compareTo(byte[])

BinaryPrefixComparator 跟前面相同,只是比较左端的数据是否相同

NullComparator 判断给定的是否为空

BitComparator 按位比较

RegexStringComparator 提供一个正则的比较器,仅支持 EQUAL 和非 EQUAL

SubstringComparator 判断提供的子串是否出现在 value 中

HBase过滤器的分类

比较过滤器

1、行键过滤器 RowFilter

1
2
Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes()));
scan.setFilter(rowFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 1 public class HbaseFilterTest {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes()));
20 scan.setFilter(rowFilter);
21 ResultScanner resultScanner = table.getScanner(scan);
22 for(Result result : resultScanner) {
23 List<Cell> cells = result.listCells();
24 for(Cell cell : cells) {
25 System.out.println(cell);
26 }
27 }
28
29
30 }

运行结果部分截图

img

2、列簇过滤器 FamilyFilter

1
2
Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes()));
scan.setFilter(familyFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
 1 public class HbaseFilterTest {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes()));
20 scan.setFilter(familyFilter);
21 ResultScanner resultScanner = table.getScanner(scan);
22 for(Result result : resultScanner) {
23 List<Cell> cells = result.listCells();
24 for(Cell cell : cells) {
25 System.out.println(cell);
26 }
27 }
28
29
30 }
31
32
33 }

img

3、列过滤器 QualifierFilter

1
2
Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes()));
scan.setFilter(qualifierFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
 1 public class HbaseFilterTest {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes()));
20 scan.setFilter(qualifierFilter);
21 ResultScanner resultScanner = table.getScanner(scan);
22 for(Result result : resultScanner) {
23 List<Cell> cells = result.listCells();
24 for(Cell cell : cells) {
25 System.out.println(cell);
26 }
27 }
28
29
30 }
31
32
33 }

img

4、值过滤器 ValueFilter

1
2
Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男"));
scan.setFilter(valueFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
 1 public class HbaseFilterTest {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男"));
20 scan.setFilter(valueFilter);
21 ResultScanner resultScanner = table.getScanner(scan);
22 for(Result result : resultScanner) {
23 List<Cell> cells = result.listCells();
24 for(Cell cell : cells) {
25 System.out.println(cell);
26 }
27 }
28
29
30 }
31
32
33 }

img

5、时间戳过滤器 TimestampsFilter

1
2
3
4
List<Long> list = new ArrayList<>();
list.add(1522469029503l);
TimestampsFilter timestampsFilter = new TimestampsFilter(list);
scan.setFilter(timestampsFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
 1 public class HbaseFilterTest {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 List<Long> list = new ArrayList<>();
20 list.add(1522469029503l);
21 TimestampsFilter timestampsFilter = new TimestampsFilter(list);
22 scan.setFilter(timestampsFilter);
23 ResultScanner resultScanner = table.getScanner(scan);
24 for(Result result : resultScanner) {
25 List<Cell> cells = result.listCells();
26 for(Cell cell : cells) {
27 System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
28 + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
29 }
30 }
31
32
33 }
34
35
36 }

img

专用过滤器

1、单列值过滤器 SingleColumnValueFilter —-会返回满足条件的整行

1
2
3
4
5
6
7
8
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
"info".getBytes(), //列簇
"name".getBytes(), //列
CompareOp.EQUAL,
new SubstringComparator("刘晨"));
//如果不设置为 true,则那些不包含指定 column 的行也会返回
singleColumnValueFilter.setFilterIfMissing(true);
scan.setFilter(singleColumnValueFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
 1 public class HbaseFilterTest2 {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
20 "info".getBytes(),
21 "name".getBytes(),
22 CompareOp.EQUAL,
23 new SubstringComparator("刘晨"));
24 singleColumnValueFilter.setFilterIfMissing(true);
25
26 scan.setFilter(singleColumnValueFilter);
27 ResultScanner resultScanner = table.getScanner(scan);
28 for(Result result : resultScanner) {
29 List<Cell> cells = result.listCells();
30 for(Cell cell : cells) {
31 System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
32 + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
33 }
34 }
35
36
37 }
38
39
40 }

img

2、单列值排除器 SingleColumnValueExcludeFilter

1
2
3
4
5
6
7
8
SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(
"info".getBytes(),
"name".getBytes(),
CompareOp.EQUAL,
new SubstringComparator("刘晨"));
singleColumnValueExcludeFilter.setFilterIfMissing(true);

scan.setFilter(singleColumnValueExcludeFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
 1 public class HbaseFilterTest2 {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(
20 "info".getBytes(),
21 "name".getBytes(),
22 CompareOp.EQUAL,
23 new SubstringComparator("刘晨"));
24 singleColumnValueExcludeFilter.setFilterIfMissing(true);
25
26 scan.setFilter(singleColumnValueExcludeFilter);
27 ResultScanner resultScanner = table.getScanner(scan);
28 for(Result result : resultScanner) {
29 List<Cell> cells = result.listCells();
30 for(Cell cell : cells) {
31 System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
32 + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
33 }
34 }
35
36
37 }
38
39
40 }

img

3、前缀过滤器 PrefixFilter—-针对行键

1
2
3
PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes());

scan.setFilter(prefixFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
 1 public class HbaseFilterTest2 {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes());
20
21 scan.setFilter(prefixFilter);
22 ResultScanner resultScanner = table.getScanner(scan);
23 for(Result result : resultScanner) {
24 List<Cell> cells = result.listCells();
25 for(Cell cell : cells) {
26 System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
27 + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
28 }
29 }
30
31
32 }
33
34
35 }

img

4、列前缀过滤器 ColumnPrefixFilter

1
2
3
ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes());

scan.setFilter(columnPrefixFilter);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
 1 public class HbaseFilterTest2 {
2
3 private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";
4 private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";
5
6 private static Connection conn = null;
7 private static Admin admin = null;
8
9 public static void main(String[] args) throws Exception {
10
11 Configuration conf = HBaseConfiguration.create();
12 conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);
13 conn = ConnectionFactory.createConnection(conf);
14 admin = conn.getAdmin();
15 Table table = conn.getTable(TableName.valueOf("student"));
16
17 Scan scan = new Scan();
18
19 ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes());
20
21 scan.setFilter(columnPrefixFilter);
22 ResultScanner resultScanner = table.getScanner(scan);
23 for(Result result : resultScanner) {
24 List<Cell> cells = result.listCells();
25 for(Cell cell : cells) {
26 System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())
27 + "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());
28 }
29 }
30
31
32 }
33
34
35 }

img

5、分页过滤器 PageFilter