Analyze Big Data Platforms For Security and Performance

Hive Query Activity Monitoring Quick Start

Since Apache Eagle 0.3.0-incubating. Apache Eagle will be called Eagle in the following.

This Guide describes the steps to enable HIVE1 query activity monitoring.

  • Prerequisite
  • Stream HIVE query logs into Eagle platform
  • Demos “Hive Query Activity Monitoring”

Prerequisite

Stream HIVE query logs into Eagle platform

There are a couple of methods to capture HIVE query logs. As of 0.4.0, Eagle uses YARN API to periodically poll running HIVE jobs and in realtime parse query expressions. So here Eagle assumes resource manager is installed in Hadoop[^HADOOP] cluster.

Demos

  • Hive:
    1. Click on menu “DAM” and select “Hive” to view Hive policy
    2. You should see policy with name “queryPhoneNumber”. This Policy generates alert when hive table with sensitivity(Phone_Number) information is queried.
    3. In sandbox read restricted sensitive HIVE column. ( To learn more about data sensitivity settings click Data Classification Tutorial)
$ su hive
$ hive
$ set hive.execution.engine=mr;
$ use xademo;
$ select a.phone_number from customer_details a, call_detail_records b where a.phone_number=b.phone_number;

From UI click on alert tab and you should see alert for your attempt to read restricted column.


Footnotes

  1. All mentions of “hive” on this page represent Apache Hive.

Copyright © 2015 The Apache Software Foundation, Licensed under the Apache License, Version 2.0.
Apache Eagle, Eagle, Apache Hadoop, Hadoop, Apache HBase, HBase, Apache Hive, Hive, Apache Ambari, Ambari, Apache Spark, Spark, Apache Kafka, Kafka, Apache Storm, Storm, Apache Maven, Maven, Apache Tomcat, Tomcat, Apache Derby, Derby, Apache Cassandra, Cassandra, Apache ZooKeeper, ZooKeeper, Apache, the Apache feather logo, and the Apache project logo are trademarks of The Apache Software Foundation.