Analyze Big Data Platforms For Security and Performance

Terminology

Here are some terms we are using in Apache Eagle (called Eagle in the following), please check them for your reference. They are basic knowledge of Eagle which also will help to well understand Eagle.

  • Site: a site can be considered as a physical data center. Big data platform e.g. Hadoop1 may be deployed to multiple data centers in an enterprise.

  • Application: an application is composed of data integration, policies and insights for one data source.

  • Policy: a policy defines the rule to alert. Policy can be simply a filter expression or a complex window based aggregation rules etc.

  • Data source: a data source is a monitoring target data. Eagle supports many data sources HDFS audit logs, Hive2 query, MapReduce job etc.

  • Stream: a stream is the streaming data from a data source. Each data source has its own stream.

  • Data activity monitoring: Data activity monitoring is to monitor how user exploits data in Hadoop system etc.

  • User profile: a user profile is the historical activity model generated using machine learning algorithm which could be used for showing insights.

  • Data classification: data classification provides the ability to classify different data sources with different levels of sensitivity.


Footnotes

  1. All mentions of “hadoop” on this page represent Apache Hadoop.

  2. Apache Hive.

Copyright © 2015 The Apache Software Foundation, Licensed under the Apache License, Version 2.0.
Apache Eagle, Eagle, Apache Hadoop, Hadoop, Apache HBase, HBase, Apache Hive, Hive, Apache Ambari, Ambari, Apache Spark, Spark, Apache Kafka, Kafka, Apache Storm, Storm, Apache Maven, Maven, Apache Tomcat, Tomcat, Apache Derby, Derby, Apache Cassandra, Cassandra, Apache ZooKeeper, ZooKeeper, Apache, the Apache feather logo, and the Apache project logo are trademarks of The Apache Software Foundation.