1.Atlas Type System
Atlas 类型系统,Atlas 允许用户为他们想要管理的元数据对象定义一个模型。该模型由称为 “类型” 的定义组成。被称为 “实体” 的 “类型” 实例表示被管理的实际元数据对象。类型系统是一个组件,允许用户定义和管理类型和实体。由 Atlas 管理的所有元数据对象(例如Hive表)都使用类型进行建模,并表示为实体。如果要在Atlas中存储新类型的元数据,需要了解类型系统组件的概念。
2.Atlas 类型系统
2.1 Type
Type(类型)在Atlas中代表了一类数据,如hdfs_path Type,hive_db Type,hive_table Type,hive_column Type。Type,简单理解,可理解成Java 面向对象中的类Class,定义了一类数据。
Atlas中的 “类型” 定义了如何存储和访问特定类型的元数据对象。类型表示了所定义元数据对象的一个或多个属性集合。具有开发背景的用户可以将 “类型” 理解成面向对象的编程语言的 “类” 定义的或关系数据库的 “表模式”。
2.2 Entity
Entity(实体)是某个Type(类型)的Instance(实例)。类似面向对象中某一Class的具体Object。
2.3 Attribute
类型系统中,Type和Entity都是有属性的。Attribute定义了Type和Entity的具体属性。除此之外,Atlas内部还自带了一些预定义类型,如Referenceable Type,Asset Type,Infrastructure Type,DataSet Type,Process Type等。
3.Atlas Rest API
总结部分Rest API,其他API查看Atlas官网Atlas Rest API。
Atlas REST API 参考地址: http://atlas.apache.org/api/v2/
curl -s -u admin:admin "http://localhost:21000/api/atlas/admin/status"
ACTIVE:此实例处于活跃状态,可以响应用户请求。
PASSIVE:此实例处于被动状态。它会将收到的任何用户请求重定向到当前ACTIVE实例。
BECOMING_ACTIVE:此实例正在转换为ACTIVE实例,在此状态下无法为用户提供请求服务。
BECOMING_PASSIVE:此实例正在转换为PASSIVE实例,在此状态下无法为用户提供请求服务。
注意:正常情况下,只有一个应该为ACTIVE状态,其他实例均为PASSIVE状态。
• 查看Atlas版本和描述 GET /admin/version
curl -s -u admin:admin "http://localhost:21000/api/atlas/admin/version”
如:
{
"Version": "0.8.4",
"Revision": "release",
"Name": "apache-atlas",
"Description": "Metadata Management and Data Governance Platform over Hadoop"
}
#查询所有Hive表
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/search/basic?typeName=hive_table"
#查询所有Hive表,且包含某一关键字
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/search/basic?query=ads_gmv_sum_day&typeName=hive_table”
如:
{
"queryType": "BASIC",
"searchParameters": {
"query": "ads_gmv_sum_day",
"typeName": "hive_table",
"excludeDeletedEntities": false,
"includeClassificationAttributes": false,
"includeSubTypes": true,
"includeSubClassifications": true,
"limit": 100,
"offset": 0
},
"queryText": "ads_gmv_sum_day",
"entities": [{
"typeName": "hive_table",
"attributes": {
"owner": "luomk",
"createTime": 1570460212000,
"qualifiedName": "gmall.ads_gmv_sum_day@primary",
"name": "ads_gmv_sum_day"
},
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"status": "ACTIVE",
"displayText": "ads_gmv_sum_day",
"classificationNames": []
}]
}
3.3 TypesREST
• 检索所有Type,并返回所有信息 GET /v2/types/typedefs
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
如:
{
"enumDefs":Array[2],
"structDefs":Array[3],
"classificationDefs":Array[1],
"entityDefs":Array[31]
}
• 检索所有Type,并返回最少信息 GET /v2/types/typedefs/headers
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs/headers"
如:
[{
"guid": "fb973127-eaf5-40d8-b7e2-76a8d9277647",
"name": "hive_principal_type",
"category": "ENUM"
}, {
"guid": "a50ae8c7-d3ce-4841-9f17-ffccfb965ba8",
"name": "file_action",
"category": "ENUM"
}, {
"guid": "79c4fea8-c318-4c55-848c-da967fcdc27d",
"name": "hive_serde",
"category": "STRUCT"
}, {
"guid": "b001c35b-4a27-4121-bf2b-f6dc3b114a98",
"name": "hive_order",
"category": "STRUCT"
}, {
"guid": "ed9105e9-d007-4ba4-9245-0249f65d10c8",
"name": "fs_permissions",
"category": "STRUCT"
}, {
"guid": "affdcdca-074d-4a83-aca6-355cb4f9923d",
"name": "TaxonomyTerm",
"category": "CLASSIFICATION"
}, {
"guid": "311875de-7272-41f7-9e95-e7bca61a631d",
"name": "falcon_process",
"category": "ENTITY"
}, {
"guid": "603ba60e-5291-44d9-8cf1-ada9c84cd766",
"name": "falcon_feed_replication",
"category": "ENTITY"
}, {
"guid": "5b6e3f38-fe2f-4022-b34a-3bc8201983ab",
"name": "DataSet",
"category": "ENTITY"
}, {
"guid": "6f6b8df6-bdbd-4102-862a-7b85e3c1ed14",
"name": "falcon_feed_creation",
"category": "ENTITY"
}, {
"guid": "21f2cddb-8975-4db9-8cd8-fdedd7ae0a76",
"name": "Process",
"category": "ENTITY"
}, {
"guid": "e54dc459-a8ca-4b57-81ae-1756cb6075de",
"name": "hive_table",
"category": "ENTITY"
}, {
"guid": "f950c86d-7c42-4735-87f8-6e44f9abd194",
"name": "hive_db",
"category": "ENTITY"
}, {
"guid": "7ed996cd-ef61-46b6-ab98-e76d685f84f9",
"name": "sqoop_dbdatastore",
"category": "ENTITY"
}, {
"guid": "9098252b-8797-45da-8710-fb23ef7f1d60",
"name": "hbase_namespace",
"category": "ENTITY"
}, {
"guid": "82d29db6-dd5a-404c-bb11-682a4d65ab07",
"name": "hive_process",
"category": "ENTITY"
}, {
"guid": "1f7320cd-b2fd-4398-bd90-3e79cf477909",
"name": "storm_node",
"category": "ENTITY"
}, {
"guid": "f678cf64-aca4-4198-a4ca-04a64c69b4ba",
"name": "hbase_column",
"category": "ENTITY"
}, {
"guid": "3b4c4912-694b-4bc7-a711-12fcb4803ebb",
"name": "AtlasServer",
"category": "ENTITY"
}, {
"guid": "f96ed296-3711-4640-9a82-f7350ace8d97",
"name": "Referenceable",
"category": "ENTITY"
}, {
"guid": "a45c5739-d128-4256-9a9b-7f7fe725f9d8",
"name": "hbase_table",
"category": "ENTITY"
}, {
"guid": "e50991b4-e023-44fc-9db0-54e5dd0c35b8",
"name": "falcon_feed",
"category": "ENTITY"
}, {
"guid": "ac643124-a524-4ad5-ae9f-a6cfeed56afc",
"name": "jms_topic",
"category": "ENTITY"
}, {
"guid": "a0dda35d-fcb3-4a37-bc1e-4639606b92db",
"name": "storm_topology",
"category": "ENTITY"
}, {
"guid": "d212fc52-4d7c-4b93-aa27-e861b5f618af",
"name": "Infrastructure",
"category": "ENTITY"
}, {
"guid": "e3fe2788-1ba0-40cd-adc4-3fb1e2bab913",
"name": "hbase_column_family",
"category": "ENTITY"
}, {
"guid": "252d5a57-bc27-4e32-a28a-56ba130f8711",
"name": "storm_spout",
"category": "ENTITY"
}, {
"guid": "3e98f481-2758-4f73-8927-c70ccc6a54c7",
"name": "Asset",
"category": "ENTITY"
}, {
"guid": "ba5574ff-c893-46a8-ba31-f005a9da6ed0",
"name": "hive_column",
"category": "ENTITY"
}, {
"guid": "6ec0acc8-ddaf-4c7a-8cb1-8660abfbe985",
"name": "kafka_topic",
"category": "ENTITY"
}, {
"guid": "1984cde2-e963-4de3-8604-60a26c40165b",
"name": "hive_storagedesc",
"category": "ENTITY"
}, {
"guid": "cf50e617-f069-4159-9608-7cfb890fbd69",
"name": "hdfs_path",
"category": "ENTITY"
}, {
"guid": "9f4d9527-38aa-4bac-9fca-1dcec6885e98",
"name": "sqoop_process",
"category": "ENTITY"
}, {
"guid": "4bdeaf91-9d1d-47b5-a912-64253e3e7e17",
"name": "hive_column_lineage",
"category": "ENTITY"
}, {
"guid": "f7adaaa6-c0a2-4a61-a5ed-7810a3a52c32",
"name": "storm_bolt",
"category": "ENTITY"
}, {
"guid": "981f7e29-7182-4ad8-8d66-b52c181c2af4",
"name": "falcon_cluster",
"category": "ENTITY"
}, {
"guid": "395e3dd4-726f-4a65-a3a8-44475bef68bc",
"name": "fs_path",
"category": "ENTITY"
}]
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/entity/bulk?minExtInfo=yes&guid=2dd4ca4c-9d33-4c19-bca3-f60e162debf2”
如:
{
"referredEntities": {
"6093c0d2-e90d-4cbc-81ee-850fdfb06528": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.dt@primary",
"name": "dt",
"description": null,
"comment": "????",
"position": 0,
"type": "string",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "6093c0d2-e90d-4cbc-81ee-850fdfb06528",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"82812804-baec-4f04-8978-540972a0f10a": {
"typeName": "hive_storagedesc",
"attributes": {
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day@primary_storage",
"inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"bucketCols": null,
"sortCols": null,
"storedAsSubDirectories": false,
"location": "hdfs://hadoop102:9000/warehouse/gmall/ads/ads_gmv_sum_day",
"compressed": false,
"outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"parameters": null,
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
},
"serdeInfo": {
"typeName": "hive_serde",
"attributes": {
"serializationLib": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
"name": null,
"parameters": {
"serialization.format": "\t",
"field.delim": "\t"
}
}
},
"numBuckets": -1
},
"guid": "82812804-baec-4f04-8978-540972a0f10a",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"fa07e2c3-245d-46ce-b753-42dda6afba48": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.gmv_amount@primary",
"name": "gmv_amount",
"description": null,
"comment": "??gmv?????",
"position": 2,
"type": "decimal(16,2)",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "fa07e2c3-245d-46ce-b753-42dda6afba48",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"206d5311-6799-4deb-8014-2b64bbcfd3c5": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.gmv_payment@primary",
"name": "gmv_payment",
"description": null,
"comment": "??????",
"position": 3,
"type": "decimal(16,2)",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "206d5311-6799-4deb-8014-2b64bbcfd3c5",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"726f654d-10cd-4914-8a27-27b2ce35274f": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.gmv_count@primary",
"name": "gmv_count",
"description": null,
"comment": "??gmv????",
"position": 1,
"type": "bigint",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "726f654d-10cd-4914-8a27-27b2ce35274f",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
}
},
"entities": [{
"typeName": "hive_table",
"attributes": {
"owner": "luomk",
"temporary": false,
"lastAccessTime": 1570460212000,
"aliases": null,
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day@primary",
"columns": [{
"guid": "6093c0d2-e90d-4cbc-81ee-850fdfb06528",
"typeName": "hive_column"
}, {
"guid": "726f654d-10cd-4914-8a27-27b2ce35274f",
"typeName": "hive_column"
}, {
"guid": "fa07e2c3-245d-46ce-b753-42dda6afba48",
"typeName": "hive_column"
}, {
"guid": "206d5311-6799-4deb-8014-2b64bbcfd3c5",
"typeName": "hive_column"
}],
"description": null,
"viewExpandedText": null,
"sd": {
"guid": "82812804-baec-4f04-8978-540972a0f10a",
"typeName": "hive_storagedesc"
},
"tableType": "EXTERNAL_TABLE",
"createTime": 1570460212000,
"name": "ads_gmv_sum_day",
"comment": "GMV",
"partitionKeys": null,
"parameters": {
"totalSize": "0",
"EXTERNAL": "TRUE",
"numRows": "5",
"rawDataSize": "141",
"COLUMN_STATS_ACCURATE": "true",
"numFiles": "0",
"transient_lastDdlTime": "1575182190",
"comment": "GMV"
},
"db": {
"guid": "5b85a7d4-6315-4947-9b74-28852dc57195",
"typeName": "hive_db"
},
"retention": 0,
"viewOriginalText": null
},
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "luomk",
"createTime": 1570869511992,
"updateTime": 1575182190929,
"version": 0,
"classifications": []
}]
}
• 获取某个Entity定义 GET /v2/entity/guid/{guid}
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/entity/bulk?minExtInfo=yes&guid=2dd4ca4c-9d33-4c19-bca3-f60e162debf2"
如:
{
"referredEntities": {
"6093c0d2-e90d-4cbc-81ee-850fdfb06528": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.dt@primary",
"name": "dt",
"description": null,
"comment": "????",
"position": 0,
"type": "string",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "6093c0d2-e90d-4cbc-81ee-850fdfb06528",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"82812804-baec-4f04-8978-540972a0f10a": {
"typeName": "hive_storagedesc",
"attributes": {
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day@primary_storage",
"inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"bucketCols": null,
"sortCols": null,
"storedAsSubDirectories": false,
"location": "hdfs://hadoop102:9000/warehouse/gmall/ads/ads_gmv_sum_day",
"compressed": false,
"outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"parameters": null,
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
},
"serdeInfo": {
"typeName": "hive_serde",
"attributes": {
"serializationLib": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
"name": null,
"parameters": {
"serialization.format": "\t",
"field.delim": "\t"
}
}
},
"numBuckets": -1
},
"guid": "82812804-baec-4f04-8978-540972a0f10a",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"fa07e2c3-245d-46ce-b753-42dda6afba48": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.gmv_amount@primary",
"name": "gmv_amount",
"description": null,
"comment": "??gmv?????",
"position": 2,
"type": "decimal(16,2)",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "fa07e2c3-245d-46ce-b753-42dda6afba48",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"206d5311-6799-4deb-8014-2b64bbcfd3c5": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.gmv_payment@primary",
"name": "gmv_payment",
"description": null,
"comment": "??????",
"position": 3,
"type": "decimal(16,2)",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "206d5311-6799-4deb-8014-2b64bbcfd3c5",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
},
"726f654d-10cd-4914-8a27-27b2ce35274f": {
"typeName": "hive_column",
"attributes": {
"owner": "luomk",
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day.gmv_count@primary",
"name": "gmv_count",
"description": null,
"comment": "??gmv????",
"position": 1,
"type": "bigint",
"table": {
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"typeName": "hive_table"
}
},
"guid": "726f654d-10cd-4914-8a27-27b2ce35274f",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "admin",
"createTime": 1570869511992,
"updateTime": 1570869511992,
"version": 0,
"classifications": []
}
},
"entities": [{
"typeName": "hive_table",
"attributes": {
"owner": "luomk",
"temporary": false,
"lastAccessTime": 1570460212000,
"aliases": null,
"replicatedTo": null,
"replicatedFrom": null,
"qualifiedName": "gmall.ads_gmv_sum_day@primary",
"columns": [{
"guid": "6093c0d2-e90d-4cbc-81ee-850fdfb06528",
"typeName": "hive_column"
}, {
"guid": "726f654d-10cd-4914-8a27-27b2ce35274f",
"typeName": "hive_column"
}, {
"guid": "fa07e2c3-245d-46ce-b753-42dda6afba48",
"typeName": "hive_column"
}, {
"guid": "206d5311-6799-4deb-8014-2b64bbcfd3c5",
"typeName": "hive_column"
}],
"description": null,
"viewExpandedText": null,
"sd": {
"guid": "82812804-baec-4f04-8978-540972a0f10a",
"typeName": "hive_storagedesc"
},
"tableType": "EXTERNAL_TABLE",
"createTime": 1570460212000,
"name": "ads_gmv_sum_day",
"comment": "GMV",
"partitionKeys": null,
"parameters": {
"totalSize": "0",
"EXTERNAL": "TRUE",
"numRows": "5",
"rawDataSize": "141",
"COLUMN_STATS_ACCURATE": "true",
"numFiles": "0",
"transient_lastDdlTime": "1575182190",
"comment": "GMV"
},
"db": {
"guid": "5b85a7d4-6315-4947-9b74-28852dc57195",
"typeName": "hive_db"
},
"retention": 0,
"viewOriginalText": null
},
"guid": "2dd4ca4c-9d33-4c19-bca3-f60e162debf2",
"status": "ACTIVE",
"createdBy": "admin",
"updatedBy": "luomk",
"createTime": 1570869511992,
"updateTime": 1575182190929,
"version": 0,
"classifications": []
}]
}
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/entity/guid/2dd4ca4c-9d33-4c19-bca3-f60e162debf2/classifications"
如:
{
"list": [],
"startIndex": 0,
"pageSize": 0,
"totalCount": 0,
"sortType": "NONE"
}
3.5 LineageREST
• 查询某个Entity的Lineage GET /v2/lineage/{guid}
curl -s -u admin:admin "http://localhost:21000/api/atlas/v2/lineage/2dd4ca4c-9d33-4c19-bca3-f60e162debf2"