The Query API
A common challenge for users of IaaS software is how to find a wanted resource quickly and precisely; for example, finding the VM that has EIP(16.16.16.16) from 10,000 VMs. Most IaaS software solves this problem by ad-hoc query logic in APIs. ZStack, instead of ad-hoc, equips with a framework that can automatically generate queries for every field of every resource, and join queries that span multiple resources, helping users to manage enormous resources in the cloud.
The motivation
A moderate cloud may manage several hundreds of physical hosts and tens of thousands of VMs, leading to a challenge of finding
wanted resources because IaaS software rarely have comprehensive query APIs. Most IaaS software
allows users to query a resource with a handful of conditions(e.g. name, UUID) that are
hard coded in query APIs. If users want to do a query with conditions exceeding those hard coded, for example, query a VM with the created date,
they may have to end up listing all VMs then filtering the result in a for..loop
. Querying a resource with arbitrary fields
have not been fully supported in most IaaS software, let alone join queries; for example, if users want to find a VM whose nic is
applied with a specific security group rule, they may have to list both resources(VM, security group) and do twice of for..loop
.
On the other side, compared to software like JIRA, most UI of IaaS software are coarse and crude.
Many developers may not realize that the root of the poor UI is not because UI developers lack skills of CSS/HTML/JavaScript, but
the software itself could not provide robust APIs supporting a sophisticated UI; for example, to implement a feature
like JIRA filter that shows only resources meeting specified conditions, UI may have to do a lot of post-processing work of listing all then
filtering by for..loop
, which will screw up with a large number of resources.
IaaS software has been suffering this for a while; the antidote is to provide a mechanism that can automatically generate queries for every field of every resource and can handle join queries.
Ashamed: It's a little embarrassing that we claim the UI of IaaS software as poor user experience while, so ours is. However, that's all because our UI is made by a Java/Python developer who is not good at CSS/HTML/JavaScript. ZStack has comprehensive query APIs that can help build JIRA like UI; we are sure our next version of UI will get much better.
The problem
Most IaaS software use a relational database(e.g. MySQL) as backend database that resources are normally arranged in individual tables;
such as virtual machine table, host table, volume table. For every resource, there is an API, which may be differently named
as describe API
, list API
or query API
, to get a single of a list of that resource; those APIs typically have hard coded parameters
that expose a part of columns of database tables, allowing users to query a resource with a small number of conditions; these parameters are selected
elaborately, usually are columns that matter to API designers themselves, for example, name, uuid. However, as not all columns are exposed,
users often run into situations that they have to retrieve all resources then do post-processing in one or more for..loop
,
because the column they want to query is absent from the API.
A complex query scenario may need to use join queries that often span multiple database tables; for example, to find a VM with EIP 16.16.16.16, it may involve VM table, nic table, and EIP table. Some IaaS software solves this problem using database views, which is another kind of hard coding that can only join selected tables in fixed forms, while tables in reality can be joined in very complex manners. In the software upgrade, views also require database migration if any tables a view refers to have changed.
Query APIs
To avoid hand-coded query logic in APIs and provide users the flexibility of querying everything everywhere, ZStack creates a framework that can automatically generate queries for all resources and doesn't require developers to write code to implement query logic; what's more, the framework can also generate various join queries as long as tables are connected by foreign keys.
In following contents, we will cite zone as an example to elaborate this awesome framework. A zone has following columns in database:
FIELD | DESCRIPTION |
uuid | zone UUID |
name | zone name |
description | zone description |
state | zone state |
type | zone type |
createDate | the time the zone was created |
lastOpDate | the last time the zone was operated |
Users can query a zone by any single field or combinations of fields, using regular SQL comparison operators such as '=', '!=', '>', '>=', '<', '<=', 'in', 'not in', 'is null', 'is not null', 'like', 'not like'.
Note: In the command line tool, some operators have different forms: 'in'(?=), 'not in'(!?=), 'is null'(=null), 'is not null'(!=null), 'like'(~=), 'not like'(!~=).
QueryZone name=west-coast-zone
QueryZone name=west-coast-zone state=Enabled
As a zone is the ancestor of major ZStack resources, many resources more or less have relationships to it; for example, a running VM is always in some zone. Such kind of relationship can produce the join query like:
QueryZone vmInstance.name=web-vm1
As showed in the prior table, a zone doesn't expose any field called vmInstance
, but there is a condition starting with 'vmInstance' in above query.
This kind of query is called expanded query in ZStack. Here vmInstance
represents the VM table that has a
field zoneUuid
(foreign key) referring to the zone table, so the query framework can understand their relationship and generate the join query.
The above example is interpreted as "find zones that run VMs of name web-vm1". Expanding the example further, as the VM nic table has
foreign key to the VM table, and the EIP table has foreign key to the VM nic table, a query of zone can also use an EIP as condition:
QueryZone vmInstance.vmNics.eip.vipIp=16.16.16.16
The query is interpreted as "find zones that run VMs whose nics associated with EIP 16.16.16.16". Now you see the power of the query APIs! We can even create some very complex query:
QueryVolumeSnapshot volume.vmInstance.vmNics.l3Network.l2Network.attachedClusterUuids=13238c8e0591444e9160df4d3636be82
This complex query is to find volume snapshots created from volumes of VMs that have nics on L3 networks whose parent L2 networks are attached to a cluster of uuid equal to 13238c8e0591444e9160df4d3636be82. Don't panic, you rarely need that complex query, but it does prove the competency. Besides those, SQL features like selecting fields, sort, count, and pagination are also supported:
QueryL3Network name=L3-SYSTEM-PUBLIC count=true
QueryL3Network l2NetworkUuid=33107835aee84c449ac04c9622892dec limit=10
QueryL3Network l2NetworkUuid=33107835aee84c449ac04c9622892dec start=10 limit=100
QueryL3Network fields=name,uuid l2NetworkUuid=33107835aee84c449ac04c9622892dec
QueryL3Network l2NetworkUuid=33107835aee84c449ac04c9622892dec sortBy=createDate sortDirection=desc
The implementation
Despite query APIs are so powerful, the magic under the cover is very concise. When adding a new resource, developers do not need to write any code about query logic, but define the query API and the resource itself. To implement the query API for zone, a developer needs to:
1. annotate zone inventory with query metadata
@Inventory(mappingVOClass = ZoneVO.class)
@PythonClassInventory
@ExpandedQueries({
@ExpandedQuery(expandedField = "vmInstance", inventoryClass = VmInstanceInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid"),
@ExpandedQuery(expandedField = "cluster", inventoryClass = ClusterInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid"),
@ExpandedQuery(expandedField = "host", inventoryClass = HostInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid"),
@ExpandedQuery(expandedField = "primaryStorage", inventoryClass = PrimaryStorageInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid"),
@ExpandedQuery(expandedField = "l2Network", inventoryClass = L2NetworkInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid"),
@ExpandedQuery(expandedField = "l3Network", inventoryClass = L3NetworkInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid"),
@ExpandedQuery(expandedField = "backupStorageRef", inventoryClass = BackupStorageZoneRefInventory.class,
foreignKey = "uuid", expandedInventoryKey = "zoneUuid", hidden = true),
})
@ExpandedQueryAliases({
@ExpandedQueryAlias(alias = "backupStorage", expandedField = "backupStorageRef.backupStorage")
})
public class ZoneInventory implements Serializable{
private String uuid;
private String name;
private String description;
private String state;
private String type;
private Timestamp createDate;
private Timestamp lastOpDate;
}
Annotations above declare relationships among zone and other resources, which are bases of zone's expanded queries.
2. define a query API
@AutoQuery(replyClass = APIQueryZoneReply.class, inventoryClass = ZoneInventory.class)
public class APIQueryZoneMsg extends APIQueryMessage {
}
3. declare the query API in zone's API configuration file
<?xml version="1.0" encoding="UTF-8"?>
<service xmlns="http://zstack.org/schema/zstack">
<id>zone</id>
<interceptor>ZoneApiInterceptor</interceptor>
<message>
<name>org.zstack.header.zone.APIQueryZoneMsg</name>
<serviceId>query</serviceId>
</message>
</service>
The API APIQueryZoneMsg
is routed to the query service by specifying the service id query
. That's it,
zero line of code about query logic; the query service takes care of all the rest. All ZStack query APIs are defined like this,
adding query API for a new resource is extremely easy.
Current limitation
The main limitation is only logic AND
is supported amid query conditions, no OR
. For example:
QueryZone name=west-coast-zone state=Enabled
is saying "find zones of name west-coast and state Enabled". We do this because we found 99% combinations
of query conditions are based on logic AND
, from the SQL usage in ZStack source code. On the other hide, it's very hard to
keep query APIs concise if logic OR
is introduced without creating a DSL. However, for many cases, comparison operation
in(?=)
can be used to achieve OR
:
QueryZone name=west-coast-zone state?=Enabled,Disabled
The above example is saying "find zones of name west-coast and state of Enabled OR Disabled". In future, we will introduce a query language in DSL style, for example:
QueryZone name=west-coast-zone AND (state=Enabled OR state=Disabled)
Summary
In this article, we demonstrated ZStack's query APIs. With the powerful tool, users can query any resources in the way similar to the rational database. In future, ZStack will build a sophisticated UI that can create various views(filters) leveraging query APIs, for example, showing all VMs running on the same L3 network, bringing a revolutionary change to user experience of IaaS UI.