This project has retired. For details please refer to its Attic page.
Lens –

Lens User Guide

Overview

Lens Server runs several services which can be used from their REST endpoints. This document covers some of the important services, their purpose and key API calls.

Lens server provides metastore service for managing metadata. Metadata exposed by lens is inspired by OLAP data cubes. See Metastore Model for metastore constructs that lens provides.

Lens server also provides query service for querying data exposed by lens. See Query Language for the grammar of the query.

To access any service on lens, user should be working in a session. User can also pass various configuration parameters from client.

The sections below give more details on each service and configuration.

Configuration

Client configuration can be overridden in lens-client-site.xml. See client configuration for all the configuration parameters available and their default values.

Session Service

To use any Lens service the user must first create a session. Each Lens session is associated with a unique session handle, which must be passed when making queries, or doing metadata operations in the same session. To check if the session service is deployed in the Lens Server, user can send a GET request to /session. An OK response means that the session service is deployed.

Sessions also allow users to set configuration or resources which could be shared across a group of queries. For example, if a group of queries need to call a UDF available in some specific jar, then the jar file can be added as a resource in the session. All queries started within the same session can make use of the jar file.

LENS provides REST api, Java client api and CLI for doing all session level operations.

The important API calls exposed by the session resource are -

Query Execution Service

The Query Execution Service is used to query data exposed by Lens.

Query Submission Workflow

  • Create a session using the session service
  • Submit a query by sending a POST to the /queryapi/queries endpoint. By default this call should return immediately with the query handle of the newly created query. Each query in Lens is associated with a unique query handle. Query handle can be used to check query status and get results. If users wish to submit an interactive query which finishes fast, the 'op' parameter must be set to EXECUTE_WITH_TIMEOUT. This behaviour is explained in detail below.
  • In case of async execution, poll for query status by sending a GET to /queryapi/queries/queryhandle}. Once the query reaches SUCCESSFUL state, its results can be retrieved using the /queryapi/queries/queryhandle/resultset endpoint.
  • By default the create query call returns immediately. This behaviour is intended to suit batch queries. However, for interactive queries it may be necessary to issue the query and get its result in a single call to the server. For such cases the create query call takes an 'op' argument. If the op parameter is set to EXECUTE_WITH_TIMEOUT, then an additional timeout value must also be passed. If the query completes before this timeout is reached, the call immediately returns with the query result set. If however, the query doesn't finish, only the query handle is returned, and users can further poll for query status and fetch results when the query is SUCCESSFUL.
  • At any time, user can cancel the execution of the query by sending a DELETE to /queryapi/queries/queryhandle

To summarize, given below are steps for batch (async) queries

  1. Create session, note returned session handle.
  2. Create query by passing session handle, note returned query handle.
  3. Poll for query status by passing session handle and query handle.
  4. If query is SUCCESSFUL, get results.

Steps for interactive queries.

  1. Create a session.
  2. Create a query by setting op=EXECUTE_WITH_TIMEOUT, Also set the timeout value.
  3. Check the response, if it contains only the query handle, then poll for status as is the case in async queries. If it contains both query handle and result set, then that means query did complete successfully within the timeout.

Getting query results

A query can be run once, but its results can be fetched any number of times, if the results are persisted, until its purged from server memory. Results can be obtained by sending a GET to /queryapi/queries/queryhandle/resultset. This endpoint takes optional fromindex and fetchsize parameters which can be used for pagination of results.

Various query result formatting and fetching options are described in query result doc.

Life of a Query in the Lens Server

The following diagram shows query state transition in the Lens Server

Query States in Lens

When user submits a query to the Lens Server, its starts in the NEW state. After the query is submitted, it moves into the QUEUED state. Until Lens server is free to take up the query, it remains in the QUEUED state. As soon as Lens server starts processing the query, it enters the LAUNCHED state. At this stage Lens has decided which backend engine will be used to execute the query.

For each query Lens server will poll the chosen backend engine for query status. A GET on the query endpoint returns the latest status of the query.

The RUNNING state indicates that the query is currently being processed by the query backend. After the RUNNING state, the query can enter either the EXECUTED or FAILED states, depending on the result of query execution. If the execution is successful, then server would format the result if required and then set the state to SUCCESSFUL or FAILED if formatting fails.

If the query is SUCCESSFUL, its result set can be retrieved using the result set API call, by passing the session handle and query handle. The query can be executed once, and its results can be fetched multiple times unless the query has been purged from Lens server state.

In any state, if the user requests that the query be cancelled, the query will enter into CANCELLED state. Query can be cancelled by sending a DELETE at the query endpoint.

FAILED, SUCCESSFUL and CANCELLED are end states for a query. Once a query reaches these states, it becomes eligible to purging. The query is purged when its purge delay expires, after which it is not possible to retrieve results of the query. This purge delay is configurable. After purging the query enters the CLOSED state.

Prepared queries

A query can be prepared for execution. Once prepared the query can be submitted for execution as many times as required. When prepared query is no longer required, it should be destroyed. REST api, JavaClient api and CLI commands are available for all the operations supported.

Metastore service

The Metastore service is used for DDL operations like creating, updating cubes, fact tables and dimensions. It also pprovides endpoints to create storage tables and to add partitions to a storage table. For more detailed information see the metastore service resource documentation.

For resource exposed endpoint for cubes, facts, dimensions and storage tables. For each of the resource, HTTP methods specify the operation to be performed. For example, a POST on the cubes resource creates a cube, whereas a GET on the cubes reource will get list of all cubes. Similar convention is followed for fact, dimension, and storage tables.

For Java clients, JAXB classes corresponding to each of the endpoints are available.