Pivotal Knowledge Base

Follow

GemFire function usage patterns

Environment

Product Version
Pivotal GemFire 7.0.2 and later

Introduction

The purpose of this article is to describe some typical usage patterns for GemFire functions, adding to the general description of functions found in the User's Guide.

Description

Executing queries in functions

For executing queries in a function, do something like:

  • Invoke the function with onRegion.
  • Have the function return true from optimizeForWrite, so that it is executed only on primary buckets.
  • Use the Query Execute API with a RegionFunctionContext in the function. Otherwise, you could end up executing the same query on more than one member.

If you set a filter, the function (and query) will execute on only the member containing the primary or primaries for that filter.

For example (based on stock trades):

If you route all trades on a specific cusip number to the same bucket using a PartitionResolver, then querying for all trades for a specific cusip number can be done efficiently using a Function. The trades could be stored with a simple String key like cusip-id or a complex key containing both the cusip number and ID. Either way, the PartitionResolver will need to be able to return the cusip number for the routing object.

Invoke the function as shown below:

Execution execution = FunctionService.onRegion(this.region).withFilter(Collections.singleton(cusip));
ResultCollector collector = execution.execute("TradeQueryFunction");
Object result = collector.getResult();

In the TradeQueryFunction, execute the query as shown below:

RegionFunctionContext rfc = (RegionFunctionContext) context;
String cusip = (String) rfc.getFilter().iterator().next();
SelectResults results = (SelectResults) this.query.execute(rfc, new String[] {cusip});

Where the query is as following:

select * from /Trade where cusip = $1

This will route the function execution to the member whose primary bucket contains the cusip filter. Then it will execute the query on the RegionFunctionContext, which will be just the data for that bucket. Note, however, that the PartitionResolver will also need to be able to return the cusip for that filter (which is just the input string itself).

General info on functions

When executing a function onRegion on a replicated region, the function is executed on any member defining that region. Since the region is replicated, every server has the same data.

If executing a function onRegion on a partitioned region, then where the function is invoked depends on the result of optimizeForWrite. If optimizeForWrite returns true, the function is invoked on all the members containing primary buckets for that region. While, if optimizeForWrite returns false, the function is invoked on as few members as it can that encompass all the buckets (generally, some mix of primary and secondary buckets). For example, if you have 2 members, and the primaries are split between them, then, when optimizeForWrite returns true, the function will be invoked on both members. Returning false, by contrast, will cause the function to be invoked on only one member since each member has all the buckets. The normal usage is to always have optimizeForWrite return true.

The onServer and onServers methods are used for "data-unaware" calls (meaning no specific region involved). These are intended mainly for admin-type behavior like:

  • start/stop gateway senders
  • create regions
  • rebalance
  • assign buckets

However,GemFire SHell (GFSH) now does a lot of this, so a function isn't necessarily needed to do those types of operations anymore.

A useful onServer use case is the command pattern using a Request/Response API like:

  • define a Request like, for instance, RebalanceCache
  • pass it as an argument to a CommandFunction from the client to a server using onServer
  • execute it on the server
  • return a Response

One use case for invoking a function from another function is member notification. This can be done with a CacheListener on a replicated region too. The basic idea is:

  • invoke a function
  • in the function, invoke another function on all the members notifying them something is about to happen
  • do the thing
  • invoke another function on all the members notifying them something has happened

Be careful when invoking one function from another. Depending on what the second function is doing, it is possible to get into a distributed deadlock situation.

Comments

Powered by Zendesk