User Scripting

From ClusterpointWiki2

Jump to: navigation, search

This feature is experimental

Clusterpoint Server supports user scripting, by using Lua language. All standard Lua features (libraries) are enabled and available at all times: string, table, math, debug, io, package and base. We do not limit any functionality, so you can use/create additional libraries if needed. Using Lua, you can easily extend Clusterpoint Server functionality.

As said above, currently it is experimental feature - it means that there could be some bugs and we do not recommend to use it on production servers without extensive testing. Described below is current functionality. We are working hard to add new and improve existing functions and hooks all the time. If you have any suggestions what functions and hooks should be added or what should be changed, feel free to use forum.

Warning. As it is said "with great power comes great responsibility". Lua can do almost everything, including performing endless loops and even crashing Clusterpoint Server. We have incorporated Lua in such way, that guarantees best performance, but it is not fool/crash proof. If you experience any strange issues and/or crashes while having Lua enabled, first step is to disable Lua and try to repeat issue. If issue cannot be repeated with Lua disabled, then this is Lua issue and not Clusterpoint Server issue. We cannot guarantee anything about Lua libraries you are using. Please see Known Issues section below, for a list of known Lua problems, so you can avoid them.

Contents

Basic Principles

Clusterpoint Server is multi-thread application. It means, that each API request is processed in dedicated thread, we are calling these "worker threads". Several worker threads are started during server startup. Each worker thread has its own Lua script context and memory area. It means, that script's global variables are only available to itself. And it also means, that you do not need to worry about thread safety in your script. All Lua code must be within single file (or have single "entry" point main script). File can be in either plain text and compiled.

Variables

There are some global variables, that can be useful.

Functions

These functions are exported by Clusterpoint Server to Lua for usage in scripts.

Helper Functions

Output Functions

All standard output from Lua script is discarded. To be able to see your output, you must use special output functions.

Storage Functions

These functions provides access to global storage, that is shared among all worker threads. These function are thread-safe. All information is lost, if server is stopped. There are two types of storage available - one is for integer values, other is for string values. The same name can be used for multiple variable types. Variable names are case-sensitive.

If you need to store information, that will be preserved across storage restarts, you must use PersistentStore functions. These functions are very similar to non-persistent functions. All data in persistent store is automatically flushed to disk when storage is stopped and loaded back when storage is started.

Additional parameter "flush" means that store will be flushed to disk immediately after data are changed. You can use this parameter, if you need to store critical data. Default value is false. But please note, that flushing data after each call (if you pass true) reduces performance.

Control Functions

Clusterpoint Server provides several methods, how user script can return warnings and information back to user or even abort request processing. Please note, that there functions are only recommendation to Clusterpoint Server core - calling these functions does not mean, that request is aborted immediately. This call will be processed on first possibility.

Core Functions (Core Calls)

These functions gives user script access to Clusterpoint Server internals. Main purpose for these functions is to get more information/data, when hook arguments is not enough for necessary actions. We decided not to pass all possible arguments to hooks because of performance, because some calls can be very time consuming. Please note, that these functions are only available in specified hooks. When function is not available, calling it will write error to Lua log file and function will return nil.

Hooks

Hooks are special functions, that are used for Lua interactions with Clusterpoint Server. You do not need, to define all hooks. Missing hook simply means, that you do not need this hook. Each hook have its own set of Core Functions (Core Calls) available. Hooks can be called in different contexts, that limits which Core Functions are available:


1 These hooks, when used in cluster environment, could be called more times (and with more documents), than documents returned to user. This is current limitation, that cannot be avoided. We will try to fix this in future versions.

Custom Function Calls

Clusterpoint Server provides special API to call custom user-defined functions. This function can have any number of arguments, but all arguments must be strings. Function can return result as string.

To call custom function, you need to send call-user-function API request.

<?xml version="1.0" encoding="utf-8"?>
   <cps:request xmlns:cps="www.clusterpoint.com">
   ...
   <cps:command>call-user-function</cps:command>
   ...
   <cps:content>
      <function>function_name</function>
      <argument>argument_1</argument>
      <argument>argument_2</argument>
      <argument>argument_3</argument>
      ...
   </cps:content>
</cps:request>

This request is very similar to other API requests. Command name must be 'call-user-function' and cps:content must contain called function name as 'function' and any number of arguments as 'argument'. Arguments will be passed to function in the same order, as they appear in request. If function have more arguments than given, then other arguments will have nil value. If function have less arguments that given, then function call will fail.

If your function return result, if will be returned in response as 'result'.

<?xml version="1.0" encoding="utf-8"?>
<cps:reply xmlns:cps="www.clusterpoint.com">
   ...
   <cps:command>call-user-function</cps:command>
   ...
   <cps:content>
      <result>function_result</result>
   </cps:content>
</cps:reply>


For example, if you have Lua function

function myfunc(arg1, arg2)
   ...
   return result_variable
end

you can send following API request

<?xml version="1.0" encoding="utf-8"?>
   <cps:request xmlns:cps="www.clusterpoint.com">
   ...
   <cps:command>call-user-function</cps:command>
   ...
   <cps:content>
      <function>myfunc</function>
      <argument>value1</argument>
   </cps:content>
</cps:request>

Clusterpoint Server will call function myfunc with arguments arg1 = "value1" and arg2 = nil. Returned value will be returned in response

<?xml version="1.0" encoding="utf-8"?>
<cps:reply xmlns:cps="www.clusterpoint.com">
   ...
   <cps:command>call-user-function</cps:command>
   ...
   <cps:content>
      <result>result_value</result>
   </cps:content>
</cps:reply>


You can define and use multiple custom functions for single storage. Custom functions can perform any tasks - return internal script statuses, change variables that affect hook behaviour and more.

In cluster configuration function will be executed on every cluster node and multiple results will be returned to user.

Document Variables

In most cases hooks will have whole document passed in as string argument. But if your hook needs only few predefined tag values, then you can use feature called "document variables" to extract these values.

To enable these features, you will need to modify storage policy. Add new policy "variable" with value:

For example, if your document looks like

<document>
  <id>1</id>
  <title>Lua 5.1 Reference Manual</title>
  <language>lua</language>
  <isbn>85-903798-3-3</isbn>
</document>

and you need only tags <language> and <isbn> in your hook, you can use following policy

  <language variable=lang>
  <isbn variable=_auto>

In this example your hook will have table "lang" => "lua", "isbn" => "85-903798-3-3". Please note, that if document will have multiple variables with the same names, only last will be available.

Additional rules apply if you are using variables with XML content. When using automatic variable names (policy variable=_auto), whole XPath to tag value will be used as variable name. When using predefined variable names, XPath to lower tags are lost. In following example (shown document merged with policy)

<document>
  <id>1</id>
  <title>Lua 5.1 Reference Manual</title>
  <language variable=_auto>
    <described>lua</described>
    <written>english</written>
  </language>
  <author variable=author>
    <author1>R. Ierusalimschy</author1>
    <author2>L. H. de Figueiredo</author2>
    <author3>W. Celes</author3>
  </author>
</document>

you will have table "language/described" => "lua", "language/written" => "english", "author" => "W. Celes". In this example first two authors are lost because of predefined variable name "author", that forces all subtags to have the same variable name and only last remains available.

Important! Document variables will be parsed and stored only when Lua script is enabled, setting policy is not enough. This means, that inserted/updated documents without Lua enabled, will not have variables. To enable variables for existing documents, you have to update policy (if not done already), enable Lua script, restart storage and reindex all documents.

Configuration

To enable Lua functionality you need to add <lua_script> tag containing filename for script file to storage configuration. After tag is added, storage must be restarted. If you need to perform any changes to your script, storage must/sup> string OnAlternative(string xpath, string to, string word, integer count, float h, float idif, float cr) - This hooks is called for each alternative word returned by alternatives command. Passed arguments are the same, as returned by alternatives command. Hook must return word as string. Hook cannot raise errors.

Log Files

To be more user friendly, all Lua errors, warnings, information and debug messages are written to dedicated log files. Normal storage log files will not contain any information about user sresult>function_resultcripts. Log files are located in storage directory and named lua_script_YY_MM_DD.log. If you have any problems running your script, first task to check log files.

Script Examples

Here are some script examples, that are fully tested and working.

Known Issues

We have found some issues with Lua libraries, that is listed below so you can avoid them. If you find any other issues, please let us know, so we can add them to this list.

Personal tools
Namespaces
Variants
Actions
Download
Developer area
Admin area
Toolbox
Navigation