Catalog
is a JSON data to manage the configuration of a Droonga cluster.
A Droonga cluster consists of one or more datasets
, and a dataset
consists of other portions. They all must be explicitly described in a catalog
and be shared with all the hosts in the cluster.
This version
of catalog
will be available from Droonga 1.0.0.
{
"version": <Version number>,
"effectiveDate": "<Effective date>",
"datasets": {
"<Name of the dataset 1>": {
"nWorkers": <Number of workers>,
"plugins": [
"Name of the plugin 1",
...
],
"schema": {
"<Name of the table 1>": {
"type" : <"Array", "Hash", "PatriciaTrie" or "DoubleArrayTrie">
"keyType" : "<Type of the primary key>",
"tokenizer" : "<Tokenizer>",
"normalizer" : "<Normalizer>",
"columns" : {
"<Name of the column 1>": {
"type" : <"Scalar", "Vector" or "Index">,
"valueType" : "<Type of the value>",
"vectorOptions": {
"weight" : <Weight>,
},
"indexOptions" : {
"section" : <Section>,
"weight" : <Weight>,
"position" : <Position>,
"sources" : [
"<Name of a column to be indexed>",
...
]
}
},
"<Name of the column 2>": { ... },
...
}
},
"<Name of the table 2>": { ... },
...
},
"fact": "<Name of the fact table>",
"replicas": [
{
"dimension": "<Name of the dimension column>",
"slicer": "<Name of the slicer function>",
"slices": [
{
"label": "<Label of the slice>",
"volume": {
"address": "<Address string of the volume>"
}
},
...
}
},
...
]
},
"<Name of the dataset 2>": { ... },
...
}
}
version
2
. (Specification written in this page is valid only when this value is 2
)effectiveDate
datasets
dataset
definition.nWorkers
dataset
and volume
definition.A version 2 catalog effective after 2013-09-01T00:00:00Z
, with no datasets:
{
"version": 2,
"effectiveDate": "2013-09-01T00:00:00Z",
"datasets": {
}
}
plugins
dataset
and volume
definition.schema
table
definition.dataset
and volume
definition.fact
dataset
is stored as more than one slice
, one fact table must be selected from tables defined in schema
parameter.dataset
and volume
definition.replicas
volume
definitions.A dataset with 4 workers per a database instance, with plugins groonga
, crud
and search
:
{
"nWorkers": 4,
"plugins": ["groonga", "crud", "search"],
"schema": {
},
"replicas": [
]
}
type
"Array"
: for tables which have no keys."Hash"
: for hash tables."PatriciaTrie"
: for patricia trie tables."DoubleArrayTrie"
: for double array trie tables."Hash"
keyType
type
is "Array"
."Integer"
: 64bit signed integer."Float"
: 64bit floating-point number."Time"
: Time value with microseconds resolution."ShortText"
: Text value up to 4095 bytes length."TokyoGeoPoint"
: Tokyo Datum based geometric point value."WGS84GeoPoint"
: WGS84 based geometric point value.tokenizer
keyType
is "ShortText"
."TokenDelimit"
"TokenUnigram"
"TokenBigram"
"TokenTrigram"
"TokenBigramSplitSymbol"
"TokenBigramSplitSymbolAlpha"
"TokenBigramSplitSymbolAlphaDigit"
"TokenBigramIgnoreBlank"
"TokenBigramIgnoreBlankSplitSymbol"
"TokenBigramIgnoreBlankSplitSymbolAlpha"
"TokenBigramIgnoreBlankSplitSymbolAlphaDigit"
"TokenDelimitNull"
normalizer
keyType
is "ShortText"
."NormalizerAuto"
"NormalizerNFKC51"
columns
column
definition.A Hash
table whose key is ShortText
type, with no columns:
{
"type": "Hash",
"keyType": "ShortText",
"columns": {
}
}
A PatriciaTrie
table with TokenBigram
tokenizer and NormalizerAuto
normalizer, with no columns:
{
"type": "PatriciaTrie",
"keyType": "ShortText",
"tokenizer": "TokenBigram",
"normalizer": "NormalizerAuto",
"columns": {
}
}
An object with the following key/value pairs.
type
"Scalar"
: A single value."Vector"
: A list of values."Index"
: A set of unique values with additional properties respectively. Properties can be specified in indexOptions
."Scalar"
valueType
"Bool"
: true
or false
."Integer"
: 64bit signed integer."Float"
: 64bit floating-point number."Time"
: Time value with microseconds resolution."ShortText"
: Text value up to 4,095 bytes length."Text"
: Text value up to 2,147,483,647 bytes length."TokyoGeoPoint"
: Tokyo Datum based geometric point value."WGS84GeoPoint"
: WGS84 based geometric point value.vectorOptions
vectorOptions
definition{}
(Void object).indexOptions
indexOptions
definition{}
(Void object).A scaler column to store ShortText
values:
{
"type": "Scalar",
"valueType": "ShortText"
}
A vector column to store ShortText
values with weight:
{
"type": "Scalar",
"valueType": "ShortText",
"vectorOptions": {
"weight": true
}
}
A column to index address
column on Store
table:
{
"type": "Index",
"valueType": "Store",
"indexOptions": {
"sources": [
"address"
]
}
}
weight
true
or false
).false
.Store the weight data.
{
"weight": true
}
section
true
or false
).false
.weight
true
or false
).false
.position
true
or false
).false
.sources
valueType
.Store the section data, the weight data and the position data.
Index name
and address
on the referencing table.
{
"section": true,
"weight": true,
"position": true
"sources": [
"name",
"address"
]
}
slices
. When a volume consists of a single database instance, address
parameter must be assigned and the other parameters must not be assigned. Otherwise, dimension
, slicer
and slices
are required, and vice versa.address
${host_name}:${port_number}/${tag}.${name}
host_name
: The name of host that has the database instance.port_number
: The port number for the database instance.tag
: The tag of the database instance. The tag name can’t include .
. You can use multiple tags for one host name and port number pair.name
: The name of the databases instance. You can use multiple names for one host name, port number and tag triplet.dimension
columns
parameter of the fact table. See dimension."_key"
dataset
and volume
definition.slicer
"hash"
dataset
and volume
definition.In order to define a volume which consists of a collection of slices
,
the way how slice records into slices must be decided.
The slicer function that specified as slicer
and
the column (or key) specified as dimension
,
which is input for the slicer function, defines that.
Slicers are categorized into three types. Here are three types of slicers:
hash
{High, Middle, Low}
.
Slicers of this type are:
slices
slice
definitions.A volume at “localhost:24224/volume.000”:
{
"address": "localhost:24224/volume.000"
}
A volume that consists of three slices, records are to be distributed according to hash
,
which is ratio-scaled slicer function, of _key
.
{
"dimension": "_key",
"slicer": "hash",
"slices": [
{
"volume": {
"address": "localhost:24224/volume.000"
}
},
{
"volume": {
"address": "localhost:24224/volume.001"
}
},
{
"volume": {
"address": "localhost:24224/volume.002"
}
}
]
weight
slicer
is ratio-scaled.1
.label
label
is allowed in slices.boundary
slicer
’s return value. Only available when the slicer
is ordinal-scaled.boundary
is allowed in a slices.volume
An object which is a volume
definition
Slice for a ratio-scaled slicer, with the weight 1
:
{
"weight": 1,
"volume": {
}
}
Slice for a nominal-scaled slicer, with the label "1"
:
{
"label": "1",
"volume": {
}
}
Slice for a ordinal-scaled slicer, with the boundary 100
:
{
"boundary": 100,
"volume": {
}
}
See the catalog of basic tutorial.