Catalog is a JSON data to manage the configuration of a Droonga cluster.
A Droonga cluster consists of one or more datasets, and a dataset consists of other portions. They all must be explicitly described in a catalog and be shared with all the hosts in the cluster.
This version of catalog will be available from Droonga 1.0.0.
{
"version": <Version number>,
"effectiveDate": "<Effective date>",
"datasets": {
"<Name of the dataset 1>": {
"nWorkers": <Number of workers>,
"plugins": [
"Name of the plugin 1",
...
],
"schema": {
"<Name of the table 1>": {
"type" : <"Array", "Hash", "PatriciaTrie" or "DoubleArrayTrie">
"keyType" : "<Type of the primary key>",
"tokenizer" : "<Tokenizer>",
"normalizer" : "<Normalizer>",
"columns" : {
"<Name of the column 1>": {
"type" : <"Scalar", "Vector" or "Index">,
"valueType" : "<Type of the value>",
"vectorOptions": {
"weight" : <Weight>,
},
"indexOptions" : {
"section" : <Section>,
"weight" : <Weight>,
"position" : <Position>,
"sources" : [
"<Name of a column to be indexed>",
...
]
}
},
"<Name of the column 2>": { ... },
...
}
},
"<Name of the table 2>": { ... },
...
},
"fact": "<Name of the fact table>",
"replicas": [
{
"dimension": "<Name of the dimension column>",
"slicer": "<Name of the slicer function>",
"slices": [
{
"label": "<Label of the slice>",
"volume": {
"address": "<Address string of the volume>"
}
},
...
}
},
...
]
},
"<Name of the dataset 2>": { ... },
...
}
}
version2. (Specification written in this page is valid only when this value is 2)effectiveDatedatasetsdataset definition.nWorkersdataset and volume definition.A version 2 catalog effective after 2013-09-01T00:00:00Z, with no datasets:
{
"version": 2,
"effectiveDate": "2013-09-01T00:00:00Z",
"datasets": {
}
}
pluginsdataset and volume definition.schematable definition.dataset and volume definition.factdataset is stored as more than one slice, one fact table must be selected from tables defined in schema parameter.dataset and volume definition.replicasvolume definitions.A dataset with 4 workers per a database instance, with plugins groonga, crud and search:
{
"nWorkers": 4,
"plugins": ["groonga", "crud", "search"],
"schema": {
},
"replicas": [
]
}
type"Array": for tables which have no keys."Hash": for hash tables."PatriciaTrie": for patricia trie tables."DoubleArrayTrie": for double array trie tables."Hash"keyTypetype is "Array"."Integer" : 64bit signed integer."Float" : 64bit floating-point number."Time" : Time value with microseconds resolution."ShortText" : Text value up to 4095 bytes length."TokyoGeoPoint" : Tokyo Datum based geometric point value."WGS84GeoPoint" : WGS84 based geometric point value.tokenizerkeyType is "ShortText"."TokenDelimit""TokenUnigram""TokenBigram""TokenTrigram""TokenBigramSplitSymbol""TokenBigramSplitSymbolAlpha""TokenBigramSplitSymbolAlphaDigit""TokenBigramIgnoreBlank""TokenBigramIgnoreBlankSplitSymbol""TokenBigramIgnoreBlankSplitSymbolAlpha""TokenBigramIgnoreBlankSplitSymbolAlphaDigit""TokenDelimitNull"normalizerkeyType is "ShortText"."NormalizerAuto""NormalizerNFKC51"columnscolumn definition.A Hash table whose key is ShortText type, with no columns:
{
"type": "Hash",
"keyType": "ShortText",
"columns": {
}
}
A PatriciaTrie table with TokenBigram tokenizer and NormalizerAuto normalizer, with no columns:
{
"type": "PatriciaTrie",
"keyType": "ShortText",
"tokenizer": "TokenBigram",
"normalizer": "NormalizerAuto",
"columns": {
}
}
An object with the following key/value pairs.
type"Scalar": A single value."Vector": A list of values."Index" : A set of unique values with additional properties respectively. Properties can be specified in indexOptions."Scalar"valueType"Bool" : true or false."Integer" : 64bit signed integer."Float" : 64bit floating-point number."Time" : Time value with microseconds resolution."ShortText" : Text value up to 4,095 bytes length."Text" : Text value up to 2,147,483,647 bytes length."TokyoGeoPoint" : Tokyo Datum based geometric point value."WGS84GeoPoint" : WGS84 based geometric point value.vectorOptionsvectorOptions definition{} (Void object).indexOptionsindexOptions definition{} (Void object).A scaler column to store ShortText values:
{
"type": "Scalar",
"valueType": "ShortText"
}
A vector column to store ShortText values with weight:
{
"type": "Scalar",
"valueType": "ShortText",
"vectorOptions": {
"weight": true
}
}
A column to index address column on Store table:
{
"type": "Index",
"valueType": "Store",
"indexOptions": {
"sources": [
"address"
]
}
}
weighttrue or false).false.Store the weight data.
{
"weight": true
}
sectiontrue or false).false.weighttrue or false).false.positiontrue or false).false.sourcesvalueType.Store the section data, the weight data and the position data.
Index name and address on the referencing table.
{
"section": true,
"weight": true,
"position": true
"sources": [
"name",
"address"
]
}
slices. When a volume consists of a single database instance, address parameter must be assigned and the other parameters must not be assigned. Otherwise, dimension, slicer and slices are required, and vice versa.address${host_name}:${port_number}/${tag}.${name}
host_name: The name of host that has the database instance.port_number: The port number for the database instance.tag: The tag of the database instance. The tag name can’t include .. You can use multiple tags for one host name and port number pair.name: The name of the databases instance. You can use multiple names for one host name, port number and tag triplet.dimensioncolumns parameter of the fact table. See dimension."_key"dataset and volume definition.slicer"hash"dataset and volume definition.In order to define a volume which consists of a collection of slices,
the way how slice records into slices must be decided.
The slicer function that specified as slicer and
the column (or key) specified as dimension,
which is input for the slicer function, defines that.
Slicers are categorized into three types. Here are three types of slicers:
hash{High, Middle, Low}.
Slicers of this type are:
slicesslice definitions.A volume at “localhost:24224/volume.000”:
{
"address": "localhost:24224/volume.000"
}
A volume that consists of three slices, records are to be distributed according to hash,
which is ratio-scaled slicer function, of _key.
{
"dimension": "_key",
"slicer": "hash",
"slices": [
{
"volume": {
"address": "localhost:24224/volume.000"
}
},
{
"volume": {
"address": "localhost:24224/volume.001"
}
},
{
"volume": {
"address": "localhost:24224/volume.002"
}
}
]
weightslicer is ratio-scaled.1.labellabel is allowed in slices.boundaryslicer’s return value. Only available when the slicer is ordinal-scaled.boundary is allowed in a slices.volumeAn object which is a volume definition
Slice for a ratio-scaled slicer, with the weight 1:
{
"weight": 1,
"volume": {
}
}
Slice for a nominal-scaled slicer, with the label "1":
{
"label": "1",
"volume": {
}
}
Slice for a ordinal-scaled slicer, with the boundary 100:
{
"boundary": 100,
"volume": {
}
}
See the catalog of basic tutorial.