Skip to content
Snippets Groups Projects
Commit 06abd78d authored by Luca Morandini's avatar Luca Morandini
Browse files

- Improved child/parent relationship example

- Added more notes
parent 579a8a86
Branches
No related tags found
No related merge requests found
......@@ -15,7 +15,7 @@
Before accessing Kubernetes services, an SSH tunnel to the bastion node has to be opened in a different shell and kept open.
In addition, the `openrc` file has to be source and the kubeconfig file put under the `~/.kube` directory (see the READM in
the `installaiton` folder for more details).
the `installation` folder for more details).
```shell
......@@ -25,18 +25,16 @@ To access services on the cluster, one has to use the `port-forward` command of
kubectl port-forward service/elasticsearch-master -n elastic 9200:9200
```
> Note: This command will start the port forwarding so please keep this terminal open and do not close it.
> Note: The port forwarding can be stopped by pressing `Ctrl + C` and closing the terminal window. The port forwarding is only active when the terminal window is open. Once it is stopped, you need to re-run the command to start the port forwarding again.
To access the Kibana user interface, one has to use the `port-forward` command of `kubectl` (another terminal window):
```shell
kubectl port-forward service/kibana-kibana -n elastic 5601:5601
```
> Note: This command will start the port forwarding so please keep this terminal open and do not close it.
> Note: Thess commands will start the port forwarding so please keep the terminals open and do not close it.
> Note: The port forwarding can be stopped by pressing `Ctrl + C` and closing the terminal window. The port forwarding is only active when the terminal window is open. Once it is stopped, you need to re-run the command to start the port forwarding again.
Test the ElasticSearch API:
```shell
......@@ -60,7 +58,7 @@ curl -XPUT -k 'https://127.0.0.1:9200/students'\
"mappings": {
"properties": {
"id": {
"type": "text"
"type": "keyword"
},
"name": {
"type": "text"
......@@ -77,10 +75,10 @@ curl -XPUT -k 'https://127.0.0.1:9200/students'\
--user 'elastic:elastic' | jq '.'
```
The index should now be shown in the Kibana dashboard ('Management / Index management').
The index should now be shown in the Kibana dashboard ("Management / Index management"). To be able to see indexes in
Kibana, they have to be added as a data view.
Let's add some documents to the newly created index:
```shell
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/1234567"\
--header 'Content-Type: application/json'\
......@@ -118,14 +116,17 @@ curl -XGET -k "https://127.0.0.1:9200/students/_search"\
--user 'elastic:elastic' | jq '.'
```
(The use of a body in a GET request is not ReSTful, and hardly supported by HTTP... but it is allowed by ElasticSearch.)
(Note that, since "course" is a "text" field, it is case-insensitive and regexp can be used.)
## Create a data view from Kibana
Go to Kibana (Management / Kibana / Data views) , create a data view named "students" with pattern "student\*", and check that the documents have been
Go to Kibana (Management / Kibana / Data views) , create a data view named "students" with pattern "student*", and check that the documents have been
added to the index by going to "Analysis / Discover".
Now Kibana can be used to test search queries or to have a look at data.
In Kibana, go to "Discover" and select the "students" data view. Create the KQL expressione `course : "cloud"`,
two documents should be returned (the expression language used in the Discover tab of Kibana is neither SQL nor Query DSL, but KQL).
## ElasticSearch parent-child join
......@@ -135,8 +136,7 @@ with some limitations.
Let's first delete the `students` index:
```shell
curl -XDELETE -k 'https://127.0.0.1:9200/students'\
--user 'elastic:elastic' | jq '.'
curl -XDELETE -k 'https://127.0.0.1:9200/students' --user 'elastic:elastic' | jq '.'
```
Then let's re-create the database with a mapping that defines the parent-child relationship:
......@@ -153,9 +153,6 @@ curl -XPUT -k 'https://127.0.0.1:9200/students' \
},
"mappings": {
"properties": {
"uomid": {
"type": "text"
},
"name": {
"type": "text"
},
......@@ -176,11 +173,10 @@ curl -XPUT -k 'https://127.0.0.1:9200/students' \
}' \
--user 'elastic:elastic' | jq '.'
```
Let's insert some data about courses and students that use the parent-child relationship:
```shell
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/1?routing=comp90024"\
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/comp90024?routing=comp"\
--header 'Content-Type: application/json'\
--data '{
"name": "COMP90024",
......@@ -191,34 +187,34 @@ curl -XPUT -k "https://127.0.0.1:9200/students/_doc/1?routing=comp90024"\
}'\
--user 'elastic:elastic' | jq '.'
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/2?routing=comp90024"\
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/1234567?routing=comp"\
--header 'Content-Type: application/json'\
--data '{
"name": "John Smith",
"uomid": "1234567",
"mark": 80,
"relation_type": {
"name": "student",
"parent": 1
"parent": "comp90024"
}
}'\
--user 'elastic:elastic' | jq '.'
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/3?routing=comp90024"\
curl -XPUT -k "https://127.0.0.1:9200/students/_doc/0123456?routing=comp"\
--header 'Content-Type: application/json'\
--data '{
"name": "Jane Doe",
"uomid": "0123456",
"mark": 90,
"relation_type": {
"name": "student",
"parent": 1
"parent": "comp90024"
}
}'\
--user 'elastic:elastic' | jq '.'
```
Example of a query that returns all students of a given course that have a mark greater than 80:
NOTE: the "routing" parameter has to be added so that all children are in the same shard as their parent.
Example of a query that returns all students of course whose description contains "computing" hat have a mark greater than 80:
```shell
curl -XGET -k "https://127.0.0.1:9200/students/_search"\
......@@ -239,7 +235,7 @@ curl -XGET -k "https://127.0.0.1:9200/students/_search"\
"parent_type": "course",
"query": {
"match": {
"name": "comp90024"
"coursedescription": "computing"
}
}
}
......@@ -255,7 +251,7 @@ curl -XGET -k "https://127.0.0.1:9200/students/_search"\
### Data setup
Create an ElasticSearch Index to hold temperatures (it uses dynamic mapping):
Let's create an ElasticSearch Index to hold temperatures (it uses dynamic mapping):
```shell
curl -XPUT -k 'https://127.0.0.1:9200/temperatures'\
--header 'Content-Type: application/json'\
......@@ -269,7 +265,7 @@ curl -XPUT -k 'https://127.0.0.1:9200/temperatures'\
"mappings": {
"properties": {
"date": {
"type": "keyword"
"type": "date"
},
"temperature": {
"type": "dense_vector",
......@@ -296,8 +292,6 @@ Load temperatures data as vectors in the index (temperatures are expressed in Ke
)
```
Add the index "temperatures" as a Kibana data view.
### Vector search
Search for the most similar temperature vector to a vector of typical Vancouver temperatures in the month of January (expressed in Kelvin).
......@@ -309,7 +303,7 @@ curl -XGET -k "https://127.0.0.1:9200/temperatures/_search"\
"knn": {
"field": "temperature",
"query_vector": [274.62,275.18,275.9,276.74,277.65,278.56,279.4,280.12,280.68,281.03,281.15,281.03,280.68,280.12,279.4,278.56,277.65,276.74,275.9,275.18,274.62,274.27,274.15,274.27],
"k": 4,
"k": 3,
"num_candidates": 100
},
"fields": [ "date" ]
......@@ -319,13 +313,14 @@ curl -XGET -k "https://127.0.0.1:9200/temperatures/_search"\
## Use of Kibana to test queries
Create a data view out of the "temperatures" index on Kibana (select "date" as timestamp field).
Go to "Dev Tools / Console" and try the following queries:
Match partial strings:
```shell
POST /_sql?format=txt
{
"query": "SELECT * FROM students WHERE MATCH(coursedescription, 'cloud*')"
"query": "SELECT * FROM students WHERE MATCH(coursedescription, 'computing')"
}
```
......@@ -344,9 +339,19 @@ Change the endpoint of thw query to `POST /_sql/translate` and execute it: the r
### Use of Kibana to browse data
In Kibana, go to "Discover" and select the "students" data view. Create the KQL expressione `name : "john"`, a single document should be returned.
When there is a timestamp in your data, as it is the case for the `temperatures` index, you can use Kibana to select data easily by time range
(the "temperatures" index contains data from 2012, hence you have to select an appropriate time range).
When there is a timestamp in your data, as it is the case for the `observations` index (which can be created by following
the instructions in the Fission README), you can use Kibana to select data easily by time range.
## Removal of indexes
To remove the data views in Kibana, go to "Management / Kibana / Data views" and delte the data views you want to remove.
To remove the indexes created in this workshop, use the following commands:
```shell
curl -XDELETE -k 'https://127.0.0.1:9200/students' --user 'elastic:elastic' | jq '.'
curl -XDELETE -k 'https://127.0.0.1:9200/temperatures' --user 'elastic:elastic' | jq '.'
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment