Summaries/Databases/ElasticSearch/General.md

37 lines
1.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Application: ELK Stack
- Elasticsearch - distributed NoSQL database
- Logstash - ingests streams of activity data
- Kibana - Visualisation / Dashboard
# Fundamentals concepts
[Source: architecture](https://codersite.dev/hot-warm-architecture-elasticsearch/)
The act of storing data in Elasticsearch is called **indexing**.
An index is a collection of documents and each document is a collection of fields, which are the **key-value pairs** that contain your data. Every index has some properties like mappings, settings, and aliases.
In Elasticsearch, a document belongs to a type, and those types live inside an index. We can draw a parallel to a traditional relational database:
Relational DB ⇒ Databases ⇒ Tables ⇒ Rows ⇒ Columns
Elasticsearch ⇒ Indices ⇒ Types ⇒ Documents ⇒ Fields
In Elasticsearch, the term **document** has a specific meaning. It refers to the **top-level**, or root object that is serialized into JSON and stored in Elasticsearch under a unique ID.
Elasticsearch lets you insert documents without a predefined schema (in RDBMS you need to define tables in advance).
## Inverted index
Relational databases add an index, such as a B-tree index, to specific columns in order to improve the speed of data retrieval. Elasticsearch use a structure called an **inverted index** for exactly the same purpose.
By default, **every field in a document is indexed** (has an inverted index) and thus is searchable **FullText search**. A field without an inverted index is not searchable.
An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which it appears.
# Summary
- Elasticsearch gives us Google-like features
- Scalable ingest / data size / search performance
- Accessible through a "REST API"
- Can be used as a full-text "search engine"
- Can be used as a scalable NoSQL database