Raijin is a schemaless database that does not require a schema to be defined up-front allowing you to cope with data variety, since some records may contain fields that are not common in other records. This is especially useful when storing event data in a structured format, considering that event logs can contain virtually any kind of information such as user names, locations, IP addresses, timestamps, and other event attributes.
Events are generated 24x7 and arrive in large bursts during peak times. Raijin can ingest data in excess of 100k records per second on commodity hardware. Insertion performance does not degrade as the size of the data grows.
Traditional RDMBS products employ indexes that need to be created to efficiently query the data. This requires additional storage, reduces ingestion speed, and adds maintenance overhead. Raijin does not use indexes, instead, it stores metadata about data chunks which enables it to query data just as fast.
SQL is a universal declarative query language that all data analysts are familiar with. The Raijin Database supports SQL as its primary query language while lifting some of SQL’s limitations. Users do not need to learn yet another domain-specific language to work with data — queries written for other SQL databases can be easily migrated and executed by Raijin.
Log data often contains sensitive information that needs to be protected. Raijin uses strong encryption to reduce the chances of data theft or unauthorized access, and it can help to meet compliance and regulatory standards.
Raijin can store data in a compressed format. Data compression not only saves disk space but provides a performance boost with modern CPUs. It can compress event log data down to about 15-20% of its original size.
Raijin uses hybrid columnar data storage. The columnar format combined with vectorized execution greatly increases the data throughput demanded by analytical workloads. It can store and query machine-generated data such as event logs more efficiently than most traditional RDBMS solutions and NoSQL document databases, thereby reducing operational and maintenance costs. It can also function as an ideal data platform for powering BI, reporting, and dashboarding solutions.
For handling sparse data, the Raijin Database engine uses a flat JSON representation for the data records. This is natively supported when loading and querying data unlike other SQL solutions that introduced it as a bolted-on afterthought.
Most NoSQL solutions are inefficient or lack support for analytical queries. Raijin supports groups and aggregations using standard SQL syntax.
To be able to process large amounts of data, Raijin uses cache-aware algorithms and data structures to exploit the capabilities of modern CPUs. Instead of processing data one tuple at a time, it operates on data blocks. Using vectorized execution backed by optimized SIMD instructions, Raijin ensures that your CPUs are not wasting cycles.