Tkrzw: Difference between revisions
Appearance
Content deleted Content added
mNo edit summary |
|||
(42 intermediate revisions by 27 users not shown) | |||
Line 1: | Line 1: | ||
{{Multiple issues| |
|||
{{Underlinked|date=June 2013}} |
|||
{{Missing information|in what ways are Tokyo Cabinet and Kyoto Cabinet important when considering the history of database systems|date=July 2015}} |
|||
{{Missing information|what the Kyoto Cabinet offered that made it a notable successor to Tokyo Cabinet|date=July 2015}} |
|||
{{ref improve|date=October 2014}} |
|||
}} |
|||
{{Infobox software |
|||
| name = Tkrzw |
|||
| title = Tkrzw |
|||
| logo = TKRZW Logo.png |
|||
| logo caption = |
|||
| logo_size = |
|||
| logo_alt = |
|||
| screenshot = <!-- Image name is enough --> |
|||
| caption = |
|||
| screenshot_size = |
|||
| screenshot_alt = |
|||
| collapsible = |
|||
| author = Mikio Hirabayashi |
|||
| developer = [[Google]] |
|||
| released = {{Start date and age|2020|07|11}} |
|||
| discontinued = |
|||
| latest release version = 0.9.3 |
|||
| latest release date = {{Start date and age|2020|08|02}} |
|||
| latest preview version = |
|||
| latest preview date = <!-- {{Start date and age|2020|MM|DD}} --> |
|||
| programming language = [[C++]] |
|||
| operating system = |
|||
| platform = |
|||
| size = |
|||
| language = |
|||
| language count = <!-- DO NOT include this parameter unless you know what it does --> |
|||
| language footnote = |
|||
| genre = [[Database engine]], library |
|||
| license = [[Apache 2.0]] |
|||
| alexa = |
|||
| standard = |
|||
| AsOf = |
|||
| website = {{Official URL}} |
|||
}} |
|||
{{Infobox software |
{{Infobox software |
||
| name = Kyoto Cabinet |
| name = Kyoto Cabinet |
||
Line 12: | Line 50: | ||
| screenshot_alt = |
| screenshot_alt = |
||
| collapsible = |
| collapsible = |
||
| author = |
| author = Mikio Hirabayashi |
||
| developer = FAL Labs |
| developer = FAL Labs |
||
| released = |
| released = {{Start date and age|2009|12|25}} |
||
| discontinued = |
| discontinued = |
||
| latest release version = 1.2. |
| latest release version = 1.2.78 |
||
| latest release date = |
| latest release date = {{Start date and age|2020|07|19}} |
||
| latest preview version = |
| latest preview version = |
||
| latest preview date = <!-- {{Start date and age|YYYY|MM|DD |
| latest preview date = <!-- {{Start date and age|YYYY|MM|DD}} --> |
||
| frequently updated = <!-- DO NOT include this parameter unless you know what it does --> |
|||
| status = |
|||
| programming language = [[C++]] |
| programming language = [[C++]] |
||
| operating system = |
| operating system = |
||
Line 30: | Line 66: | ||
| language footnote = |
| language footnote = |
||
| genre = [[Database engine]], library |
| genre = [[Database engine]], library |
||
| license = [[ |
| license = [[GPL 3]] |
||
| alexa = |
| alexa = |
||
| website = {{URL|fallabs.com/kyotocabinet/}} |
|||
| standard = |
| standard = |
||
| AsOf = |
| AsOf = |
||
| website = {{URL|https://dbmx.net/kyotocabinet/}} |
|||
}} |
}} |
||
{{Infobox software |
|||
'''Kyoto Cabinet''' is a [[Library (computing)|library]] of routines for managing a database. The database is a simple data file containing records, each is a pair of a key and a value. Every key and value is serial bytes with variable length. Both binary data and character string can be used as a key and a value. Each key must be unique within a database. There is neither concept of data tables nor data types. Records are organized in [[hash table]] or [[B+ tree]]. |
|||
| name = Tokyo Cabinet |
|||
| title = Tokyo Cabinet |
|||
The following access methods are provided to the database: |
|||
| logo = <!-- Image name is enough --> |
|||
* storing a record with a key and a value |
|||
| logo caption = |
|||
* deleting a record by a key |
|||
| logo_size = |
|||
* retrieving a record by a key |
|||
| logo_alt = |
|||
| screenshot = <!-- Image name is enough --> |
|||
Moreover, traversal access to every key are provided. These access methods are similar to ones of the original [[dbm]] (and its followers: NDBM and GDBM) library defined in the UNIX standard. Kyoto Cabinet is an alternative for the DBM because of its higher performance. |
|||
| caption = |
|||
| screenshot_size = |
|||
Each operation of the hash database has the time complexity of "O(1)". Therefore, in theory, the performance is constant regardless of the scale of the database. In practice, the performance is determined by the speed of the main memory or the storage device. If the size of the database is less than the capacity of the main memory, the performance will seem on-memory speed, which is faster than <code>std::map</code> of [[Standard Template Library|STL]]. Of course, the database size can be greater than the capacity of the main memory and the upper limit is 8 exabytes. Even in that case, each operation needs only one or two seeking of the storage device. |
|||
| screenshot_alt = |
|||
| collapsible = |
|||
Each operation of the B+ tree database has the time complexity of "O(log N)". Therefore, in theory, the performance is logarithmic to the scale of the database. Although the performance of random access of the B+ tree database is slower than that of the hash database, the B+ tree database supports sequential access in order of the keys, which realizes forward matching search for strings and range search for integers. The performance of sequential access is much faster than that of random access. |
|||
| author = Mikio Hirabayashi |
|||
| developer = FAL Labs |
|||
As the [[Application programming interface|API]] is based on object-oriented design, the hash database and the B+ tree database have same methods which inherited from the upper abstract class. Beside them, seven kinds of databases are provided under the same base class. The prototype hash database is powered by the standard container of <code>std::unordered_map</code>. The prototype tree database is powered by the standard container of <code>std::map</code>. The stash database is powered by the original implementation of naive hash map saving memory. The cache hash database is powered by the original implementation of doubly-linked hash map with LRU deletion algorithm. The cache tree database is powered by the cache hash database and provides B+ tree mechanism. The directory hash database is powered by the directory mechanism of the file system and stores records as respective files in a directory. The directory tree database is powered by the directory hash database and provides B+ tree mechanism. All databases have practical utility methods related to transaction and cursor. Programs for command-line interface are also included in the package. |
|||
| released = {{Start date and age|2006}} |
|||
| discontinued = |
|||
Kyoto Cabinet runs very fast. For example, elapsed time to store one million records is 0.9 seconds for the hash database, and 1.1 seconds for the B+ tree database. Moreover, the size of database is very small. For example, overhead for a record is 16 bytes for the hash database, and 4 bytes for the B+ tree database. Furthermore, scalability of Kyoto Cabinet is great. The database size can be up to 8 EB (9.22e18 bytes). |
|||
| latest release version = 1.4.48 |
|||
| latest release date = {{Start date and age|2012|08|17}} |
|||
Kyoto Cabinet is written in the [[C++]] language, and provided as API of C++, C, [[Java (programming language)|Java]], [[Python (programming language)|Python]], [[Ruby (programming language)|Ruby]], [[Perl]], and [[Lua (programming language)|Lua]]. Kyoto Cabinet is available on platforms which have API conforming to C++03 with the [[C++ Technical Report 1|TR1 library extensions]]. Kyoto Cabinet is a free software licensed under the [[GNU General Public License]] (GPL). The FOSS License Exception is also provided in order to accommodate products under other free and open source licenses. The Specific FOSS Library Linking Exception is also provided in order to be available in some specific FOSS libraries. On the other hand, a commercial license is also provided. If you use Kyoto Cabinet within a proprietary software, the commercial license is required. |
|||
| latest preview version = |
|||
| latest preview date = <!-- {{Start date and age|YYYY|MM|DD}} --> |
|||
==Features== |
|||
| programming language = [[C (programming language)|C]] |
|||
| operating system = |
|||
===Genealogy=== |
|||
| platform = |
|||
The original DBM was developed by [[Ken Thompson]] as a part of the original AT&T UNIX. After that, a lot of followers developed such DBM-like products as NDBM, SDBM, GDBM, TDB, and [[Berkeley DB]]. In 2003, Mikio Hirabayashi developed QDBM to replace GDBM for performance reason. |
|||
| size = |
|||
| language = |
|||
In 2007, Tokyo Cabinet was developed as the successor to QDBM on the following purposes: |
|||
| language count = <!-- DO NOT include this parameter unless you know what it does --> |
|||
* Improves space efficiency: smaller size of database file. |
|||
| language footnote = |
|||
* Improves time efficiency: faster processing speed. |
|||
| genre = [[Database engine]], library |
|||
* Improves parallelism: higher performance in multi-thread environment. |
|||
| license = [[LGPL 2.1]] |
|||
* Improves usability: simplified API. |
|||
| alexa = |
|||
* Improves robustness: database file is not corrupted even under catastrophic situation. |
|||
| standard = |
|||
* Supports 64-bit architecture: enormous memory space and database file are available. |
|||
| AsOf = |
|||
| website = {{URL|https://dbmx.net/tokyocabinet/}} |
|||
These purposes were achieved, making Tokyo Cabinet suitable as a replacement to conventional DBM software. |
|||
}} |
|||
'''Tkrzw''' is a [[library (computing)|library]] of routines for managing [[key–value database]]s. '''Tokyo Cabinet''' was sponsored by the Japanese [[social network]]ing site [[Mixi]], and was a multithreaded [[embedded database]] manager and was announced by its authors as "a modern implementation of [[DBM (computing)|DBM]]".<ref name="website">{{cite web |title=Tokyo Cabinet: a modern implementation of DBM |url=http://fallabs.com/tokyocabinet/ |publisher=FAL Labs |date=5 August 2010 |access-date=18 October 2014 |archive-url=https://web.archive.org/web/20230623105019/http://fallabs.com/tokyocabinet/ |archive-date=2023-06-23}}</ref> '''Kyoto Cabinet''' is the designated successor of Tokyo Cabinet,<ref name="website"/> while Tkrzw is a recommended successor of Kyoto Cabinet. |
|||
In 2009, Kyoto Cabinet was developed as another successor to QDBM. Compared with the sibling product (Tokyo Cabinet), the following advantages were pursued: |
|||
* Improves space efficiency : smaller size of database file. |
|||
* Improves parallelism : higher performance in multi-thread environment. |
|||
* Improves portability : abstraction of the lower layer to support non-POSIX systems. |
|||
* Improves usability : simplified API, object-oriented design. |
|||
* Improves robustness : database file is not corrupted even under catastrophic situation. |
|||
However, at least in single thread operations, the performance of Tokyo Cabinet is better than Kyoto Cabinet. |
|||
===Effective implementation of hash database=== |
|||
Kyoto Cabinet uses hash algorithm to retrieve records. If a bucket array has sufficient number of elements, the time complexity of retrieval is "O(1)". That is, the time required for retrieving a record is constant, regardless of the scale of a database. It is also the same about storing and deleting. Collision of hash values is managed by separate chaining. Data structure of the chains is binary search tree. Even if a bucket array has unusually scarce elements, the time complexity of retrieval is "O(log n)". |
|||
Kyoto Cabinet attains performance improvement in retrieval by loading the whole of the bucket array onto the RAM. If the bucket array is on RAM, it is possible to access a region of a target record by about one set of file operations such as "lseek", "read", and "write". The bucket array saved in a file is not read into RAM with the "read" call but directly mapped to RAM with the "mmap" call. Therefore, preparation time on connecting to a database is very short, and two or more processes can share the same [[memory map]]. |
|||
The hash function used for hash table is [[MurmurHash]] 2.0. If the number of elements of the bucket array is about a half of records stored within a database, although it depends on characteristic of the input, the probability of collision of hash values is about 55.3% (35.5% if the same, 20.4% if twice, 11.0% if four times, 5.7% if eight times). In that case, it is possible to retrieve a record by two or less sets of file operations. If it is made into a performance index, in order to handle a database containing one million of records, a bucket array with half a million of elements is required. The size of each element is 6 bytes. That is, if 3M bytes of RAM is available, a database containing one million records can be handled. |
|||
When overwriting a record with a value whose size is greater than the existing one, it is necessary to move the region to another position of the file. Because the time complexity of the operation depends on the size of the region of a record, extending values successively is inefficient. However, Kyoto Cabinet deals with this problem by alignment. If the incremental data can be placed in the padding region trailing the records, it is not necessary to move the region of the record. |
|||
Generally speaking, while succession of updating, fragmentation of available regions occurs, and the size of a database grows rapidly. Kyoto Cabinet deals with this problem by the free block pool and the automatic defragmentation mechanism. If a record is removed or shifted to another position, the region will be treated as a free block. The free block pool manages free blocks and reuses the best fit region for a new record. The automatic defragmentation is to shift records and free blocks separately. Successive free blocks coalesce into one. |
|||
===Useful implementation of B+ tree database=== |
|||
Although the B+ tree database is slower than the hash database, it features ordering access to each record. The order can be assigned by users. Records in the B+ tree database are sorted and arranged in logical pages. Sparse index organized in B tree that is multiway balanced tree are maintained for each page. Thus, the time complexity of retrieval and so on is "O(log n)". Cursor is provided to access each record in order. The cursor can jump to a position specified by a key and can step forward or backward from the current position. Because each page is arranged as double linked list, the time complexity of stepping cursor is "O(1)". |
|||
The B+ tree database is implemented based on the above the hash database. Because each page of the B+ tree database is stored as each record in the hash database, the B+ tree database inherits efficiency of storage management of the hash database. Because the header of each record is smaller and alignment of each page is adjusted according to the page size, in most cases, the size of database file is cut by half compared to one of the hash database. |
|||
Although operations of many pages are required to update the B+ tree database, Kyoto Cabinet expedites the process by the page cache and reducing file operations. The page cache is implemented with double layered LRU list, which realizes that frequently accessed pages are cached in the "hot" list and recently accessed pages are cached in the "warm" LRU list. If the page cache works efficiently and the whole of the sparse index is cached on memory, it is possible to retrieve a record by one or less set of file operations. |
|||
===Practical functionality=== |
|||
Kyoto Cabinet features transaction mechanisms. It is possible to commit a series of operations between the beginning and the end of the transaction in a lump, or to abort the transaction and perform rollback to the state before the transaction. Two isolation levels are supported: "serializable" and "read uncommitted". Durability is secured by write ahead logging and shadow paging. |
|||
Automatic transaction and automatic recovery mechanisms are also supported. If the automatic transaction option is specified when opening the database, every updating operation is guarded by transaction which is committed implicitly. Therefore, durability can be assured without explicit transaction operations. The automatic recovery mechanism works after the database is crashed outside transaction. If inconsistency of the database is detected when opening the database, all regions are scanned as with "fsck" and the database is reconstructed with surviving records implicitly. |
|||
Kyoto Cabinet provides two modes to connect to a database: "reader" and "writer". A reader can perform retrieving but neither storing nor deleting. A writer can perform all access methods. Exclusion control between processes is performed when connecting to a database by file locking. While a writer is connected to a database, neither readers nor writers can be connected. While a reader is connected to a database, other readers can be connect, but writers can not. According to this mechanism, data consistency is guaranteed with simultaneous connections in multitasking environment. |
|||
Functions of API are reentrant and available in multi-thread environment. Different database objects can be operated in parallel entirely. For simultaneous operations against the same database object, rwlock (reader-writer lock) is used for exclusion control. That is, while a writing thread is operating an object, other reading threads and writing threads are blocked. However, while a reading thread is operating an object, reading threads are not blocked. Locking granularity depends on data structures. The hash database uses record locking. The B+ tree database uses page locking. |
|||
In order to improve performance and concurrency, Kyoto Cabinet uses such atomic operations built in popular CPUs as atomic-increment and CAS ([[compare-and-swap]]). Lock primitives provided by the native environment such as the POSIX thread package are alternated by own primitives using CAS. |
|||
===Simple but flexible interfaces=== |
|||
Kyoto Cabinet provides simple APIs based on object-oriented design. Every operation for database is encapsulated and published as lucid methods as "open", "close", "set", "remove", "get", and so on. The classes of the hash database and the B+ tree database are derived class of the common abstract class which defines the interface. Porting an application from one database to another is easy. Moreover, the polymorphic database API is provided to assign a database in run-time. |
|||
Tokyo Cabinet features on-disk [[B+ tree]]s and [[hash table]]s for key-value storage, with "some" support for [[transaction (database)|transactions]].<ref name="smith">{{cite book |title=Professional Website Performance |first=Peter |last=Smith |publisher=John Wiley & Sons |year=2012}}</ref> |
|||
Kyoto Cabinet supports the [[visitor pattern]]. You can define arbitrary database operations with call back functions. The visitor class encapsulates that call back functions and their state data. The database class has the "accept" method, which accepts an instance of the visitor class and calls its functions with a record data. The return value of the call back function is reflected as the new state of the record. |
|||
==See also== |
|||
In addition, a lot of useful utilities are provided such as "prefix search", "regex search", "logging", "hot backup", "pseudo-snapshot", and "merging". A framework for [[MapReduce]] is also provided. Although it is not distributed, it is useful for aggregate calculation with less CPU loading and less memory usage. The plain text database is an interface to treat a plain text file as a database file. It is useful to use a text file as input or output data for the MapReduce framework. The index database is a wrapper of a polymorphic database in order to improve the efficiency of the "append" operation. It is useful to construct inverted indices. |
|||
{{Portal|Free and open-source software}} |
|||
* [[Berkeley DB]] |
|||
* [[LevelDB]] |
|||
==References== |
|||
While the core API is provided for C++, bindings for other languages such as C, Java, Python, Ruby, Perl, and Lua are also provided. Command-line interfaces are also provided corresponding to each API. They are useful for prototyping, test, and debugging. |
|||
{{reflist}} |
|||
== |
==External links== |
||
* {{official website}} |
|||
{{Portal|Free software}} |
|||
* [https://dbmx.net/kyotocabinet/ Kyoto Cabinet official website] |
|||
* [[dbm]] |
|||
** [https://dbdb.io/db/kyoto-cabinet Kyoto Cabinet (Website Carnegie Mellon Database Group)] |
|||
** [[Berkeley DB]] |
|||
* [https://dbmx.net/tokyocabinet/ Tokyo Cabinet official website] |
|||
** [[LevelDB]] |
|||
** [https://dbdb.io/db/tokyo-cabinet Tokyo Cabinet (Website Carnegie Mellon Database Group)] |
|||
== External links == |
|||
* http://fallabs.com/kyotocabinet/ |
|||
[[Category:Database management systems]] |
|||
[[Category:C++ libraries]] |
[[Category:C++ libraries]] |
||
[[Category:Free software programmed in C++]] |
[[Category:Free software programmed in C++]] |
||
[[Category:Database engines]] |
|||
[[Category:Embedded databases]] |
|||
[[Category:Key-value databases]] |
|||
[[Category:Ordered Key-Value Store]] |
Latest revision as of 22:26, 18 August 2024
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Original author(s) | Mikio Hirabayashi |
---|---|
Developer(s) | |
Initial release | July 11, 2020 |
Stable release | 0.9.3
/ August 2, 2020 |
Repository | |
Written in | C++ |
Type | Database engine, library |
License | Apache 2.0 |
Website | dbmx |
Original author(s) | Mikio Hirabayashi |
---|---|
Developer(s) | FAL Labs |
Initial release | December 25, 2009 |
Stable release | 1.2.78
/ July 19, 2020 |
Repository | |
Written in | C++ |
Type | Database engine, library |
License | GPL 3 |
Website | dbmx |
Original author(s) | Mikio Hirabayashi |
---|---|
Developer(s) | FAL Labs |
Initial release | 2006 |
Stable release | 1.4.48
/ August 17, 2012 |
Repository | |
Written in | C |
Type | Database engine, library |
License | LGPL 2.1 |
Website | dbmx |
Tkrzw is a library of routines for managing key–value databases. Tokyo Cabinet was sponsored by the Japanese social networking site Mixi, and was a multithreaded embedded database manager and was announced by its authors as "a modern implementation of DBM".[1] Kyoto Cabinet is the designated successor of Tokyo Cabinet,[1] while Tkrzw is a recommended successor of Kyoto Cabinet.
Tokyo Cabinet features on-disk B+ trees and hash tables for key-value storage, with "some" support for transactions.[2]
See also
[edit]References
[edit]- ^ a b "Tokyo Cabinet: a modern implementation of DBM". FAL Labs. 5 August 2010. Archived from the original on 2023-06-23. Retrieved 18 October 2014.
- ^ Smith, Peter (2012). Professional Website Performance. John Wiley & Sons.