US20020087500A1 - In-memory database system - Google Patents
In-memory database system Download PDFInfo
- Publication number
- US20020087500A1 US20020087500A1 US09/135,917 US13591798A US2002087500A1 US 20020087500 A1 US20020087500 A1 US 20020087500A1 US 13591798 A US13591798 A US 13591798A US 2002087500 A1 US2002087500 A1 US 2002087500A1
- Authority
- US
- United States
- Prior art keywords
- record
- data
- key
- look
- page
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2308—Concurrency control
- G06F16/2315—Optimistic concurrency control
- G06F16/2329—Optimistic concurrency control using versioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2308—Concurrency control
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99938—Concurrency, e.g. lock management in shared database
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
Definitions
- This invention relates generally to databases, and more particularly to enabling multiple concurrent read-only access to database records.
- the database manager When the database manager immediately changes the data in the database in response to an update request, the database manager must reverse the changes using a rollback mechanism if the requesting transaction aborts. Therefore, in order to present a consistent view of the data to another transaction, the database manager either denies access to the changed data until the modifying transaction commits the changes, or permits the other transaction access to the data but must also rollback the other transaction if the modifying transaction aborts. The processing of read-only transactions is thus slowed when they execute concurrently with transactions that update common data.
- An in-memory database system uses a shared memory to cache records and keys read from a database and controls the updating of the records and keys through a database manager process.
- a transaction performs an update, the original, unmodified data is preserved in the shared memory, the new data is written to the shared memory, and a look-aside table for the transaction records the changes.
- a transaction performs read-only access to the shared memory using its own context while a versioning scheme based on the look-aside tables ensures a read-committed isolation level view of the original, unmodified data until the modifying transaction commits the update.
- the database manager is responsible for writing the new data into the shared memory and for maintaining the look-aside tables for all transaction which have made modifications to the data in the shared memory.
- the database manager also writes committed changes to the database and performs rollback on uncommitted changes in the shared memory using the entries in the look-aside table for the committing/aborting transaction.
- the shared memory is divided into logical pages and short duration page latches are employed to maintain consistency on the page while a transaction or the database manager is reading or writing data on the page.
- a method of controlling access to database records which are stored in memory shared among multiple processes is described as creating record and/or index entries in a look-aside table, preserving the original data in the shared memory, and allowing a process access to the modified data if a corresponding record and/or index entries exists in the look-aside table for the process.
- the method also performs rollback and abort processing using the look-aside table.
- the in-memory database system is described as having a plurality of clients which manipulate data, a shared memory for caching the data, an in-memory database manager that creates the look-aside table entries and writes changes to the shared memory.
- the details of data structures and page latches used by the in-memory database system are given.
- a particular implementation of the in-memory database system is also described.
- the present invention describes systems, clients, servers, methods, and computer-readable media of varying scope.
- systems, clients, servers, methods, and computer-readable media of varying scope.
- FIG. 1 shows a diagram of the hardware and operating environment in conjunction with which embodiments of the invention may be practiced
- FIG. 2 is a diagram illustrating a system-level overview of an exemplary embodiment of the invention
- FIGS. 3A and 3B are time line diagrams illustrating the interactions of two client processes operating in the exemplary embodiment shown in FIG. 2;
- FIG. 4 is a flowchart of a method to be performed by a client process according to an exemplary embodiment of the invention
- FIGS. 5A, 5B, 5 C, 6 , 7 , 8 and 9 are flowcharts of methods to be performed by a database manager process according to an exemplary embodiment of the invention.
- FIG. 10 is a diagram of a look-aside data structure for use in an exemplary implementation of the invention.
- FIG. 11 is diagram of a transaction data structure for use in an exemplary implementation of the invention.
- FIG. 12 is a diagram of a single level hash table data structure for use in an exemplary implementation of the invention.
- FIG. 13 is a diagram of a two level hash table data structure for use in an exemplary implementation of the invention.
- FIG. 1 is a diagram of the hardware and operating environment in conjunction with which embodiments of the invention may be practiced.
- the description of FIG. 1 is intended to provide a brief, general description of suitable computer hardware and a suitable computing environment in conjunction with which the invention may be implemented.
- the invention is described in the general context of computer-executable instructions, such as program modules, being executed by a computer, such as a personal computer.
- program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types.
- the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like.
- the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
- program modules may be located in both local and remote memory storage devices.
- the exemplary hardware and operating environment of FIG. 1 for implementing the invention includes a general purpose computing device in the form of a computer 20 , including a processing unit 21 , a system memory 22 , and a system bus 23 that operatively couples various system components include the system memory to the processing unit 21 .
- a processing unit 21 There may be only one or there may be more than one processing unit 21 , such that the processor of computer 20 comprises a single central-processing unit (CPU), or a plurality of processing units, commonly referred to as a parallel processing environment.
- the computer 20 may be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited.
- the system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
- the system memory may also be referred to as simply the memory, and includes read only memory (ROM) 24 and random access memory (RAM) 25 .
- ROM read only memory
- RAM random access memory
- BIOS basic input/output system
- BIOS basic routines that help to transfer information between elements within the computer 20 , such as during start-up, is stored in ROM 24 .
- the computer 20 further includes a hard disk drive 27 for reading from and writing to a hard disk, not shown, a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29 , and an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD ROM or other optical media.
- a hard disk drive 27 for reading from and writing to a hard disk, not shown
- a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29
- an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD ROM or other optical media.
- the hard disk drive 27 , magnetic disk drive 28 , and optical disk drive 30 are connected to the system bus 23 by a hard disk drive interface 32 , a magnetic disk drive interface 33 , and an optical disk drive interface 34 , respectively.
- the drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computer 20 . It should be appreciated by those skilled in the art that any type of computer-readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read only memories (ROMs), and the like, may be used in the exemplary operating environment.
- a number of program modules may be stored on the hard disk, magnetic disk 29 , optical disk 31 , ROM 24 , or RAM 25 , including an operating system 35 , one or more application programs 36 , other program modules 37 , and program data 38 .
- a user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42 .
- Other input devices may include a microphone, joystick, game pad, satellite dish, scanner, or the like.
- These and other input devices are often connected to the processing unit 21 through a serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port, or a universal serial bus (USB).
- a monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48 .
- computers typically include other peripheral output devices (not shown), such as speakers and printers.
- the computer 20 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer 49 . These logical connections are achieved by a communication device coupled to or a part of the computer 20 ; the invention is not limited to a particular type of communications device.
- the remote computer 49 may be another computer, a server, a router, a network PC, a client, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 20 , although only a memory storage device 50 has been illustrated in FIG. 1.
- the logical connections depicted in FIG. 1 include a local-area network (LAN) 51 and a wide-area network (WAN) 52 .
- LAN local-area network
- WAN wide-area network
- the computer 20 When used in a LAN-networking environment, the computer 20 is connected to the local network 51 through a network interface or adapter 53 , which is one type of communications device.
- the computer 20 When used in a WAN-networking environment, the computer 20 typically includes a modem 54 , a type of communications device, or any other type of communications device for establishing communications over the wide area network 52 , such as the Internet.
- the modem 54 which may be internal or external, is connected to the system bus 23 via the serial port interface 46 .
- program modules depicted relative to the personal computer 20 may be stored in the remote memory storage device. It is appreciated that the network connections shown are exemplary and other means of and communications devices for establishing a communications link between the computers may be used.
- the computer in conjunction with which embodiments of the invention may be practiced may be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited.
- a computer typically includes one or more processing units as its processor, and a computer-readable medium such as a memory.
- the computer may also include a communications device such as a network adapter or a modem, so that it is able to communicatively couple other computers.
- an in-memory database system 200 comprises an in-memory database (IMDB) manager 201 and shared memory 202 in a computer such as local computer 20 in FIG. 1.
- the IMDB manager 201 is responsible for reading and writing records from a database 220 into and from shared memory 202 on behalf of a client process 210 .
- Database 220 can be resident on the same computer as the in-memory database system 200 or can be located on a different computer such as remote computer 49 in FIG. 1.
- the client process 210 can reside on the same computer as the in-memory database system 200 or can execute on a different computer as long as the client process 210 can address the shared memory 202 .
- the client process 210 can address the shared memory 202 through its context, the client process can directly access the records in shared memory 202 without having to call the IMDB manager.
- the client process 210 has read-only access to the records and calls the IMDB manager to modify or delete an existing record or to create a new record.
- FIG. 3A is a time line diagram illustrating the interactions of two client processes in accordance with the exemplary embodiment of the invention.
- Each client process is represented by a database transaction which performs operations on database records.
- the two database transaction access the same database employee record for an employee named “Smith.”
- the primary key for the employee records is the employee number which in the case of employee Smith is “123.”
- the actions described below are divided among the transactions for the client processes and the IMDB manager 201 when one client process performs modifies a database record.
- Transaction1 executes a retrieve command on the employee record “123” which returns copy 301 of the employee record from shared memory 202 at time mark A 1 . If a copy of the record is not already in memory, the IMDB manager 201 reads a copy from the database 220 into shared memory 202 . Transaction1 modifies the last name of the employee from “Smith” to “Jones” at time mark B 1 . Because the name change has not yet been committed by transaction1, the modified record is not written back to the database. Instead, the IMDB manager 201 creates a modified copy 303 of the record in shared memory and sets a “modified” flag 302 in the original copy 301 of the record in the shared memory.
- the IMDB manager 201 also creates a look-aside table 305 for transactions in transaction1's context, if one does not already exist, and creates a record entry 306 in the look-aside table 305 which points to the location of the modified copy 303 of the record in shared memory.
- the look-aside table 305 is accessible only by transaction1 and by the IMDB manager.
- transaction1 When transaction1 wants to re-read the record at time mark C 1 , transaction1 specifies the key again and retrieves the original copy 301 from shared memory. Because the modified flag 302 is set in copy 301 , the transaction1 searches its look-aside table 305 and finds the record entry 306 . Transaction1 then retrieves the modified copy 303 of the record using the information in the record entry 306 at time mark D 1 . When transaction1 commits its changes at time mark E 1 , the IMDB manager writes all modifications specified in transaction1's look-aside table 305 to the shared memory and to the database. The look-aside table 305 is deleted after all the modifications have been committed.
- transaction2 is executing concurrently with transaction1.
- Transaction2 issues a retrieve command using key “123” at time mark A 2 which retrieves the copy 201 from shared memory.
- transaction2 next retrieves the record using the key “123” at time mark B 2 after transaction1 has modified the record
- transaction2 reads the copy 301 from the database and recognizes that the modified flag 302 is set. Therefore, transaction2 knows that changes to the record are pending and searches its look-aside table 310 , if one exists, for a corresponding record entry. Because transaction1 was responsible for the modification, transaction2 does not find a corresponding record entry and therefore continues its processing with the unmodified copy 301 of the record.
- transaction1 has committed the changes (at time mark E 1 )
- a third read operation by transaction2 on key “123” returns the modified copy 303 of the record in shared memory to transaction2.
- transaction2 sees an inconsistency between the information in the copy 301 of the record retrieved at time marks A 2 and B 2 , and the copy 303 retrieved at time mark C 2 .
- the in-memory database system of the present invention guarantees consistency of read-committed transactions but does not guarantee consistency of read-repeatable or serializable transactions.
- transaction1 can abort and rollback the uncommitted changes using the information in the look-aside table. After rollback, the copy 301 of the employee record in the shared memory appears as it was at time mark A 1 , i.e., before transaction1 modified it at time mark B 1 . Rollback processing is described in detail in the next section.
- each record in shared memory is located using a record identifier (RECID) specified in the index entries for the record.
- the RECID is also used as a hash key to search for the corresponding record entry in the look-aside tables.
- the IMDB manager hashes the RECID (OLDRECID) for the original record to determine which record entry to use in the appropriate look-aside table.
- the RECID (NEWRECID) for the modified record is written into the entry.
- FIG. 3A does not show the index entries since only non-key data is modified in the example.
- FIG. 3B shows the same series of transactions when the employee name is the primary key for the employee records. Therefore, in FIG. 3B, the primary index for the employee table is shown to illustrate the actions taken a key is changed.
- a copy 301 of the employee record is read from shared memory at time mark A 1 , the record entry 306 pointing to the modified copy 303 is created in look-aside table 305 , and the modified flag set in the original copy 301 at time mark B 1 .
- the IMDB manager also inserts a new key entry 322 for “Jones” into the primary key index table 320 for the employee records.
- the new key entry 322 contains the new RECID (NEWRECID) for the modified record.
- the old entry 321 for “Smith” is marked as uncommitted-deleted (UCD) while the new entry 322 is marked as uncommitted-inserted (UCI).
- Two index entries 307 , 308 are also added to the look-aside table 305 .
- Index entry 307 contains an identifier for the employee table (“EMPLOYEE”), an identifier for the primary index (“NAME”), and the value of the deleted key (“SMITH”).
- Index entry 308 contains the identifier for the employee table (“EMPLOYEE”), the identifier for the primary index (“NAME”), and the value of the inserted key (“JONES”).
- the index entries are located by hashing on table identifier, index identifier, and key value.
- transaction1 issues a retrieve command on the employee record using the primary key “Smith.”
- the index entry 321 is marked as uncommitted-deleted, so transaction1 uses the string “EMPLOYEE-NAME-SMITH” to search its look-aside table 305 for a matching entry. Because a matching entry, in this case entry 307 , exists, transaction1 knows it is the modifying transaction, so the primary key of “Smith” does not exist for it and no record is returned.
- transaction1 issues a retrieve command on the employee record using the primary key “Jones” at time mark D 1 , it determines it is the modifying transaction because entry 308 exists so it uses NEWRECID in the index entry 322 to retrieve the modified copy 303 of the record (time mark E 1 ).
- transaction2 issues a retrieve command for the employee record using “Smith” at time mark B 2 , it determines that the primary key “Smith” is marked as uncommitted-deleted, and that it is not the modifying transaction since its look-aside table 310 does not contain a matching entry. The transaction2 can continue to use the original copy 301 of the record if the name modification is not critical to its processing (time mark C 2 ). Similarly, when transaction2 issues a retrieve command for the employee record using “Jones” at time mark D 2 , it determines that the primary key “Jones” is marked as uncommitted-inserted, and that is not the modifying transaction, so it treats they key as if it were not in the index.
- a similar scenario takes place when a secondary key for a record is modified.
- a transaction that is retrieving the record using the secondary key proceeds as described above for FIG. 3B where the index table and the index entries are specific for the secondary key.
- the exemplary embodiment of the IMDB manager combines the secondary key value with the primary key value to yield a unique key value.
- Other commonly used mechanisms to create unique keys for non-unique keys are equally applicable and are within the scope of the invention.
- a transaction retrieving the record using the primary key reads the unmodified copy of the record since the key entry in the primary key contains the OLDRECID.
- the modified flag in the record alerts the transaction that a change to the data is pending.
- the transaction uses the OLDRECID to search its look-aside table and retrieves the modified copy if it finds a matching entry.
- the IMDB manager creates both index and record entries in the look-aside table when a record is deleted.
- the affected key entry in the each index table is marked as uncommitted-deleted, an index entry in each appropriate look-aside table keyed on the record table, index, and deleted key value is created, and a null record entry in each look-aside table is created so that hashing into the look-aside table using the OLDRECID indicates that the record is deleted.
- the IMDB manager creates a new key entry in the each index table marked as uncommitted-inserted and an index entry in each appropriate look-aside table keyed on the record table, index, and new key value.
- a record entry is also created in the look-aside table which contains the NEWRECID for the newly created record; the record entry is hashed into using a null value.
- the IMDB system maintains data in the shared memory in both a new, uncommitted state resulting from a update function performed by a transaction, and in the original, committed state to provide versioning control for client processes.
- the IMDB system is predicated on two principals:
- the exemplary embodiment of a invention described by methods in the flowcharts of FIGS. 4 - 7 requires all index entries in the look-aside table to be unique. Because all secondary keys in a database may not be required to have unique values, the invention combines such secondary keys with the primary key for the record (which is unique) to create a unique key for the corresponding secondary index entry in the look-aside table. Additionally, if a record has been deleted and then the same record is reinserted by a transaction before the deletion is committed, the index entries for the record's keys in the appropriate look-aside table contain a NEWRECID for the reinserted record, which is used when retrieving the record by the transaction that deleted and reinserted the record. The key entries in the index tables contain an OLDRECID for the original record, which is used when retrieving the record by all other transactions.
- FIG. 4 a flowchart of a method to be performed by a client according to an exemplary embodiment of the invention is shown. This method is inclusive of the acts required to be taken by the client when retrieving a record.
- the client uses an appropriate hashing algorithm, or other suitable method, to find the key entry in the appropriate index table in shared memory (block 401 ).
- the key entry can be either a primary key for the record or a secondary key depending on the criteria specified by the client in the retrieval command.
- the client next determines if the key entry has been changed.
- the client searches its look-aside table for a matching index entry (block 407 ). If a matching index entry is found (block 409 ), then the client uses the NEWRECID in the index entry to read the copy of the record it reinserted (block 411 ). If a matching entry is not found at block 409 , then the original key still exists for the client and the client uses the OLDRECID in the key entry in the index table to read the original copy of the record (block 413 ).
- the client searches its look-aside table for a matching entry (block 415 ). If a matching entry is found, the client has deleted the key so the key does not exist for it and thus no record is retrieved. If a matching entry is not found (block 417 ), the original key still exists for the client and the client uses the OLDRECID in the key entry in the index table to read the original copy of the record (block 413 ).
- the client searches its look-aside table for a matching index entry (block 421 ). If a matching index entry is found (block 423 ), the client knows that it is the transaction that inserted (modified) the key and uses the NEWRECID in the index entry to read the modified copy of the record from shared memory (block 411 ). If a matching index entry is not found at block 423 , the client knows that another transaction modified the key and has not committed the change so the key value does not exist for the client.
- the client reads the record from the shared memory using the RECID in the key entry (block 425 ).
- the client checks the modified flag in the record to determine if any data has been changed (block 427 ). If the modified flag is set, then the client searches its look-aside table for a matching record entry (block 429 ). If a matching record entry is found (block 431 ), then the client knows it is the transaction that modified the record, and uses the NEWRECID in the record entry to read the modified copy of the record from the shared memory (block 411 ). If the client does not find a matching record entry at block 431 , the client knows that the unmodified copy of the record read at block 425 is the copy that exists for it.
- the IMDB manager reads and writes records from the database using commands specific to the type of database used to store the records. For example, a relational database such as Oracle is accessed using standard SQL commands.
- the invention is not limited to use with only relational databases, but is applicable to any key-based data structure.
- the IMDB manager is responsible for assigning RECIDs to records and for storing the records in the shared memory.
- the IMDB manager is also responsible for creating the corresponding shared memory indices for a record, and for creating and managing the look-aside tables in shared memory.
- the IMDB manager pre-loads entire tables of database records into shared memory, and creates the RECIDs and shared memory indices during an initialization phase.
- the IMDB manager pre-loads only a subset of database records when a range of key values is specified by a client.
- the client transactions can only read information from shared memory and must call the IMDB to request modifications to the records and indices.
- IMDB manager can be used by the IMDB manager in managing the shared memory. One particular technique is discussed in detail in the next section.
- the client transaction calls the IMDB manager to perform five functions illustrated in FIGS. 5 A-C (modify), FIG. 6 (delete), FIG. 7 (add), FIG. 8 (commit), and FIG. 9 (rollback).
- the IMDB creates a look-aside table for a client transaction when the transaction first requests a modification to a record in the shared memory (not illustrated).
- the IMDB manager creates the shared memory table at different stages in the processing of the transaction will be readily apparent to one of skill in the art and are contemplated as within the scope of the invention.
- the IMDB manager determines if the record has been previously modified by the same client (block 501 ), i.e., the modification has not yet been committed so a matching record entry exists in the client's look-aside table for the client. If so, then the previously modified copy of the record is used instead of that supplied in the function call (block 503 ). In an alternate embodiment, the IMDB manager returns an error message if the modified flag is set in the record and a matching entry in the look-aside table is not found as a check to ensure a client does not attempt to modify a record having uncommitted modifications made by another client.
- the IMDB manager performs a DeleteKey operation on the old value for each key that is to change (block 507 ).
- the DeleteKey operation is described in more detail below in conjunction with FIG. 5B.
- the IMDB manager creates the modified record in shared memory with a NEWRECID (block 509 ). If the record being modified is newly added (block 511 ), i.e., added by the same transaction and not yet committed, the IMDB manager updates the look-aside table entry for the record by replacing the RECID for the previous copy of the record with the NEWRECID for the modified record (block 513 ). The IMDB manager performs an InsertKey operation on the new value for each key that is to change to equate the new key value with the NEWRECID (block 515 ). Duplicate key entries that are detected by the InsertKey operation, as described in more detail below in conjunction with FIG. 5C, cause the record modification to fail. For each key that is not being modified, the IMDB manager updates all the corresponding key entries for the appropriate indices in shared memory with the NEWRECID (block 517 ).
- the IMDB performs an InsertKey operation on the new value for each key that is to change to equate the new key with the OLDRECID of the copy of the record before the current modification (block 519 ).
- the retrieval function described above maps the new key to the NEWRECID for the client that modifies the record; the new key does not exists for the other clients.
- the key is a duplicate (block 521 )
- the record modification fails.
- the record entry in the look-aside table is updated by replacing the RECID for the previously modified record with the NEWRECID for the current modified record (block 525 ).
- the DeleteKey operation is illustrated in FIG. 5B and performed by the IMDB manager when executing the modify and delete functions.
- the IMDB manager determines if an index entry in the look-aside table exists with the same key value that is being deleted (block 531 ). If not, then the IMDB manager creates a new index entry in the look-aside table that contains the deleted key value and RECID of the corresponding record (block 533 ). The IMDB manager also marks the key entry for the deleted value in the index table as uncommitted-deleted (block 535 ).
- the IMDB manager determines if the corresponding key entry in the index table is marked as uncommitted-inserted (block 537 ). If not, the entry must be marked as both uncommitted-deleted and uncommitted-inserted so the index entry is retained and the key entry is remarked as uncommitted-deleted (block 535 ). If the key entry is marked as uncommitted-inserted at block 537 , then both the existing index entry and the key entry are deleted (blocks 539 and 541 ).
- the InsertKey operation is illustrated in FIG. 5C and performed by the IMDB manager when executing the modify and add functions.
- the IMDB manager determines if an index entry in the look-aside table exists with the same key value that is being inserted (block 551 ). If not, then the IMDB manager creates a new index entry in the look-aside table that contains the new key value and the RECID specified in the InsertKey operation (block 553 ).
- the IMDB manager also inserts an entry for the new key value in the index table and marks the entry as uncommitted-inserted (block 555 ).
- the IMDB manager determines if the key entry in the index table is marked uncommitted-inserted (block 557 ). If so, then the key to be added is a duplicate and an error flag is set (block 559 ). If the key entry is not marked uncommitted-inserted, then the entry must be uncommitted-deleted. Therefore, the existing key entry is marked as both uncommitted-deleted and uncommitted-inserted (block 561 ), the existing index entry in the look-aside table is deleted (block 563 ), and a new index entry containing the reinserted key value and the NEWRECID for the reinserted record is created (block 565 ).
- the IMDB manager determines if the record was previously modified (block 601 ) so that the modified record can be used rather than the record specified in the function call (block 603 ). As described in conjunction with FIG. 5, in an alternate embodiment, the IMDB manager checks if the same client performed the previous modification and returns an error if not.
- the IMDB manager performs the DeleteKey operation illustrated in FIG. 5B for each key in the deleted record (block a 605 ). If the record is newly added (block 607 ), the IMDB deletes the corresponding record entry in the look-aside table (block 609 ) and deletes the newly added record from shared memory (block 611 ).
- the IMDB manager deletes the record entry in the look-aside table (block 615 ) and deletes the modified record from the shared memory (block 617 ).
- the IMDB manager also creates a new record entry in the look-aside table that has a null value for the new RECID to denote that the record has been deleted (block 619 ).
- the null RECID entry is found by hashing on the RECID of the deleted record. If the record is neither newly added nor previously modified, the IMDB manager marks the record as modified (block 621 ) and creates the new null record entry at block 619 .
- FIG. 7 illustrates the acts performed by the IMDB manager when a client requests that a record be added to the database.
- the IMDB manager creates the new record in the shared memory marked as modified (block 701 ), adds a record entry containing the RECID of the new record to the look-aside table (block 703 ), and performs the InsertKey operation illustrated in FIG. 5C for each key in the record (block 705 ). If any of the keys duplicate existing key values (block 707 ), the record is not added.
- Commit and rollback processes are mirror images of each other.
- the client calls the IMDB manager to update the shared memory to reflect the modifications made by the client as shown in FIG. 8.
- the IMDB manager reads each entry in the look-aside table for the client (block 801 ) and determines what type of entry it is. The methods used to determine the entry type depends on the data structure of the look-aside table as one of skill in the art will immediately appreciate. The details of a particular look-aside table are described in the next section.
- the IMDB manager updates the corresponding key entries in the index tables for the record by replacing the original RECID in the key entries with the RECID for the modified record (block 804 ).
- the IMDB manager also deletes the original record from the shared memory (block 807 ). If the entry is for a deleted record (block 805 ), the IMDB deletes the original record from the shared memory (block 807 ). If the entry is an index entry corresponding to an added key (block 809 ), the IMDB manager removes the UCI marking from the key entry in the shared memory (block 811 ).
- the IMDB manager deletes the key entry from the shared memory (block 815 ). If the entry is an index entry corresponding to a key that has been reinserted (block 817 ), the IMDB manager removes the UCD and UCI markings from the key entry in the shared memory (block 819 ) and updates the key entry with the RECID from the corresponding index entry in the look-aside table (block 821 ). Note that if the entry is for an added record, the IMDB manager takes no action because the newly added indices when committed point to where the new record is stored in shared memory. Once all entries in the look-aside table have been processed (block 823 ), the IMDB manager deletes the look-aside table from the shared memory (block 825 ).
- a client When a client does not commit its changes (aborts), it requests that the IMDB manager rollback the shared memory to a point prior to the changes by discarding all the modifications in shared memory (FIG. 9).
- the IMDB manager reads each entry from the look-aside table (block 901 ) and determines the type of entry as explained above in conjunction with FIG. 8. If the entry is for a modified record (block 903 ), the IMDB manager clears the modified flag from the original record in the shared memory (block 905 ) and deletes the modified (new) record from the shared memory (block 909 ). If the entry is for an added record (block 907 ), the IMDB manager deletes the new record from the shared memory (block 909 ).
- the IMDB manager deletes the new key entry from the shared memory (block 913 ). If the entry is an index entry for a deleted key (block 915 ), the IMDB manager removes the uncommitted-deleted (UCD) marking from the key entry in the shared memory (block 917 ). If the entry is an index entry for a reinserted key (block 919 ), the IMDB manager removed the UCD and UCI markings from the key entry in the shared memory (block 921 ). Note that when the entry is for a deleted record, the IMDB manager takes no action because the indices when rolled back will point to the original record in the shared memory. Once all entries in the look-aside table have been processed (block 923 ), the IMDB manager deletes the look-aside table from the shared memory (block 925 ).
- DTC Distributed Transaction Coordinator
- the in-memory database system employed by the DTC uses page latches to control access to shared memory, and special hash table data structures and hash functions to implement the look-aside table and a transaction table.
- the shared memory for the IMDB is divided into logical fixed length pages.
- the records and index keys from the database are cached on the shared memory pages by the IMDB manager (core process).
- the index keys cached in the shared memory are arranged in balanced (B+) tree structures for quick access.
- the look-aside tables for the client processes are also cached on the shared memory pages.
- the core process maintains a transaction table in the shared memory which associates a transaction identifier, such as a globally unique identifier (GUID), with its look-aside table.
- GUID globally unique identifier
- the client processes are permitted only read access to the look-aside tables and the transaction table.
- a shared memory page comprises a header, a timestamp array, a slot array, and a data section.
- the header contains a page identifier, the number of entries (data base records, index keys, look-aside tables) stored on the page, a pointer to free space within the data section, and the size of the free space.
- the timestamp array stores a timestamp value for each page entry.
- the slot array contains one slot for each page entry; each slot contains the offset of the entry from the start of the data section and the length of the entry.
- Page latches are a synchronization mechanism which ensures the consistency of the data on a page while a transaction is accessing the page.
- the page latches are associated with the page and thus can be maintained for multiple transactions operating on a page.
- page latches are of short duration, lasting for only as long as necessary to read or write data to the page. These characteristics also mean that page latches are not subject to deadlocks.
- traditional database locks are associated with a single transaction to keep the transaction consistent, are held for the duration of the transaction, and can incur deadlock situations which require the implementation of complex deadlock detection and resolutions algorithms.
- Each page also has multiple shared page latches. Any process (client or core) can obtain a shared page latch which allows the holder to read data from the page. There are as many shared page latches active at one time as there are transactions accessing the page. Note that a transaction having many threads of execution will use only a single shared page latch for all the threads.
- page latches are meant for short duration operations and no deadlock detection scheme is used for them, the client and core processes are designed to obtain page latches in such a way as to prevent deadlock. Typically a thread of execution will obtain only a single latch at a time. However when multiple latches are required, a predetermined ordering is used. When multiple index pages in the B+ tree structure must be latched, a parent page is latched before any of its children pages. When multiple pages at the same level in the index, or multiple data pages, must be latched, they pages are latched in physical order.
- the page latches for a data page are not stored on the data page because the client process must have write access to the page latch itself in order to obtain the latch and only the core process has write access to the data pages. Instead the page latches are stored in a region of shared memory separate from the database pages themselves and shared by the core and client processes in write mode. In the DTC implementation, the page latch memory region contains eight bytes of latch data for each data page in the shared memory. Therefore, a particular page latch can be found by using the page number to determine the offset for the page latch shared memory, e.g., for page i, the offset in the shared page latch table is i*8.
- Each page latch consists of two fields (both 32-bits in length):
- fExclusive which is set to indicate there is an exclusive latch requested on the page.
- a page is share latched if dwShareCount is greater than zero.
- a page is exclusively latched if dwShareCount is zero and fExclusive is set (equal to one).
- a page is share latched but the core process is waiting for an exclusive latch if dwShareCount greater than zero and fExclusive is one.
- Increment dwShareCount using an InterlockedIncrement instruction that guarantees that only one thread will increment the count; multiple threads trying to increment the count are processed in a serial fashion).
- a thread can only acquire a shared latch if no other thread has an exclusive latch or is waiting for an exclusive latch. Note, that after incrementing the share count, the thread determines if fExclusive is set because in the interval, another thread may come along and may successfully obtain an exclusive latch as described in more detail below.
- a thread releases a shared latch by using InterlockedDecrement to decrement dwShareCount.
- a thread releases an exclusive latch by using InterlockedCompareExchange to set fExclusive to zero.
- the InterlockedCompareExchange instruction sets a memory variable to a value only if the memory compares equal to another value.
- the above procedure calls InterlockedCompareExchange(&fExclusive, 1, 0) so that InterlockedCompareExchange will only set fExclusive to one if fExclusive is equal to zero.
- InterlockedCompareExchange can be implemented either on the underlying processor or in the operating system using other synchronization primitives provided by the processor.
- the thread waits for dwShareCount to fall to zero.
- latches are meant for short duration operations so that the share count falls to zero relatively quickly as other threads release their share latches and because no thread can acquire a shared latch on the page since shared latches cannot be acquired when fExclusive is set.
- the client processes are running untrusted application code, it is possible that a client process can die while holding a share latch. To recover from this situation, the core process resets the share count if it unable to acquire an exclusive latch after some period of time (e.g., 5 seconds). The core process does not reset an exclusive latch since exclusive latches are only obtained by the core process threads and the core process only runs trusted code.
- Both the look-aside tables and the transaction table are implemented as hash table data structures.
- the look-aside table data structures are designed to give very high performance and can be scaled to different sizes, as described further below, to accommodate varying numbers of transactions and updates.
- the index and record entries described in the two previous sections are kept in the look-aside tables along with some miscellaneous entries.
- a record entry 1001 comprises three fields: a record identifier for the RECID of the unmodified record 1002 , a record identifier for the RECID of the modified record 1003 , and a bitmap 1004 used to denote which columns of the record have been modified. If a record is modified multiple times by a transaction, the later changes are OR'd together with the existing bitmap 1004 to create a new bitmap. The bitmap is used to construct the proper database calls when writing committed changes to a back-end database as part of the commit process.
- An index entry 1011 comprises five fields: a RECID 1012 for the key, two key length fields 1013 , 1014 for the key and the primary key respectively, an identifier 1015 for the index for the key, and a RECID 1016 of the new data record associated with the key if the key was deleted and then reinserted as described in the previous section.
- keys can be variable length in the DTC implementation, the key itself is allocated to a separate record to permit fixed length look-aside table entries.
- the key entry in the index serves as the separate key record for the look-aside table; in an alternate embodiment, the separate key record is distinct from the key entry so that dynamic allocation of additional keys to the index does not require changes in the index entry 1011 .
- the key can be stored in the look-aside table entry if variable length table entries are supported or if the key is restricted to fixed-length values.
- the primary key field 1014 is null.
- a combination of the key and the primary key is used for the index entry and thus both fields 1013 and 1014 contain valid values.
- the particular index or record entry is found by translating a search key into a table address using a hash function shared between the core and client processes.
- the RECID is the search key for record entries.
- a combination of a database table identifier (which identifies the database table with which the index is associated), the index identifier, and the key value is used as the search key for index entries.
- a RECID is eight bytes long where five bytes specify the shared memory page number, one byte specifies the page sequence number, nine bits specify a slot on the page, and seven bits specify the slot sequence number.
- the slot sequence number and the page sequence number are used to distinguish recycled or overflow slots and pages. However, the sequence numbers are not useful in distinguishing one record from another when searching the look-aside table and so only the page number and slot are input into the hash function.
- the algorithm used by the hash function for record search keys in the DTC implementation is
- ⁇ specifies a bitwise exclusive OR operation and ⁇ specifies a left shift operation.
- the search key for an index entry comprises a database table identifier, an index identifier (indexid), and the key value.
- the database table identifier is a sixteen byte database identifier (DBID) and a double word (32-bit) object identifier (OBJID) assigned by the operating system.
- DBID database identifier
- OBJID double word object identifier
- hash OBJID ⁇ (DBID ⁇ 16) ⁇ indexid ⁇ 12 ⁇ keyhash
- the value of “hash” produced by the algorithms is divided by the maximum number of entries in the look-side table and the remainder is used as an address for the index or record entry.
- the hash algorithms are designed to produce a look-aside table address for an entry which is reasonably unique within the table, and falls in the range of zero to one less than the table size.
- Hash duplicates, or collisions occur when record already exists at the table address calculated by the hash function for a new record.
- the IMDB uses a linked list collision resolution scheme in which the new record is allocated to a space in shared memory and is linked to the hash address as illustrated in FIGS. 12 and 13 below.
- the value of the search key RECID is compared with the appropriate RECID field in each hash duplicate entry to find the correct entry.
- each entry 1101 comprises a GUID 1102 for a transaction and the shared memory address 1103 for the look-aside table associated with the transaction.
- the GUID 1102 is a 16-byte (four 32-bit words) globally unique identifier assigned by the operating system.
- An entry is located within the transaction table 1100 by exclusively OR'ing the four words of the GUID, dividing the result by the number of maximum number of entries in the transaction table, and using the remainder to address the entry. Hash duplicates are handled as described above for the look-aside table.
- the address of the transaction table in the shared memory is stored in a fixed location in the shared memory so that it can always be found by the client processes.
- the transaction and look-aside tables reside on fixed length shared memory pages and are capable of being resized when necessary. Both tables are designed to be allocated in various sizes with the smallest table having seventeen entries and the largest having 866,586 entries (the number of entries that fit on 1974 shared memory pages). There are four other intermediate sizes in the DTC implementation: 127, 439 (the number of entries that fit on one shared memory page), 7463 (the number of entries that fit on seventeen shared memory pages, and 55,753 (the number of entries that fit on 127 shared memory pages).
- the table size is factored into the hashing function as described above so that the resulting entry address falls within the number of entries for that size of table. Alternate table sizes are contemplated as within the scope of the invention.
- a hash table that fits on a single shared memory page is illustrated in FIG. 12, e.g., a hash table with seventeen, 127, or 439 entries in the DTC implementation.
- a hash table that spans multiple shared memory pages is illustrated in FIG. 13, e.g., a hash table of 7463, 55,753 or 86,586 entries in the DTC implementation.
- the first four bytes 1201 , 1301 contain the current size of the hash table.
- Both figures also illustrate the use of linked lists 1204 , 1306 to handle collisions and overflow among table entries 1203 , 1303 respectively.
- the difference in the two data structures is that the larger sized hash table 1300 uses a two level page linking mechanism.
- the first level 1301 is an array of page entries 1303 that point to pages 1304 which contain the hash entries 1303 comparable with the hash entries 1203 of hash table 1200 .
- the smallest three hash table sizes are single level data structures as shown in FIG. 12.
- the larger three hash table sizes are two level data structures as shown in FIG. 13.
- a transaction or look-aside table is resized to the next size if the current table size is not the maximum allowed size and the number of entries in the current table is greater than the maximum number of entries allowed under the current size. Performance can also be degraded if a transaction or look-aside table is too large since the dedicated but unused space in shared memory cannot be allocated to other data. Therefore, a table is shrunk to a smaller size if the number of entries is less than one half the number of entries in the next smaller sized table.
- the process of resizing a transaction or look-aside table is the responsibility of the core process which acquires an exclusive latch on the page or pages involved so that all client processes are denied access to the look-aside table during resizing. All entries in the old table are deleted from the old table and are added to the new table. Each entry is rehashed because the hash function for the new table can result in a different table address for the entry than its table address in the old table.
- a in-memory database system has been described that enables multiple concurrent read-only access to database records through a unique versioning scheme based on look-aside tables associated with modifying transactions.
Abstract
Description
- This invention relates generally to databases, and more particularly to enabling multiple concurrent read-only access to database records.
- A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawing hereto: Copyright® 1997, Microsoft Corporation, All Rights Reserved.
- Existing database systems employ a database manager that control reads and writes on the database records to guarantee consistency of the data. A transaction issues a record request to the database manager which is executed by switching between the context for the transaction and that for the database manager, typically a very expensive operation in terms of processing cycles. The reverse context switch is performed when the database manager completes the request and returns data to the transaction. However, when a transaction is only reading data and not making changes, the context switch introduces unnecessary overhead and slows the processing of the read-only transaction.
- When the database manager immediately changes the data in the database in response to an update request, the database manager must reverse the changes using a rollback mechanism if the requesting transaction aborts. Therefore, in order to present a consistent view of the data to another transaction, the database manager either denies access to the changed data until the modifying transaction commits the changes, or permits the other transaction access to the data but must also rollback the other transaction if the modifying transaction aborts. The processing of read-only transactions is thus slowed when they execute concurrently with transactions that update common data.
- Therefore, a database system is needed which permits read-only transactions direct access to data and which presents a consistent view of data to a transaction without the complications involved with standard rollback procedures.
- The above-mentioned shortcomings, disadvantages and problems are addressed by the present invention, which will be understood by reading and studying the following specification.
- An in-memory database system uses a shared memory to cache records and keys read from a database and controls the updating of the records and keys through a database manager process. When a transaction performs an update, the original, unmodified data is preserved in the shared memory, the new data is written to the shared memory, and a look-aside table for the transaction records the changes. A transaction performs read-only access to the shared memory using its own context while a versioning scheme based on the look-aside tables ensures a read-committed isolation level view of the original, unmodified data until the modifying transaction commits the update. The database manager is responsible for writing the new data into the shared memory and for maintaining the look-aside tables for all transaction which have made modifications to the data in the shared memory. The database manager also writes committed changes to the database and performs rollback on uncommitted changes in the shared memory using the entries in the look-aside table for the committing/aborting transaction. The shared memory is divided into logical pages and short duration page latches are employed to maintain consistency on the page while a transaction or the database manager is reading or writing data on the page.
- A method of controlling access to database records which are stored in memory shared among multiple processes is described as creating record and/or index entries in a look-aside table, preserving the original data in the shared memory, and allowing a process access to the modified data if a corresponding record and/or index entries exists in the look-aside table for the process. The method also performs rollback and abort processing using the look-aside table.
- The in-memory database system is described as having a plurality of clients which manipulate data, a shared memory for caching the data, an in-memory database manager that creates the look-aside table entries and writes changes to the shared memory. The details of data structures and page latches used by the in-memory database system are given. A particular implementation of the in-memory database system is also described.
- The present invention describes systems, clients, servers, methods, and computer-readable media of varying scope. In addition to the aspects and advantages of the present invention described in this summary, further aspects and advantages of the invention will become apparent by reference to the drawings and by reading the detailed description that follows.
- FIG. 1 shows a diagram of the hardware and operating environment in conjunction with which embodiments of the invention may be practiced;
- FIG. 2 is a diagram illustrating a system-level overview of an exemplary embodiment of the invention;
- FIGS. 3A and 3B are time line diagrams illustrating the interactions of two client processes operating in the exemplary embodiment shown in FIG. 2;
- FIG. 4 is a flowchart of a method to be performed by a client process according to an exemplary embodiment of the invention;
- FIGS. 5A, 5B,5C, 6, 7, 8 and 9 are flowcharts of methods to be performed by a database manager process according to an exemplary embodiment of the invention;
- FIG. 10 is a diagram of a look-aside data structure for use in an exemplary implementation of the invention;
- FIG. 11 is diagram of a transaction data structure for use in an exemplary implementation of the invention;
- FIG. 12 is a diagram of a single level hash table data structure for use in an exemplary implementation of the invention; and
- FIG. 13 is a diagram of a two level hash table data structure for use in an exemplary implementation of the invention.
- In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanying drawings which form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical and other changes may be made without departing from the spirit or scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
- The detailed description is divided into five sections. In the first section, the hardware and the operating environment in conjunction with which embodiments of the invention may be practiced are described. In the second section, a system level overview of the invention is presented. In the third section, methods for an exemplary embodiment of the invention are provided. In the fourth section, a particular implementation of the invention is described that operates as part of Microsoft Corp.'s Distributed Transaction Coordinator. Finally, in the fifth section, a conclusion of the detailed description is provided.
- FIG. 1 is a diagram of the hardware and operating environment in conjunction with which embodiments of the invention may be practiced. The description of FIG. 1 is intended to provide a brief, general description of suitable computer hardware and a suitable computing environment in conjunction with which the invention may be implemented. Although not required, the invention is described in the general context of computer-executable instructions, such as program modules, being executed by a computer, such as a personal computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types.
- Moreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
- The exemplary hardware and operating environment of FIG. 1 for implementing the invention includes a general purpose computing device in the form of a
computer 20, including aprocessing unit 21, asystem memory 22, and asystem bus 23 that operatively couples various system components include the system memory to theprocessing unit 21. There may be only one or there may be more than oneprocessing unit 21, such that the processor ofcomputer 20 comprises a single central-processing unit (CPU), or a plurality of processing units, commonly referred to as a parallel processing environment. Thecomputer 20 may be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited. - The
system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. The system memory may also be referred to as simply the memory, and includes read only memory (ROM) 24 and random access memory (RAM) 25. a basic input/output system (BIOS) 26, containing the basic routines that help to transfer information between elements within thecomputer 20, such as during start-up, is stored inROM 24. Thecomputer 20 further includes ahard disk drive 27 for reading from and writing to a hard disk, not shown, amagnetic disk drive 28 for reading from or writing to a removablemagnetic disk 29, and anoptical disk drive 30 for reading from or writing to a removableoptical disk 31 such as a CD ROM or other optical media. - The
hard disk drive 27,magnetic disk drive 28, andoptical disk drive 30 are connected to thesystem bus 23 by a harddisk drive interface 32, a magneticdisk drive interface 33, and an opticaldisk drive interface 34, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for thecomputer 20. It should be appreciated by those skilled in the art that any type of computer-readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read only memories (ROMs), and the like, may be used in the exemplary operating environment. - A number of program modules may be stored on the hard disk,
magnetic disk 29,optical disk 31,ROM 24, or RAM 25, including anoperating system 35, one ormore application programs 36,other program modules 37, andprogram data 38. A user may enter commands and information into thepersonal computer 20 through input devices such as akeyboard 40 andpointing device 42. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to theprocessing unit 21 through aserial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port, or a universal serial bus (USB). Amonitor 47 or other type of display device is also connected to thesystem bus 23 via an interface, such as avideo adapter 48. In addition to the monitor, computers typically include other peripheral output devices (not shown), such as speakers and printers. - The
computer 20 may operate in a networked environment using logical connections to one or more remote computers, such asremote computer 49. These logical connections are achieved by a communication device coupled to or a part of thecomputer 20; the invention is not limited to a particular type of communications device. Theremote computer 49 may be another computer, a server, a router, a network PC, a client, a peer device or other common network node, and typically includes many or all of the elements described above relative to thecomputer 20, although only amemory storage device 50 has been illustrated in FIG. 1. The logical connections depicted in FIG. 1 include a local-area network (LAN) 51 and a wide-area network (WAN) 52. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet. - When used in a LAN-networking environment, the
computer 20 is connected to thelocal network 51 through a network interface oradapter 53, which is one type of communications device. When used in a WAN-networking environment, thecomputer 20 typically includes amodem 54, a type of communications device, or any other type of communications device for establishing communications over thewide area network 52, such as the Internet. Themodem 54, which may be internal or external, is connected to thesystem bus 23 via theserial port interface 46. In a networked environment, program modules depicted relative to thepersonal computer 20, or portions thereof, may be stored in the remote memory storage device. It is appreciated that the network connections shown are exemplary and other means of and communications devices for establishing a communications link between the computers may be used. - The hardware and operating environment in conjunction with which embodiments of the invention may be practiced has been described. The computer in conjunction with which embodiments of the invention may be practiced may be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited. Such a computer typically includes one or more processing units as its processor, and a computer-readable medium such as a memory. The computer may also include a communications device such as a network adapter or a modem, so that it is able to communicatively couple other computers.
- A system level overview of the operation of an exemplary embodiment of the invention is described by reference to FIG. 2. As shown in FIG. 2, an in-
memory database system 200 comprises an in-memory database (IMDB)manager 201 and sharedmemory 202 in a computer such aslocal computer 20 in FIG. 1. TheIMDB manager 201 is responsible for reading and writing records from adatabase 220 into and from sharedmemory 202 on behalf of aclient process 210.Database 220 can be resident on the same computer as the in-memory database system 200 or can be located on a different computer such asremote computer 49 in FIG. 1. Theclient process 210 can reside on the same computer as the in-memory database system 200 or can execute on a different computer as long as theclient process 210 can address the sharedmemory 202. - Because the
client process 210 can address the sharedmemory 202 through its context, the client process can directly access the records in sharedmemory 202 without having to call the IMDB manager. In the exemplary embodiment, theclient process 210 has read-only access to the records and calls the IMDB manager to modify or delete an existing record or to create a new record. - FIG. 3A is a time line diagram illustrating the interactions of two client processes in accordance with the exemplary embodiment of the invention. Each client process is represented by a database transaction which performs operations on database records. In FIG. 3A, the two database transaction access the same database employee record for an employee named “Smith.” The primary key for the employee records is the employee number which in the case of employee Smith is “123.” The actions described below are divided among the transactions for the client processes and the
IMDB manager 201 when one client process performs modifies a database record. - Transaction1 executes a retrieve command on the employee record “123” which returns
copy 301 of the employee record from sharedmemory 202 at time mark A1. If a copy of the record is not already in memory, theIMDB manager 201 reads a copy from thedatabase 220 into sharedmemory 202. Transaction1 modifies the last name of the employee from “Smith” to “Jones” at time mark B1. Because the name change has not yet been committed by transaction1, the modified record is not written back to the database. Instead, theIMDB manager 201 creates a modifiedcopy 303 of the record in shared memory and sets a “modified”flag 302 in theoriginal copy 301 of the record in the shared memory. TheIMDB manager 201 also creates a look-aside table 305 for transactions in transaction1's context, if one does not already exist, and creates arecord entry 306 in the look-aside table 305 which points to the location of the modifiedcopy 303 of the record in shared memory. The look-aside table 305 is accessible only by transaction1 and by the IMDB manager. - When transaction1 wants to re-read the record at time mark C1, transaction1 specifies the key again and retrieves the
original copy 301 from shared memory. Because the modifiedflag 302 is set incopy 301, the transaction1 searches its look-aside table 305 and finds therecord entry 306. Transaction1 then retrieves the modifiedcopy 303 of the record using the information in therecord entry 306 at time mark D1. When transaction1 commits its changes at time mark E1, the IMDB manager writes all modifications specified in transaction1's look-aside table 305 to the shared memory and to the database. The look-aside table 305 is deleted after all the modifications have been committed. - As shown in FIG. 3A, transaction2 is executing concurrently with transaction1. Transaction2 issues a retrieve command using key “123” at time mark A2 which retrieves the
copy 201 from shared memory. When transaction2 next retrieves the record using the key “123” at time mark B2 after transaction1 has modified the record, transaction2 reads thecopy 301 from the database and recognizes that the modifiedflag 302 is set. Therefore, transaction2 knows that changes to the record are pending and searches its look-aside table 310, if one exists, for a corresponding record entry. Because transaction1 was responsible for the modification, transaction2 does not find a corresponding record entry and therefore continues its processing with theunmodified copy 301 of the record. - Once transaction1 has committed the changes (at time mark E1), a third read operation by transaction2 on key “123” (at time mark C2) returns the modified
copy 303 of the record in shared memory to transaction2. Note that transaction2 sees an inconsistency between the information in thecopy 301 of the record retrieved at time marks A2 and B2, and thecopy 303 retrieved at time mark C2. The in-memory database system of the present invention guarantees consistency of read-committed transactions but does not guarantee consistency of read-repeatable or serializable transactions. - Alternatively at time mark E1, transaction1 can abort and rollback the uncommitted changes using the information in the look-aside table. After rollback, the
copy 301 of the employee record in the shared memory appears as it was at time mark A1, i.e., before transaction1 modified it at time mark B1. Rollback processing is described in detail in the next section. - Setting the modified flag in old records reduces the number of accesses required on the look-aside tables. However, alternate embodiments in which the modified flag is not used are also contemplated as within the scope of the invention. In these embodiment, the client process searches the look-aside table each time it retrieves a record from the shared memory.
- Furthermore, as one of skill in the art will readily appreciate, various embodiments for the entries in the look-aside table are possible. In the exemplary embodiment being discussed in this section, each record in shared memory is located using a record identifier (RECID) specified in the index entries for the record. The RECID is also used as a hash key to search for the corresponding record entry in the look-aside tables. When record is modified, the IMDB manager hashes the RECID (OLDRECID) for the original record to determine which record entry to use in the appropriate look-aside table. The RECID (NEWRECID) for the modified record is written into the entry. In the interest of clarity, FIG. 3A does not show the index entries since only non-key data is modified in the example.
- FIG. 3B shows the same series of transactions when the employee name is the primary key for the employee records. Therefore, in FIG. 3B, the primary index for the employee table is shown to illustrate the actions taken a key is changed.
- As in FIG. 3A, a
copy 301 of the employee record is read from shared memory at time mark A1, therecord entry 306 pointing to the modifiedcopy 303 is created in look-aside table 305, and the modified flag set in theoriginal copy 301 at time mark B1. - Because the primary key for the record has changed, at time mark B1 the IMDB manager also inserts a new key entry 322 for “Jones” into the primary key index table 320 for the employee records. The new key entry 322 contains the new RECID (NEWRECID) for the modified record. The
old entry 321 for “Smith” is marked as uncommitted-deleted (UCD) while the new entry 322 is marked as uncommitted-inserted (UCI). Twoindex entries Index entry 307 contains an identifier for the employee table (“EMPLOYEE”), an identifier for the primary index (“NAME”), and the value of the deleted key (“SMITH”).Index entry 308 contains the identifier for the employee table (“EMPLOYEE”), the identifier for the primary index (“NAME”), and the value of the inserted key (“JONES”). The index entries are located by hashing on table identifier, index identifier, and key value. - At time mark C1, transaction1 issues a retrieve command on the employee record using the primary key “Smith.” The
index entry 321 is marked as uncommitted-deleted, so transaction1 uses the string “EMPLOYEE-NAME-SMITH” to search its look-aside table 305 for a matching entry. Because a matching entry, in thiscase entry 307, exists, transaction1 knows it is the modifying transaction, so the primary key of “Smith” does not exist for it and no record is returned. Similarly when transaction1 issues a retrieve command on the employee record using the primary key “Jones” at time mark D1, it determines it is the modifying transaction becauseentry 308 exists so it uses NEWRECID in the index entry 322 to retrieve the modifiedcopy 303 of the record (time mark E1). - On the other hand, when transaction2 issues a retrieve command for the employee record using “Smith” at time mark B2, it determines that the primary key “Smith” is marked as uncommitted-deleted, and that it is not the modifying transaction since its look-aside table 310 does not contain a matching entry. The transaction2 can continue to use the
original copy 301 of the record if the name modification is not critical to its processing (time mark C2). Similarly, when transaction2 issues a retrieve command for the employee record using “Jones” at time mark D2, it determines that the primary key “Jones” is marked as uncommitted-inserted, and that is not the modifying transaction, so it treats they key as if it were not in the index. - A similar scenario takes place when a secondary key for a record is modified. A transaction that is retrieving the record using the secondary key proceeds as described above for FIG. 3B where the index table and the index entries are specific for the secondary key. For secondary indices that are not required to have unique key values, the exemplary embodiment of the IMDB manager combines the secondary key value with the primary key value to yield a unique key value. Other commonly used mechanisms to create unique keys for non-unique keys are equally applicable and are within the scope of the invention.
- After the secondary key is modified, a transaction retrieving the record using the primary key reads the unmodified copy of the record since the key entry in the primary key contains the OLDRECID. The modified flag in the record alerts the transaction that a change to the data is pending. The transaction then uses the OLDRECID to search its look-aside table and retrieves the modified copy if it finds a matching entry.
- The IMDB manager creates both index and record entries in the look-aside table when a record is deleted. The affected key entry in the each index table is marked as uncommitted-deleted, an index entry in each appropriate look-aside table keyed on the record table, index, and deleted key value is created, and a null record entry in each look-aside table is created so that hashing into the look-aside table using the OLDRECID indicates that the record is deleted. Similarly, when a record is created, the IMDB manager creates a new key entry in the each index table marked as uncommitted-inserted and an index entry in each appropriate look-aside table keyed on the record table, index, and new key value. A record entry is also created in the look-aside table which contains the NEWRECID for the newly created record; the record entry is hashed into using a null value.
- Marking key entries as uncommitted-deleted or uncommitted-inserted reduces the number of accesses to the look-aside table in the same fashion as setting the modified flag in an old record. Alternate embodiments in which the key entries are not so marked as contemplated as within the scope of the invention.
- The system level overview of the operation of an exemplary embodiment of the invention has been described in this section of the detailed description. The IMDB system maintains data in the shared memory in both a new, uncommitted state resulting from a update function performed by a transaction, and in the original, committed state to provide versioning control for client processes. The IMDB system is predicated on two principals:
- 1. No record is updated (added, deleted or modified) by more than one transaction at a time so that there is always only one uncommitted copy of any record in the shared memory; and
- 2. No key entry in an index is inserted or deleted by more than one transaction at a time so that there is always only one uncommitted copy of any unique key in the shared memory.
- While the invention is not limited to any particular set of transactions, for sake of clarity the modification of a single record using a simplified version of a look-aside table has been described. Alternate embodiments of the data structures for the look-aside table and the details of suitable hashing algorithms are described in section four.
- In the previous section, a system level overview of the operation of an exemplary embodiment of the invention was described. In this section, the particular methods performed by the clients and the IMDB manager of such an exemplary embodiment are described by reference to a series of flowcharts. The methods to be performed by the clients constitute computer programs made up of computer-executable instructions. Similarly, the methods to be performed by the IMDB manager constitute computer programs also made up of computer-executable instructions. Describing the methods by reference to flowcharts enables one skilled in the art to develop programs including instructions to carry out the methods on a suitable computer (the processor of the computer executing the instructions from computer-readable media).
- The exemplary embodiment of a invention described by methods in the flowcharts of FIGS.4-7 requires all index entries in the look-aside table to be unique. Because all secondary keys in a database may not be required to have unique values, the invention combines such secondary keys with the primary key for the record (which is unique) to create a unique key for the corresponding secondary index entry in the look-aside table. Additionally, if a record has been deleted and then the same record is reinserted by a transaction before the deletion is committed, the index entries for the record's keys in the appropriate look-aside table contain a NEWRECID for the reinserted record, which is used when retrieving the record by the transaction that deleted and reinserted the record. The key entries in the index tables contain an OLDRECID for the original record, which is used when retrieving the record by all other transactions.
- Referring first to FIG. 4, a flowchart of a method to be performed by a client according to an exemplary embodiment of the invention is shown. This method is inclusive of the acts required to be taken by the client when retrieving a record.
- The client uses an appropriate hashing algorithm, or other suitable method, to find the key entry in the appropriate index table in shared memory (block401). The key entry can be either a primary key for the record or a secondary key depending on the criteria specified by the client in the retrieval command. The client next determines if the key entry has been changed.
- If the key entry in the index table is marked as uncommitted-deleted (UCD) (block403) and uncommitted-inserted (block 405), the client searches its look-aside table for a matching index entry (block 407). If a matching index entry is found (block 409), then the client uses the NEWRECID in the index entry to read the copy of the record it reinserted (block 411). If a matching entry is not found at
block 409, then the original key still exists for the client and the client uses the OLDRECID in the key entry in the index table to read the original copy of the record (block 413). - If the key entry in the index table is marked as uncommitted-deleted (UCD) (block403) but not uncommitted-inserted (block 405), the client searches its look-aside table for a matching entry (block 415). If a matching entry is found, the client has deleted the key so the key does not exist for it and thus no record is retrieved. If a matching entry is not found (block 417), the original key still exists for the client and the client uses the OLDRECID in the key entry in the index table to read the original copy of the record (block 413).
- If the key entry is not marked as uncommitted-deleted (block403) but is marked as uncommitted-inserted (UCI) (block 419), the client searches its look-aside table for a matching index entry (block 421). If a matching index entry is found (block 423), the client knows that it is the transaction that inserted (modified) the key and uses the NEWRECID in the index entry to read the modified copy of the record from shared memory (block 411). If a matching index entry is not found at
block 423, the client knows that another transaction modified the key and has not committed the change so the key value does not exist for the client. - If the key entry is not marked as either uncommitted-inserted or uncommitted-deleted, the client reads the record from the shared memory using the RECID in the key entry (block425). The client checks the modified flag in the record to determine if any data has been changed (block 427). If the modified flag is set, then the client searches its look-aside table for a matching record entry (block 429). If a matching record entry is found (block 431), then the client knows it is the transaction that modified the record, and uses the NEWRECID in the record entry to read the modified copy of the record from the shared memory (block 411). If the client does not find a matching record entry at
block 431, the client knows that the unmodified copy of the record read atblock 425 is the copy that exists for it. - The IMDB manager reads and writes records from the database using commands specific to the type of database used to store the records. For example, a relational database such as Oracle is accessed using standard SQL commands. The invention is not limited to use with only relational databases, but is applicable to any key-based data structure. The IMDB manager is responsible for assigning RECIDs to records and for storing the records in the shared memory. The IMDB manager is also responsible for creating the corresponding shared memory indices for a record, and for creating and managing the look-aside tables in shared memory. In one embodiment, the IMDB manager pre-loads entire tables of database records into shared memory, and creates the RECIDs and shared memory indices during an initialization phase. In an alternate embodiment, the IMDB manager pre-loads only a subset of database records when a range of key values is specified by a client.
- The client transactions can only read information from shared memory and must call the IMDB to request modifications to the records and indices. One of skill in the art will immediately appreciate that any number of well-known data management techniques can be used by the IMDB manager in managing the shared memory. One particular technique is discussed in detail in the next section.
- The client transaction calls the IMDB manager to perform five functions illustrated in FIGS.5A-C (modify), FIG. 6 (delete), FIG. 7 (add), FIG. 8 (commit), and FIG. 9 (rollback). In the exemplary embodiment being described in this section, the IMDB creates a look-aside table for a client transaction when the transaction first requests a modification to a record in the shared memory (not illustrated). Alternate embodiments in which the IMDB manager creates the shared memory table at different stages in the processing of the transaction will be readily apparent to one of skill in the art and are contemplated as within the scope of the invention.
- Turning first to FIG. 5A, when a client calls the IMDB manager to modify a record, the IMDB manager determines if the record has been previously modified by the same client (block501), i.e., the modification has not yet been committed so a matching record entry exists in the client's look-aside table for the client. If so, then the previously modified copy of the record is used instead of that supplied in the function call (block 503). In an alternate embodiment, the IMDB manager returns an error message if the modified flag is set in the record and a matching entry in the look-aside table is not found as a check to ensure a client does not attempt to modify a record having uncommitted modifications made by another client.
- The IMDB manager performs a DeleteKey operation on the old value for each key that is to change (block507). The DeleteKey operation is described in more detail below in conjunction with FIG. 5B.
- The IMDB manager creates the modified record in shared memory with a NEWRECID (block509). If the record being modified is newly added (block 511), i.e., added by the same transaction and not yet committed, the IMDB manager updates the look-aside table entry for the record by replacing the RECID for the previous copy of the record with the NEWRECID for the modified record (block 513). The IMDB manager performs an InsertKey operation on the new value for each key that is to change to equate the new key value with the NEWRECID (block 515). Duplicate key entries that are detected by the InsertKey operation, as described in more detail below in conjunction with FIG. 5C, cause the record modification to fail. For each key that is not being modified, the IMDB manager updates all the corresponding key entries for the appropriate indices in shared memory with the NEWRECID (block 517).
- If the record being modified is not newly added, the IMDB performs an InsertKey operation on the new value for each key that is to change to equate the new key with the OLDRECID of the copy of the record before the current modification (block519). The retrieval function described above maps the new key to the NEWRECID for the client that modifies the record; the new key does not exists for the other clients. As before, if the key is a duplicate (block 521), the record modification fails.
- If the record was previously modified (block523), then the record entry in the look-aside table is updated by replacing the RECID for the previously modified record with the NEWRECID for the current modified record (block 525).
- The DeleteKey operation is illustrated in FIG. 5B and performed by the IMDB manager when executing the modify and delete functions. The IMDB manager determines if an index entry in the look-aside table exists with the same key value that is being deleted (block531). If not, then the IMDB manager creates a new index entry in the look-aside table that contains the deleted key value and RECID of the corresponding record (block 533). The IMDB manager also marks the key entry for the deleted value in the index table as uncommitted-deleted (block 535).
- If there is a matching index entry in the look-aside table at
block 531, then the IMDB manager determines if the corresponding key entry in the index table is marked as uncommitted-inserted (block 537). If not, the entry must be marked as both uncommitted-deleted and uncommitted-inserted so the index entry is retained and the key entry is remarked as uncommitted-deleted (block 535). If the key entry is marked as uncommitted-inserted atblock 537, then both the existing index entry and the key entry are deleted (blocks 539 and 541). - The InsertKey operation is illustrated in FIG. 5C and performed by the IMDB manager when executing the modify and add functions. The IMDB manager determines if an index entry in the look-aside table exists with the same key value that is being inserted (block551). If not, then the IMDB manager creates a new index entry in the look-aside table that contains the new key value and the RECID specified in the InsertKey operation (block 553). The IMDB manager also inserts an entry for the new key value in the index table and marks the entry as uncommitted-inserted (block 555).
- If the index entry does exist at block551, then the IMDB manager determines if the key entry in the index table is marked uncommitted-inserted (block 557). If so, then the key to be added is a duplicate and an error flag is set (block 559). If the key entry is not marked uncommitted-inserted, then the entry must be uncommitted-deleted. Therefore, the existing key entry is marked as both uncommitted-deleted and uncommitted-inserted (block 561), the existing index entry in the look-aside table is deleted (block 563), and a new index entry containing the reinserted key value and the NEWRECID for the reinserted record is created (block 565).
- When the client calls the IMDB manager to delete a record (referring to FIG. 6), the IMDB manager determines if the record was previously modified (block601) so that the modified record can be used rather than the record specified in the function call (block 603). As described in conjunction with FIG. 5, in an alternate embodiment, the IMDB manager checks if the same client performed the previous modification and returns an error if not.
- The IMDB manager performs the DeleteKey operation illustrated in FIG. 5B for each key in the deleted record (block a605). If the record is newly added (block 607), the IMDB deletes the corresponding record entry in the look-aside table (block 609) and deletes the newly added record from shared memory (block 611).
- If the record was previously modified (block613), the IMDB manager deletes the record entry in the look-aside table (block 615) and deletes the modified record from the shared memory (block 617). The IMDB manager also creates a new record entry in the look-aside table that has a null value for the new RECID to denote that the record has been deleted (block 619). The null RECID entry is found by hashing on the RECID of the deleted record. If the record is neither newly added nor previously modified, the IMDB manager marks the record as modified (block 621) and creates the new null record entry at
block 619. - FIG. 7 illustrates the acts performed by the IMDB manager when a client requests that a record be added to the database. The IMDB manager creates the new record in the shared memory marked as modified (block701), adds a record entry containing the RECID of the new record to the look-aside table (block 703), and performs the InsertKey operation illustrated in FIG. 5C for each key in the record (block 705). If any of the keys duplicate existing key values (block 707), the record is not added.
- Commit and rollback processes are mirror images of each other. When the client commits changes, it calls the IMDB manager to update the shared memory to reflect the modifications made by the client as shown in FIG. 8. The IMDB manager reads each entry in the look-aside table for the client (block801) and determines what type of entry it is. The methods used to determine the entry type depends on the data structure of the look-aside table as one of skill in the art will immediately appreciate. The details of a particular look-aside table are described in the next section.
- If the entry is for a modified record (block803), the IMDB manager updates the corresponding key entries in the index tables for the record by replacing the original RECID in the key entries with the RECID for the modified record (block 804). The IMDB manager also deletes the original record from the shared memory (block 807). If the entry is for a deleted record (block 805), the IMDB deletes the original record from the shared memory (block 807). If the entry is an index entry corresponding to an added key (block 809), the IMDB manager removes the UCI marking from the key entry in the shared memory (block 811). If the entry is an index entry corresponding to a deleted key (block 813), the IMDB manager deletes the key entry from the shared memory (block 815). If the entry is an index entry corresponding to a key that has been reinserted (block 817), the IMDB manager removes the UCD and UCI markings from the key entry in the shared memory (block 819) and updates the key entry with the RECID from the corresponding index entry in the look-aside table (block 821). Note that if the entry is for an added record, the IMDB manager takes no action because the newly added indices when committed point to where the new record is stored in shared memory. Once all entries in the look-aside table have been processed (block 823), the IMDB manager deletes the look-aside table from the shared memory (block 825).
- When a client does not commit its changes (aborts), it requests that the IMDB manager rollback the shared memory to a point prior to the changes by discarding all the modifications in shared memory (FIG. 9). The IMDB manager reads each entry from the look-aside table (block901) and determines the type of entry as explained above in conjunction with FIG. 8. If the entry is for a modified record (block 903), the IMDB manager clears the modified flag from the original record in the shared memory (block 905) and deletes the modified (new) record from the shared memory (block 909). If the entry is for an added record (block 907), the IMDB manager deletes the new record from the shared memory (block 909). If the entry is an index entry for an added key (block 911), the IMDB manager deletes the new key entry from the shared memory (block 913). If the entry is an index entry for a deleted key (block 915), the IMDB manager removes the uncommitted-deleted (UCD) marking from the key entry in the shared memory (block 917). If the entry is an index entry for a reinserted key (block 919), the IMDB manager removed the UCD and UCI markings from the key entry in the shared memory (block 921). Note that when the entry is for a deleted record, the IMDB manager takes no action because the indices when rolled back will point to the original record in the shared memory. Once all entries in the look-aside table have been processed (block 923), the IMDB manager deletes the look-aside table from the shared memory (block 925).
- The particular methods performed by a client process and an in-memory database manager process of an exemplary embodiment of the invention have been described. The method performed by the client process has been shown by reference to a flowchart including all the acts from401 until 431. The methods performed by the in-memory database manager process has been shown by reference to six flowcharts including all the acts from 501 until 565, from 601 until 623, from 701 until 715, from 801 until 819, and from 901 until 921. As will be readily apparent to one skilled in the art, the particular order in which certain acts are performed can be varied without departing from the scope of the invention. For example, when a key is modified, the old key can be marked as uncommitted-deleted either before or after the new key is created because the both the original and changed keys are present in the shared memory.
- In this section of the detailed description, a particular implementation of the in-memory database system is described that is part of the Distributed Transaction Coordinator (DTC) available from Microsoft Corp. The in-memory database system employed by the DTC uses page latches to control access to shared memory, and special hash table data structures and hash functions to implement the look-aside table and a transaction table.
- Shared Memory
- The shared memory for the IMDB is divided into logical fixed length pages. The records and index keys from the database are cached on the shared memory pages by the IMDB manager (core process). The index keys cached in the shared memory are arranged in balanced (B+) tree structures for quick access.
- The look-aside tables for the client processes are also cached on the shared memory pages. In the DTC embodiment, the core process maintains a transaction table in the shared memory which associates a transaction identifier, such as a globally unique identifier (GUID), with its look-aside table.
- As with the rest of the data in the shared memory, the client processes are permitted only read access to the look-aside tables and the transaction table.
- A shared memory page comprises a header, a timestamp array, a slot array, and a data section. The header contains a page identifier, the number of entries (data base records, index keys, look-aside tables) stored on the page, a pointer to free space within the data section, and the size of the free space. The timestamp array stores a timestamp value for each page entry. The slot array contains one slot for each page entry; each slot contains the offset of the entry from the start of the data section and the length of the entry.
- Page Latches
- A portion of the shared memory is reserved for page latches. Page latches are a synchronization mechanism which ensures the consistency of the data on a page while a transaction is accessing the page. The page latches are associated with the page and thus can be maintained for multiple transactions operating on a page. Additionally, page latches are of short duration, lasting for only as long as necessary to read or write data to the page. These characteristics also mean that page latches are not subject to deadlocks. In contrast, traditional database locks are associated with a single transaction to keep the transaction consistent, are held for the duration of the transaction, and can incur deadlock situations which require the implementation of complex deadlock detection and resolutions algorithms.
- There is a single exclusive page latch associated with each page which is used by the core process to prevent client processes from accessing the page while the core process is updating data on the page. Each page also has multiple shared page latches. Any process (client or core) can obtain a shared page latch which allows the holder to read data from the page. There are as many shared page latches active at one time as there are transactions accessing the page. Note that a transaction having many threads of execution will use only a single shared page latch for all the threads.
- If there is an exclusive latch on a page, no shared latches can be active. Similarly, when a thread in the core process requests an exclusive page latch, it must wait until all active shared page latches have been released. Thus, page latches provide increased performance in read-intensive environments, which are the most common types of database transactions.
- Because page latches are meant for short duration operations and no deadlock detection scheme is used for them, the client and core processes are designed to obtain page latches in such a way as to prevent deadlock. Typically a thread of execution will obtain only a single latch at a time. However when multiple latches are required, a predetermined ordering is used. When multiple index pages in the B+ tree structure must be latched, a parent page is latched before any of its children pages. When multiple pages at the same level in the index, or multiple data pages, must be latched, they pages are latched in physical order. For example if pages p1, p2, and p3 must be latched where p1 is a non-leaf page and p2, and p3 are leaf pages in the index, then p1 is latched first, then the lower of p1 and p2, then the higher of p1 and p2.
- The page latches for a data page are not stored on the data page because the client process must have write access to the page latch itself in order to obtain the latch and only the core process has write access to the data pages. Instead the page latches are stored in a region of shared memory separate from the database pages themselves and shared by the core and client processes in write mode. In the DTC implementation, the page latch memory region contains eight bytes of latch data for each data page in the shared memory. Therefore, a particular page latch can be found by using the page number to determine the offset for the page latch shared memory, e.g., for page i, the offset in the shared page latch table is i*8.
- Each page latch consists of two fields (both 32-bits in length):
- dwShareCount that indicates the number of shared readers of the page; and
- fExclusive which is set to indicate there is an exclusive latch requested on the page. A page is share latched if dwShareCount is greater than zero. A page is exclusively latched if dwShareCount is zero and fExclusive is set (equal to one). A page is share latched but the core process is waiting for an exclusive latch if dwShareCount greater than zero and fExclusive is one.
- When a thread wants to acquire a shared latch, it executes the following procedure:
- 1. Determines if fExclusive is 0. If so, go to 2, otherwise go to 5.
- 2. Increment dwShareCount (using an InterlockedIncrement instruction that guarantees that only one thread will increment the count; multiple threads trying to increment the count are processed in a serial fashion).
- 3. Determine if fExclusive is 0. If so, then return.
- 4. Decrement dwShareCount (using InterlockedDecrement).
- 5. Sleep and go to 1.
- Thus, a thread can only acquire a shared latch if no other thread has an exclusive latch or is waiting for an exclusive latch. Note, that after incrementing the share count, the thread determines if fExclusive is set because in the interval, another thread may come along and may successfully obtain an exclusive latch as described in more detail below.
- A thread releases a shared latch by using InterlockedDecrement to decrement dwShareCount.
- When a thread wants to acquire an exclusive latch, it executes the following procedure:
- 1. Use InterlockedCompareExchange to set fExclusive to 1. The InterlockedCompareExchange instruction guarantees that a single thread sets fExclusive to 1, so either the instruction will succeed in setting fExclusive to 1 or it will fail which indicates that the fExclusive was already set to 1.
- 2. If the instruction fails, then another thread has or is waiting for an exclusive latch. Sleep and retry until it succeeds.
- 3. If the instruction succeeds in setting fExclusive to 1, determine whether dwShareCount is greater than 0.
- 4. If dwShareCount is 0, then return.
- 5. Set a local counter timesThroughLoop to 0.
- 6. If dwShareCount is greater than 0 then determine if timesThroughLoop is greater than some predetermined maximum. If so, then go to 8.
- 7. Increment timesThroughLoop, sleep, and go to 6.
- 8. Set dwShareCount to 0 and return.
- A thread releases an exclusive latch by using InterlockedCompareExchange to set fExclusive to zero.
- Because only one thread is allowed to set fExclusive at a time, the InterlockedCompareExchange instruction is used. The InterlockedCompareExchange instruction sets a memory variable to a value only if the memory compares equal to another value. The above procedure calls InterlockedCompareExchange(&fExclusive, 1, 0) so that InterlockedCompareExchange will only set fExclusive to one if fExclusive is equal to zero. InterlockedCompareExchange can be implemented either on the underlying processor or in the operating system using other synchronization primitives provided by the processor.
- After obtaining fExclusive, the thread waits for dwShareCount to fall to zero. As discussed above, latches are meant for short duration operations so that the share count falls to zero relatively quickly as other threads release their share latches and because no thread can acquire a shared latch on the page since shared latches cannot be acquired when fExclusive is set. However, because the client processes are running untrusted application code, it is possible that a client process can die while holding a share latch. To recover from this situation, the core process resets the share count if it unable to acquire an exclusive latch after some period of time (e.g., 5 seconds). The core process does not reset an exclusive latch since exclusive latches are only obtained by the core process threads and the core process only runs trusted code.
- Hash Table Data Structures
- Both the look-aside tables and the transaction table are implemented as hash table data structures. The look-aside table data structures are designed to give very high performance and can be scaled to different sizes, as described further below, to accommodate varying numbers of transactions and updates. The index and record entries described in the two previous sections are kept in the look-aside tables along with some miscellaneous entries.
- The DTC embodiment of a look-aside
table data structure 1000 is illustrated in FIG. 10. Arecord entry 1001 comprises three fields: a record identifier for the RECID of theunmodified record 1002, a record identifier for the RECID of the modifiedrecord 1003, and abitmap 1004 used to denote which columns of the record have been modified. If a record is modified multiple times by a transaction, the later changes are OR'd together with the existingbitmap 1004 to create a new bitmap. The bitmap is used to construct the proper database calls when writing committed changes to a back-end database as part of the commit process. - An
index entry 1011 comprises five fields: a RECID 1012 for the key, twokey length fields identifier 1015 for the index for the key, and a RECID 1016 of the new data record associated with the key if the key was deleted and then reinserted as described in the previous section. Because keys can be variable length in the DTC implementation, the key itself is allocated to a separate record to permit fixed length look-aside table entries. In one alternate embodiment, the key entry in the index serves as the separate key record for the look-aside table; in an alternate embodiment, the separate key record is distinct from the key entry so that dynamic allocation of additional keys to the index does not require changes in theindex entry 1011. One of skill in the art will readily recognize that the key can be stored in the look-aside table entry if variable length table entries are supported or if the key is restricted to fixed-length values. When the key corresponding to an index entry is required to have unique values, the primarykey field 1014 is null. When the key is not required to be unique, a combination of the key and the primary key is used for the index entry and thus bothfields - The particular index or record entry is found by translating a search key into a table address using a hash function shared between the core and client processes. The RECID is the search key for record entries. A combination of a database table identifier (which identifies the database table with which the index is associated), the index identifier, and the key value is used as the search key for index entries.
- In the DTC implementation, a RECID is eight bytes long where five bytes specify the shared memory page number, one byte specifies the page sequence number, nine bits specify a slot on the page, and seven bits specify the slot sequence number. The slot sequence number and the page sequence number are used to distinguish recycled or overflow slots and pages. However, the sequence numbers are not useful in distinguishing one record from another when searching the look-aside table and so only the page number and slot are input into the hash function. The algorithm used by the hash function for record search keys in the DTC implementation is
- Let dw=low order four bytes of page #, bh=high byte of page #, and slot=slot # then
- hash=dw^ bh^ (slot <<23)
- where ^ specifies a bitwise exclusive OR operation and << specifies a left shift operation.
- As described above, the search key for an index entry comprises a database table identifier, an index identifier (indexid), and the key value. The database table identifier is a sixteen byte database identifier (DBID) and a double word (32-bit) object identifier (OBJID) assigned by the operating system. The algorithm used by the hash function for index search keys in the DTC implementation is
- hash=OBJID^ (DBID <<16)^ indexid <<12^ keyhash
- where keyhash is the result of a rotating exclusive OR'ing of the bytes of the key, for example:
let cb be the number of bytes in the key keyhash = key[0]; for (ib = 1; ib < cb; ib++) { keyhash =_rot1(keyhash, 1); keyhash = keyhash {circumflex over ( )} key[ib]; } - The value of “hash” produced by the algorithms is divided by the maximum number of entries in the look-side table and the remainder is used as an address for the index or record entry. The hash algorithms are designed to produce a look-aside table address for an entry which is reasonably unique within the table, and falls in the range of zero to one less than the table size. Hash duplicates, or collisions, occur when record already exists at the table address calculated by the hash function for a new record. In such a case, the IMDB uses a linked list collision resolution scheme in which the new record is allocated to a space in shared memory and is linked to the hash address as illustrated in FIGS. 12 and 13 below. The value of the search key RECID is compared with the appropriate RECID field in each hash duplicate entry to find the correct entry.
- One embodiment for a transaction table1100 is shown in FIG. 11 in which each
entry 1101 comprises aGUID 1102 for a transaction and the sharedmemory address 1103 for the look-aside table associated with the transaction. TheGUID 1102 is a 16-byte (four 32-bit words) globally unique identifier assigned by the operating system. An entry is located within the transaction table 1100 by exclusively OR'ing the four words of the GUID, dividing the result by the number of maximum number of entries in the transaction table, and using the remainder to address the entry. Hash duplicates are handled as described above for the look-aside table. The address of the transaction table in the shared memory is stored in a fixed location in the shared memory so that it can always be found by the client processes. - As mentioned above, the transaction and look-aside tables reside on fixed length shared memory pages and are capable of being resized when necessary. Both tables are designed to be allocated in various sizes with the smallest table having seventeen entries and the largest having 866,586 entries (the number of entries that fit on 1974 shared memory pages). There are four other intermediate sizes in the DTC implementation: 127, 439 (the number of entries that fit on one shared memory page), 7463 (the number of entries that fit on seventeen shared memory pages, and 55,753 (the number of entries that fit on 127 shared memory pages). The table size is factored into the hashing function as described above so that the resulting entry address falls within the number of entries for that size of table. Alternate table sizes are contemplated as within the scope of the invention.
- While any given table size can accommodate any possible number of entries because collisions are resolved using the linked list described above, having many more entries than the table is sized to hold leads to reduced performance when it is necessary to traverse the linked list.
- A hash table that fits on a single shared memory page is illustrated in FIG. 12, e.g., a hash table with seventeen, 127, or 439 entries in the DTC implementation. A hash table that spans multiple shared memory pages is illustrated in FIG. 13, e.g., a hash table of 7463, 55,753 or 86,586 entries in the DTC implementation. In both hash table data structures, the first four
bytes 1201, 1301 contain the current size of the hash table. Both figures also illustrate the use of linkedlists table entries first level 1301 is an array ofpage entries 1303 that point topages 1304 which contain thehash entries 1303 comparable with thehash entries 1203 of hash table 1200. The smallest three hash table sizes are single level data structures as shown in FIG. 12. The larger three hash table sizes are two level data structures as shown in FIG. 13. - In order to increase the performance of the IMDB system by reducing the number of traverses of a linked collision list, a transaction or look-aside table is resized to the next size if the current table size is not the maximum allowed size and the number of entries in the current table is greater than the maximum number of entries allowed under the current size. Performance can also be degraded if a transaction or look-aside table is too large since the dedicated but unused space in shared memory cannot be allocated to other data. Therefore, a table is shrunk to a smaller size if the number of entries is less than one half the number of entries in the next smaller sized table.
- The process of resizing a transaction or look-aside table is the responsibility of the core process which acquires an exclusive latch on the page or pages involved so that all client processes are denied access to the look-aside table during resizing. All entries in the old table are deleted from the old table and are added to the new table. Each entry is rehashed because the hash function for the new table can result in a different table address for the entry than its table address in the old table.
- The application of the in-memory database system described in the first two section to support Microsoft's Distributed Transaction Controller has been described in this section. A combination of page latches and hashing methodologies enables the unique versioning scheme described in the previous sections, thus providing concurrent database access while reducing the processing time for transactions.
- A in-memory database system has been described that enables multiple concurrent read-only access to database records through a unique versioning scheme based on look-aside tables associated with modifying transactions. Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that any arrangement which is calculated to achieve the same purpose may be substituted for the specific embodiments shown. This application is intended to cover any adaptations or variations of the present invention.
- For example, those of ordinary skill within the art will appreciate that a persistent database is not necessary to practice the invention and that the data structures and methods of the invention can be used to implement a stand-alone, non-persistent data base. Additionally, while the invention has been described in terms of transactions that commit or abort related updates as a group, the look-aside table versioning scheme is equally applicable to transactions which commit or abort updates individually by including information in the look-aside table which associates each table entry with the update command that created the entry. Furthermore, those of ordinary skill within the art will appreciate that the invention can be practiced with any type of back-end database server, requiring only that the in-memory database manager process be constructed to execute the appropriate commands to read and write data to the database server.
- The terminology used in this application with respect to is meant to include all of these environments. Therefore, it is manifestly intended that this invention be limited only by the following claims and equivalents thereof.
Claims (26)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/135,917 US6457021B1 (en) | 1998-08-18 | 1998-08-18 | In-memory database system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/135,917 US6457021B1 (en) | 1998-08-18 | 1998-08-18 | In-memory database system |
Publications (2)
Publication Number | Publication Date |
---|---|
US20020087500A1 true US20020087500A1 (en) | 2002-07-04 |
US6457021B1 US6457021B1 (en) | 2002-09-24 |
Family
ID=22470366
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/135,917 Expired - Lifetime US6457021B1 (en) | 1998-08-18 | 1998-08-18 | In-memory database system |
Country Status (1)
Country | Link |
---|---|
US (1) | US6457021B1 (en) |
Cited By (77)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030004945A1 (en) * | 2001-06-28 | 2003-01-02 | International Business Machines Corporation | System and method for avoiding deadlock situations due to pseudo-deleted entries |
US20050027758A1 (en) * | 2003-07-07 | 2005-02-03 | Evyatar Meller | Method and system for updating versions of content stored in a storage device |
DE102004045727B3 (en) * | 2004-09-21 | 2006-04-13 | Infineon Technologies Ag | Microcontroller or microprocessor with interface circuit has demultiplexer, registers in parallel and multiplexer for processing signals including page control bits |
US20060294118A1 (en) * | 2005-06-24 | 2006-12-28 | Seagate Technology Llc | Skip list with address related table structure |
US20070005457A1 (en) * | 2005-06-16 | 2007-01-04 | Andrei Suvernev | Parallel time interval processing through shadowing |
US7222117B1 (en) | 2003-11-14 | 2007-05-22 | Advent Software, Inc. | Segmented global area database |
EP1840768A2 (en) * | 2006-03-28 | 2007-10-03 | Sun Microsystems, Inc. | Systems and method for a distributed in-memory database |
EP1840766A3 (en) * | 2006-03-28 | 2007-12-26 | Sun Microsystems, Inc. | Systems and methods for a distributed in-memory database and distributed cache |
US7401093B1 (en) * | 2003-11-10 | 2008-07-15 | Network Appliance, Inc. | System and method for managing file data during consistency points |
US20080228795A1 (en) * | 2007-03-12 | 2008-09-18 | Microsoft Corporation | Transaction time indexing with version compression |
US20080243966A1 (en) * | 2007-04-02 | 2008-10-02 | Croisettier Ramanakumari M | System and method for managing temporary storage space of a database management system |
US20090083276A1 (en) * | 2007-09-26 | 2009-03-26 | Barsness Eric L | Inserting data into an in-memory distributed nodal database |
US20090106196A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Transferring records between tables using a change transaction log |
US20090106216A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Push-model based index updating |
US20090106324A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Push-model based index deletion |
US20090106325A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Restoring records using a change transaction log |
US20090144337A1 (en) * | 2007-11-29 | 2009-06-04 | Eric Lawrence Barsness | Commitment control for less than an entire record in an in-memory database in a parallel computer system |
US20090182656A1 (en) * | 2003-10-22 | 2009-07-16 | Scottrade, Inc. | System and Method for the Automated Brokerage of Financial Instruments |
US20090300286A1 (en) * | 2008-05-28 | 2009-12-03 | International Business Machines Corporation | Method for coordinating updates to database and in-memory cache |
US7721062B1 (en) | 2003-11-10 | 2010-05-18 | Netapp, Inc. | Method for detecting leaked buffer writes across file system consistency points |
US20100268905A1 (en) * | 2007-12-17 | 2010-10-21 | Freescale Semiconductor, Inc. | Memory mapping system, request controller, multi-processing arrangement, central interrupt request controller, apparatus, method for controlling memory access and computer program product |
US7822792B2 (en) * | 2006-12-15 | 2010-10-26 | Sap Ag | Administration of planning file entries in planning systems with concurrent transactions |
US20100318497A1 (en) * | 2009-06-16 | 2010-12-16 | Bmc Software, Inc. | Unobtrusive Copies of Actively Used Compressed Indices |
US20110010299A1 (en) * | 1996-06-28 | 2011-01-13 | Shannon Lee Byrne | System for dynamically encrypting content for secure internet commerce and providing embedded fulfillment software |
GB2476360A (en) * | 2009-12-21 | 2011-06-22 | Intel Corp | Passing data from a CPU to a graphics processor by writing multiple versions of the data in a shared memory |
US20120259809A1 (en) * | 2011-04-11 | 2012-10-11 | Sap Ag | In-Memory Processing for a Data Warehouse |
US20120323971A1 (en) * | 2011-06-14 | 2012-12-20 | Sybase, Inc. | Optimizing data storage and access of an in-memory database |
US20130013602A1 (en) * | 2011-07-06 | 2013-01-10 | International Business Machines Corporation | Database system |
US8438558B1 (en) | 2009-03-27 | 2013-05-07 | Google Inc. | System and method of updating programs and data |
US8655886B1 (en) * | 2011-03-25 | 2014-02-18 | Google Inc. | Selective indexing of content portions |
US20140101635A1 (en) * | 2012-10-09 | 2014-04-10 | Martin Hoffmann | Automated generation of two-tier mobile applications |
US20140279836A1 (en) * | 2013-03-13 | 2014-09-18 | Sap Ag | Configurable Rule for Monitoring Data of In Memory Database |
WO2014178854A1 (en) * | 2013-04-30 | 2014-11-06 | Hewlett-Packard Development Company, L.P. | Memory network to route memory traffic and i/o traffic |
US8886671B1 (en) | 2013-08-14 | 2014-11-11 | Advent Software, Inc. | Multi-tenant in-memory database (MUTED) system and method |
EP2802110A1 (en) * | 2013-05-10 | 2014-11-12 | Arista Networks, Inc. | System and method for reading and writing data with a shared memory hash table |
US20140344796A1 (en) * | 2013-05-20 | 2014-11-20 | General Electric Company | Utility meter with utility-configurable sealed data |
US20150074053A1 (en) * | 2013-09-12 | 2015-03-12 | Sap Ag | Cross System Analytics for In Memory Data Warehouse |
US20150220595A1 (en) * | 2014-01-31 | 2015-08-06 | International Business Machines Corporation | Dynamically adjust duplicate skipping method for increased performance |
US20150324278A1 (en) * | 2014-03-03 | 2015-11-12 | Empire Technology Development Llc | Data sort using memory-intensive exosort |
US20160147827A1 (en) * | 2010-04-08 | 2016-05-26 | Microsoft Technology Licensing, Llc | In-memory database system |
US20160350360A1 (en) * | 2010-03-24 | 2016-12-01 | Matrixx Software, Inc. | System with multiple conditional commit databases |
WO2017063049A1 (en) * | 2015-10-15 | 2017-04-20 | Big Ip Pty Ltd | A system, method, computer program and data signal for conducting an electronic search of a database |
WO2017063048A1 (en) * | 2015-10-15 | 2017-04-20 | Big Ip Pty Ltd | A system, method, computer program and data signal for the provision of a database of information for lead generating purposes |
US20170147633A1 (en) * | 2015-11-23 | 2017-05-25 | Sap Se | Unified table delta dictionary lazy materialization |
WO2017095387A1 (en) * | 2015-11-30 | 2017-06-08 | Hewlett-Packard Enterprise Development LP | Multiple simultaneous value object |
US9734221B2 (en) | 2013-09-12 | 2017-08-15 | Sap Se | In memory database warehouse |
EP3136245A4 (en) * | 2014-04-23 | 2017-10-18 | Hitachi, Ltd. | Computer |
US20170316042A1 (en) * | 2016-04-27 | 2017-11-02 | Sap Se | Index page with latch-free access |
US20180167460A1 (en) * | 2016-12-09 | 2018-06-14 | Google Inc. | High-throughput algorithm for multiversion concurrency control with globally synchronized time |
US10152527B1 (en) * | 2015-12-28 | 2018-12-11 | EMC IP Holding Company LLC | Increment resynchronization in hash-based replication |
US10268723B2 (en) | 2016-06-20 | 2019-04-23 | TmaxData Co., Ltd. | Method and apparatus for executing query and computer readable medium therefor |
US10275491B2 (en) | 2016-06-20 | 2019-04-30 | TmaxData Co., Ltd. | Method and apparatus for executing query and computer readable medium therefor |
WO2019143849A1 (en) * | 2018-01-17 | 2019-07-25 | Medici Ventures, Inc. | Multi-approval system using m of n keys to restore a customer wallet |
US10394797B2 (en) | 2016-03-10 | 2019-08-27 | TmaxData Co., Ltd. | Method and computing apparatus for managing main memory database |
US10489382B2 (en) * | 2017-04-18 | 2019-11-26 | International Business Machines Corporation | Register restoration invalidation based on a context switch |
US10496670B1 (en) * | 2009-01-21 | 2019-12-03 | Vmware, Inc. | Computer storage deduplication |
US10540184B2 (en) | 2017-04-18 | 2020-01-21 | International Business Machines Corporation | Coalescing store instructions for restoration |
US10545766B2 (en) | 2017-04-18 | 2020-01-28 | International Business Machines Corporation | Register restoration using transactional memory register snapshots |
US10552164B2 (en) | 2017-04-18 | 2020-02-04 | International Business Machines Corporation | Sharing snapshots between restoration and recovery |
US10564977B2 (en) | 2017-04-18 | 2020-02-18 | International Business Machines Corporation | Selective register allocation |
US10572265B2 (en) | 2017-04-18 | 2020-02-25 | International Business Machines Corporation | Selecting register restoration or register reloading |
CN111143143A (en) * | 2019-12-26 | 2020-05-12 | 北京神州绿盟信息安全科技股份有限公司 | Performance test method and device |
US10649785B2 (en) | 2017-04-18 | 2020-05-12 | International Business Machines Corporation | Tracking changes to memory via check and recovery |
CN111259048A (en) * | 2020-01-08 | 2020-06-09 | 人民法院信息技术服务中心 | Data transmission method and system based on memory database and multiple data channels |
US10732981B2 (en) | 2017-04-18 | 2020-08-04 | International Business Machines Corporation | Management of store queue based on restoration operation |
US10782979B2 (en) | 2017-04-18 | 2020-09-22 | International Business Machines Corporation | Restoring saved architected registers and suppressing verification of registers to be restored |
US10838733B2 (en) | 2017-04-18 | 2020-11-17 | International Business Machines Corporation | Register context restoration based on rename register recovery |
US20210004360A1 (en) * | 2015-07-30 | 2021-01-07 | Workday, Inc. | Indexing structured data with security information |
US10963261B2 (en) | 2017-04-18 | 2021-03-30 | International Business Machines Corporation | Sharing snapshots across save requests |
US10990585B2 (en) * | 2018-05-10 | 2021-04-27 | Sap Se | Transaction-specific selective uncommitted read for database transactions |
US11010192B2 (en) | 2017-04-18 | 2021-05-18 | International Business Machines Corporation | Register restoration using recovery buffers |
US11068469B2 (en) | 2015-09-04 | 2021-07-20 | Arista Networks, Inc. | System and method of a dynamic shared memory hash table with notifications |
CN113177031A (en) * | 2021-04-21 | 2021-07-27 | 北京人大金仓信息技术股份有限公司 | Processing method and device for database shared cache, electronic equipment and medium |
US11182365B2 (en) * | 2016-03-21 | 2021-11-23 | Mellanox Technologies Tlv Ltd. | Systems and methods for distributed storage of data across multiple hash tables |
US20220078236A1 (en) * | 2020-09-10 | 2022-03-10 | EMC IP Holding Company LLC | Multipart upload for distributed file systems |
US11288251B2 (en) * | 2018-05-25 | 2022-03-29 | Microsoft Technology Licensing, Llc | Supporting concurrent updates to a database page |
US11663207B2 (en) * | 2018-09-24 | 2023-05-30 | Salesforce, Inc. | Translation of tenant identifiers |
Families Citing this family (100)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6513084B1 (en) * | 1999-06-29 | 2003-01-28 | Microsoft Corporation | Arbitration of state changes |
US6718032B1 (en) * | 1999-07-13 | 2004-04-06 | Interactive Intelligence | Read-only in-memory tables for intelligent call processing system |
US7099898B1 (en) * | 1999-08-12 | 2006-08-29 | International Business Machines Corporation | Data access system |
US7110973B1 (en) * | 1999-09-29 | 2006-09-19 | Charles Schwab & Co., Inc. | Method of processing customer transactions |
US6618822B1 (en) * | 2000-01-03 | 2003-09-09 | Oracle International Corporation | Method and mechanism for relational access of recovery logs in a database system |
US6675180B2 (en) * | 2000-06-06 | 2004-01-06 | Matsushita Electric Industrial Co., Ltd. | Data updating apparatus that performs quick restoration processing |
US6631374B1 (en) | 2000-09-29 | 2003-10-07 | Oracle Corp. | System and method for providing fine-grained temporal database access |
US8650169B1 (en) * | 2000-09-29 | 2014-02-11 | Oracle International Corporation | Method and mechanism for identifying transaction on a row of data |
US6959301B2 (en) * | 2001-01-04 | 2005-10-25 | Reuters Limited | Maintaining and reconstructing the history of database content modified by a series of events |
JP3631680B2 (en) * | 2001-02-06 | 2005-03-23 | 株式会社ビーコンインフォメーションテクノロジー | Data processing system, data processing method, and computer program |
US7174363B1 (en) | 2001-02-22 | 2007-02-06 | Charles Schwab & Co., Inc. | Distributed computing system architecture |
US7310653B2 (en) * | 2001-04-02 | 2007-12-18 | Siebel Systems, Inc. | Method, system, and product for maintaining software objects during database upgrade |
US6996600B2 (en) * | 2001-09-28 | 2006-02-07 | Siemens Building Technologies Inc. | System and method for servicing messages between device controller nodes and via a lon network |
US7155499B2 (en) * | 2001-09-28 | 2006-12-26 | Siemens Building Technologies, Inc. | System controller for controlling a control network having an open communication protocol via proprietary communication |
US20030074460A1 (en) * | 2001-09-28 | 2003-04-17 | Michael Soemo | Proprietary protocol for communicating network variables on a control network |
US20030074459A1 (en) * | 2001-09-28 | 2003-04-17 | Michael Soemo | Proprietary protocol for a system controller for controlling device controllers on a network having an open communication protocol |
US7072879B2 (en) * | 2001-10-22 | 2006-07-04 | Siemens Building Technologies, Inc. | Partially embedded database and an embedded database manager for a control system |
US7117069B2 (en) * | 2001-11-28 | 2006-10-03 | Siemens Building Technologies, Inc. | Apparatus and method for executing block programs |
US6993539B2 (en) | 2002-03-19 | 2006-01-31 | Network Appliance, Inc. | System and method for determining changes in two snapshots and for transmitting changes to destination snapshot |
US8099393B2 (en) * | 2002-03-22 | 2012-01-17 | Oracle International Corporation | Transaction in memory object store |
CA2383825A1 (en) * | 2002-04-24 | 2003-10-24 | Ibm Canada Limited-Ibm Canada Limitee | Dynamic configuration and self-tuning of inter-nodal communication resources in a database management system |
CA2383713A1 (en) * | 2002-04-26 | 2003-10-26 | Ibm Canada Limited-Ibm Canada Limitee | Managing attribute-tagged index entries |
US7873700B2 (en) * | 2002-08-09 | 2011-01-18 | Netapp, Inc. | Multi-protocol storage appliance that provides integrated support for file and block access protocols |
US7340486B1 (en) * | 2002-10-10 | 2008-03-04 | Network Appliance, Inc. | System and method for file system snapshot of a virtual logical disk |
US8010491B2 (en) * | 2003-02-28 | 2011-08-30 | Microsoft Corporation | Method for managing multiple file states for replicated files |
US7562214B2 (en) * | 2003-03-31 | 2009-07-14 | International Business Machines Corporation | Data processing systems |
US7383378B1 (en) * | 2003-04-11 | 2008-06-03 | Network Appliance, Inc. | System and method for supporting file and block access to storage object on a storage appliance |
US7113953B2 (en) * | 2003-06-30 | 2006-09-26 | International Business Machines Corporation | System and method for efficiently writing data from an in-memory database to a disk database |
US9412123B2 (en) | 2003-07-01 | 2016-08-09 | The 41St Parameter, Inc. | Keystroke analysis |
WO2005008183A2 (en) * | 2003-07-16 | 2005-01-27 | Prospective Concepts Ag | Modular data recording and display unit |
US8225282B1 (en) * | 2003-11-25 | 2012-07-17 | Nextaxiom Technology, Inc. | Semantic-based, service-oriented system and method of developing, programming and managing software modules and software solutions |
US20050187984A1 (en) * | 2004-02-20 | 2005-08-25 | Tianlong Chen | Data driven database management system and method |
US7421562B2 (en) * | 2004-03-01 | 2008-09-02 | Sybase, Inc. | Database system providing methodology for extended memory support |
US10999298B2 (en) | 2004-03-02 | 2021-05-04 | The 41St Parameter, Inc. | Method and system for identifying users and detecting fraud by use of the internet |
WO2005086003A1 (en) * | 2004-03-08 | 2005-09-15 | Annex Systems Incorporated | Database system |
EP1738287A4 (en) * | 2004-03-18 | 2008-01-23 | Andrew Peter Liebman | A novel media file access and storage solution for multi-workstation/multi-platform non-linear video editing systems |
WO2009129252A2 (en) * | 2008-04-14 | 2009-10-22 | Andrew Liebman | A novel media file for multi-platform non-linear video editing systems |
US7499953B2 (en) | 2004-04-23 | 2009-03-03 | Oracle International Corporation | Online recovery of user tables using flashback table |
US7437355B2 (en) * | 2004-06-24 | 2008-10-14 | Sap Ag | Method and system for parallel update of database |
US7822727B1 (en) * | 2004-07-02 | 2010-10-26 | Borland Software Corporation | System and methodology for performing read-only transactions in a shared cache |
US8095501B1 (en) | 2004-07-27 | 2012-01-10 | Infoblox Inc. | Automatic enforcement or relationships in a database schema |
US7383281B1 (en) * | 2004-09-24 | 2008-06-03 | Infoblox, Inc. | Multiversion database cluster management |
US8364631B1 (en) | 2004-09-24 | 2013-01-29 | Infoblox Inc. | Database migration |
US7698501B1 (en) | 2005-04-29 | 2010-04-13 | Netapp, Inc. | System and method for utilizing sparse data containers in a striped volume set |
US7653682B2 (en) * | 2005-07-22 | 2010-01-26 | Netapp, Inc. | Client failure fencing mechanism for fencing network file system data in a host-cluster environment |
US7454436B2 (en) * | 2005-08-30 | 2008-11-18 | Microsoft Corporation | Generational global name table |
EP1934719A2 (en) * | 2005-09-20 | 2008-06-25 | Sterna Technologies (2005) Ltd. | A method and system for managing data and organizational constraints |
US11301585B2 (en) | 2005-12-16 | 2022-04-12 | The 41St Parameter, Inc. | Methods and apparatus for securely displaying digital images |
US8938671B2 (en) | 2005-12-16 | 2015-01-20 | The 41St Parameter, Inc. | Methods and apparatus for securely displaying digital images |
US7702658B2 (en) * | 2006-01-27 | 2010-04-20 | International Business Machines Corporation | Method for optimistic locking using SQL select, update, delete, and insert statements |
US8151327B2 (en) | 2006-03-31 | 2012-04-03 | The 41St Parameter, Inc. | Systems and methods for detection of session tampering and fraud prevention |
US7698273B2 (en) * | 2006-06-23 | 2010-04-13 | Microsoft Corporation | Solving table locking problems associated with concurrent processing |
WO2008001192A2 (en) * | 2006-06-28 | 2008-01-03 | Nokia Corporation | Apparatus, method and computer program product providing protected feedback signaling transmission in uplink closed-loop mimo |
US20080033908A1 (en) * | 2006-08-04 | 2008-02-07 | Nortel Networks Limited | Method and system for data processing in a shared database environment |
US7609703B2 (en) | 2006-09-15 | 2009-10-27 | Hewlett-Packard Development Company, L.P. | Group communication system and method |
US8181187B2 (en) * | 2006-12-01 | 2012-05-15 | Portico Systems | Gateways having localized in-memory databases and business logic execution |
GB0625698D0 (en) * | 2006-12-21 | 2007-01-31 | Ibm | Rollback support in distributed data management systems |
US8301673B2 (en) * | 2006-12-29 | 2012-10-30 | Netapp, Inc. | System and method for performing distributed consistency verification of a clustered file system |
US8219821B2 (en) | 2007-03-27 | 2012-07-10 | Netapp, Inc. | System and method for signature based data container recognition |
US8312214B1 (en) | 2007-03-28 | 2012-11-13 | Netapp, Inc. | System and method for pausing disk drives in an aggregate |
US8219749B2 (en) * | 2007-04-27 | 2012-07-10 | Netapp, Inc. | System and method for efficient updates of sequential block storage |
US7882304B2 (en) * | 2007-04-27 | 2011-02-01 | Netapp, Inc. | System and method for efficient updates of sequential block storage |
US7827350B1 (en) | 2007-04-27 | 2010-11-02 | Netapp, Inc. | Method and system for promoting a snapshot in a distributed file system |
US7996636B1 (en) | 2007-11-06 | 2011-08-09 | Netapp, Inc. | Uniquely identifying block context signatures in a storage volume hierarchy |
US7984259B1 (en) | 2007-12-17 | 2011-07-19 | Netapp, Inc. | Reducing load imbalance in a storage system |
US8380674B1 (en) | 2008-01-09 | 2013-02-19 | Netapp, Inc. | System and method for migrating lun data between data containers |
US8725986B1 (en) | 2008-04-18 | 2014-05-13 | Netapp, Inc. | System and method for volume block number to disk block number mapping |
CN105373592B (en) * | 2008-06-19 | 2019-01-11 | 安德鲁·利布曼 | For the novel media file access of multiple-workstation/multi-platform non-linear video editing systems and storage solution |
US20100023560A1 (en) * | 2008-07-25 | 2010-01-28 | Mitel Networks Corporation | Method and apparatus for concurrently updating a database |
US9280572B2 (en) * | 2009-01-12 | 2016-03-08 | Oracle International Corporation | Managing product information versions |
JP4672778B2 (en) * | 2009-01-29 | 2011-04-20 | 東芝ストレージデバイス株式会社 | Data storage |
US9112850B1 (en) | 2009-03-25 | 2015-08-18 | The 41St Parameter, Inc. | Systems and methods of sharing information through a tag-based consortium |
US8458217B1 (en) | 2009-08-24 | 2013-06-04 | Advent Software, Inc. | Instantly built information space (IBIS) |
US9053200B2 (en) * | 2009-12-14 | 2015-06-09 | Appfolio, Inc. | Systems and methods for sorting, grouping, and rendering subsets of large datasets over a network |
CN102156700A (en) * | 2010-02-12 | 2011-08-17 | 华为技术有限公司 | Database accessing method and device and system |
JP5459613B2 (en) * | 2010-02-26 | 2014-04-02 | 日本電気株式会社 | Data processing system, data processing method, and data processing program |
JP5460486B2 (en) * | 2010-06-23 | 2014-04-02 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Apparatus and method for sorting data |
WO2012054646A2 (en) | 2010-10-19 | 2012-04-26 | The 41St Parameter, Inc. | Variable risk engine |
CN102117338B (en) * | 2011-04-02 | 2013-03-13 | 天脉聚源(北京)传媒科技有限公司 | Data base caching method |
US9626375B2 (en) | 2011-04-08 | 2017-04-18 | Andrew Liebman | Systems, computer readable storage media, and computer implemented methods for project sharing |
US20120310934A1 (en) * | 2011-06-03 | 2012-12-06 | Thomas Peh | Historic View on Column Tables Using a History Table |
US8769350B1 (en) * | 2011-09-20 | 2014-07-01 | Advent Software, Inc. | Multi-writer in-memory non-copying database (MIND) system and method |
US10754913B2 (en) | 2011-11-15 | 2020-08-25 | Tapad, Inc. | System and method for analyzing user device information |
US8332349B1 (en) | 2012-01-06 | 2012-12-11 | Advent Software, Inc. | Asynchronous acid event-driven data processing using audit trail tools for transaction systems |
US9633201B1 (en) | 2012-03-01 | 2017-04-25 | The 41St Parameter, Inc. | Methods and systems for fraud containment |
US9521551B2 (en) | 2012-03-22 | 2016-12-13 | The 41St Parameter, Inc. | Methods and systems for persistent cross-application mobile device identification |
EP2880619A1 (en) | 2012-08-02 | 2015-06-10 | The 41st Parameter, Inc. | Systems and methods for accessing records via derivative locators |
WO2014078569A1 (en) | 2012-11-14 | 2014-05-22 | The 41St Parameter, Inc. | Systems and methods of global identification |
US20140258212A1 (en) * | 2013-03-06 | 2014-09-11 | Sap Ag | Dynamic in-memory database search |
US9430543B2 (en) | 2013-03-15 | 2016-08-30 | Wal-Mart Stores, Inc. | Incrementally updating a large key-value store |
US9928104B2 (en) * | 2013-06-19 | 2018-03-27 | Nvidia Corporation | System, method, and computer program product for a two-phase queue |
US10909113B2 (en) | 2013-07-31 | 2021-02-02 | Sap Se | Global dictionary for database management systems |
US9659050B2 (en) | 2013-08-06 | 2017-05-23 | Sybase, Inc. | Delta store giving row-level versioning semantics to a non-row-level versioning underlying store |
US10902327B1 (en) | 2013-08-30 | 2021-01-26 | The 41St Parameter, Inc. | System and method for device identification and uniqueness |
US9773048B2 (en) * | 2013-09-12 | 2017-09-26 | Sap Se | Historical data for in memory data warehouse |
US9292564B2 (en) * | 2013-09-21 | 2016-03-22 | Oracle International Corporation | Mirroring, in memory, data from disk to improve query performance |
US9430329B2 (en) * | 2014-04-03 | 2016-08-30 | Seagate Technology Llc | Data integrity management in a data storage device |
US10091312B1 (en) | 2014-10-14 | 2018-10-02 | The 41St Parameter, Inc. | Data structures for intelligently resolving deterministic and probabilistic device identifiers to device profiles and/or groups |
US10019476B2 (en) | 2015-05-27 | 2018-07-10 | Microsoft Technology Licensing, Llc | Multi-version data system nested transactions isolation |
US11164206B2 (en) * | 2018-11-16 | 2021-11-02 | Comenity Llc | Automatically aggregating, evaluating, and providing a contextually relevant offer |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4914569A (en) * | 1987-10-30 | 1990-04-03 | International Business Machines Corporation | Method for concurrent record access, insertion, deletion and alteration using an index tree |
JP3392236B2 (en) * | 1994-11-04 | 2003-03-31 | 富士通株式会社 | Distributed transaction processing system |
US5745904A (en) | 1996-01-12 | 1998-04-28 | Microsoft Corporation | Buffered table user index |
US5878410A (en) * | 1996-09-13 | 1999-03-02 | Microsoft Corporation | File system sort order indexes |
US5832508A (en) * | 1996-09-18 | 1998-11-03 | Sybase, Inc. | Method for deallocating a log in database systems |
US6029177A (en) * | 1997-11-13 | 2000-02-22 | Electronic Data Systems Corporation | Method and system for maintaining the integrity of a database providing persistent storage for objects |
-
1998
- 1998-08-18 US US09/135,917 patent/US6457021B1/en not_active Expired - Lifetime
Cited By (148)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110010299A1 (en) * | 1996-06-28 | 2011-01-13 | Shannon Lee Byrne | System for dynamically encrypting content for secure internet commerce and providing embedded fulfillment software |
US20160217274A1 (en) * | 1996-06-28 | 2016-07-28 | Arvato Digital Services Llc | System for dynamically encrypting content for secure internet commerce and providing embedded fulfillment software |
US20030004945A1 (en) * | 2001-06-28 | 2003-01-02 | International Business Machines Corporation | System and method for avoiding deadlock situations due to pseudo-deleted entries |
US6944615B2 (en) * | 2001-06-28 | 2005-09-13 | International Business Machines Corporation | System and method for avoiding deadlock situations due to pseudo-deleted entries |
US20050027758A1 (en) * | 2003-07-07 | 2005-02-03 | Evyatar Meller | Method and system for updating versions of content stored in a storage device |
US7676479B2 (en) * | 2003-07-07 | 2010-03-09 | Red Bend Ltd. | Method and system for updating versions of content stored in a storage device |
US8615454B2 (en) | 2003-10-22 | 2013-12-24 | Scottrade, Inc. | System and method for the automated brokerage of financial instruments |
US8655755B2 (en) | 2003-10-22 | 2014-02-18 | Scottrade, Inc. | System and method for the automated brokerage of financial instruments |
US20090182656A1 (en) * | 2003-10-22 | 2009-07-16 | Scottrade, Inc. | System and Method for the Automated Brokerage of Financial Instruments |
US8170940B2 (en) * | 2003-10-22 | 2012-05-01 | Scottrade, Inc. | System and method for the automated brokerage of financial instruments |
US8756130B2 (en) | 2003-10-22 | 2014-06-17 | Scottrade, Inc. | System and method for the automated brokerage of financial instruments |
US8612321B2 (en) | 2003-10-22 | 2013-12-17 | Scottrade, Inc. | System and method for the automated brokerage of financial instruments |
US7721062B1 (en) | 2003-11-10 | 2010-05-18 | Netapp, Inc. | Method for detecting leaked buffer writes across file system consistency points |
US7739250B1 (en) | 2003-11-10 | 2010-06-15 | Netapp, Inc. | System and method for managing file data during consistency points |
US7401093B1 (en) * | 2003-11-10 | 2008-07-15 | Network Appliance, Inc. | System and method for managing file data during consistency points |
US7979402B1 (en) | 2003-11-10 | 2011-07-12 | Netapp, Inc. | System and method for managing file data during consistency points |
US7222117B1 (en) | 2003-11-14 | 2007-05-22 | Advent Software, Inc. | Segmented global area database |
DE102004045727B3 (en) * | 2004-09-21 | 2006-04-13 | Infineon Technologies Ag | Microcontroller or microprocessor with interface circuit has demultiplexer, registers in parallel and multiplexer for processing signals including page control bits |
US9141930B2 (en) * | 2005-06-16 | 2015-09-22 | Sap Se | Method and apparatus for making changes to a quantity for a time interval within a time series |
US20070005457A1 (en) * | 2005-06-16 | 2007-01-04 | Andrei Suvernev | Parallel time interval processing through shadowing |
US10068197B2 (en) | 2005-06-16 | 2018-09-04 | Sap Se | Method and apparatus for making changes to a quantity for a time interval within a time series |
US7831624B2 (en) * | 2005-06-24 | 2010-11-09 | Seagate Technology Llc | Skip list with address related table structure |
US20060294118A1 (en) * | 2005-06-24 | 2006-12-28 | Seagate Technology Llc | Skip list with address related table structure |
EP1840768A3 (en) * | 2006-03-28 | 2007-12-26 | Sun Microsystems, Inc. | Systems and method for a distributed in-memory database |
EP1840768A2 (en) * | 2006-03-28 | 2007-10-03 | Sun Microsystems, Inc. | Systems and method for a distributed in-memory database |
EP1840766A3 (en) * | 2006-03-28 | 2007-12-26 | Sun Microsystems, Inc. | Systems and methods for a distributed in-memory database and distributed cache |
US7822792B2 (en) * | 2006-12-15 | 2010-10-26 | Sap Ag | Administration of planning file entries in planning systems with concurrent transactions |
US20080228795A1 (en) * | 2007-03-12 | 2008-09-18 | Microsoft Corporation | Transaction time indexing with version compression |
US7747589B2 (en) * | 2007-03-12 | 2010-06-29 | Microsoft Corporation | Transaction time indexing with version compression |
US20080243966A1 (en) * | 2007-04-02 | 2008-10-02 | Croisettier Ramanakumari M | System and method for managing temporary storage space of a database management system |
US8892558B2 (en) | 2007-09-26 | 2014-11-18 | International Business Machines Corporation | Inserting data into an in-memory distributed nodal database |
US9183284B2 (en) | 2007-09-26 | 2015-11-10 | International Business Machines Corporation | Inserting data into an in-memory distributed nodal database |
US20090083276A1 (en) * | 2007-09-26 | 2009-03-26 | Barsness Eric L | Inserting data into an in-memory distributed nodal database |
US9183283B2 (en) | 2007-09-26 | 2015-11-10 | International Business Machines Corporation | Inserting data into an in-memory distributed nodal database |
US9594794B2 (en) * | 2007-10-19 | 2017-03-14 | Oracle International Corporation | Restoring records using a change transaction log |
US20090106325A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Restoring records using a change transaction log |
US20090106324A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Push-model based index deletion |
US20090106216A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Push-model based index updating |
US9418154B2 (en) | 2007-10-19 | 2016-08-16 | Oracle International Corporation | Push-model based index updating |
US20090106196A1 (en) * | 2007-10-19 | 2009-04-23 | Oracle International Corporation | Transferring records between tables using a change transaction log |
US8682859B2 (en) | 2007-10-19 | 2014-03-25 | Oracle International Corporation | Transferring records between tables using a change transaction log |
US9594784B2 (en) | 2007-10-19 | 2017-03-14 | Oracle International Corporation | Push-model based index deletion |
US20090144337A1 (en) * | 2007-11-29 | 2009-06-04 | Eric Lawrence Barsness | Commitment control for less than an entire record in an in-memory database in a parallel computer system |
US8027996B2 (en) * | 2007-11-29 | 2011-09-27 | International Business Machines Corporation | Commitment control for less than an entire record in an in-memory database in a parallel computer system |
US8397043B2 (en) * | 2007-12-17 | 2013-03-12 | Freescale Semiconductor, Inc. | Memory mapping system, request controller, multi-processing arrangement, central interrupt request controller, apparatus, method for controlling memory access and computer program product |
US20100268905A1 (en) * | 2007-12-17 | 2010-10-21 | Freescale Semiconductor, Inc. | Memory mapping system, request controller, multi-processing arrangement, central interrupt request controller, apparatus, method for controlling memory access and computer program product |
US8250028B2 (en) * | 2008-05-28 | 2012-08-21 | International Business Machines Corporation | Method for coordinating updates to database and in-memory cache |
US20120030429A1 (en) * | 2008-05-28 | 2012-02-02 | International Business Machines Corporation | Method for coordinating updates to database and in-memory cache |
US8131698B2 (en) * | 2008-05-28 | 2012-03-06 | International Business Machines Corporation | Method for coordinating updates to database and in-memory cache |
US20090300286A1 (en) * | 2008-05-28 | 2009-12-03 | International Business Machines Corporation | Method for coordinating updates to database and in-memory cache |
US10496670B1 (en) * | 2009-01-21 | 2019-12-03 | Vmware, Inc. | Computer storage deduplication |
US11899592B2 (en) * | 2009-01-21 | 2024-02-13 | Vmware, Inc. | Computer storage deduplication |
US20200065318A1 (en) * | 2009-01-21 | 2020-02-27 | Vmware, Inc. | Computer storage deduplication |
US8438558B1 (en) | 2009-03-27 | 2013-05-07 | Google Inc. | System and method of updating programs and data |
US10642696B2 (en) | 2009-06-16 | 2020-05-05 | Bmc Software, Inc. | Copying compressed pages without uncompressing the compressed pages |
US9753811B2 (en) | 2009-06-16 | 2017-09-05 | Bmc Software, Inc. | Unobtrusive copies of actively used compressed indices |
US20100318497A1 (en) * | 2009-06-16 | 2010-12-16 | Bmc Software, Inc. | Unobtrusive Copies of Actively Used Compressed Indices |
US8843449B2 (en) * | 2009-06-16 | 2014-09-23 | Bmc Software, Inc. | Unobtrusive copies of actively used compressed indices |
US9710396B2 (en) * | 2009-12-21 | 2017-07-18 | Intel Corporation | Sharing virtual memory-based multi-version data between the heterogeneous processors of a computer platform |
US8868848B2 (en) | 2009-12-21 | 2014-10-21 | Intel Corporation | Sharing virtual memory-based multi-version data between the heterogenous processors of a computer platform |
US20150019825A1 (en) * | 2009-12-21 | 2015-01-15 | Ying Gao | Sharing virtual memory-based multi-version data between the heterogeneous processors of a computer platform |
GB2476360A (en) * | 2009-12-21 | 2011-06-22 | Intel Corp | Passing data from a CPU to a graphics processor by writing multiple versions of the data in a shared memory |
GB2476360B (en) * | 2009-12-21 | 2012-10-31 | Intel Corp | Sharing virtual memory-based multi-version data between the heterogenous processors of a computer platform |
US20110153957A1 (en) * | 2009-12-21 | 2011-06-23 | Ying Gao | Sharing virtual memory-based multi-version data between the heterogenous processors of a computer platform |
US9756469B2 (en) * | 2010-03-24 | 2017-09-05 | Matrixx Software, Inc. | System with multiple conditional commit databases |
US20160350360A1 (en) * | 2010-03-24 | 2016-12-01 | Matrixx Software, Inc. | System with multiple conditional commit databases |
US20160147827A1 (en) * | 2010-04-08 | 2016-05-26 | Microsoft Technology Licensing, Llc | In-memory database system |
US10296615B2 (en) * | 2010-04-08 | 2019-05-21 | Microsoft Technology Licensing, Llc | In-memory database system |
US11048691B2 (en) * | 2010-04-08 | 2021-06-29 | Microsoft Technology Licensing, Llc | In-memory database system |
US10055449B2 (en) * | 2010-04-08 | 2018-08-21 | Microsoft Technology Licensing, Llc | In-memory database system |
US9830350B2 (en) * | 2010-04-08 | 2017-11-28 | Microsoft Technology Licensing, Llc | In-memory database system |
US8655886B1 (en) * | 2011-03-25 | 2014-02-18 | Google Inc. | Selective indexing of content portions |
US8412690B2 (en) * | 2011-04-11 | 2013-04-02 | Sap Ag | In-memory processing for a data warehouse |
US20120259809A1 (en) * | 2011-04-11 | 2012-10-11 | Sap Ag | In-Memory Processing for a Data Warehouse |
US20120323971A1 (en) * | 2011-06-14 | 2012-12-20 | Sybase, Inc. | Optimizing data storage and access of an in-memory database |
US20130013890A1 (en) * | 2011-07-06 | 2013-01-10 | International Business Machines Corporation | Database system |
US9155320B2 (en) * | 2011-07-06 | 2015-10-13 | International Business Machines Corporation | Prefix-based leaf node storage for database system |
US9149054B2 (en) * | 2011-07-06 | 2015-10-06 | International Business Machines Corporation | Prefix-based leaf node storage for database system |
US20130013602A1 (en) * | 2011-07-06 | 2013-01-10 | International Business Machines Corporation | Database system |
US9043750B2 (en) * | 2012-10-09 | 2015-05-26 | Sap Se | Automated generation of two-tier mobile applications |
US20140101635A1 (en) * | 2012-10-09 | 2014-04-10 | Martin Hoffmann | Automated generation of two-tier mobile applications |
US20140279836A1 (en) * | 2013-03-13 | 2014-09-18 | Sap Ag | Configurable Rule for Monitoring Data of In Memory Database |
US9646040B2 (en) * | 2013-03-13 | 2017-05-09 | Sap Se | Configurable rule for monitoring data of in memory database |
US9952975B2 (en) | 2013-04-30 | 2018-04-24 | Hewlett Packard Enterprise Development Lp | Memory network to route memory traffic and I/O traffic |
WO2014178854A1 (en) * | 2013-04-30 | 2014-11-06 | Hewlett-Packard Development Company, L.P. | Memory network to route memory traffic and i/o traffic |
US20140337593A1 (en) * | 2013-05-10 | 2014-11-13 | Hugh W. Holbrook | System and method for reading and writing data with a shared memory hash table |
US9495114B2 (en) * | 2013-05-10 | 2016-11-15 | Arista Networks, Inc. | System and method for reading and writing data with a shared memory hash table |
EP2802110A1 (en) * | 2013-05-10 | 2014-11-12 | Arista Networks, Inc. | System and method for reading and writing data with a shared memory hash table |
US20140344796A1 (en) * | 2013-05-20 | 2014-11-20 | General Electric Company | Utility meter with utility-configurable sealed data |
US8886671B1 (en) | 2013-08-14 | 2014-11-11 | Advent Software, Inc. | Multi-tenant in-memory database (MUTED) system and method |
US9734221B2 (en) | 2013-09-12 | 2017-08-15 | Sap Se | In memory database warehouse |
US20150074053A1 (en) * | 2013-09-12 | 2015-03-12 | Sap Ag | Cross System Analytics for In Memory Data Warehouse |
US9734230B2 (en) * | 2013-09-12 | 2017-08-15 | Sap Se | Cross system analytics for in memory data warehouse |
US20150220596A1 (en) * | 2014-01-31 | 2015-08-06 | International Business Machines Corporation | Dynamically adjust duplicate skipping method for increased performance |
US20150220595A1 (en) * | 2014-01-31 | 2015-08-06 | International Business Machines Corporation | Dynamically adjust duplicate skipping method for increased performance |
US9892158B2 (en) * | 2014-01-31 | 2018-02-13 | International Business Machines Corporation | Dynamically adjust duplicate skipping method for increased performance |
US9928274B2 (en) * | 2014-01-31 | 2018-03-27 | International Business Machines Corporation | Dynamically adjust duplicate skipping method for increased performance |
US20150324278A1 (en) * | 2014-03-03 | 2015-11-12 | Empire Technology Development Llc | Data sort using memory-intensive exosort |
US9858179B2 (en) * | 2014-03-03 | 2018-01-02 | Empire Technology Development Llc | Data sort using memory-intensive exosort |
EP3136245A4 (en) * | 2014-04-23 | 2017-10-18 | Hitachi, Ltd. | Computer |
US10430287B2 (en) | 2014-04-23 | 2019-10-01 | Hitachi, Ltd. | Computer |
US20210004360A1 (en) * | 2015-07-30 | 2021-01-07 | Workday, Inc. | Indexing structured data with security information |
US11860861B2 (en) | 2015-09-04 | 2024-01-02 | Arista Networks, Inc. | Growing dynamic shared memory hash table |
US11068469B2 (en) | 2015-09-04 | 2021-07-20 | Arista Networks, Inc. | System and method of a dynamic shared memory hash table with notifications |
WO2017063049A1 (en) * | 2015-10-15 | 2017-04-20 | Big Ip Pty Ltd | A system, method, computer program and data signal for conducting an electronic search of a database |
WO2017063048A1 (en) * | 2015-10-15 | 2017-04-20 | Big Ip Pty Ltd | A system, method, computer program and data signal for the provision of a database of information for lead generating purposes |
US10997164B2 (en) * | 2015-11-23 | 2021-05-04 | Sap Se | Unified table delta dictionary lazy materialization |
US20170147633A1 (en) * | 2015-11-23 | 2017-05-25 | Sap Se | Unified table delta dictionary lazy materialization |
WO2017095387A1 (en) * | 2015-11-30 | 2017-06-08 | Hewlett-Packard Enterprise Development LP | Multiple simultaneous value object |
US10152527B1 (en) * | 2015-12-28 | 2018-12-11 | EMC IP Holding Company LLC | Increment resynchronization in hash-based replication |
US10394797B2 (en) | 2016-03-10 | 2019-08-27 | TmaxData Co., Ltd. | Method and computing apparatus for managing main memory database |
US11182365B2 (en) * | 2016-03-21 | 2021-11-23 | Mellanox Technologies Tlv Ltd. | Systems and methods for distributed storage of data across multiple hash tables |
US10558636B2 (en) * | 2016-04-27 | 2020-02-11 | Sap Se | Index page with latch-free access |
US20170316042A1 (en) * | 2016-04-27 | 2017-11-02 | Sap Se | Index page with latch-free access |
US10268723B2 (en) | 2016-06-20 | 2019-04-23 | TmaxData Co., Ltd. | Method and apparatus for executing query and computer readable medium therefor |
US10275491B2 (en) | 2016-06-20 | 2019-04-30 | TmaxData Co., Ltd. | Method and apparatus for executing query and computer readable medium therefor |
US20180167460A1 (en) * | 2016-12-09 | 2018-06-14 | Google Inc. | High-throughput algorithm for multiversion concurrency control with globally synchronized time |
US20210185126A1 (en) * | 2016-12-09 | 2021-06-17 | Google Llc | High-Throughput Algorithm For Multiversion Concurrency Control With Globally Synchronized Time |
US11601501B2 (en) * | 2016-12-09 | 2023-03-07 | Google Llc | High-throughput algorithm for multiversion concurrency control with globally synchronized time |
US10951706B2 (en) * | 2016-12-09 | 2021-03-16 | Google Llc | High-throughput algorithm for multiversion concurrency control with globally synchronized time |
US10592251B2 (en) | 2017-04-18 | 2020-03-17 | International Business Machines Corporation | Register restoration using transactional memory register snapshots |
US10963261B2 (en) | 2017-04-18 | 2021-03-30 | International Business Machines Corporation | Sharing snapshots across save requests |
US10740108B2 (en) | 2017-04-18 | 2020-08-11 | International Business Machines Corporation | Management of store queue based on restoration operation |
US10782979B2 (en) | 2017-04-18 | 2020-09-22 | International Business Machines Corporation | Restoring saved architected registers and suppressing verification of registers to be restored |
US10838733B2 (en) | 2017-04-18 | 2020-11-17 | International Business Machines Corporation | Register context restoration based on rename register recovery |
US10649785B2 (en) | 2017-04-18 | 2020-05-12 | International Business Machines Corporation | Tracking changes to memory via check and recovery |
US10489382B2 (en) * | 2017-04-18 | 2019-11-26 | International Business Machines Corporation | Register restoration invalidation based on a context switch |
US10732981B2 (en) | 2017-04-18 | 2020-08-04 | International Business Machines Corporation | Management of store queue based on restoration operation |
US10540184B2 (en) | 2017-04-18 | 2020-01-21 | International Business Machines Corporation | Coalescing store instructions for restoration |
US10545766B2 (en) | 2017-04-18 | 2020-01-28 | International Business Machines Corporation | Register restoration using transactional memory register snapshots |
US11010192B2 (en) | 2017-04-18 | 2021-05-18 | International Business Machines Corporation | Register restoration using recovery buffers |
US10572265B2 (en) | 2017-04-18 | 2020-02-25 | International Business Machines Corporation | Selecting register restoration or register reloading |
US10564977B2 (en) | 2017-04-18 | 2020-02-18 | International Business Machines Corporation | Selective register allocation |
US11061684B2 (en) | 2017-04-18 | 2021-07-13 | International Business Machines Corporation | Architecturally paired spill/reload multiple instructions for suppressing a snapshot latest value determination |
US10552164B2 (en) | 2017-04-18 | 2020-02-04 | International Business Machines Corporation | Sharing snapshots between restoration and recovery |
US11216809B2 (en) | 2018-01-17 | 2022-01-04 | Tzero Ip, Llc | Multi-approval system using M of N keys to restore a customer wallet |
WO2019143849A1 (en) * | 2018-01-17 | 2019-07-25 | Medici Ventures, Inc. | Multi-approval system using m of n keys to restore a customer wallet |
US11392940B2 (en) | 2018-01-17 | 2022-07-19 | Tzero Ip, Llc | Multi-approval system using M of N keys to perform an action at a customer device |
US11429959B2 (en) | 2018-01-17 | 2022-08-30 | Tzero Ip, Llc | Multi-approval system using M of N keys to generate a transaction address |
US11531985B2 (en) | 2018-01-17 | 2022-12-20 | Tzero Ip, Llc | Multi-approval system using M of N keys to generate a sweeping transaction at a customer device |
US10990585B2 (en) * | 2018-05-10 | 2021-04-27 | Sap Se | Transaction-specific selective uncommitted read for database transactions |
US11288251B2 (en) * | 2018-05-25 | 2022-03-29 | Microsoft Technology Licensing, Llc | Supporting concurrent updates to a database page |
US11663207B2 (en) * | 2018-09-24 | 2023-05-30 | Salesforce, Inc. | Translation of tenant identifiers |
CN111143143A (en) * | 2019-12-26 | 2020-05-12 | 北京神州绿盟信息安全科技股份有限公司 | Performance test method and device |
CN111259048A (en) * | 2020-01-08 | 2020-06-09 | 人民法院信息技术服务中心 | Data transmission method and system based on memory database and multiple data channels |
US20220078236A1 (en) * | 2020-09-10 | 2022-03-10 | EMC IP Holding Company LLC | Multipart upload for distributed file systems |
US11671492B2 (en) * | 2020-09-10 | 2023-06-06 | EMC IP Holding Company LLC | Multipart upload for distributed file systems |
CN113177031A (en) * | 2021-04-21 | 2021-07-27 | 北京人大金仓信息技术股份有限公司 | Processing method and device for database shared cache, electronic equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
US6457021B1 (en) | 2002-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6457021B1 (en) | In-memory database system | |
US7031987B2 (en) | Integrating tablespaces with different block sizes | |
US5430869A (en) | System and method for restructuring a B-Tree | |
US6321234B1 (en) | Database server system with improved methods for logging transactions | |
US6314417B1 (en) | Processing multiple database transactions in the same process to reduce process overhead and redundant retrieval from database servers | |
US6125360A (en) | Incremental maintenance of materialized views containing one-to-N lossless joins | |
US6243718B1 (en) | Building indexes on columns containing large objects | |
US6134543A (en) | Incremental maintenance of materialized views containing one-to-one lossless joins | |
US7243088B2 (en) | Database management system with efficient version control | |
US5999943A (en) | Lob locators | |
US6738790B1 (en) | Approach for accessing large objects | |
EP1040433B1 (en) | A fine-grained consistency mechanism for optimistic concurrency control using lock groups | |
US4914569A (en) | Method for concurrent record access, insertion, deletion and alteration using an index tree | |
US7716182B2 (en) | Version-controlled cached data store | |
US8032488B2 (en) | System using virtual replicated tables in a cluster database management system | |
US6631366B1 (en) | Database system providing methodology for optimizing latching/copying costs in index scans on data-only locked tables | |
US6647386B2 (en) | Method, system, and program for reverse index scanning | |
CN111316255B (en) | Data storage system and method for providing a data storage system | |
US5956705A (en) | Reverse-byte indexing | |
CN111373389B (en) | Data storage system and method for providing a data storage system | |
US8510269B2 (en) | Uninterrupted database index reorganization/movement | |
EP0100821A2 (en) | Method and apparatus for managing a database | |
AU2002303899A1 (en) | Integrating tablespaces with different block sizes | |
Schijning | The ADABAS Buffer Pool Manager |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MICROSOFT CORPORATION, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BERKOWITZ, BRIAN T.;SIMHADRI, SREENIVAS;CHRISTOFFERSON, PETER A.;AND OTHERS;REEL/FRAME:009581/0399;SIGNING DATES FROM 19981027 TO 19981029 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034541/0001 Effective date: 20141014 |