|
|||||||||||
|
|
Handle System Scalability
[ Reproduced (in part) from the Technical Manual, Section 1, Overview. ] Scalability was a critical design criteria for the Handle System. The problem can be divided into storage and performance. That is, is there some limit to the number of identifiers (handles) that can be added? And, does performance go down, or do some functions simply break with increased numbers of identifiers, such that at some point the system becomes unusable? Specific details on this are given below, but it is important to keep two higher level issues in mind. First, it is important here, as in many other places, to distinguish between Handle System design and any given implementation. Scalability in design may or may not work out as expected in any given implementation, but if the design is fundamentally scalable, specific implementation problems can be corrected as they are encountered. Secondly, use of the Handle System through some other service, e.g., an http proxy, may well introduce other scalability issues which the basic Handle System design does not and cannot address. Storage
The Handle System has been designed at a very basic level as a distributed system, that is, it will run across as many computers as are required to provide the desired functionality. Figure 1 illustrates two possible configurations. ![]() Figure 1 - Example Handle Site Configurations
Identifiers are held in and resolved by handle servers and handle servers are grouped into one or more handle sites within each handle service. There are no design limits on the total number of handle services which constitute the Handle System, there are no design limits on the number of sites which make up each service, and there are no limits on the number of servers which make up each site. Replication by site, within a service, does not require that each site contain the same number of servers; that is, while each site will have the same replicated set of identifiers, each site may allocate that set of identifiers across a different number of servers. Thus increased numbers of identifiers within a site can be accommodated by adding additional servers, either on the same or additional computers, additional sites can be added to a service at any time, and additional services can be created. Every service must be registered with the Global Handle Registry, but that service can also have as many sites with as many servers as needed. The result is that the number of identifiers that can be accommodated in the current system is limited only by the number of computers available. Performance
Constant performance across increasing numbers of identifiers is addressed by hashing, replication, and caching. Hashing, a technique well known to database designers, is used in the Handle System to evenly allocate any number of identifiers across any number of servers within a site, and allows a single computation to determine on which server within a set of servers a given identifier is located, regardless of the number of identifiers or the number of servers. Each server within a site is responsible for a subset of identifiers managed by that site. Given a specific identifier and knowledge of the service responsible for that identifier, a handle client selects a site within that service and can perform a single computation on the identifier to determine which server within the site contains the identifier. The result of the computation becomes a pointer into a hash table, which is unique to each handle site and which can be thought of as a map of the given site, mapping which identifiers belong to which servers. The computation is independent of the number of servers and identifiers, and it will not take a client any longer to locate and query the correct server for an identifier within a service that contains billions of identifiers and hundreds of servers, than for a service that contains only millions of identifiers and only a few servers. The connection between a given identifier and the responsible handle service is determined by prefix. Prefix records are maintained by the Global Handle Registry as handles, and these handles are hashed across the Global Handle Registry sites in the same way that all other identifiers are hashed across their respective service sites. The only hierarchy in Handle System services is the two level distinction between a single global and all locals, which means that the worst case resolution would be that a client with no built-in or cached knowledge would have to consult Global and one local. Another aspect of Handle System scalability is replication. The individual handle services within the Handle System each consist of one or more handle service sites, where each site replicates the complete individual handle service, at least for the purposes of handle resolution. Thus, increased demand on a given handle service can be met with additional sites, and increased demand on a given site can be met with additional servers. This also opens up the option, so far not implemented by any existing clients, of optimizing resolution performance by selecting the "best" server from a group of replicated servers. Handle clients may optimize performance across parallel service sites and, given a choice of multiple sites, will largely ignore sites which are slow or completely unresponsive, either because of server problems or because of network problems. Any given handle service can thus be made more robust both in terms of performance and reliability, through the addition of servers and collections of servers. Caching may also be used to improve performance and reduce the possibility of bottleneck situations in the Handle System, as is the case in many distributed systems. The Handle System data model and protocol design includes a space for cache time-outs and handle caching servers have been developed and are in use. For more information on Handle System scalability, see the Handle System RFCs referenced in the Interface Specification. Updated 16 May 2006
Send inquiries to hdladmin@cnri.reston.va.us |
|