Stanford University Network Identity System: Scope and Requirements RL "Bob" Morgan, DCCS DRAFT, 96/03/01 --- (1.0) Introduction (1.1) Background and motivation (1.2) Naming issues (2.0) Requirements (1.0) Introduction This proposal presents the requirements for a "Stanford Network Identification System". This system provides primary identifiers for people and other entities for use in campus-wide distributed computing systems. These identifiers are called "SUNet IDs". The use of SUNet IDs in authentication (e.g., Kerberos) and authorization functions provides a basis for secure distributed computing. The use of SUNet IDs across many applications (including UNIX login, e-mail, WWW, file-system (AFS), dialin, DCE, and others) provides a common reference for users of these applications. The SUNet ID system must be tightly linked to other campus name and ID systems, including Kerberos, the directory, Reference Data, University ID, etc. (1.1) Background and motivation As the use of distributed computing services at Stanford has grown in popularity and importance, both the range of applications and the diversity of users have increased. The list of supported applications includes: Kerberos authentication UNIX login E-mail Mailing lists AFS file service WWW home pages On-line directory Usenet news Almost every member of the Stanford community now takes advantage of at least some of these services. This includes users across the spectrum from sophisticated to naive. This evolution leads to two contrasting requirements for Stanford's distributed computing services: (a) integration: users, especially the less sophisticated, expect the various services to fit together seamlessly, without their being exposed to the mechanics of how the services are provided; and (b) independence: the use of a service should be separable, as much as possible, from the use of other services, so that users aren't obliged to learn about services they don't care about (e.g., having to learn UNIX just to get e-mail), and so that resource allocations can be made in a fine-grained way. Unfortunately these requirements are likely to be in conflict. One facet of making them more compatible is to provide users with a consistent view of their participation in the various services. This consists of a standard "namespace" to identify users and other entities such as groups and departments. A single name (or small set of names) that identifies a user across multiple services and applications acts as a "key" to bind the services together from the user's point of view. Another trend is the increasing diversity of the entities participating in distributed applications. It is common to want to find the e-mail address of a department, the WWW home page of a project, the newsgroup of an academic class. Consistent naming of these more abstract entities is particularly important. (1.1.1) Loosely-affiliated users (1.1.2) Multiple security domains The availability of strong security mechanisms to protect distributed computing communications is becoming increasingly important. Stanford uses one primary distributed computing security system today, Kerberos version 4, but it is clear that others are on the way. DCE, which uses Kerberos version 5, is being implemented now, and public-key-based systems are under active consideration. It is likely that Stanford will have to support several different systems simultaneously. A basic feature of any security system is the naming of the security objects that interact using the system. Preferred naming schemes vary among the systems. An important function of a comprehensive SUNet ID system is to provide linkages among these different secure name domains. It is in the community's interest to promote consistency among the domains so that it is easy to relate, for example, a person's Kerberos 4 principal, their Kerberos 5 principal, and their public-key certificate. The use of names in security domains places important constraints on such issues as the form of those names, re-use, names per domain, procedures for establishment, etc. (1.1.3) UnivID, ID card, security domain linkage. (1.1.4) Ease-of-use and security of ID/acct establishment (1.2) Naming issues There are a number of general issues that arise in the design of a naming system. This section describes and discusses these issues in a relatively formal way as background to the requirements presented in section 2, many of which involve these issues. First, we define a "naming system" as a set of mappings between character strings ("names") and named entities ("subjects"). Given an input string, the system can determine the corresponding set of subjects (possibly the empty set). The set of all possible strings is the system "name space". The set of subjects is the "domain". Naming systems of interest for this discussion also provide an inverse mapping; that is, given a subject the set of names associated with that subject can be determined. In this discussion we consider only systems where both names and subjects are finite sets. In addition, the naming systems of interest are dynamic, meaning that mappings can change over time. (I1) Cardinality of name to subject mapping The issue is whether a name can map to zero, one, or many subjects. In a system in which each name maps to exactly one subject, a name may be thought of as "identifying" a subject; such a name is called an "identifier". A serial number is a commonly-used form of identifier. In a system in which a name can map to many subjects, a name may be considered to be an "attribute" but not an identifier. The distinction between identifier and attribute is important because many applications implicitly assume that a name is used as an identifier. For example, a file name maps to a single file; an account name maps to a single account. If a name is in the system but maps to no subject, it may be thought of as a "reserved" name, or as mapping to a special null subject. (I2) Cardinality of subject to name mapping The issue is whether a subject can map to zero, one, or many names. Note that for a given system this issue is independent of issue (I1). (I3) Structured / flat name space In a structured name space a name is made up of more than one part, each part possibly having its own rules for construction. In addition there may be rules for composing the parts to form a composite name; in particular special separator characters may be used between parts when the composite name is written as a character string. A flat name space has no required structure. This does not preclude conventions for interpreting some of its names as structures, however. (I4) Partitioned naming system A naming system is partitioned if there are rules such that a particular subset of names may only map to a particular subset of subjects. (I5) Natural / artificial names A name space is natural if a subject's name is based on some other attribute of the subject. It is artificial otherwise. (I6) Authentic / arbitrary names A name space is authentic if a subject's name is required to be related to some other attribute of the subject. Otherwise it is arbitrary. (I7) Permanent / changeable names A naming system has permanent names if there is a rule such that when a name maps to a particular subject at one time, it continues to map to that subject at all later times. (I8) Use-once / re-usable names A naming system has re-usable names if it is possible for a name that maps to one subject to later map to another subject. Note that this issue is only significant in a system in which a name can map to only one subject, and names are not permanent. (I9) Automatic / manual entry In a manual-entry system, members of the set of mappings are established by explicit human interaction with the system. This implies direct human control of names. In an automatic system, members of the set of mappings are established based on external events. This implies that names must be generated (or chosen) by an algorithm. (1.3) General service model and scope The core function of the SUNet ID system is the maintenance of mappings between sets of strings, called "identifiers", and real-world "things", called "subjects". These strings are "identifiers" in the sense described above, meaning that a string maps to at most one subject. A subject may have several identifiers, just as in the real world it may have several different names. An "entry" in the SUNet ID system represents these relationships, on the one hand to the singular subject it represents, and on the other to the set of strings that identify it. Alternative identifier forms are available to support the needs of different applications. One key identifier form, the account identifier, is based on current practice identifying users in the Leland AFS/Kerberos database. It is expected that every computer-using member of the Stanford community will have a SUNet ID. The "community" for this purpose is a very extended one, including anyone who uses any Stanford computer system in an authenticated way. Other subjects, such as departments, other campus organizations, academic classes, mailing lists, and host computers also are identified by SUNet IDs. The SUNet ID system only provides identifiers. Possession of a SUNet ID by itself does not authorize its subject to use any Stanford computing facility or service. Access to facilities and services is controlled by their owners. Basing access mechanisms on SUNet IDs allows for services to take advantage of the campus authentication infrastructure. The SUNet ID system provides not only for the storage of IDs but also for their entry and maintenance. This requires both a user interface and a lower-level protocol interface. It also must have well-defined interactions with related software systems, including Reference Data, the Directory, application systems that use SUNet IDs, etc. (2.0) Requirements (R1) Support identification of all applicable subjects, campus-wide SUNet IDs enable the identification and authentication of subjects in the information space. They should be available for use in identifying subjects across many applications without obstacles. SUNet IDs also should be available to identify people and other subjects across all campus organizations. (R2) Flat space It would be possible to design a hierarchical ID space (eg, "staff/morgan", "student/morgan"); this would obviously be more extensible than a flat space. However, some of the intended applications of these IDs, such as UNIX account IDs and "user@stanford.edu" e-mail addressing, are inherently flat, so a flat space is required. This doesn't preclude the development of hierarchical spaces for other campus-wide applications, however (e.g., the DCE CDS name space). (R3) Cross-application focus As a global space, SUNet IDs are intended to identify subjects that have meaning across many applications. Application-specific identifiers, while not fundamentally prohibited, are not encouraged, as they will tend to fill up the ID space. There may need to be guidelines for what sorts of subjects are appropriate to be identified by SUNet IDs. (R4) Uniqueness of identifiers Each identifier in the SUNet ID system must be unique; that is, an ID must be associated with exactly one real-world subject. This is necessary for it to function as a primary identifier across many systems. There are important applications for spaces where names are not unique, e.g., the space of real personal names. The SUNet ID system does not deal directly with these names but must interact with systems that do, principally the directory. (R5) One SUNet ID system entity per real-world subject The SUNet ID system internally models real-world subjects as "entities". As a primary identification system it is very desirable for there to be just one SUNet ID entity per real-world subject (that entity might be associated with several identifiers, however). This will promote consistency for users and improve security management. However, there may be some instances where a subject requires more than one entity; this capability should be supported. In these cases identifying one entity as "the real one" and others as "aliases" is appropriate; alternatively, the different entities may be different "roles" for the subject. In any case the linkage of such entities via their common subject should be explicit. (R6) "Natural" identifiers As the SUNet ID system provides the primary identifiers for a subject in the computing environment, these identifiers are intimately associated with the subject which they identify. If an ID is artificially constructed or hard to remember, it will be unpleasant for its owner and inconvenient for other users. Instinctive, friendly, human-oriented ID provide a much improved user experience, at the cost of some conflict over desirable ID strings. This requirement implies that a person should be able to promote their preferred name as their ID instead of (or in addition to) their formal name; e.g. "Bill.Jones" vs. "William.A.Jones". This requirement tends to conflict with, and should override, the authenticity requirement in (R7). The account ID space is more constrained (see (R10) and (R11)); thus account IDs are less likely to be able to be "familiar" for all subjects. Other, less-constrained forms need to be provided for this purpose. In the case where a subject's real name changes (e.g., a person after marriage) this requirement suggests that its IDs should change (as desired). This conflicts with (and overrides) the permanence requirement in (R8). (R7) Authenticity of identifiers It is in the interest of the institution for identifiers to be as similar as possible to the real names of the subjects they identify. This will promote ease of use and reduce the incidence of misrepresentation. As noted in (R6), authenticity and familiarity may sometimes be in conflict. It may be desirable to require some proof of the common use of a nickname before allowing its use as an ID. This may conflict with the requirement for user autonomy in (R14), however. As noted, the account ID space is contrained in length, so many account IDs will very likely not be similar to the real names of their subjects. In ID spaces which are less constrained, it is desirable to require some congruence between the ID and the real name of the subject. (R8) Permanence of identifiers In current practice the account ID (aka Kerberos principal) is used as the primary identifier of a subject across many services. As such it is extremely inconvenient (at least), for both users and system administrators, to replace one account ID with another once it is in use. Account IDds should be permanent for at least the duration of a subject's association with Stanford. Name changes have to be accomodated, however, as described in (R6). To the extent that other ID forms are used in other widely-dispersed, hard-to-change contexts, they should also be permanent. (R9) Re-use of identifiers An important role of the SUNet ID system is to provide identifiers for use in security mechanisms (currently the Kerberos system, primarily). Re-use of an ID, that is, use of an ID by one subject after it has been used by another, is a violation of security if that ID is still in use in its old association in some system. In addition, while an ID is conceptually only active while the subject is associated with Stanford, determining when "association" has ended may be difficult in many cases. For these reasons re-use of IDs that are used in security associations should be prohibited. This will include IDs used in public-key certificates in the near future. This requirement conflicts with the familiarity requirement (R6), as it makes some desirable ID strings unavailable. As much as possible the no re-use requirement should take precedence. (R10) Support UNIX account names Traditionally UNIX account names have been no more than 8 characters. Modern UNIX-like operating systems seem to be relaxing this constraint, but for general-purpose use this limit must be respected. The SUNet ID system should support an "account ID" that is no more than 8 characters in the general case. Also by convention, UNIX account names use a limited character set (lower-case letters, digits, dash, period); this restriction should also be respected. (R11) Support Kerberos version 4 security domain Elements in the Kerberos version 4 security domain are "security principals" (or just "principals"). It is a requirement that there be at most one principal per real-world subject, and therefore at most one principal per SUNet ID system entity. In Kerberos 4, principal names, although technically unconstrained, are by convention of the form "entity.instance". People usually use principals with a null instance, though non-null instances (e.g., ".root") are used by convention in some applications. SUNet ID system account IDs, and any other IDs used as Kerberos principals, must be compatible with this usage. The system should make explicit the linkage between any multiple instances. (R12) Available to loosely-affiliated persons Any "regular" member of the Stanford community (e.g., anyone in the HR or Registrar active databases) is eligible for a SUNet ID system entry; it is expected that almost everyone will have one. Entries must also be available to people with a wide variety of other relationships to Stanford: visiting scholars, contractors, special students, guests, etc. This will make the authentication mechanisms based on SUNet IDs available to these users, independent of their eligibility for or use of any particular SUNet ID-based service. (R13) Appropriate use, responsibility, and sponsorship Use of SUNet IDs for authentication is the basis of distributed application security. It is necessary to specify policies about appropriate use of IDs, and responsibilities for ensuring appropriate use. One suggested policy is that for each SUNet ID system entry, there should be a "regular" Stanford person (i.e., current, full-time faculty, staff, or student) who is ultimately responsible for its appropriate use. This will ground appropriate use of SUNet IDs in the more traditional standards that apply to regular member of the Stanford community. The responsible person is a "sponsor" of the SUNet ID for the other subject (group, virtual entity, or loosely-affiliated person). Policies must be specified concerning who can be a sponsor, and what sort of entities (and with what characteristics) can be sponsored. It also is desirable to represent that a group (such as an academic department) is a sponsor. (R14) User autonomy Procedures and policies for creation and management of SUNet IDs, in particular sponsorship policies, should be designed to maximize the ability of users to get desired results from the system without involvement of support staff. (R15) Multiple security domains In addition to the support of Kerberos 4 requirements (R11), the system should be extensible to support other security domains, such as DCE/Kerberos version 5, and X.509 version 3 certificates. (R16) Consistent with other Stanford ID systems The SUNet ID system is one of several systems for identifying and distributing information about Stanford objects. Other important systems include: University ID Stanford Card (aka ID Card) Reference Data Directory service NetDB As much as possible these systems should share a common data model regarding their subjects: i.e., it should be easy to find the records associated with a particular real-world subject across all these systems. They should also avoid separate maintenance of overlapping or conflicting data. Data should be able to flow easily among these systems as needed. (R17) Identifier consistency across differences in case and punctuation Users can not be expected to appreciate obscure distinctions between identifiers. This includes case and punctuation differences (e.g., "PatLee", "patlee", "pat.lee", "pat_lee"). Comparison rules to test for uniqueness should be designed to prevent different subjects from using IDs that differ only in these ways. The system may need to permit exceptions to these rules, however, to support the possible existence of subjects whose real names differ only in these ways. (R18) Expiration For security purposes, it is necessary that the system maintain information about whether a SUNet ID system entry is active. This should correspond to the status of the real-world subject at Stanford. An inactive entry normally would not be valid for use in other contexts in the system, e.g., as a sponsor. The entry and its IDs are not deleted from the system, however, since its IDs may need to be protected from re-use, and its subject may return to active status. Entries and IDs will generally be created by a person's direct action, but will become inactive based on external events: a person's leaving Stanford, a validity end-date having been reached, etc. The system must track these events and set the status of entries and IDs appropriately. Policies must be established about handling of inactive entries in other contexts, e.g., the status of a sponsored entry if a sponsoring entry becomes inactive. The system should attempt to alert affected parties ahead of time about these situations. (R19) Support requirements of dependent application systems The SUNet ID system exists to provide application systems with identifiers. Protocol interface. Create, modify, read. Must use strong security. Notification of changes. Bulk export. Import? Specific bullets about Kerb4, ref data, Leland, majordomo, AFS? (R20) Preserve "desirable" names The SUNet ID system promotes the use of natural names for identifiers, and it discourages re-using identifiers. Over time, new users will find that desirable identifiers are less and less available to them. To minimize this effect, it is desirable to discourage the use of more-attractive identifiers by some subjects. For example, identifiers for loosely-affiliated persons might be required to include a digit; these users might also be permitted to have only an account ID, not any other forms. (R21) User interface Authenticated. Web-based. Easy to use. Also command-line/script-oriented. (R22) User capabilities (authorization) SUNet ID system internal, not representing authorzation for client applications. User X can do Y. User X can create N of entity type Y. Members of group X can do Y.