Stanford University Network Identity System: Design RL "Bob" Morgan, DCCS DRAFT, 96/03/01 (1) Introduction (2) Model and Definitions (3) Identifiers (4) Entities (5) Usage Policies (6) User interface (7) Operations (8) Examples and Scenarios (1.0) Introduction This document presents a design for a Stanford University Network Identity System, abbreviated as "SIDS" in this document. The scope and requirements for this system are described in a separate document. (2.0) Model and Definitions SIDS is a system for the maintenance of identifiers for entities, and support for the use of these identifiers in other application systems. "SIDS" refers to the entity/ID database, its user and protocol interfaces, policies, etc. A "real-world subject" (or "subject") is the actual person, group, or other "thing" identified by an identifier. A "SIDS entity" (or "entity") is a unique element (internally identified by a non-user-visible key) in the SIDS database. An entity is a representation of a subject. Normally there is only one entity per subject. A "SIDS identifier" (or "ID") is a unique human-readable string that identifies a SIDS entity. It is possible (and common) for more than one ID to be associated with an entity. In common use this is called a "SUNet ID". A "client application" (or "application") is a software system that uses SIDS identifiers to identify its users and other resources. (3.0) SIDS identifiers A SIDS entity is associated with a non-empty set of identifiers. The entity (and therefore the subject it represents) can be identified using any of its IDs. ID formats and policies differ among the different kinds of entities, and among classes of IDs intended for particular applications. (3.1) General identifier (GID) A base class of identifier, "general identifier", establishes rules common to all IDs. Other ID classes (e.g., "account ID") refine the general ID class with additional constraints. Further rules may apply to particular entities. (3.1.1) General ID syntax A general ID is at least three and no more than two hundred fifty-five characters in length. A GID consists of characters from the set of printable 7-bit ASCII characters. (3.1.2) General ID uniqueness and normalized form SIDS IDs are unique; i.e., an ID identifies exactly one SIDS entity. Uniqueness comparisons use a "normalized form" of the ID string. An ID string is mapped to its normalized form by removing all characters that are not either letters or digits, and by translating all its upper-case letters to lower case. Thus, e.g., "Pat.Lee", "_pat_lee_", "Pat Lee", and "PATLEE" all map to the same normalized form, "patlee". It is permissible for multiple strings that map to the same normalized string to exist as separate SIDS IDs, but these IDs must all map to the same entity. (3.1.3) Reserved IDs A set of character strings is maintained that may not be used as IDs. This includes special UNIX account names such as "root" and "daemon", as well as other names that might lead to confusion. Also, any strings considered inappropriate for IDs (e.g. obscene words) will not be permitted. (3.2) Kerberos v4 ID "Kerberos v4 ID" (K4ID) is a class of General ID. A K4ID may be used as a principal in a Kerberos version 4 database. A SIDS entity may have no more than one K4ID. A K4ID has two parts. The first part is the "base" part; the second is the "instance" part. Together they form a "principal". The principal is written with the parts separated by a period: "base.instance". If the instance is null, the principal is written as "base." or "base". The base and instance parts consist of characters from the set [ a-z 0-9 - ], i.e., lower-case letters, digits, and dash. The first and last character of base and instance parts must be a letter or a digit. (3.3) Account ID "Account ID" (AID) is a class of Kerberos ID. An AID may be used as an account name on a UNIX system. An AID has no less than three and no more than eight characters. The dash character is not permitted. It must contain at least one lower-case letter [ a-z ]. An AID always has a null instance. (3.4) Restricted account ID "Restricted account ID" (RAID) is a class of account ID. Some entities may be required to use a RAID for their account ID to preserve the general AID space for others. A RAID must be at least 4 characters long. The last character of a RAID must be a digit [ 0-9 ]. (3.5) Kerberos 4 service ID "Kerberos 4 service ID" (K4SID) is a class of Kerberos ID. A K4SID is used to identify the Kerberos version 4 principal of a Kerberized service. There are two K4SID forms: host-specific and host-independent. A host-specific K4SID has the format "serviceid.instance" where the serviceid identifies the type of service (e.g., "rcmd", "pop") and instance is the Internet Domain Name Service name (just the left-most component) of the host providing the service. A host-independent K4SID is a simple base part, with a null instance. (3.6) Email ID "Email ID" (EID) is a class of general ID. It uses a character set restricted to those characters that are commonly used in the mailbox (ie, before the "@") part of an Internet (RFC 822) email address. A EID consists of characters from the set [A-Z a-z 0-9 - . ]; ie, upper- and lower-case letters, digits, dash, and period. It is recommended that the period character be used to separate parts of an ID based on a multi-part name: e.g., "Pat.Lee", "Pat.G.Lee", "Pat.G.Lee.Jr", "Computer.Science.Department". (3.7) Person ID "Person ID" (PID) is a class of Email ID. A PID is intended to be as similar as possible to a real-world name of the person it identifies. If a PID contains a non-alphanumeric character (ie, dash or period), then its minimum length is the same as the minimum length of a general ID. If it is all alphanumeric, then its minimum length is nine characters. This serves to separate the PID space and the AID space. The trailing characters of a PID must match the real last name of the person who is the subject of the ID. For this comparison both the person's PID and their real last name are mapped into normalized form before the comarison is done. For example, "Pat.Lee", "p.lee", "xxx-lee", and "plee" would be acceptable PIDs for a person named "Pat Lee". If a person's real last name is multi-part (e.g., "Pat Lee-Lopez") the a PID is acceptable if its trailing characters match any part of the name (again using the normalized form). For example "Pat.Lee" or "Pat.Lopez" would be acceptable PIDs for the person named "Pat Lee-Lopez". This also allows, e.g., "pat.lee" or "pat.lee.jr" to be acceptable PIDs for the person named "Pat Lee, Jr". Also allow initial characters to be last name, eg "lee.pat"? A PID may also have trailing digits [ 0-9 ], e.g., "Pat.Lee.3". Any trailing digits are removed before the comparison with the person's real last name is done. (3.8) Restricted person ID "Restricted person ID" (RPID) is a class of Person ID. Some person entities may be required to use a RPID to preserve the general PID space for others. The last character of a RPID must be a digit. (3.9) Department ID "Department ID" (DID) is a class of Email ID. A DID is intended to be as similar as possible to a real-world name of the department it identifies. Format? (3.10) Academic class ID "Academic class ID" (ACID) is a class of Email ID. A ACID is intended to represent the name of an academic class. Composed of Dept ID + number? (3.11) Host ID "Host ID" (HID) is a class of General ID. A HID is used to identify a network-attached computer. It is equivalent to the name of the host in the Internet Domain Name System. A HID is a sequence of domains, written separated by periods: "domain1.domain2.domain3". The characters of a domain are from the set [ A-Z a-z 0-9 - ], i.e., upper- and lower-case letters, digits, and dash. The first and last character of a domain must be a letter or a digit. A HID is the fully-qualified form of DNS name, e.g. "Leland.Stanford.EDU". (4.0) SIDS entities The following kinds of real-world subjects can be identified by SIDS entities: real persons personal roles (e.g., a root Kerberos principal, "lee.root") virtual persons (e.g., "testuser") host computers kerberos-authenticated services (e.g. "rcmd.elaine23") generic processes (e.g., "backup") organizational roles (e.g., "English.Dept.Chair") groups, including: official campus organizations (e.g., academic departments) academic classes other campus organizations (e.g., student organizations) ad-hoc groups (e.g., a project team) mailing lists This list is expected to grow. The SIDS system accommodates new kinds of entities over time. SIDS entities are organized into a class structure. A general class defines attributes common to all entities. Other classes add attributes and constraints. (4.1) General SIDS entity attributes All entities may have the attributes below. Some entity types may have additional attributes. (4.1.1) Entity key Each entity is uniquely identified by a key generated specifically for use by the SIDS database. In general this key is not visible outside of SIDS. (4.1.2) ID An entity must have at least one ID, and may have many. IDs are generally limited to those classes that are appropriate for the type of entity. At most one ID can be a Kerberos 4 ID. (4.1.3) Subject If the real-world subject of an entity is identified in another authoritative Stanford database (e.g., Reference Data), the subject's key from that foreign database is recorded here, as well as an identifier for the foreign database itself. This will facilitate tracking common subjects across these databases. This attribute will be null, or be an internal pointer, if the subject of this entity is not tracked in any other Stanford database. For some entities (e.g. person) this implies maintaining data about the subject in SIDS that would otherwise be maintained by the external system. (4.1.4) Sponsorship record An entity may have zero or more sponsorship records. A sponsor is another SIDS entity; the sponsor is identified by its SIDS entity key. Each "act of sponsorship" identifies the sponsoring and sponsored entities, the beginning and ending dates for which this sponsorship is valid, and any other information associated with the sponsorship (e.g., restrictions on use). (4.1.5) Owner An entity must have at least one owner and may have many, identified by entity key. The owner of an entity has the ability to modify information associated with that entity (e.g., its IDs). In many cases the sponsor of an entity will also be its owner. (4.1.6) Proxy An entity may have zero or more proxies, identified by entity key. A proxy can "act as" the entity for some purpose. One use is to allow a person to sponsor a SIDS entity by authority of a group such as a department. Normally the "acting" entity would be a person entity. (4.1.7) Housekeeping A set of attributes describe various adminstrative aspects of the entity. Created-by, Last-modified-by: The entities that created/modified the entity. Create-date, Last-modify-date: Date and time of creation and modification. Active: A flag indicating whether the entity is an active part of the system. (4.2) Entity classes (4.2.1) Person entity "Person entity" is a class of general entity. A SIDS person entity is associated with a real-world person, its subject. A person entity must have exactly one account ID. A person entity may have several Person IDs, up to a limit set by policy. It is not required to have a Person ID. One ID, AID or PID, is designated as the "preferred" ID. An application that wants to use a single SUNet ID of an entity for some purpose can use the preferred ID. A pre-sponsored person (see section 5.1) is the owner of his or her own person entity. A person is also associated with a set of capabilities that define their interaction with SIDS. Primarily, these capabilities describe what types of entities the person can create and sponsor, and the attributes of those entities. (More on these later ??) (4.2.1.1) Sponsored person There are some additional considerations that apply to a person entity that is explicitly sponsored (i.e., not pre-sponsored). By default, a sponsored person's account ID must be a restricted account ID, and their person ID, if any, must be a restricted person ID. Unrestricted IDs may be allowed by request of the sponsor or action of the SIDS administration. Either the subject or the sponsor or both can be the owner of a sponsored person entity. For most people (including almost all pre-sponsored people), personal information such as real name, affiliation, etc, is maintained in an external authoritative database (e.g., Reference Data). For others, this information is maintained in SIDS. The attributes should be consistent with those maintained in the external databases. Actually, external database for all people is coming? (4.2.2) Personal role entity "Personal role" is a class of general entity. A personal role entity identifies a role played by a particular person, usually for security purposes. This is different from an institutional role that might be associated with any person. One use for this entity type is to identify personal Kerberos principals with non-null instances (e.g., "patlee.root"). When used for this purpose the base part of the Kerberos ID must be the same as the account ID of the person entity whose role this is. A personal role entity must be sponsored, normally by the person who is the subject of the account ID it is based on. The owner of the entity is the entity of the person whose role it is. (4.2.3) Kerberos 4 service "Kerberos 4 service" is a class of general entity. It represents a software service that uses Kerberos 4 authentication. A kerberos 4 service entity must be sponsored. It is either of the host-specific or the host-independent form, and has a Kerberos 4 service ID of the corresponding form. If it is of the host-specific form, the host part of the ID must match the leftmost component of an existing SIDS host entity ID. (4.2.4) Organizational role "Organizational role" is a class of general entity. An organizational role entity represents a role in an organization which itself is identified by a SIDS entity; e.g., the chair of an academic department. Generally only roles that are common across many organizations are appropriate for this representation. An attribute is maintained for this entity that links it to the entity of the organization involved. (4.2.5) Casual-use entity "Casual-use" is a class of organizational role entity. A casual-use entity represents an identity that can be used by people in situations where a proper person entity is too cumbersome or otherwise inappropriate. For example, short-term temporary employees, students in a one-day class, visitors, etc. It should be noted that these IDs are subject to the same appropriate-use rules as other IDs. A casual-use entity must have exactly one account ID, and may have one or more person IDs. A casual-use entity's account ID must be a restricted account ID, and its person ID, if any, must be a restricted person ID. (4.2.6) Host entity Host entities will not normally be created or modified in SIDS. Instead, they will be imported from their normal maintenance system, NetDB. They will not normally have sponsor or owner attributes. (4.2.7) Group entity "Group" is a class of general entity. It is used to identify groups ranging from formal organizations such as academic departments to ad-hoc groups. It can also identify shared-use entities for software processes (e.g., "backup"). A group entity has group-specific attributes, as defined below. (4.2.7.1) Member A group entity has a "member" attribute, which identifies zero or more entities that are members of the group. Any entity, including another group, may be a member of a group. (4.2.8) Organization "Organization" is a class of group entity. A set of organizations is identified by Stanford policy as "pre-sponsored"; i.e., an organization in this set is eligible to be represented by a SIDS entity without requiring sponsorship by a person. Normally this would include all official departments and other organizations about which information is maintained in some other authoritative database. An organization may have up to one account ID. It must have at least one Email ID (or a subclass of Email ID appropriate to the organization) and may have several. Common organization types may have a standard template of entities to which they are related. For example, an academic department "XYZ" might have: a role: xyz.dept.chair a role: xyz.dept.sponsorship a group: xyz.dept.faculty (4.2.9) Department "Department" is a class of organization entity. (4.2.10) Academic class "Academic class" is a class of organization entity. (4.2.11) Mailing list entity "Mailing list" is a class of group entity. It is used to represent groups that exist only for the purpose of being an addressable mailing list. Note that any other group entity can also function as a mailing list. A mailing list must be sponsored. It is notable because some entries on a mailing list will be simple e-mail addresses, not SIDS entities. These entries must be maintained outside of the core SIDS system. A mailing list has at least one Email ID. It does not have any other form of ID. (5.0) Usage Policies (5.1) Sponsorship The creation of a SIDS entity is associated with a sponsorship record. This record identifies the sponsor, which is an existing SIDS entity identified by its entity key, the sponsored entity, and beginning and ending dates for the sponsorship. An entity may have one or more sponsorship records. Some external authoritative data sources identify real-world subjects for which these sources "pre-sponsor" SIDS entities. Data sources include the Registrar (for qualifying students) and Human Resources (for qualifying faculty and staff). Policies are required to determine which sources may provide pre-sponsorship, and which of their subjects qualify for pre-sponsorship. All other subjects require explicit sponsorship within SIDS before they can be represented by a SIDS entity. (5.2) Active/Inactive status A SIDS entity may have one or more sponsorship records. If any of its records indicate active sponsorship (as determined by the records's begin/end dates), the entity is active in SIDS; otherwise it is inactive. When a SIDS entity changes fom active to inactive it is retained in the SIDS database. Its change in status is communicated to systems that make use of SUNet IDs (e.g., the Leland system), so they can take appropriate action. A sponsored person entity, with one or more IDs, can be created in SIDS by someone who is not eligible to sponsor it, e.g., by the sponsored person. Such an entity is inactive until it is properly sponsored. After a period of time (e.g., 2 weeks) if it is not sponsored it and its IDs are removed from the system. Grace period? Is that the responsibility of the client app or SIDS? Have to be able to do kerb auth until the last app deletes the user, no? Need policy for what end dates are allowed. Also whether/how often an end date can be extended. Entity inactivity implies ID inactivity, but IDs can become inactive on their own, eg by a person deciding they don't want it any more. (5.3) Re-use When an ID is first allocated to an entity, there is a period of time duing which it becomes "established" as being in use to identify that entity. This has to do with being used by client applications, and by users. Once a SIDS account ID is established as identifying an entity, it is never used to identify any other entity. Person IDs, once established, remain unavailable for re-use for a period of time (default 2 years) after the entity becomes inactive. After this time they may be re-used, though this should be discouraged. (5.4) Proper use The user of a SIDS entity is responsible for its proper use. The owner of a SIDS entity is expected to: ensure that its account is only used by authorized people; ensure that its account has a reasonably secure password. (5.5) Conflict resolution Conflicts (e.g., Professor John A. Doe wishes to have the ID john.doe but student John C. Doe has already established this ID) shall be resolved by political precedence, then by seniority, then by appeal. Changes that violate the non-re-use policy will be permitted only under special circumstances. Political precedence ranks faculty, then continuing staff, then students and non-continuing staff, and then all other affiliates. Seniority should be self-evident. The decision and appeals path is SIDS operations management, ITSS management, Provost, and then President/Faculty Senate. Note that conflicts within a single department may be deferred to the decision structure within that unit. (6.0) User interface Many UIs possible, via protocol interface. Web-based UI preferred. Terminal-compatible if possible. Must be authenticated. Univid/PIN for creation, kerb for maintenance. Support proper kerb via either regular protocol (Mosaic) or callback (MacLeland agent). (7.0) Operations (7.1) Entity / ID creation Allow entities to be created conditionally. (7.2) Protocol interface Must be authenticated. Authenticating client is service supplying front-end, not end-user directly (ie, three-tier design). See separate document. (7.3) Linkages to other systems (7.3.1) Reference data SIDS data is potentially all reference data. Alternatively, SIDS should eventually be a part of the proposed "person registry", or more accurately "entity registry" database. (7.3.2) Directory A SIDS entity normally is linked to a directory entry, which normally will have many more attributes that describe it. SIDS person entity maps to directory person object (?). Other entities map to ?. A particular multi-valued attribute is used to contain SIDS ID strings in the directory. This will support applications that want to look up entries specifically by SUNet ID, and those that want to find the set of SUNet IDs corresponding to a particular subject. SIDS ID strings will also be part of the set of Common Names for a subject's directory entry. A separate attribute may be required to contain the subject's account ID. Changes in SIDS are reflected in the directory as soon as possible. Overnight batch updates are acceptable but not optimal. It may be necessary to support privacy flags to control the visibility of SUNet IDs in the directory. (7.4) Kerberos (7.5) Leland (7.6) DCE (8.0) Examples and Scenarios Person: real name account ID person IDs Doe, John A johndoe John.Doe, J-Man.Doe, JMan.Doe McGillicuddy, Robert M rmm Robert.McGillicuddy, Bob.McGillicuddy Group: Computer Science Dept cs Computer.Science.Department, Computer.Science, Comp.Sci Comp Sci 356 cs356 Computer.Science.356, CS.356 www-people list www-people Backup Service backup Service: Elaine23 Services rcmd.elaine23