Stanford University Network Identity System:  Scope and Requirements
RL "Bob" Morgan, DCCS
DRAFT, 96/03/01
---

(1.0)  Introduction
(1.1)  Background and motivation
(1.2)  Naming issues
(2.0)  Requirements


(1.0)  Introduction

This proposal presents the requirements for a "Stanford Network 
Identification System".  This system provides primary identifiers for 
people and other entities for use in campus-wide distributed computing 
systems.  These identifiers are called "SUNet IDs".

The use of SUNet IDs in authentication (e.g., Kerberos) and 
authorization functions provides a basis for secure distributed 
computing.  The use of SUNet IDs across many applications (including 
UNIX login, e-mail, WWW, file-system (AFS), dialin, DCE, and others) 
provides a common reference for users of these applications.

The SUNet ID system must be tightly linked to other campus name and ID 
systems, including Kerberos, the directory, Reference Data, University 
ID, etc.

(1.1)  Background and motivation

As the use of distributed computing services at Stanford has grown in 
popularity and importance, both the range of applications and the 
diversity of users have increased.  The list of supported applications 
includes:

	Kerberos authentication
	UNIX login
	E-mail
	Mailing lists
	AFS file service
	WWW home pages
	On-line directory
	Usenet news

Almost every member of the Stanford community now takes advantage of at 
least some of these services.  This includes users across the spectrum 
from sophisticated to naive.

This evolution leads to two contrasting requirements for Stanford's 
distributed computing services:

(a) integration: users, especially the less sophisticated, expect the 
various services to fit together seamlessly, without their being exposed 
to the mechanics of how the services are provided; and
	
(b) independence: the use of a service should be separable, as much as 
possible, from the use of other services, so that users aren't obliged 
to learn about services they don't care about (e.g., having to learn 
UNIX just to get e-mail), and so that resource allocations can be made 
in a fine-grained way.

Unfortunately these requirements are likely to be in conflict.  One 
facet of making them more compatible is to provide users with a 
consistent view of their participation in the various services.  This 
consists of a standard "namespace" to identify users and other entities 
such as groups and departments.  A single name (or small set of names) 
that identifies a user across multiple services and applications acts as 
a "key" to bind the services together from the user's point of view.

Another trend is the increasing diversity of the entities participating 
in distributed applications.  It is common to want to find the e-mail 
address of a department, the WWW home page of a project, the newsgroup 
of an academic class.  Consistent naming of these more abstract entities 
is particularly important.

(1.1.1)  Loosely-affiliated users

(1.1.2)  Multiple security domains

The availability of strong security mechanisms to protect distributed 
computing communications is becoming increasingly important.  Stanford 
uses one primary distributed computing security system today, 
Kerberos version 4, but it is clear that others are on the way.  DCE, 
which uses Kerberos version 5, is being implemented now, and 
public-key-based systems are under active consideration.  It is likely 
that Stanford will have to support several different systems 
simultaneously.

A basic feature of any security system is the naming of the security 
objects that interact using the system.  Preferred naming schemes vary 
among the systems.  An important function of a comprehensive SUNet ID 
system is to provide linkages among these different secure name 
domains.  It is in the community's interest to promote consistency 
among the domains so that it is easy to relate, for example, a person's 
Kerberos 4 principal, their Kerberos 5 principal, and their public-key 
certificate.

The use of names in security domains places important constraints on 
such issues as the form of those names, re-use, names per domain, 
procedures for establishment, etc.

(1.1.3)  UnivID, ID card, security domain linkage.

(1.1.4)  Ease-of-use and security of ID/acct establishment


(1.2)  Naming issues

There are a number of general issues that arise in the design of a 
naming system.  This section describes and discusses these issues in a 
relatively formal way as background to the requirements presented in 
section 2, many of which involve these issues.

First, we define a "naming system" as a set of mappings between 
character strings ("names") and named entities ("subjects").  Given an 
input string, the system can determine the corresponding set of 
subjects (possibly the empty set).  The set of all possible strings is 
the system "name space".  The set of subjects is the "domain".  Naming 
systems of interest for this discussion also provide an inverse 
mapping; that is, given a subject the set of names associated with that 
subject can be determined.

In this discussion we consider only systems where both names and 
subjects are finite sets.  In addition, the naming systems of interest 
are dynamic, meaning that mappings can change over time.

(I1)  Cardinality of name to subject mapping

The issue is whether a name can map to zero, one, or many subjects.

In a system in which each name maps to exactly one subject, a name may 
be thought of as "identifying" a subject; such a name is called an 
"identifier".  A serial number is a commonly-used form of identifier.  

In a system in which a name can map to many subjects, a name may be 
considered to be an "attribute" but not an identifier.

The distinction between identifier and attribute is important because 
many applications implicitly assume that a name is used as an 
identifier.  For example, a file name maps to a single file; an account 
name maps to a single account.

If a name is in the system but maps to no subject, it may be thought of 
as a "reserved" name, or as mapping to a special null subject.

(I2)  Cardinality of subject to name mapping

The issue is whether a subject can map to zero, one, or many names.  
Note that for a given system this issue is independent of issue (I1).

(I3)  Structured / flat name space

In a structured name space a name is made up of more than one part, each 
part possibly having its own rules for construction.  In addition there 
may be rules for composing the parts to form a composite name; in 
particular special separator characters may be used between parts when 
the composite name is written as a character string.  

A flat name space has no required structure.  This does not preclude 
conventions for interpreting some of its names as structures, however.

(I4)  Partitioned naming system

A naming system is partitioned if there are rules such that a particular 
subset of names may only map to a particular subset of subjects.

(I5)  Natural / artificial names

A name space is natural if a subject's name is based on some other 
attribute of the subject.  It is artificial otherwise.

(I6)  Authentic / arbitrary names

A name space is authentic if a subject's name is required to be related 
to some other attribute of the subject.  Otherwise it is arbitrary.

(I7)  Permanent / changeable names

A naming system has permanent names if there is a rule such that when a 
name maps to a particular subject at one time, it continues to map to 
that subject at all later times.

(I8)  Use-once / re-usable names

A naming system has re-usable names if it is possible for a name that 
maps to one subject to later map to another subject.  Note that this 
issue is only significant in a system in which a name can map to only 
one subject, and names are not permanent.

(I9)  Automatic / manual entry

In a manual-entry system, members of the set of mappings are established 
by explicit human interaction with the system.  This implies direct 
human control of names.

In an automatic system, members of the set of mappings are established 
based on external events.  This implies that names must be generated (or 
chosen) by an algorithm.

(1.3)  General service model and scope

The core function of the SUNet ID system is the maintenance of mappings 
between sets of strings, called "identifiers", and real-world "things", 
called "subjects".  These strings are "identifiers" in the sense 
described above, meaning that a string maps to at most one subject.  A 
subject may have several identifiers, just as in the real world it may 
have several different names.  An "entry" in the SUNet ID system 
represents these relationships, on the one hand to the singular subject 
it represents, and on the other to the set of strings that identify it.

Alternative identifier forms are available to support the needs of 
different applications.  One key identifier form, the account 
identifier, is based on current practice identifying users in the Leland 
AFS/Kerberos database.

It is expected that every computer-using member of the Stanford 
community will have a SUNet ID.  The "community" for this purpose is a 
very extended one, including anyone who uses any Stanford computer 
system in an authenticated way.  Other subjects, such as departments, 
other campus organizations, academic classes, mailing lists, and host 
computers also are identified by SUNet IDs.

The SUNet ID system only provides identifiers.  Possession of a SUNet ID 
by itself does not authorize its subject to use any Stanford computing 
facility or service.  Access to facilities and services is controlled by 
their owners.  Basing access mechanisms on SUNet IDs allows for services 
to take advantage of the campus authentication infrastructure.

The SUNet ID system provides not only for the storage of IDs but also 
for their entry and maintenance.  This requires both a user interface 
and a lower-level protocol interface.  It also must have well-defined 
interactions with related software systems, including Reference Data, 
the Directory, application systems that use SUNet IDs, etc.


(2.0)  Requirements

(R1)  Support identification of all applicable subjects, campus-wide

SUNet IDs enable the identification and authentication of subjects in 
the information space.  They should be available for use in identifying 
subjects across many applications without obstacles.  SUNet IDs also 
should be available to identify people and other subjects across all 
campus organizations.

(R2)  Flat space

It would be possible to design a hierarchical ID space (eg, 
"staff/morgan", "student/morgan"); this would obviously be more extensible 
than a flat space.  However, some of the intended applications of these 
IDs, such as UNIX account IDs and "user@stanford.edu" e-mail 
addressing, are inherently flat, so a flat space is required.  This 
doesn't preclude the development of hierarchical spaces for other 
campus-wide applications, however (e.g., the DCE CDS name space).

(R3)  Cross-application focus

As a global space, SUNet IDs are intended to identify subjects that have 
meaning across many applications.  Application-specific identifiers, while 
not fundamentally prohibited, are not encouraged, as they will tend to fill 
up the ID space.  There may need to be guidelines for what sorts of 
subjects are appropriate to be identified by SUNet IDs.

(R4)  Uniqueness of identifiers

Each identifier in the SUNet ID system must be unique; that is, an ID 
must be associated with exactly one real-world subject.  This is 
necessary for it to function as a primary identifier across many 
systems.

There are important applications for spaces where names are not unique, 
e.g., the space of real personal names.  The SUNet ID system does not 
deal directly with these names but must interact with systems that do, 
principally the directory.

(R5)  One SUNet ID system entity per real-world subject

The SUNet ID system internally models real-world subjects as "entities".  
As a primary identification system it is very desirable for there to be 
just one SUNet ID entity per real-world subject (that entity might be 
associated with several identifiers, however).  This will promote 
consistency for users and improve security management.  

However, there may be some instances where a subject requires more than 
one entity; this capability should be supported.  In these cases 
identifying one entity as "the real one" and others as "aliases" is 
appropriate; alternatively, the different entities may be different 
"roles" for the subject.  In any case the linkage of such entities via 
their common subject should be explicit.

(R6)  "Natural" identifiers

As the SUNet ID system provides the primary identifiers for a subject in 
the computing environment, these identifiers are intimately associated 
with the subject which they identify.  If an ID is artificially 
constructed or hard to remember, it will be unpleasant for its owner and 
inconvenient for other users.  Instinctive, friendly, human-oriented ID 
provide a much improved user experience, at the cost of some conflict 
over desirable ID strings.

This requirement implies that a person should be able to promote their 
preferred name as their ID instead of (or in addition to) their formal 
name; e.g.  "Bill.Jones" vs.  "William.A.Jones".  This requirement tends 
to conflict with, and should override, the authenticity requirement in 
(R7).

The account ID space is more constrained (see (R10) and (R11)); thus 
account IDs are less likely to be able to be "familiar" for all 
subjects.  Other, less-constrained forms need to be provided for this 
purpose.

In the case where a subject's real name changes (e.g., a person after 
marriage) this requirement suggests that its IDs should change (as 
desired).  This conflicts with (and overrides) the permanence 
requirement in (R8).

(R7)  Authenticity of identifiers

It is in the interest of the institution for identifiers to be as 
similar as possible to the real names of the subjects they identify.  
This will promote ease of use and reduce the incidence of 
misrepresentation.

As noted in (R6), authenticity and familiarity may sometimes be in 
conflict.  It may be desirable to require some proof of the common use 
of a nickname before allowing its use as an ID.  This may conflict with 
the requirement for user autonomy in (R14), however.

As noted, the account ID space is contrained in length, so many account 
IDs will very likely not be similar to the real names of their subjects.  
In ID spaces which are less constrained, it is desirable to require some 
congruence between the ID and the real name of the subject.

(R8)  Permanence of identifiers

In current practice the account ID (aka Kerberos principal) is used as 
the primary identifier of a subject across many services.  As such it is 
extremely inconvenient (at least), for both users and system 
administrators, to replace one account ID with another once it is 
in use.  Account IDds should be permanent for at least the 
duration of a subject's association with Stanford.  Name changes have to 
be accomodated, however, as described in (R6).

To the extent that other ID forms are used in other widely-dispersed, 
hard-to-change contexts, they should also be permanent.

(R9)  Re-use of identifiers

An important role of the SUNet ID system is to provide identifiers for 
use in security mechanisms (currently the Kerberos system, primarily).  
Re-use of an ID, that is, use of an ID by one subject after it has been 
used by another, is a violation of security if that ID is still in use 
in its old association in some system.  In addition, while an ID is 
conceptually only active while the subject is associated with Stanford, 
determining when "association" has ended may be difficult in many cases.  
For these reasons re-use of IDs that are used in security associations 
should be prohibited.  This will include IDs used in public-key 
certificates in the near future.

This requirement conflicts with the familiarity requirement (R6), as it 
makes some desirable ID strings unavailable.  As much as possible the no 
re-use requirement should take precedence.

(R10)  Support UNIX account names

Traditionally UNIX account names have been no more than 8 characters.  
Modern UNIX-like operating systems seem to be relaxing this constraint, 
but for general-purpose use this limit must be respected.  The SUNet ID 
system should support an "account ID" that is no more than 8 characters 
in the general case.  Also by convention, UNIX account names use a 
limited character set (lower-case letters, digits, dash, period); this 
restriction should also be respected.

(R11)  Support Kerberos version 4 security domain

Elements in the Kerberos version 4 security domain are "security 
principals" (or just "principals").  It is a requirement that there be 
at most one principal per real-world subject, and therefore at most one 
principal per SUNet ID system entity.

In Kerberos 4, principal names, although technically unconstrained, are 
by convention of the form "entity.instance".  People usually use 
principals with a null instance, though non-null instances (e.g., 
".root") are used by convention in some applications.  SUNet ID system 
account IDs, and any other IDs used as Kerberos principals, must be 
compatible with this usage.  The system should make explicit the linkage 
between any multiple instances.

(R12)  Available to loosely-affiliated persons

Any "regular" member of the Stanford community (e.g., anyone in the HR 
or Registrar active databases) is eligible for a SUNet ID system entry; 
it is expected that almost everyone will have one.  Entries must also be 
available to people with a wide variety of other relationships to 
Stanford: visiting scholars, contractors, special students, guests, etc.  
This will make the authentication mechanisms based on SUNet IDs 
available to these users, independent of their eligibility for or use of 
any particular SUNet ID-based service.

(R13)  Appropriate use, responsibility, and sponsorship

Use of SUNet IDs for authentication is the basis of distributed 
application security.  It is necessary to specify policies about 
appropriate use of IDs, and responsibilities for ensuring appropriate 
use.  

One suggested policy is that for each SUNet ID system entry, there 
should be a "regular" Stanford person (i.e., current, full-time faculty, 
staff, or student) who is ultimately responsible for its appropriate 
use.  This will ground appropriate use of SUNet IDs in the more 
traditional standards that apply to regular member of the Stanford 
community.

The responsible person is a "sponsor" of the SUNet ID for the other 
subject (group, virtual entity, or loosely-affiliated person).  Policies 
must be specified concerning who can be a sponsor, and what sort of 
entities (and with what characteristics) can be sponsored.  It also is 
desirable to represent that a group (such as an academic department) is 
a sponsor.

(R14)  User autonomy

Procedures and policies for creation and management of SUNet IDs, in 
particular sponsorship policies, should be designed to maximize the 
ability of users to get desired results from the system without 
involvement of support staff.

(R15)  Multiple security domains

In addition to the support of Kerberos 4 requirements (R11), the system 
should be extensible to support other security domains, such as 
DCE/Kerberos version 5, and X.509 version 3 certificates.

(R16)  Consistent with other Stanford ID systems

The SUNet ID system is one of several systems for identifying and 
distributing information about Stanford objects.  Other important 
systems include:

  University ID
  Stanford Card (aka ID Card)
  Reference Data
  Directory service
  NetDB

As much as possible these systems should share a common data model 
regarding their subjects: i.e., it should be easy to find the records 
associated with a particular real-world subject across all these systems.  
They should also avoid separate maintenance of overlapping or 
conflicting data.  Data should be able to flow easily among these 
systems as needed.

(R17)  Identifier consistency across differences in case and punctuation

Users can not be expected to appreciate obscure distinctions between 
identifiers.  This includes case and punctuation differences (e.g., 
"PatLee", "patlee", "pat.lee", "pat_lee").  Comparison rules to test for 
uniqueness should be designed to prevent different subjects from using 
IDs that differ only in these ways.  The system may need to permit 
exceptions to these rules, however, to support the possible existence of 
subjects whose real names differ only in these ways.

(R18)  Expiration

For security purposes, it is necessary that the system maintain 
information about whether a SUNet ID system entry is active.  This 
should correspond to the status of the real-world subject at Stanford.  
An inactive entry normally would not be valid for use in other contexts 
in the system, e.g., as a sponsor.  The entry and its IDs are not 
deleted from the system, however, since its IDs may need to be protected 
from re-use, and its subject may return to active status.

Entries and IDs will generally be created by a person's direct action, 
but will become inactive based on external events: a person's leaving 
Stanford, a validity end-date having been reached, etc.  The system must 
track these events and set the status of entries and IDs appropriately.

Policies must be established about handling of inactive entries in other 
contexts, e.g., the status of a sponsored entry if a sponsoring entry 
becomes inactive.  The system should attempt to alert affected parties 
ahead of time about these situations.

(R19)  Support requirements of dependent application systems

The SUNet ID system exists to provide application systems with 
identifiers.  

Protocol interface.  Create, modify, read.  Must use strong security.

Notification of changes.

Bulk export.  Import?

Specific bullets about Kerb4, ref data, Leland, majordomo, AFS?

(R20)  Preserve "desirable" names

The SUNet ID system promotes the use of natural names for identifiers, 
and it discourages re-using identifiers.  Over time, new users will find 
that desirable identifiers are less and less available to them.  To 
minimize this effect, it is desirable to discourage the use of 
more-attractive identifiers by some subjects.  For example, identifiers 
for loosely-affiliated persons might be required to include a digit; 
these users might also be permitted to have only an account ID, not any 
other forms.

(R21)  User interface

Authenticated.
Web-based.
Easy to use.
Also command-line/script-oriented.

(R22)  User capabilities (authorization)

SUNet ID system internal, not representing authorzation for client 
applications.

User X can do Y.
User X can create N of entity type Y.
Members of group X can do Y.