Name resolution, default schema, implicit schema

This document describes name resolution behavior of schema objects, types and XML schema collections, and how it interacts with default schema, compile plan implicit schema.

I. Context/Background

The following shows the general syntax for referring to an object/type/XML-Schema-collection in SQL Server:

[server.] [database.] [schema.] entity

where:

  • entity is the name of the object/type/XML Schema collection.
  • schema is the schema containing the entity. The schema qualifier lets you refer to an entity in a schema other than default or implicit schema. Only schema entities can be qualified with schema. Schema entities are schema scoped entities including types, XML schema collections, and objects such as tables, views, procedures, functions, DML triggers, constraints, rules, defaults, etc. Non-schema objects, such as DDL triggers, event notifications, cannot be qualified with schema because they are not schema scoped.
  • database is the database context of the entity. This identifier is only valid for schema scoped objects but not types or XML collections.
  • server applies only when you are referencing to an object on linked/remote server. This is the name of the linked server containing the object. The server qualifier lets you refer to an object in a server other than your local server instance. Not all SQL statements allow you to access objects on linked/remote servers. If server part is specified, the object is resolved by delegating to the specified linked/remote server.

You can include spaces around the periods separating the parts of the reference to the entity, or omit database or schema part between the periods. If names of any part include special characters, you can use quoted identifier (BOL Identifier section).

In SQL 2000, types are owned by users and they are not schema scoped, i.e. their names are unique in the containing database. The only way to reference a given type is to use single part (entity) name.

1.2. System Objects

In SQL2005, System Objects are shipped with SQL Server in resource database. (See resource database CSS Material) These include views, scalar- and table-functions, stored-procedures, and extended-stored-procedures. Except Information-Schema (ANSI) views that belong to system schema “INFORMATION_SCHEMA”, all system objects belong to a new, reserved system schema “sys”.(Note: If a user database has a schema called “sys” it cannot be upgraded.)

These system objects can be visible (or resolvable) in every database, and compiled and/or executed in a database as if it were created in that database.The following are feature-definitions of the different types of System Objects.

  • System-Stored-Procedure – a System Object that has a name beginning with “sp_” and is either a stored-procedure or an extended-stored-procedure.
  • System-Extended-Stored-Procedure – a System Object that has a name beginning with “xp_” and is either an extended-stored-procedure or a stored-procedure. Unless schema ‘sys’ is qualified with or implied, it is visible only in “master” database (and thus always executes only in master).
  • System View – a System Object that is a view. It can be a backward compatible view, a catalog view, a dynamic management view (DMV) or an Information-Schema view. Unless schema context is implied or it is a backward compatible view, explicit schema ‘sys’ or ‘INFORMATION_SCHEMA’ must be qualified with.
  • System Function – a System Object that is a function (scalar or table) that “exists” (or is visible) in every database. It is recommended that the name begins with “fn_”.

Most of these system objects are migrated from SQL2000, and they were contained in schema “dbo” or “system_function_schema” (unbind able and hidden, only contains system functions) of master database. Back then, SQL Server name resolution and execution context have following special behaviors to handle system objects:

1.Resolve “anydb.dbo.sp_” to “master.dbo.sp_”, but execute in “anydb” context.

When you run either of the following two batches:

USE my_database; EXECdbo.sp_my_procedure; go
EXEC my_database.dbo.sp_my_procedure; go

SQL Server 2000 searches master.dbo before my_database.dbo, therefore finding the system procedure sp_my_procedure if it exists. This means that a fully-qualified 3-part(database, schema and object) name may not literally resolve to what it says it should, rather a system object from master database could be resolved instead and evaluated in the context of the specified database. In other words, user procedures (if resolved) are always compiled and executed in the database where they exist, but system procedures are compiled and executed in every database. Clearly, we are hijacking user’s namespace in this scenario, i.e. if a user created procedure, which is contained in schema “dbo”, has the same name as system procedure, the user will not be able to execute this procedure.

Important Note: In above example, SQL 2000 considers dbo.sp_my_procedure in master as system procedure only if it is marked as MS-shipped. If it is not marked as MS-shipped, then the user procedure in my_database will be resolved before it. However, if the user procedure in master was resolved indeed, then it would be compiled and executed in master database context, except SQL 65system tables contained in it, which are compiled in my_database context. This behavior was for backward compatible.It is still preserved in SQL2005for TSQL procedures (not CLR procedures) but will be removed in some future releases.

2.System table-valued functions must be invoked with special “::” prefix, system scalar-valued functions must be invoked without any schema.

In SQL 2000, system functions were contained in hidden schema “system_function_schema”, which cannot be referenced. User defined functions must be qualified with schema. As a consequence, this rule makes system functions unambiguously invoked. In SQL2005, the “::” syntax is being deprecated, rather it is recommended to qualify with schema “sys”. Compatibility will be maintained only for the System Functions shipped in SQL2000 by keeping an internal table to map the old syntax to the replacement system function.

3.Information-Schema views must be referenced using schema “INFORMATION_SCHEMA’, and they exist in master database but are compiled in every database.

4.All other MS-shipped “system” objects in master database, including extended procedures, user defined functions, system tables, have to be invoked by specifying master database, and thus always invoked in master database context.

1.3 Important Terminologies

[User’s Default Schema] In SQL 2000, there is an implicit relationship between users and schemas. Users (actually all database principals) by default are owners of a schema that is of the same name as the user (database principal). This implicit relationship is explicitly persisted upon upgrade from SQL 2000. SQL2005 enables an association of a user with a default schema that is not of the same name as the user, or multiple users to the same default schema. If a user’s default schema is not specified while creating the user, dbo is the default schema.

[Execute-As Context]A procedure/function can be executed with the privileges of its owner or with the privileges of the current user (caller), or with the privileges of arbitrary user, depending on the procedure/function definition.

Standalone Execute-As/SETUSER can be executed in current session or inside procedure/function.
[The Current Login/User]The session login/user is either the logged-in login/user or the login/user associated with the remote linked/remote-server session. Without invoke any Execute-As context, either thru procedure/function or standalone, the current login/user is the session login/user.

On database context switch, the user registered in the new database that is mapped to current login is the new current user. If no user is registered for the login, guest user is used.

On entering an Execute-As context (except execute as caller), the specified login/user becomes thecurrent login/user.On exiting an Execute-As context, the current login/user reverts to the previous current login/user, that is, the current login/user that triggered the Execute-As context switch.

[Active Default Schema] Active default schema is schema of the SQL module if entity referenced by a statement (except DDL, dynamic SQL, a.k.a. EXECUTE statements, or intrinsic such as object_id()) inside a procedure, function or view. Otherwise, active default schema is current user’s default schema.

II. Name Resolution Algorithm for Types and XML Schema Collections

2. 1. Algorithm

As implied by general syntax for referencing entities, types or XML schema collections cannot be referenced across database. If schema name is unspecified, SQL server uses “sys first” algorithm as outlined below.

  1. First, Look in the “sys” schema if not DDL access, then
  2. Look in the active default schema, then
  3. Look in “dbo” schema.

If schema name is specified, SQL server simply looks in the designated schema.

Note:

  • Database collation is used to match schema, type and XML schema collection names.
  • DDL access includes CREATE/DROP/ALTER on types or XML schema collections.

2.2. Backward Compatibility

During upgrade, system types are put in “sys” schema, user defined types are put in “dbo” schema. In SQL 2000, user defined types can only be created thru sp_addtype. This procedure is modified in SQL2005 to always put newly created types in “dbo” schema. Since types can only be referenced using single part name in SQL 2000, above algorithm clearly satisfies backward compatibility requirement.

Note that new type DDL can be used to create types in schemas other than “dbo” and “sys”. It is recommended to qualify with schema when referencing user defined types in applications, if active default schema cannot be dependent on.

XML schema collections are newly introduced in SQL2005, so there is no backward compatibility issue with it.

III. Name Resolution Algorithm for Objects

3.1 Algorithm

In a nutshell, objects name resolution follows the same rule as types and XML schema collections. However, SQL Server name resolution and execution context used to have special behaviors in SQL2000 as outlined in Section I, not to mention the fact that system objects are now migrated to resource database, and user-schema separation feature is introduced. Clearly name resolution algorithm needs to be more complex, with the following goals(more details later):

  • Avoiding luring attack
  • Minimizing user’s name space hijacking
  • Backward compatibility
  • Unifying algorithm with types and XML schema collections
  • Acceptable performance

For DDL access, including CREATE/DROP/ALTER object, the algorithm is the same as types and XML collections. Following flow chart covers other scenarios for 1-part or 2-part names.

If server name is specified, the object reference is resolved by the specified remote/linked server.

If database name is specified, the object is resolved in the specified database context. (For system objects which are visible in every database, it means compile/execution database context.) It is worth to mention the case when schema name is omitted though. Under this circumstance, the active default schema is defined as following:

  • For object references outside SQL modules, or DDL, dynamic SQL inside SQL modules, current user’s default schema in the specified database is used.
  • Otherwise, the module schema owner’s default schema in the specified database is used.

3.2Avoiding Luring Attack

Consider following scenario system admin uses an unqualified name reference to a system procedure from an arbitrary database/schema context.

use database1
go
exec sp_addlogin 'somelogin', 'somepassword'
go

We recommend users to qualify with schema “sys” when referencing system procedures in SQL2005.However, existing applications may well rely on SQL 2000 name resolution special behavior to bind system procedure (with sp_ prefix) tighter than user procedures.If we did not force a “sys first” strategy, the unqualified name reference shown above would bind to dbo.sp_addlogin in database1. A “malicious dbo” could then get a system admin to execute arbitrary code.

3.2 Minimizing User’s Name Space Hijacking

As illustrated in the algorithm logic, system objects are resolved first to avoid luring attack. Unfortunately, it is hijacking user’s namespace, i.e.if Microsoft SQL Server ships system stored procedure sp_foo in SQL2005 or future releases, user stored procedure with same name, which is contained in dbo schema, can no longer be used.

The strategy is for system objects to use prefix naming standards, and requiring users to qualify with schema otherwise, such as catalog views, Information-Schema views and dynamic management views. End users are discouraged to use MS-reserved name prefixes “sp_/fn_/xp_”.

We may introduce SQL-path or similar feature in next release to improve this, as well as address user’s desire to choose different name resolution search order in general.

3.3Backward Compatibility

As we can tell from the algorithm, the special behavior of SQL 2000 name resolution with respect to system objects is preserved.There is subtle difference between SQL2005 and SQL 2000 with respect to collation used to match names though.

In SQL2005, system object names are matched using collation of current context database. This means if you are using case sensitive collation database, system object name must be exactly matched. E.g. “exec SP_help” will fail to resolve as system procedure sp_help in database with Latin1_General_BIN collation.

The reason is that system objects logically exist in every database, ideally they should be treated same way as user objects, which are always resolved using context database collation.

This behavior change from SQL 2000 is summarized in the following table. This makes difference on SQL Server instances where user databasesuse different collation with server. (System databases use server collation.)

System object collation difference between SQL 2000 and SQL2005

SQL 2000 / SQL2005
Schemas / Database collation / Database collation
Stored procedures, functions / Server collation
(shipped in master-db and contained in schema dbo) / Database collation
(shipped in resource-db contained in schema sys)
Functions / Keyword collation, i.e. Latin1_General_CI_AS_KS_WS
(shipped in master-db and contained in schema system_function_schema) / Database collation
(shipped in resource-db contained in schema sys)
Information-Schema views / Server collation
(shipped in master-db and contained in schema INFORMATION_SCHEMA) / Database collation
(shipped in resource-db contained in schema INFORMATION_SCHEMA)
Tables / Server collation
(shipped in master-db and contained in schema dbo) / Database collation
(changed to back-compatible views, shipped in resource-db and contained in schema sys)
Tables / Database collation
(shipped in all databases and contained in schema dbo) / Database collation
(changed to back-compatible views, shipped in resource-db and contained in schema sys)

IV. Implicit schema (Compile Plan)

A compile plan may refer to non-qualified schema object. It needs to record that fact an “implicit” schema is used (Name resolution detect such scenario and default schema is recorded in the compile plan, which is used as part of the plan cache lookup key).Batches with unqualified object names result in non-reuse of query plans. For example, in "SELECT * FROM MyTable", MyTable may legitimately resolve to Alice.MyTable if Alice issues this query, and she owns a table with that name. Similarly, MyTable may resolve to Bob.MyTable. In such cases, SQL Server does not reuse query plans. If, however, Alice issues "SELECT * FROM dbo.MyTable", there is no ambiguity because the object is uniquely identified, and query plan reuse can happen. (See the uid column in sys.syscacheobjects. It indicates the user ID for the connection in which the plan was generated. Only query plans with the same user ID are candidates for reuse. When uid = -2, it means that the query does not depend on implicit name resolution, and can be shared among different user IDs.)