Distributed queries – (Part-2)
Author: Basit A. Farooq
Linked server for another SQL Server
To create a linked server object for another SQL Server, using SQL Server Management Studio:
· In Object Explorer, expand Server Objects.
· Right-click Linked Servers and choose New Linked Server.
· Enter the name of the linked server.
· Select SQL Server.
· Click Security.
· If necessary, add mappings.
· Specify how the connection should be handled for logins not explicitly mapped. You can choose from:
– Not be made.
– Be made without using a security context.
– Be made using the login’s current security context (self-mapping).
– Be made using this security context (specify a remote login and password).
· Click Server Options to define additional configuration options. You can set:
– Collation Compatible — set true if both servers use the same collation (character set and sort order).
– Data access — set true to enable the linked server to be accessed in a distributed query.
– Rpc — Enables remote procedure calls to be received from the linked server.
– Rpc Out — Enables remote procedure calls to be sent to the linked server.
– Remote collation — set true, the collation for the remote server is used. Set false, the collation for the local server is used.
– Collation name — specifies a collation to use on the remote server. Set this only if the remote server supports the same collations as SQL Server and uses only a single collation on the data source.
– Connection timeout — sets the connection timeout. To use the timeout specified in sp_configure, set this to 0.
– Query timeout — sets the query timeout. To use the timeout specified in sp_configure, set this to 0.
· Click OK.
Linked server for a non-SQL datasources
To configure a linked server mapped to a non-SQL Server data source:
· In Object Explorer, expand Server Objects.
· Right-click Linked Servers and choose New Linked Server.
· Enter the name of the linked server.
· Select “Other data source.”
· From the Provider list, select the provider.
· Enter the product name.
· Enter the data source name.
· If necessary, enter the provider string.
· If necessary, enter the location.
· If necessary, enter the catalog.
· Click Security.
· Configure mappings.
· Click Server Options.
· Configure server options as described in “Linked server for another SQL Server.”
· Click OK.
Viewing linked servers
You can view information about the linked servers and logins by querying the following catalog views:
· sys.servers — Information about linked servers.
· sys.linked_logins — Information and linked login mappings.
· sys.remote_logins — Incoming login mappings.
Deleting linked servers
To delete a linked server object:
· Expand Server Objects, Linked Servers.
· Right-click the server you want to delete and choose Delete.
· Click OK.
· Click Yes to confirm that you also want to delete the remote logins for that server.
Transact-SQL statements for distributed queries
The queries you can execute against a remote source depend on the capabilities of the OLE DB provider and the server configuration. Issues include whether or not:
• The remote sever is a linked server.
• The remote server supports a four-part name.
A four-part name is one with the syntax:
linked_server.database.schema.object
Linked servers
You can reference data sources that support four-part names using standard Transact- SQL statements, such as SELECT, INSERT, UPDATE, and DELETE. You also use a four-part name to call a stored procedure on the remote server. SELECT queries on linked servers can’t include the ORDER BY clause.
If a linked server doesn’t support a four-part name, you must access it by using a pass through query. When working with a linked server, you typically use OPENQUERY to execute pass through queries. Other rowset functions are also supported, but connect to the remote server as if it were a non-linked server.
Illegal commands
You can’t run the following commands remotely on a linked server:
• Any CREATE
• Any ALTER
• Any DROP
The OPENQUERY function
You can execute a query on a linked server that doesn’t support four-name access with OPENQUERY. You reference OPENQUERY in the FROM clause of a SELECT statement or as the target table of an INSERT, UPDATE, or DELETE statement. OPENQUERY has the following syntax:
OPENQUERY(linked_server, ‘query’)
For example, to issue a query on the linked server named RemSrv, you execute:
SELECT * FROM OPENQUERY(RemSrv,
‘SELECT FirstName, LastName FROM Sales.Customers’)
Distributed transactions
A distributed transaction is a transaction that runs across two or more servers. The servers involved are referred to as resource managers. There is also a transaction manager that coordinates between the resource managers. The Microsoft Distributed Transaction Coordinator (MS DTC) installs with SQL Server as a transaction manager.
In SQL Server, you run operations with remote linked servers as distributed transactions. This means that SQL Server either completes the transaction on all servers or rolls back the transaction from all servers. The SQL Server runs a distributed transaction when:
• A local transaction issues a distributed query.
• A local transaction executes a remote stored procedure, and the REMOTE_PROC_TRANSACTIONS option is set on.
• A BEGIN DISTRIBUTED TRANSACTION is issued.
• A distributed transaction is initiated through OLE DB methods or ODBC functions.
The MS DTC coordinates distributed transactions between two or more resource managers (data sources). You can use MS DTC in SQL-Server-only environments or in mixed environments with other transaction monitors that are compatible with the UNIX XA transaction processing standard. MS DTC provides tight consistency among data sources—at any given time, updates to all copies of data are consistent. This requires a high-speed network and can reduce the availability of databases.
SQL Server uses a two-phase commit process to avoid problems with distributed transactions, such as communication failures. A two-phase commit is implemented as:
• Prepare phase —prepared command is sent to all resource managers involved in the transaction. All buffers holding log images write to disk, and any other actions required are completed. Each resource manager returns its success or failure status to the transaction manager.
• Commit phase — the transaction manager sends commit commands to each resource manager only if each resource manager was successful in the prepare phase. At that point, each resource manager commits and then sends its status back to the transaction manager. When all resource managers have successfully committed, this information is returned to the application that initiated the transaction. If any resource manager fails, a rollback command is sent to all resource managers.
Because of their vulnerability, distributed transactions are typically used only when data must remain synchronized at all times.
Unlinked servers
Linking to a remote server is not your only option for supporting distributed queries. SQL Server rowset functions support non-linked server operations. Non-linked server operations are based on an ad hoc connection to a remote server. This establishes a temporary connection to complete an operation and closes the connection when the operation has completed. Non-linked server operations are supported by OPENROWSET and OPENDATASOURCE rowset functions.
Remote ad hoc query support is disabled by default on computers running SQL Server 2014. You can use sp_configure or the SQL Server Surface Area Configuration utility to enable remote ad hoc queries.
The OPENDATASOURCE function
You can initiate a query for data retrieval or manipulation with OPENDATASOURCE. You will often use it to make a temporary connection with a heterogeneous (non-SQL Server) data source, and when you need to make only intermittent connections.
The OPENDATASOURCE function is called as the first part of a four-part name, effectively acting as the database server for the purpose of the query. The purpose of the OPENDATASOURCE function is to return the source data object as the query data source. It has the following syntax:
OPENDATASOURCE(provider_name, init_string)
The provider_name is the programmatic identifier for the OLE DB provider. The init_string is a string of keyword value pairs (keyword=value; keyword=value). The supported keywords are described in the following table.
The OPENROWSET function
You can also query a remote server using the OPENROWSET function. One advantage of OPENROWSET is that it can return a data set based on a query, rather than a complete table or view, thus reducing network communication requirements. OPENROWSET has the following syntax:
OPENROWSET ‘provider_name’, {‘datasource’;
‘user_id’; ‘password’,} | ‘provider_string’}
, {[catalog.][schema.]object | ‘query’ }
The provider_name, datasource, user_id, password, and provider_string parameters are the same as those you have already seen. You use the catalog.schema.object syntax to execute a stored procedure on the server. The function supports bulk processing options for processing LOB type data.
You use OPENROWSET as the data source in a query. You can use it in the FROM clause of a SELECT statement as if it were a table, view, or table-valued function. There are two basic options for using OPENROWSET:
• Have the function return a complete table or view from the remote server. This can result in significant traffic requirements for very large tables.
• Pass through a query to the remote server to reduce the result size, or to return a custom result, such as one based on a join.
You can combine OPENROWSET with other data sources in your query. For example, you can run queries that join local tables with remote tables. This is one way to create distributed views, especially when heterogeneous servers are involved.