TempDB files don’t exist/are not reachable anymore, what to do?

TempDB is one of four(well, actually five with ‘mssqlsystemresource’ but that’s not the point of this blog post) system databases of SQL Server that holds temporary user and system objects for various operations, such as temp tables or sorting operations.

It is a crucial item on the startup checklist for SQL Database engine service. At every restart, tempdb is recreated and starts empty to serve its role. Any problem that you may face during this creation operation will cause the database service fail to start.

The Scenario

Let’s imagine a scenario where there are 2 tempdb data files defined, one of them is on E drive and the other is on F drive. What happens if we lose F drive? It won’t crash the instance but you will start seeing alerts in SQL Agent logs such as

“The operating system returned the error ‘2(The system cannot find the file specified.)’ while attempting ‘GetVolumePathName’ on ‘F:\temp2.ndf’ at ‘storagedmv.cpp'(443).”

Clearly, SQL Server is trying to access the defined tempdb data file but it doesn’t exist anymore, so it fails.

The first idea that comes to mind is to remove those file(s) and we should be okay. But you can’t just remove them, because SQL Server can’t remove a file if it is not empty.

USE [tempdb]
GO
ALTER DATABASE [tempdb]  REMOVE FILE [temp2]
GO

Msg 5042, Level 16, State 1, Line 3
The file ‘temp2’ cannot be removed because it is not empty.

Then how about emptying the file first and then remove it? No, it will not work also because we can’t access the file anymore.

USE [tempdb]
GO
DBCC SHRINKFILE (N'temp2', EMPTYFILE)
GO

Msg 823, Level 24, State 2, Line 3
The operating system returned error 21(The device is not ready.) to SQL Server during a read at offset 0x000001f3402000 in file ‘F:\temp2.ndf’. Additional messages in the SQL Server error log and operating system error log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

The fix

We have to tell SQL Server that we lost these files and during next startup, it should create them on another available disk. How do we do that? By giving a new path for tempdb database files using ALTER DATABASE command.

USE [tempdb]
GO
ALTER DATABASE [tempdb] MODIFY FILE 
(name = temp2, filename = 'E:\temp2.ndf')--New available path
GO

The file “temp2” has been modified in the system catalog. The new path will be used the next time the database is started.

As can be seen in the result output, SQL updated its system catalog and will try to create this data file on given network path.

We can also verify this change by querying master_files dmv

SELECT * FROM sys.master_files WHERE database_id=2

As long as there is enough space on that drive, SQL should start without any problem and create tempdb files as configured.

If there is not enough space, then it will create tempdb files based on model database configuration. This will also be logged in error log file and sql agent log as

Message
MODIFY FILE encountered operating system error 112(There is not enough space on the disk.) while attempting to expand the physical file 'E:\temp2.ndf'.

SQL Server without tempdb?

One day you may want / need to start SQL Server without tempdb, maybe because tempdb drive is corrupt or not reachable and SQL doesn’t start successfully. Then you can start database engine service in a minimal configuration mode with the -f command-line parameter, either from SQL Configuration Manager tool by setting Service Properties or use the command-line

SQLServr.exe -f

Alternatively, you can use Trace Flag 3608 which only starts master database

SQLServr.exe -T3608

After bringing the service up, you can connect to it using SQLCMD command line tool and change the paths of tempdb files to a suitable folder. Now we can stop the database service, restart it and tempdb will be automatically created on new path without any issue.

SQL Server XML Conversion Error: “Conversion of one or more characters from XML to target collation impossible”

A quick and short post this time regarding a sql server error being thrown when you try to convert xml characters. XML column has an unicode character that can not be converted into ASCII, thus our engine is not happy about it.

Let’s say you have a “Table1” which stores a “XMLColumn” and you’d like to query on that column with LIKE operator. Hence, you convert it to varchar data type first.

SELECT XMLColumn FROM Table1
WHERE Convert(varchar(max), XMLColumn) IS LIKE %ARollingStone%

Then, you’ll get below error:

Msg 6355, Level 16, State 1, Line 2
Conversion of one or more characters from XML to target collation impossible

To fix this, you need to simply change varchar to nvarchar, that’s it.

SELECT XMLColumn FROM Table1
WHERE Convert(nvarchar(max), XMLColumn) IS LIKE %ARollingStone%

Query Store stopped working after Windows Update on Database Host

Firstly, This is not going to be a long, detailed blog post. I would like to share an issue with Query Store that I experienced today after Windows Updates on SQL Server hosts and a quick fix for that.

Average Wait time is too high on SQL Server

I was doing sanity check after windows updates and noticed severe wait times on a server, caused by a specific database. This very database had a problematic select query which confuses optimizer to choose a wrong execution plan. Query Store helped me to address the issue in the past, I forced the right plan and move on.

I immediately went after the same query and realized it is the one that causes high wait times and lock issues. I checked Query Store and everything looked fine. My forced plan was still there and according to query store statistics, average execution time was 10 miliseconds for problematic query. However, I can see that query takes 10-15 seconds by checking system dmvs. Obviously query was not running with right execution plan. So how can that be and what’s wrong here?

Query Store stopped collecting data

It took some time for me to realize Query Store statistics stop around 7 AM in the morning, coinciding Windows Update time. For some reason, Query Store stopped working and collecting data. That’s why, it was not reflecting what is going on currently and I was blind.

I checked Disk usage and Query Store Space, both were fine. Then I tried to unforce and force the same plan, it didn’t help either. I couldn’t restart the instance or the database so I needed something else to provide quick fix.

FIX – Purge Query Store Data!

I correctly assumed that purging query data would trigger a mechanism internally to restart Query Store and decided to go for it. Voilà! Right execution plan is picked up right away by Query Optimizer, all locks are released and database performance went back to normal. I am not sure why query store stopped working after server reboots for Windows updates. To be able to identify the relation and what was the underlying reason, one needs a proper deep-dive investigation.

You can purge query data by right clicking on database, going into “Properties” and then jump on “Query Store” tab. There you will see a button at the bottom as also shown in below image.

Purge Query Data!

Backup script fails due to lack of network share permission

Most of people uses Ola Hallengren’s Maintenance Solution to administer their databases.  It’s easy to setup, easy to maintain. It also has dozens of parameters that you can specify and adjust maintenance plan according to your business needs.If you never heard it, you must take a look now!

We have encountered a generic problem with backup script, it was complaining about network share that we take our database backups.

We use same script on each instance, so script is good.

File share is accessible from that database host, network access is good.

We gave read/write permissions for SQL Agent service account for backup folder.

Still we get this error, so what is the problem?

The directory “\\fileshare\Network Backup Folder” does not exist. [SQLSTATE 42000] (Error 50000)

SQL Agent service account is not enough!

Both SQL Server (database engine) and Agent service accounts should have access on backup folder. Indeed after giving permissions for SQL Server service account and rebooting hosts to clear windows policy cache( if you forget to reboot or logoff/logon you will be asking WHY question a lot) it started to take backups.

Summary: If you see SQL Server can not reach a network share directory for backup or another reason, first check that both service accounts have access to that folder, then either reboot the host or log off and log on with that accounts.