PostgreSQL 9.4+ Configuration Setting Formulas

I’ve been adjusting postgresql.conf settings to maximize performance between two very different systems. I have a Raspberry Pi 2 (which has 1G RAM and can’t be increased) and a development server (with 64G RAM), both with PostgreSQL 9.4. Here are my formulas to keep my PostgreSQL environments happy. I’m not a seasoned PostgreSQL expert, so these values are a work in progress.

Some of these values were obtained by reverse-engineering http://pgtune.leopard.in.ua/.

Retrieve RAM in MB. The value is at the intersection of Mem and total.
Command: free --mega
free-mega.jpg

In my case, it shows 64432M, or about 64G, RAM.

This Excel document is my latest attempt at calculating PostgreSQL memory settings:
PG setting calculations

Roughly, I use these formulas:

shared_buffers = {0.2 * RAM_IN_MB}MB # mine was "shared_buffers = 13107MB"
effective_cache_size = {0.65 * RAM_IN_MB}MB # mine was "effective_cache_size = 42598MB"
work_mem = {0.0064 * RAM_IN_MB}MB
maintenance_work_mem = {0.128 * RAM_IN_MB, but cap at 2048 (2G)}MB
checkpoint_segments = 64 # write every 64 segments; 64 * 16MB = 1024MB = 1GB
autovacuum_naptime = 1min # 10min if performing heavy writes, otherwise 1min
default_statistics_target = 500
max_locks_per_transaction = 1024
track_functions = all
track_activity_query_size = 16000 # so selecting from pg_stat_activity query column shows 16k chars, not default of 1k

Creating a PostgreSQL BDR Sandbox on Ubuntu 14

Background:
After about two months of work trying to get Postgres-XC to work, I gave up. There was no step-by-step guide to creating a demo setup, and the documentation stunk. I found one site with a step-by-step guide, but it wasn’t detailed enough for me. Now, I will submit that I am not an expert with Postgres or Ubuntu (I’d consider myself a beginner), but I am pretty decent with SQL Server (MCITP) and a few other databases, so there’s that.

I discovered Postgres BDR (https://wiki.postgresql.org/wiki/BDR / https://wiki.postgresql.org/wiki/BDR_Quick_Start), and so far, I’m cautiously optimistic. Their quick-start guide is pretty good, and I’ve copied some portions of it here. This guide is for people like me, who have limited experience with Postgres and Ubuntu. If you find some issues with how I did something, feel free to leave a comment.

If you leave a comment, and it gets deleted, please do not be offended, and post it again. I get a huge amount of spam comments, and I may miss a real one now and then. Let me know if I am steering someone away from doing things correctly.

These steps will:

  • DELETE ANY EXISTING POSTGRES DATA.
  • Set up a Postgres 9.4 database on a single server as two nodes on Ubuntu 14.

Notes:

Replace {DevEnvName} with your development environment name. For example:
From: ./pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -o "-p 5600" -w stop -m fast
To: ./pg_ctl -l /var/lib/postgresql/9.4/bdr/devops01_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/devops01_node01_port_5600 -o "-p 5600" -w stop -m fast

Replace {DevDBName} with your development database name. For example:
From: createdb {DevDBName} --port 5600 --username postgres
To: createdb QADB --port 5600 --username postgres

Remove Any Existing Version of Postgres.

Using htop, kill any postgres sessions that you have running. You’re gonna rip it out anyway, so it doesn’t matter.

These commands will show you some errors if you don’t have postgres already installed. You can skip them if you wish. I installed BDR about 10 times before I got it set up, which is why I included them here for your viewing pleasure.

sudo su - root
cd /

apt-get --purge remove postgres*
apt-get -f autoremove
cd /home/root
rm .pgadmin3
rm pg_hba.conf
rm postgresql.conf
cd /root
rm 2ndquadrant_bdr -r
rm bdr -r
rm bdr-extension -r
rm bdr-pg -r
cd /var
rm bdr -r
rm bdr-extension -r
rm bdr-pg -r
cd /var/log
rm postgres* -r
cd /var/lib
rm postgres* -r

Download and Install PostgreSQL.

This step downloads and installs the latest stable version of PostgreSQL as of this writing, 9.4. Postgres BDR is a kind of add-on to Postgres. You will install the add-on soon.

  • sudo su - root
  • apt-get install postgresql-9.4 postgresql-client-common postgresql-client-9.4 postgresql-contrib-9.4 postgresql-common

Download and Compile BDR Code.

I don’t really know what the curl command does, but it works, so there you go.

  • exit # out of root permissions
  • sudo su - postgres
  • pwd # shows /var/lib/postgresql instead of /home/postgres; interesting
  • curl -s "http://git.postgresql.org/gitweb/?p=2ndquadrant_bdr.git;a=blob_plain;f=scripts/bdr_quickstart.sh;hb=bdr-plugin/next" | bash
  • If you receive an error after issuing the curl command above, issue it a second time.

After about 10 minutes, and a lot of text, you’ll see this:

BDR compiled and installed.

Sources at /var/lib/postgresql/2ndquadrant_bdr/bdr-src
Installed to /var/lib/postgresql/2ndquadrant_bdr/bdr

Now add it to your PATH:
export PATH=/var/lib/postgresql/2ndquadrant_bdr/bdr/bin:$PATH
and carry on with the quickstart at https://wiki.postgresql.org/wiki/BDR_User_Guide

Enter this command, similar to above export command:

  • export PATH=$HOME/2ndquadrant_bdr/bdr/bin:$PATH

Create BDR-enabled PostgreSQL Instances.

  • initdb -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -A trust -U postgres
  • initdb -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601 -A trust -U postgres

Stop all Postgres Clusters Gently.

Let’s stop them if they are running.

  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -o "-p 5600" -w stop
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601 -o "-p 5601" -w stop

Enable the BDR Extension.

I’m going to repeat myself a little on this step. I hope it helps.

  • cd /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600
  • nano postgresql.conf
  • Scroll down a little to the POSTGRESQL.CONF changes and implement them
  • Scroll down a little to the POSTGRESQL.CONF additions and add them
  • Uncomment the lines for node 1, port 5600

  • cd /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601
  • nano postgresql.conf
  • Scroll down a little to the POSTGRESQL.CONF changes and implement them
  • Scroll down a little to the POSTGRESQL.CONF additions and add them
  • Uncomment the lines for node 2, port 5601

  • Note: the bdr.connections identifier can only contain letters and numbers. Underscores, tildes, periods, dashes, and any other punctuation are not allowed. A helpful error message will complain if you try.

# Make this change in these files:
# /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.conf
# /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.conf
listen_addresses = '*'
# Add these lines to the bottom of these files:
# /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.conf
# /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.conf
# Generic settings required for BDR
#----------------------------------

# Allow two other peer nodes, plus one for init_replica
max_replication_slots = 3

# Two peer nodes, plus two slots for pg_basebackup
max_wal_senders = 4 

# Record data for logical replication
wal_level = 'logical'
track_commit_timestamp = on

# Load BDR
shared_preload_libraries = 'bdr'

# Make sure there are enough background worker slots for BDR to run
max_worker_processes = 10

# These aren't required, but are useful for diagnosing problems
#log_error_verbosity = verbose
#log_min_messages = debug1
#log_line_prefix = 'd=%d p=%p a=%a%q '

# Useful options for playing with conflicts
#bdr.default_apply_delay=2000   # milliseconds
#bdr.log_conflicts_to_table=on

# (BEGIN) BDR connection settings for node 1, port 5600
#bdr.connections = '{DevEnvName}node02port5601'
#bdr.{DevEnvName}node02port5601_dsn = 'dbname={DevDBName} user=postgres port=5601'
# (END) BDR connection settings for node 1, port 5600

# (BEGIN) BDR connection settings for node 2, port 5601
#bdr.connections = '{DevEnvName}node01port5600'
#bdr.{DevEnvName}node01port5600_dsn = 'dbname={DevDBName} user=postgres port=5600' 
#bdr.{DevEnvName}node01port5600_init_replica = on
#bdr.{DevEnvName}node01port5600_replica_local_dsn = 'dbname={DevDBName} user=postgres port=5601'
# (END) BDR connection settings for node 2, port 5601

Allow Replication in pg_hba.conf.

# Add these lines to:
# /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/pg_hba.conf
# /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/pg_hba.conf
host all all all trust
host replication all all trust
local replication postgres trust
local replication all trust

Start the Cluster.

  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -o "-p 5600" -w restart
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601 -o "-p 5601" -w restart

Verify that the cluster is running.

  • cd /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600
  • tail postgresql.log --lines 30
  • cd /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601
  • tail postgresql.log --lines 30

Note: it will likely complain that the “{DevDBName}” database does not exist: FATAL: database "{DevDBName}" does not exist. This is okay, and will be remedied within a few steps. I put this step here because the next step creates the databases, and if the cluster isn’t running, going on to the next step will only complicate things.

However, if you see other errors in postgresql.log, the next step will have a few suggestions to try if the cluster isn’t running.

Fix any issues if the cluster isn’t running.

  • I’ve found that I make the most typos when adding the Generic settings required for BDR (specified above) in the two postgresql.conf files.

  • /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.conf
  • /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.conf

  • Stop any postgresql servers running using htop.

  • After making your corrections, restart the Postgres databases.

  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -o "-p 5600" -w restart
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601 -o "-p 5601" -w restart

Create the Databases.

  • createdb {DevDBName} --port 5600 --username postgres
  • createdb {DevDBName} --port 5601 --username postgres

Test the Database on Port 5600.

  • psql --port 5600 --username postgres --dbname {DevDBName}
  • (prompt should be “{DevDBName}=#”)
  • create table test1 (id integer);
  • drop table if exists test1; -- make sure drop and create work ok
  • create table test1 (id integer);
  • insert into test1 (id) values (1);
  • select * from test1;
  • (should retrieve “id / 1”);

Disconnect PGAdmin from Servers

If you are running PGAdmin, be sure to disconnect from the databases on port 5600 and 5601, otherwise you won’t be able to restart Postgres from the Ubuntu command line. I don’t think you can forcefully disconnect users by running pg_ctl.

Restart Postgres.

I like to make sure I can still restart Postgres.

  • /etc/init.d/postgresql stop # may give you an error that it can’t find .../9.4/main, this is ok
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -o "-p 5600" -w restart
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601 -o "-p 5601" -w restart

Verify the Cluster is Running. Yes, Again.

  • cd /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600
  • tail postgresql.log --lines 30
  • cd /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601
  • tail postgresql.log --lines 30

If you don’t see clear messages that complain about databases missing or replication issues, you should be okay. In other words, if the last few lines start with NOTICE, LOG, or CONTEXT, you’re okay. If they start with ERROR, check your postgresql.conf and pg_hba.conf files.

Create a Table on the Database on Port 5600.

  • psql --port 5600 --username postgres --dbname {DevDBName}
  • (prompt should be “{DevDBName}=#”)
  • select * from test1;
  • (should return “id / 1”);
  • drop table if exists test1; -- should drop the table
  • create table test2 (id integer);
  • insert into test2 (id) values (2);
  • select * from test2;
  • (should return “id / 2”);

Verify the Table Has Been Replicated to the Database on Port 5601.

  • psql --port 5601 --username postgres --dbname {DevDBName}
  • (prompt should be “{DevDBName}=#”)
  • select * from test2;
  • (should return “id / 2”);

Create Roles

BDR cannot replicate create user commands, so you must run these commands on each node.

  • psql --port 5600 --username postgres
  • (prompt should be “postgres=#”)
  • create user {DevDBName} superuser;

  • psql --port 5601 --username postgres
  • (prompt should be “postgres=#”)
  • create user {DevDBName} superuser;

Create a New Schema in the {DevDBName} Database.

  • psql --port 5600 --username {DevDBName} --dbname {DevDBName}
  • (prompt should be “{DevDBName}=#”)
  • create schema {DevDBName};
  • set search_path = {DevDBName}, public;
  • drop table if exists test1;
  • drop table if exists test2;
  • drop table if exists test3;
  • create table test3 (id integer);
  • insert into test3 (id) values (3);
  • select * from test3;
  • (should return “id / 3”);

Verify the Table Has Been Replicated to the Database on Port 5601.

  • psql --port 5601 --username {DevDBName} --dbname {DevDBName}
  • (prompt should be “{DevDBName}=#”)
  • select * from test3;
  • (should return “id / 3”);

Create Connections via PGAdmin.

Connection #1:

  • Name: {Server_Name_Or_IP}:5600 (u={DevDBName})
  • Host: {Server_Name_Or_IP}
  • Port: 5600
  • Username: {DevDBName}

Connection #2:

  • Name: {Server_Name_Or_IP}:5601 (u={DevDBName})
  • Host: {Server_Name_Or_IP}
  • Port: 5601
  • Username: {DevDBName}

You’re Done, Sort Of!

I don’t know how to get Postgres BDR to launch automatically when the machine boots. I will update this post when I learn how.

Until then, here’s how to start it manually after a reboot:

  • sudo su - postgres
  • export PATH=$HOME/2ndquadrant_bdr/bdr/bin:$PATH
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node01_port_5600 -o "-p 5600" -w restart
  • pg_ctl -l /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601/postgresql.log -D /var/lib/postgresql/9.4/bdr/{DevEnvName}_node02_port_5601 -o "-p 5601" -w restart

Once I gain more experience with Postgres BDR, I hope to create another guide to help you explore it further. My hat’s off to 2ndquadrant.com for releasing it!

DB2: Retrieve Non-primary Indexes

I’ve been getting more into DB2 (and Postgres) at my office, so I’ll be posting about them from time to time.

For starters, this DB2 query returns all table names, index names, index columns and included columns for a schema. The column lists are CSV, and specify ascending/descending order where applicable. Hope it helps lots of DB2 admins!

with CTEListNonPrimaryIndexes as (
        SELECT i.tabschema, i.tabname, i.indname
        FROM syscat.indexes AS i
        where
                i.tabschema='{DB2_SCHEMA}' and i.indschema=i.tabschema
                and i.uniquerule <> 'P' -- skip primary keys
        ORDER BY i.tabschema, i.tabname, i.indname
        --fetch first 1000 rows only
)
, CTEIncludeColumns as (
        select
        substr(xmlserialize(xmlagg(xmltext(concat(', ', IC.colname)) order by IC.colseq) as varchar(32000)), length(', ')+1) as csv_columns
        ,C.tabschema
        ,C.tabname
        ,IC.indname
        from syscat.columns as C
        inner join syscat.indexcoluse as IC on C.tabschema=IC.indschema and C.colname=IC.colname
        inner join CTEListNonPrimaryIndexes on C.tabschema=CTEListNonPrimaryIndexes.tabschema and C.tabname=CTEListNonPrimaryIndexes.tabname
                and IC.indname=CTEListNonPrimaryIndexes.indname --and IC.colorder='I'
        where IC.colorder='I'
        group by C.tabschema, C.tabname, IC.indname
)
select CTEListNonPrimaryIndexes.tabschema as "{SCHEMA}"
, CTEListNonPrimaryIndexes.tabname as "{TABLE_NAME}"
, CTEListNonPrimaryIndexes.indname as "{INDEX_NAME}"
, varchar_format(I.LASTUSED, 'MM/DD/YYYY') as "{LAST_USED}"
, (select substr(xmlserialize(xmlagg(xmltext(concat(', ', IC.colname||' '
        ||case when IC.colorder='A' then 'asc' else 'desc' end
        )) order by IC.colseq) as varchar(32000)), length(', ')+1)
        from syscat.columns as C
        inner join syscat.indexcoluse as IC on C.tabschema=IC.indschema and C.colname=IC.colname
        where C.tabschema=CTEListNonPrimaryIndexes.tabschema and C.tabname=CTEListNonPrimaryIndexes.tabname and IC.indname=I.indname
        and IC.colorder in ('A', 'D') -- ascending or descending, exclude include columns. include columns are listed separately.
        ) as "{INDEX_COLUMNS_CSV}"
, case when CTEIncludeColumns.csv_columns is not null then 'include ('||CTEIncludeColumns.csv_columns||')' end as "{INCLUDE_COLUMNS_CSV}"
from CTEListNonPrimaryIndexes
inner join syscat.indexes as I on CTEListNonPrimaryIndexes.tabschema=I.tabschema and CTEListNonPrimaryIndexes.tabname=I.tabname and CTEListNonPrimaryIndexes.tabschema=I.indschema
        and CTEListNonPrimaryIndexes.indname=I.indname
left outer join CTEIncludeColumns on CTEListNonPrimaryIndexes.tabschema=CTEIncludeColumns.tabschema
        and CTEListNonPrimaryIndexes.tabname=CTEIncludeColumns.tabname
        and CTEListNonPrimaryIndexes.indname=CTEIncludeColumns.indname
--where I.LASTUSED >= '2014-08-04' -- retrieve all indexes used after 8/4/2014
--where I.LASTUSED >= current_timestamp - 30 days -- retrieve all indexes used in the last 30 days
order by CTEListNonPrimaryIndexes.tabschema, CTEListNonPrimaryIndexes.tabname, CTEListNonPrimaryIndexes.indname
;

Generate Create Index Code

Have you tried to google for code to retrieve all indexes in a SQL Server database? I’m sure you found a lot of solutions, some better than others.

I’m rewriting the example code for the templating system. The example code copies the AdventureWorks database into a new database, among other things. One of the steps copies index definitions.

Copying all index definitions in AdventureWorks proved to be pretty difficult, much more difficult than I originally thought.

AdventureWorks has a huge variety of index data: clustered, nonclustered, column lists, computed columns, included columns, and xml indexes. Also, the QUOTED_IDENTIFIER setting needs to be correct for some types of indexes, computed columns and xml indexes (which I didn’t know).

Anyway, even if you don’t use the templating system, I hope people find this code useful.

;with cte_index_column_names as (
SELECT
SCHEMA_NAME(tbl.schema_id) as [schema],
tbl.name as table_name, 
i.name AS index_name,
case when xi.xml_index_type_description is null then i.type_desc else xi.xml_index_type_description end as type_desc,
i.is_primary_key,
case when i.is_unique=1 then 'unique ' else '' end as [unique],
xi2.name as parent_xml_index,
xi.secondary_type_desc as secondary_xml_type_desc,
xi2.xml_index_type_description,
(
        ltrim(stuff((
                        SELECT
                        ', '
                        +quotename(clmns.name)
                        +' '
                        +CASE WHEN sub_ic.is_descending_key = 1 THEN 'desc' ELSE 'asc' END
                from sys.tables as sub_tbl
                inner join sys.indexes as sub_i on sub_i.index_id>0 and sub_i.is_hypothetical=0 and sub_i.[object_id]=sub_tbl.[object_id]
                inner join sys.index_columns as sub_ic on (sub_ic.column_id > 0 and (sub_ic.key_ordinal > 0 or sub_ic.partition_ordinal = 0)) AND (sub_ic.index_id=CAST(sub_i.index_id AS int) AND sub_ic.[object_id]=sub_i.[object_id])
                inner join sys.columns as clmns on clmns.[object_id]=sub_ic.[object_id] and clmns.column_id=sub_ic.column_id
                where sub_i.[object_id]=i.[object_id] and sub_i.index_id=i.index_id
                        and sub_ic.is_included_column=0
                order by sub_ic.key_ordinal
                FOR XML PATH('')
        ), 1, len(', '), ''))
) as csv_index_columns_with_order
,(
        ltrim(stuff((
                        SELECT
                        ', '
                        +quotename(clmns.name)
                from sys.tables as sub_tbl
                inner join sys.indexes as sub_i on sub_i.index_id>0 and sub_i.is_hypothetical=0 and sub_i.[object_id]=sub_tbl.[object_id]
                inner join sys.index_columns as sub_ic on (sub_ic.column_id > 0 and (sub_ic.key_ordinal > 0 or sub_ic.partition_ordinal = 0)) AND (sub_ic.index_id=CAST(sub_i.index_id AS int) AND sub_ic.[object_id]=sub_i.[object_id])
                inner join sys.columns as clmns on clmns.[object_id]=sub_ic.[object_id] and clmns.column_id=sub_ic.column_id
                where sub_i.[object_id]=i.[object_id] and sub_i.index_id=i.index_id
                        and sub_ic.is_included_column=0
                order by sub_ic.key_ordinal
                FOR XML PATH('')
        ), 1, len(', '), ''))
) as csv_index_columns_without_order
,(
        ltrim(stuff((
                        SELECT
                        ', '
                        +quotename(clmns.name)
                from sys.tables as sub_tbl
                inner join sys.indexes as sub_i on sub_i.index_id>0 and sub_i.is_hypothetical=0 and sub_i.[object_id]=sub_tbl.[object_id]
                inner join sys.index_columns as sub_ic on (sub_ic.column_id > 0 and (sub_ic.key_ordinal > 0 or sub_ic.partition_ordinal = 0 or sub_ic.is_included_column != 0)) AND (sub_ic.index_id=CAST(sub_i.index_id AS int) AND sub_ic.[object_id]=sub_i.[object_id])
                inner join sys.columns as clmns on clmns.[object_id]=sub_ic.[object_id] and clmns.column_id=sub_ic.column_id
                where sub_i.[object_id]=i.[object_id] and sub_i.index_id=i.index_id
                        AND sub_ic.is_included_column = 1
                order by sub_ic.key_ordinal
                FOR XML PATH('')
        ), 1, len(', '), ''))
) as csv_include_columns
,(
                select cast(count(*) as bit)
                from sys.columns as sub_c
                inner join sys.tables as sub_t on sub_c.[object_id]=sub_t.[object_id]
                where
                        sub_t.[object_id]=tbl.[object_id]
                        and sub_c.is_computed=1
) as table_has_computed_columns
,i.filter_definition
,indexedpaths.name AS indexed_xml_path_name
FROM sys.tables AS tbl
INNER JOIN sys.indexes AS i ON (i.index_id > 0 and i.is_hypothetical = 0) AND (i.[object_id]=tbl.[object_id])
LEFT OUTER JOIN sys.stats AS s ON s.stats_id = i.index_id AND s.[object_id] = i.[object_id]
LEFT OUTER JOIN sys.key_constraints AS k ON k.parent_object_id = i.[object_id] AND k.unique_index_id = i.index_id
LEFT OUTER JOIN sys.xml_indexes AS xi ON xi.[object_id] = i.[object_id] AND xi.index_id = i.index_id
LEFT OUTER JOIN sys.xml_indexes AS xi2 ON xi2.[object_id] = xi.[object_id] AND xi2.index_id = xi.using_xml_index_id
LEFT OUTER JOIN sys.spatial_indexes AS spi ON i.[object_id] = spi.[object_id] and i.index_id = spi.index_id
LEFT OUTER JOIN sys.spatial_index_tessellations as si ON i.[object_id] = si.[object_id] and i.index_id = si.index_id
LEFT OUTER JOIN sys.data_spaces AS dsi ON dsi.data_space_id = i.data_space_id
LEFT OUTER JOIN sys.tables AS t ON t.[object_id] = i.[object_id]
LEFT OUTER JOIN sys.data_spaces AS dstbl ON dstbl.data_space_id = t.Filestream_data_space_id and i.index_id < 2
LEFT OUTER JOIN sys.filetable_system_defined_objects AS filetableobj ON i.[object_id] = filetableobj.[object_id]
LEFT OUTER JOIN sys.selective_xml_index_paths AS indexedpaths ON xi.[object_id] = indexedpaths.[object_id] AND xi.using_xml_index_id = indexedpaths.index_id AND xi.path_id = indexedpaths.path_id
)
select
        quotename([schema])+'.'+quotename(table_name) as [{TABLE_NAME}]
        ,[schema]+'.'+table_name as [{UNQUOTED_TABLE_NAME}]
        ,quotename(index_name) as [{INDEX_NAME}]
        ,index_name as [{UNQUOTED_INDEX_NAME}]
        ,case
                when table_has_computed_columns=1 or type_desc in ('PRIMARY_XML', 'SECONDARY_XML') then 'set quoted_identifier on' else 'set quoted_identifier off'
        end
        as [{QUOTED_IDENTIFIER_COMMAND}]
        ,case
                when is_primary_key=1 then 'alter table '+quotename([schema])+'.'+quotename(table_name)+' add constraint '+quotename(index_name)+' primary key '+type_desc
                when type_desc='PRIMARY_XML' then 'create primary xml index '+quotename(index_name)+' on '+quotename([schema])+'.'+quotename(table_name)
                when type_desc='SECONDARY_XML' then 'create xml index '+quotename(index_name)+' on '+quotename([schema])+'.'+quotename(table_name)
                else 'create '+[unique]+type_desc collate database_default +' index '+quotename(index_name)+' on '+quotename([schema])+'.'+quotename(table_name)
        end as [{COMMAND_PREFIX}]
        ,case
                when type_desc in ('PRIMARY_XML', 'SECONDARY_XML') then csv_index_columns_without_order
                else csv_index_columns_with_order
        end as [{CSV_INDEX_COLUMNS}]
        ,case
                when type_desc='SECONDARY_XML' then 'using xml index '+quotename(parent_xml_index)+' for '+secondary_xml_type_desc collate database_default
        end as [{SECONDARY_XML_INDEX_USING}]
        ,case when csv_include_columns is not null then 'include ('+csv_include_columns+')' end as [{INCLUDE_CLAUSE}]
        ,case when filter_definition is not null then 'where '+filter_definition end as [{FILTER_DEFINITION}]
FROM
   cte_index_column_names
ORDER BY
   table_name
        ,case when lower(type_desc)='clustered' then 1 else 2 end -- create the clustered index first before nonclustered indexes
        ,case when lower(type_desc)='primary_xml' then 1 else 2 end -- create the primary xml index first before secondary xml indexes
   ,index_name

For each row returned above, replace the value from each column into the template SQL code below:

print 'Creating {UNQUOTED_INDEX_NAME} on {UNQUOTED_TABLE_NAME}'
go

{QUOTED_IDENTIFIER_COMMAND}
go

{COMMAND_PREFIX}
({CSV_INDEX_COLUMNS})
{SECONDARY_XML_INDEX_USING}
{INCLUDE_CLAUSE}
{FILTER_DEFINITION}
go

For example, the first index it returns is PK_Address_AddressID, so the CREATE INDEX command it generates is:

print 'Creating PK_Address_AddressID on Person.Address'
go

set quoted_identifier off
go

alter table [Person].[Address] add constraint [PK_Address_AddressID] primary key CLUSTERED
([AddressID] asc)
go

Enabling Read Committed Snapshot Isolation

Some of my production databases have had issues with blocking. To alleviate this, I enabled Read Committed Snapshot Isolation (RCSI). I thought it would be simply a matter of issuing an alter database command. It was a little harder on a development server, because I had to restart the SQL Server service several times before I could set the databases to single-user mode. However, my client and monitoring applications were CONSTANTLY checking and querying my production server, and while the instance and database don’t NEED to be in single-user mode, it takes FAR less time to enable RCSI if these two conditions are met. It took me about an hour of stumbling through it the first time, fraught with frustration. Now, it takes me about 20 minutes comfortably. Just for clarity, these steps require downtime; the databases on that instance will not be accessible for most of that time. I’ve now enabled it on several servers, and I wanted to make it easier for you, so here are my notes.

Kendra Little has an excellent blog post on RCSI: http://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/

Kimberly Tripp created this video about isolation levels, and includes RCSI: http://technet.microsoft.com/en-US/sqlserver/gg545007.aspx

If you have multiple installs (instances) of SQL Server, make REAL sure you know which instance the databases you want to enable RCSI are on!

RCSI increases pressure on tempdb, so I recommend having at least 4 tempdb data files.

  1. Launch SSMS, and determine how many data files you have for tempdb by running this query.
    select * from sys.master_files as M
    inner join sys.databases as D
    on M.database_id=D.database_id
    where D.name='tempdb'
    and M.type_desc='ROWS'

    count-tempdb-data-files.jpg
    I recommend at least four data files (mainly because Brent Ozar recommends four to eight). Here’s how to add more data files to tempdb.

  2. In SSMS, navigate to the tempdb properties.
    tempdb-properties.jpg

  3. Navigate to the Files page / Add button, and complete all the fields specifying the logical name, size, growth characteristics, and file name for each new tempdb data file. Click OK to create the files. Note: I’ve found a good starting point is to set the size of the new files the same size as the existing ones. SQL Server uses a mechanism to balance out the size of database files (including tempdb), so I caution you not to make the sizes too different. Also, the larger you set the file sizes, the longer SQL Server will take creating them.
    tempdb-properties-add-files.jpg

    Now that you have prepared tempdb, here’s how to enable RCSI.

  4. In SSMS, connect to the instance with databases you want to enable RCSI on.
  5. Run a query to show the status of the databases and RCSI. Keep this query window open. You’ll use it later.
    --- BEGIN TSQL COMMANDS #1
    --- Keep this query window open.  You'll use it later.
    use master
    go
    select name, user_access_desc, state_desc
    , is_read_committed_snapshot_on
    from sys.databases
    --- END TSQL COMMANDS #1
  6. Launch SQL Server Configuration Manager.
    The instance name will be in parenthesis. If you have only one instance, it’s name will be (MSSQLSERVER). This is called the default instance. If it has a different name, it’s called a named instance.
  7. On the left, click SQL Server services.
  8. On the right, note all services that are running. You will stop, then re-start, all SQL Server services except one. To be honest, I stop as many services as I can only to reduce the number of variables in case I run into a problem, not because I’ve performed detailed analysis to see which services I could leave running.
  9. Stop all services (including SQL Server Agent) except SQL Server service. To stop a service:
    Right-click the service / click Stop.
    stop-sql-server-agent-service.jpg

  10. Right-click the SQL Server service (not the SQL Server Agent service) on the instance your databases are on / click Properties.
    nav-to-sql-server-properties.jpg

  11. From the Properties window, prepare to set the SQL Server service to start in single-user mode (“-m”).
    For SQL Server 2008:
    Advanced tab / select Startup Parameters / append “;-m” (without the quotes) / Enter / Ok / Ok
    add-m-startup-parameter-s2008.jpg

    For SQL Server 2012 and later:
    Startup Parameters tab / set Specify a startup parameter to “-m” (without the quotes) / Add button / Ok / Ok
    add-m-startup-parameter.jpg

  12. Restart the SQL Server service.
    restart-sql-server-service.jpg

  13. Return to the SSMS window you opened earlier (TSQL COMMANDS #1), and execute those commands until you retrieve results. You may need to several times, because restarting the SQL Server service disconnects all connections, and SSMS detects disconnections and attempts to re-connect. This will cause this SSMS window to be the only connection allowed. If you are unable to connect to the database instance after more than 5 attempts to run the query, it may be because the connection is already taken. If that’s the case, restart the SQL Server service again, and try running the query at least 5 more times.
  14. Issue these TSQL commands (TSQL COMMANDS #2). They may take several minutes to run. In one case, I had about 500G of data between two databases, and these commands took about 2 minutes to finish.

    ---- BEGIN TSQL COMMANDS #2
    ---- Replace DATABASE_1 and DATABASE_2 with the databases
    ---- you wish to enable RCSI on.
    alter database DATABASE_1 set single_user with rollback immediate
    go
    alter database DATABASE_1 set read_committed_snapshot on
    go
    alter database DATABASE_1 set multi_user
    go
    alter database DATABASE_2 set single_user with rollback immediate
    go
    alter database DATABASE_2 set read_committed_snapshot on
    go
    alter database DATABASE_2 set multi_user
    go
    ---- END TSQL COMMANDS #2

  15. Now, run TSQL COMMANDS #1 (NOT TSQL COMMANDS #2!) again to confirm your databases have RCSI enabled by checking that is_read_committed_snapshot_on=1.
    rcsi-enabled.jpg
    In the screenshot above, I show that I enabled RCSI on only two databases.

    At this point, I figure you can perform the rest of the steps without screenshots.

  16. Prepare to set SQL Server service back to multi-user mode by removing the “-m” parameter.
  17. Restart the SQL Server service.
  18. Confirm the instance and databases are still online by running TSQL COMMANDS #1 (NOT TSQL COMMANDS #2!).
  19. Start the other SQL Server services you stopped earlier.

List Primary Keys With CSV Column Names

Part of a project I worked on required that I delete then re-create all primary keys in a database. (Long story short, the project was to change datatypes on some existing primary key columns.) Not trivial. I googled around, and found a few very nice queries that got me pretty far down the road, specifically ones using the for xml path('') clause.

After several days, to my increasing frustration, several of my primary keys (out of about 3000) weren’t being created identically as the original. If memory serves, if a table had both a clustered primary key and a unique key, it listed both the clustered keys and unique keys.

Several days and handfuls of hair later, here’s what I have. If it saves you time, give me an endorphin rush by adding a comment!

select
        quotename(schema_name(TBL_OUTSIDE.[schema_id])) as [schema]
        ,quotename(TBL_OUTSIDE.name) as [table_name]
        ,quotename(I_OUTSIDE.name) as [pk_name]
        ,I_OUTSIDE.type_desc as [clustered_or_nonclustered]
        ,ltrim(stuff((
                select
                                        ', '
                                        +quotename(col_name(K_1.parent_object_id, IC_1.column_id))
                                        +' '
                                        +case when IC_1.is_descending_key=1 then 'desc' else 'asc' end
                from sys.tables AS TBL_1
                inner join sys.indexes AS I_1 ON I_1.index_id > 0 and I_1.is_hypothetical = 0 AND I_1.[object_id]=TBL_1.[object_id]
                inner join sys.index_columns AS IC_1 ON
                        IC_1.column_id > 0
                        and (IC_1.key_ordinal > 0 or IC_1.partition_ordinal = 0 or IC_1.is_included_column <> 0)
                        and IC_1.index_id=cast(I_1.index_id AS int)
                        and IC_1.[object_id]=I_1.[object_id]
                inner join sys.columns AS CLMNS_1 ON CLMNS_1.[object_id] = IC_1.[object_id] and CLMNS_1.column_id = IC_1.column_id
                left outer join sys.key_constraints AS K_1 ON K_1.parent_object_id = I_1.[object_id] and K_1.unique_index_id = I_1.index_id
                where
                        schema_name(TBL_OUTSIDE.[schema_id])=schema_name(TBL_1.[schema_id])
                        and I_OUTSIDE.name=I_1.name
                        and TBL_OUTSIDE.name=TBL_1.name
                order by IC_1.key_ordinal
                for xml path('')
        ), 1, len(', '), '')) as [csv_columns]
--        ,I_OUTSIDE.*
from sys.tables AS TBL_OUTSIDE
inner join sys.indexes AS I_OUTSIDE ON I_OUTSIDE.index_id > 0 and I_OUTSIDE.is_hypothetical = 0 AND I_OUTSIDE.[object_id]=TBL_OUTSIDE.[object_id]
where TBL_OUTSIDE.type_desc='USER_TABLE'
and TBL_OUTSIDE.name not in ('dtproperties','sysdiagrams')  -- list true user tables only
and I_OUTSIDE.is_primary_key=1
order by
        schema_name(TBL_OUTSIDE.[schema_id])
        ,TBL_OUTSIDE.name
        ,I_OUTSIDE.name

Showing When SQL Server Backups and Restores Will Finish

Note: Since writing this entry, I’ve learned more details about tracking backup and restore progress. I’ll keep updating the query as I learn more.

It’s nice to know how long something will take. If your customer or manager asks, “When will the restore (or backup) be finished?” It’s a reasonable question, so you want to give them an answer. And then add 15 minutes, so you can run a quick restore headeronly or inspect some tables for critical data. And then add another 15 minutes, so you can look good when you finish early. Oh, snap! My secret’s out!

If you’re using SQL Server 2005, 2008, 2008 R2, 2012 or later, you’re in luck. The sys.dm_exec_requests DMV has a column estimated_completion_time. It returns a forecast of when your backup and restore times will complete in milliseconds. I don’t know what process SQL Server uses to calculate the estimate, but, it’s nice it’s there! This query shows the estimated completion time to the minute. At the bottom of this post, there’s a screenshot of a portion of the query’s output. (And no, the query does not add an extra 30 minutes!)

Start your backup or restore process first, then run the query before it finishes.

Note that during a restore, the first step is to zero (i.e., initialize) the destination files. I know of no way to track the zeroing process. So, all I can show is “Zeroing destination files.”

Disclaimer: I’ve only tested the query against 2008 and later instances. I haven’t restored a 2005 database in a while.

select top 1
S.session_id
,R.command
,case
        when lower(R.command) like 'restore%' then 'Restoring'
        when lower(R.command) like 'backup%' then 'Backing Up'
        else 'Unknown'
end as [backingup_or_restoring]
,case
        when R.percent_complete>0 and R.percent_complete<100 then
                case
                        when lower(R.command) like 'restore%' then 'Restoring'
                        when lower(R.command) like 'backup%' then 'Backing Up'
                        else 'Unknown'
                end
        when R.percent_complete=100 then 'Running upgrade steps'
        else 'Zeroing destination files'
end as [state]
,R.percent_complete
--,getdate() as [current_time]
,case
        when R.percent_complete>0 then cast(cast(dateadd(millisecond, R.estimated_completion_time, getdate()) as date) as varchar(12))
        +' '+right('0'+cast(datepart(hour, dateadd(millisecond, R.estimated_completion_time, getdate())) as varchar(2)), 2)
        +':'+right('0'+cast(datepart(minute, dateadd(millisecond, R.estimated_completion_time, getdate())) as varchar(2)), 2)
end
as completion_datetime_to_minute
,case when (R.estimated_completion_time / (1000*60*60)) % 24>0 then
cast((R.estimated_completion_time / (1000*60*60)) % 24 as varchar(10))+' hr ' else '' end
+case when (R.estimated_completion_time / (1000*60*60)) % 24>0 or R.estimated_completion_time / (1000*60) % 60>0 then
cast(R.estimated_completion_time / (1000*60) % 60 as varchar(2))+' min ' else '' end
+case when R.estimated_completion_time <= 10 *60*1000 then cast(R.estimated_completion_time / 1000 % 60 as varchar(2))+' sec' else ''
end as completion_time_flex
,case when (R.estimated_completion_time / (1000*60*60)) % 24>0 then
cast((R.estimated_completion_time / (1000*60*60)) % 24 as varchar(10))+' hr ' else '' end
+case when (R.estimated_completion_time / (1000*60*60)) % 24>0 or R.estimated_completion_time / (1000*60) % 60>0 then
cast(R.estimated_completion_time / (1000*60) % 60 as varchar(2))+' min ' else '' end
+cast(R.estimated_completion_time / 1000 % 60 as varchar(2))+' sec'
as completion_time_hms
,cast(R.percent_complete as integer) as percent_complete_int
,cast(R.percent_complete as numeric(20, 1)) as percent_complete_one_decimal_place
,S.[program_name]
,D.name as database_name
,BS.server_name
,BMF.physical_device_name as backup_filename
,BS.backup_size as backup_file_size_bytes
,T.[text] as sql_command
from sys.dm_exec_sessions as S
inner join sys.dm_exec_requests as R on S.session_id=R.session_id
cross apply sys.dm_exec_sql_text(R.[sql_handle]) as T
inner join sys.databases as D on R.database_id=D.database_id
left outer join msdb.dbo.backupset as BS on D.name=BS.database_name
left outer join msdb.dbo.backupmediafamily as BMF on BS.media_set_id=BMF.media_set_id
where (lower(R.command) like '%restore %' or lower(R.command) like '%backup %')
order by BS.backup_start_date desc

backup-restore-estimate-screenshot.jpg

Setting up a FileTable with full text search in SQL Server 2012

With FileTables, you can manipulate (add, delete, change) files and directories with DML commands (insert, delete, update), or via the filesystem. IOW, you can add, delete, and update files using TSQL, or a Windows Explorer window, or programmatically! Coolio!

With SQL Server’s Full Text Search, you can search through text using TSQL commands.

With some iFilters (which I will explain), SQL Server will automagically parse your Word, Excel, Powerpoint (2003 and later), and PDF documents into a fulltext index.

Combining them means that if your application needs to store and search documents, these features lower the barriers and hoops you have to go through to set up and maintain searchable documents.

In this post, I will describe how you can set up a FileTable and perform fulltext searches using SQL Server 2012. After some false starts, reading through blogs, and watching videos, here’s what I have learned, along with some extras.

I will show you how to set up a FileTable so you can manipulate (add, delete, change) files and directories from the filesystem or from TSQL.

Next, I will show you how to install some free utilities (from Microsoft and Adobe) called iFilters which extract text from Office (2003 and later) and PDF documents.

Finally, I will show you some TSQL statements you can use to search through those documents.

FileTable is a feature in SQL Server 2012. It is built on top of the FileStream feature, which became first available in SQL Server 2008.

Here We Go

Enable Filestream

Launch SQL Server Configuration Manager / SQL Server Services
/ right click the SQL Server instance you want to enable FileTable and Filestream on
/ Properties / Filestream tab
/ Enable Filestream for TSQL access
/ Enable Filestream for I/O access
/ Enter a Windows share name, like “mssqlserver”
/ OK
SCCM_100.jpg

Right-click the SQL Server instance you want to enable FileTable and Filestream on
/ Restart
200.jpg

Note: use Windows Explorer to navigate to the server’s root (“\\servername”), you will see a share called “mssqlserver”.
300.jpg

Next, enable filestream via SSMS.

/*
Enable filestream for TSQL access.
"exec sp_configure 'filestream_access_level', 0" disables filestream.
"exec sp_configure 'filestream_access_level', 1" enables filestream for TSQL access
"exec sp_configure 'filestream_access_level', 2" enables filestream for TSQL and file i/o access.
This is an instance-wide setting.
*/
exec sp_configure 'filestream_access_level', 2 -- more comments
go
reconfigure
go

400.jpg

Verify that filestream is enabled.

exec sp_configure 'filestream_access_level'
go

3600_500.jpg

Are enabling filestream using SQL Server Configuration Manager and enabling filestream using sp_configure redundant?
I don’t think they are redundant. After some testing, it looks like you have to enable filestream via SQL Server Configuration Manager first, then issue the sp_configure command second.
So, to completely disable filestream, you’ll need to disable it via SQL Server Configuration Manager first, then issue the sp_configure command second. Note that if you disable it either way, then restart the SQL Server service, you’ll likely lose all filestream and filetable data AND FILETABLE FILES.

Create FileTable Directory For Database

Launch Windows explorer
Create a directory called e:\sqlfiletables
Create a share called sqlfiletables:
Right-click e:\sqlfiletables / properties
/ Sharing tab / Advanced Sharing button
/ Enable Share this folder
/ OK / Close
600.jpg

Use Windows Explorer to navigate to the server’s root (“\\servername”), you will see a share called “sqlfiletables”.
700.jpg

Optional: Drop and Re-Create Test Database

/*
This is just clean-up code.
You don't need to drop and re-create the database just to set up filetables.
*/
use master
go
begin try
        alter database FileTableExampleDB set single_user with rollback immediate
end try
begin catch
end catch
begin try
        drop database FileTableExampleDB
end try
begin catch
end catch
go

create database FileTableExampleDB
go

Enable Filestream For This Database

/*
(Note: "CRUD" is an acronym for "Create / Read / Update / Delete".
I define it here because I'll use it several times in this TSQL comment.)
"non_transacted_access=off": enable CRUD access to filetables via TSQL statements only.
Disables access using the filesystem. This is essentially the same as filestream.
"non_transacted_access=readonly": enable CRUD access to filetables via TSQL.
Enables read-only access using the filesystem.
"non_transacted_access=full": enable CRUD access to filetables via TSQL
and the filesystem.
*/
alter database FileTableExampleDB
        set filestream (
                non_transacted_access = full
                ,directory_name = 'sqlfiletables'
        )
go

Verify the above alter database command worked.
In SSMS, right-click the FileTableExampleDB db / properties / options / locate the Filestream section.
Note that FILESTREAM directory name is “sqlfiletables” and that Non-Transacted access is “full”.
800.jpg

Add A Filegroup For Filestream Data

alter database FileTableExampleDB
        add filegroup FileTableExampleDB_FileGroup contains filestream
go

Verify the filegroup was created.
In SSMS, right-click the FileTableExampleDB db / properties / filegroups.
Note that FILESTREAM has an entry called “FileTableExampleDB_FileGroup” with 0 files. We will change this “0 files” to “1 file” soon.
900.jpg

Add a file to the new filegroup.

alter database FileTableExampleDB
        add file (
                name='FileTableExampleDB_File'
                ,filename='E:\sqlfiletables\FileTableExampleDB_Folder'
        ) to filegroup FileTableExampleDB_FileGroup
go

Verify that the filetable directory exists.
In Windows Explorer, navigate to e:\sqlfiletables. You’ll see that there’s a new folder called FileTableExampleDB_Folder”, which contains a directory ($FSLOG) and a file (filestream.hdr).
The “alter database / add file / to filegroup” command above created that folder.
1000.jpg

Verify that the file is referenced in the database’s filegroup using SSMS.
In SSMS, right-click the FileTableExampleDB db / properties / filegroups.
Note that the FILESTREAM entry “FileTableExampleDB_FileGroup” changed; it used to be 0, now it’s 1.
1100.jpg

Verify that the file is referenced in the database’s filegroup using TSQL.
Note that you’ll see database files as rows, log and filestream.

use FileTableExampleDB
go
select * from sys.database_files
go

1200.jpg

Create A Table For FileTable Data

use FileTableExampleDB
go
create table FileTableExample as filetable
        with (
                filetable_directory='FileTableExample_directory'
                ,filetable_collate_filename=database_default
        )
go

In SSMS, navigate to the FileTableExampleDB db / Tables / FileTables.
Note the new FileTableExample table.
1300.jpg

In Windows Explorer, navigate to E:\sqlfiletables\FileTableExampleDB_Folder.
Note that you’ll see a new directory as a GUID.
1400.jpg

In Windows Explorer, navigate to \\{server-name}\mssqlserver\sqlfiletables\FileTableExample_directory
Example: \\localdesktop\mssqlserver\sqlfiletables\FileTableExample_directory
You will not be able to navigate to this directory from the drive, only from the sharename.
Create a new file, say, called “goober.txt”, and type something into the file.
1500.jpg

Query the filetable.
I do not recommend querying the file_stream column. The file_stream column contains the file’s content, and may take a while to return data, especially if you have large files.

select stream_id, name, path_locator, parent_path_locator, file_type, cached_file_size, is_directory from FileTableExample

Shows a record with name “goober.txt”
1600.jpg

Try copying some files and directories, and querying FileTableExample again.
Note the is_directory and parent_path_locator columns.

select stream_id, name, path_locator, parent_path_locator, file_type, cached_file_size, is_directory from FileTableExample
go

1700.jpg
1800.jpg

Delete a file.

delete from FileTableExample where name='goober.txt'
go
select stream_id, name, path_locator, parent_path_locator, file_type, cached_file_size, is_directory, is_offline, is_readonly, is_system, is_temporary
from FileTableExample

1900.jpg
Note: setting a file to read-only does not prevent us from deleting the record from SQL or the file from the filesystem.
Also, deleting the record does not move the file to the recycle bin.

In Windows Explorer, navigate to
\\localdesktop\mssqlserver\sqlfiletables\FileTableExample_directory. goober.txt will disappear.
(You may need to refresh the directory)
2000.jpg

I think copying files to the \\localdesktop\mssqlserver\sqlfiletables\FileTableExample_directory\ issues an exclusive lock on the FileTable until the files are copied and the filetable is synced.

Set Up The Ability To Search Through Document Text

Launch SQL Server’s installation utility.

2700.jpg

Install SQL Server Fulltext Search feature (“full-text and semantic extractions for search”).2800.jpg

By default, the fulltext filter service is set to start manually. You can’t change this now, you’ll have to change this later. You can’t set the fulltext filter service to start automatically during the install.
2900.jpg

After the SQL Server installer finishes, set the fulltext filter service to log on as “Local system”.
Launch SQL Server Configuration Manager / SQL Server Services
/ Right-click SQL Full-text Filter Daemon Launcher / Properties / Log On tab
/ Enable Built-in account / set to “Local System”
/ Click Apply / Click Start (or Restart) button / OK
3400.jpg

Set the fulltext filter service to start automatically:
Launch SQL Server Configuration Manager / SQL Server Services
/ Right-click SQL Full-text Filter Daemon Launcher / Properties / Service tab
/ Set Start Mode to Automatic / OK
3000.jpg

Download and install MS Filter Pack 2.0:
http://www.microsoft.com/en-us/download/details.aspx?id=17062
It doesn’t provide any options you can configure.
2100.jpg

Download and install MS Filter Pack 2.0 Service Pack 1:
http://support.microsoft.com/kb/2460041
It doesn’t provide any options you can configure.
2200.jpg

Download Adobe Acrobat iFilter:
x64: http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025
(My only option was to use the 64-bit download. I couldn’t find one for 32-bit systems, and I have a 64-bit system.)
2300.jpg

Extract the installer from the zip file, and start the installer.

Take note of where you installed it.
I installed it to “E:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms”.
2400.jpg

Set the Path to where you installed it, and append the “bin” directory:
In Windows / Control panel / System and security / System / Advanced system settings / System properties dialog / Advanced tab / Environment variables / System variables (NOT User variables) / Variable = Path / Edit button
In the Variable name value, append the path where you installed the Adobe iFilter:
“;E:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin”
2500.jpg

Click OK until you close all of the dialog boxes (3 times, I think).

Run a command prompt, type “path”, and make sure the above directory is at the end of the path.
2600.jpg

Restart the SQL Server service:
Right-click the SQL Server instance you want to enable FileTable and Filestream on / Restart
200

In case you don’t think restarting the SQL Server service is necessary:
I saw that .pdf documents were in the sys.fulltext_document_types, but no pdf documents appeared when I queries the fulltext index for words that were in the pdf documents. After I restarted the SQL Server service, they did.

Issue the following commands to configure full-text services for office and pdf documents.

exec sp_fulltext_service @action='load_os_resources', @value=1; -- update os resources
go
exec sp_fulltext_service 'verify_signature', 0 -- don't verify signatures
go
exec sp_fulltext_service 'update_languages'; -- update language list
go
exec sp_fulltext_service 'restart_all_fdhosts'; -- restart daemon
go
reconfigure with override
go

Show whatever PDF and Office documents are being indexed.

select * from sys.fulltext_document_types where
        document_type like '.do%'
        or document_type like '.xls%'
        or document_type like '.pp%'
        or document_type='.pdf'
order by document_type
go

PDF documents should be in the list. In my environment, about 95% of all our documents are PDF, so this was very important to me.
3100.jpg

At some point around this step, if you’re like me, you’ll try to open a text file or a bitmap file, and you’ll get a message something like this: “The request is not supported.”
3500.jpg

According to Microsoft, this error occurs because some applications (like notepad.exe and paint.exe) use memory-mapped files, a win32 feature that is not supported by filetables:
http://msdn.microsoft.com/en-us/library/ff929144.aspx

To be perfectly honest, I do not understand exactly what this means.

Some posts say you can open files if you map the filetable share as a network drive. I have been unsuccessful in doing so. The only workaround I’ve found is to copy or move the file you want to open to a different directory, say, the desktop, then open it from there. Once you’re finished, copy the file back to the filetable directory.

Optional: Drop And Re-Create A Fulltext Catalog

This will let you easily search for text within files using TSQL: Word, PDF, Excel, Powerpoint, etc.

If you have not created a fulltext catalog against the database, you’ll need to create one.

use FileTableExampleDB
go
begin try
        drop fulltext catalog MyFulltextCatalog
end try
begin catch
end catch
begin try
        create fulltext catalog MyFulltextCatalog as default
end try
begin catch
end catch
go

In SSMS, navigate to FileTableExampleDB db / Storage / Full Text Catalogs
You’ll see MyFulltextCatalog.
3200.jpg

In SSMS, navigate to FileTableExampleDB db / Tables / FileTables / FileTableExample / Indexes.
Locate the primary key index name (“PK__”) and enter it after the “key index” clause below.
(Click the primary key index name once to highlight / ctrl-c.)
In my case, it was called PK__FileTabl__5A5B77D543540781.
3300.jpg

Create the fulltext index, and specify the key index.

create fulltext index on FileTableExample (
        name language 1033
        ,file_stream type column file_type language 1033
) key index PK__FileTabl__5A5B77D543540781 -- must be the name of the primary key!
on MyFullTextCatalog
with
        change_tracking auto -- update the index as the base table changes, IOW, update the index as you add, delete, and modify documents
        ,stoplist=system
go

Now you can search against text in files.
Note that “GetFileNamespacePath” is case-sensitive.

select
        stream_id
        ,name
        ,file_stream.GetFileNamespacePath(0, 1) as relative_file_location
        ,file_stream.GetFileNamespacePath(1, 1) as absolute_file_location
        ,file_type
        ,cached_file_size
        ,is_directory
--      ,*
from FileTableExample
where
        contains(file_stream, 'test')
        or contains(file_stream, 'capitol')
        or contains(file_stream, 'hello')
--      or contains(file_stream, 'near(Municipal, Courts)') -- document contains the word "municipal" near the word "courts"
go

Setting “change_tracking auto” should be okay when only a small number of documents are changed or added. However, in a production environment, you may want to control when fulltext indexing starts, and when it is scheduled.

To prevent SQL from automatically updating the index when you add new documents:

alter fulltext index on FileTableExample set change_tracking manual
go

Trigger a fulltext index update.

use FileTableExampleDB
go
alter fulltext index on FileTableExample start update population
go

Trigger a fulltext index complete rebuild.

use FileTableExampleDB
go
alter fulltext index on FileTableExample start full population
go