Thursday, January 28, 2010

Random Sentence Generator

I’ve always been fascinated with language and linguistics… specifically grammar and sentence structure. I was one of those nerds in high school who actually enjoyed diagramming sentences.

Sentence Diagram

For kicks, about 15 years ago, I wrote a self-contained program in Visual FoxPro (VFP) to generate random sentences, but I lost track of it… it’s probably on an old computer gathering dust in the garage.

But a year or so ago, I Googled “random sentence generator” and came upon this website, which describes an old assignment in a Computer Science class at Stanford University.

The assignment was to generate random sentences based on a text file consisting of a grammar. This grammar is a template describing the various combinations of words to form the sentences.

For example, here is a very simple grammar file (Don’t pay any attention to the T-SQL /*…*/ comment delimiters… they’re only there so that websites that syndicate this blog render the text below in a consistent manner):

/*
{<start> <pronoun> <verb>;}
{<pronoun> I;you;he;she;it;we;they;}
{<verb> ran;played;drank <liquid>;}
{<liquid> water;whiskey;Pepsi;}
*/
This grammar file consists of 4 elements, whose definitions are between curly brackets ({}). Each element is defined by a list of possible expansions, each ending with a semicolon (;). These expansions can be individual words or phrases, or they can be (or contain) other elements, which will need to be further expanded.

Beginning with the <start> element, you generate a random sentence, continuously expanding the elements until there are no more to expand, as illustrated by the following steps:

/*
Start with <start>: <start>
Expand <start>: <pronoun> <verb>
Expand <pronoun>: they <verb>
Expand <verb>: they drank <liquid>
Expand <liquid>: they drank whiskey
*/
This grammar file above, though very very simple, is capable of generating 35 different sentences.

There are various grammar files available at this location. Many of them were written for fun by students over the years. If you wish to download any of them, just right-click and choose Save Target As… from the context menu.

For example, there’s a grammar file for generating a random Star Trek episode treatment, like so:

While ferrying a Drysilic ambassador to the nearest J.C. Penny outlet, Wesley Crusher is suddenly swallowed into a time vortex connecting to the year 325. In a flash of insight, someone decides that the best solution is by using Guinan’s hat as a weapon. And thus, the Enterprise continues undaunted in its mission.

And there’s one for generating a random James Bond film synopsis (I inserted a [sic] next to the misspellings… honestly, you’d think Stanford students would spell better!):

A demented Libyan terrorist plots to lower England’s standard of living by wrecking the global economy. Bond tracks his nemesis to a palacial [sic] estate in France and they play cards (cheating shamelessly). Within minutes, Bond meets a stunning CIA agent, whom he thinks is cute but too young. Afterwards, Bond is siezed [sic] by commandos and fed to pirannahs [sic]. Incredibly, he escapes by driving a speedboat over a waterfall. Finally, with only 007 seconds to spare, Bond causes SMERSH headquarters to self-destruct.

And there’s one for generating insults:

You cantankerous bucket of defective lizard scabs.

May a gruesome and sweaty group of South American killer bees gleefully vomit in the toxic dump you call home.

Anyway, once I discovered this website a year ago, I wrote a brand new Random Sentence Generator in VFP, using this grammar file template concept. And I had a little fun generating random insults and random Trek/Bond synopses and other things.

But I didn’t stop there. I put together a grammar file to generate sentences of my own. Over many weeks, I kept adding to it, making it very extensive, filled with dependent clauses, modal auxiliary verbs, conjunctive adverbs, and a host of other complex sentence constructions. I even amended it to handle past and future tense and plurals, which meant I had to handle irregular formations like sit/sat and man/men.

I can pretty much guarantee that you will never get the same sentence generated twice from my grammar file. I haven’t done the math (I’m not really sure how to), but it must be capable of generating literally quadrillions upon quadrillions of different sentences, from the very simple…

Michael sobbed.

…to the moderately complex…

Last week my daughter was upset to discover that all the gophers wanted to gracefully gulp down 42 gallons of armadillo soup.

…to the very complex…

This group of supermodels will become very sumptuous, moderately rich, and positively sleazy next Tuesday, during the time that my sister-in-law’s daughter will look decidedly tough and allegedly creepy, and next Monday, even though a stormtrooper and that surprisingly murderous nurse will haul peanuts across a terribly horrendous ocean, very heavyhearted kangaroos will move out of the pale blue forest.

So, now cut to present day… I recently wrote the same Random Sentence Generator in T-SQL, and it’s available here on my SkyDrive (along with my grammar file and a few of the various others mentioned above) to anyone who wants to fool around with it (in SQL2005 or later).

In order to use it, you must first create a table called, appropriately, Grammar:

use TempDB  /* Change to whatever database you wish */
go
if object_id('Grammar','U') is not null drop table Grammar
go
create table Grammar
(
Descript nvarchar(20)
,Element varchar(40)
,Expansion varchar(200)
)
go
create clustered index Ix_Grammar on Grammar (Descript,Element)
go
The table will house any number of grammar file definitions that you wish to load in. I’ve written a stored procedure called LoadGrammarFile to do this. The code below loads the various grammar files mentioned above (the Nonsense file is the one that I put together):

exec LoadGrammarFile 'StarTrek' , 'c:\grammar\trek.txt'
exec LoadGrammarFile 'Bond' , 'c:\grammar\bond.txt'
exec LoadGrammarFile 'Insult' , 'c:\grammar\insult.txt'
exec LoadGrammarFile 'Nonsense' , 'c:\grammar\nonsense.txt'
So, for example, the last line above loads a grammar with the description of Nonsense from the file c:\grammar\nonsense.txt.

So now that we have our table loaded with our various grammars, we can generate sentences to our heart’s content. The stored procedure called GenerateSentence will do just that, based on the grammar we desire:

exec GenerateSentence 'Nonsense'
/*
Sentence
---------------------------------
Ronald tried to escape from Peru.
*/
The GenerateSentence procedure also accepts a couple of optional arguments. For example, if you know your grammar file well, you can indicate that you want the sentence generation to start with a different element from the <start> element, like so:

exec GenerateSentence 'Nonsense', @Init='<confuciusoption>'
/*
Sentence
-----------------------------------------------------------
Confucius say, "Pessimism is like cabbage, only more sexy."
*/


exec GenerateSentence 'Nonsense', @Init='<bookmovieoption>'
/*
Sentence
-------------------------------------------------------------------------------------
Larry Coolidge's wife checked out a film entitled "Cancer And The Kumquats in Tokyo".
*/


exec GenerateSentence 'Nonsense', @Init='<liquidobject>'
/*
Sentence
-----------------------------
A couple of tomato milkshakes
*/

You can also see how the sentence is generated step-by-step by passing @Debug=1, and that will PRINT each step in the Text Messages window in SSMS:

exec GenerateSentence 'Nonsense', @Init='<liquidobject>', @Debug=1
/*
<liquidobject>
<quantity> <liquidcontainer>s of <liquid>
<digitmorethan0><digit> <liquidcontainer>s of <liquid>
6<digit> <liquidcontainer>s of <liquid>
62 <liquidcontainer>s of <liquid>
62 bowls of <liquid>
62 bowls of <fruit> tea
62 bowls of cantaloupe tea

Sentence
--------------------------
62 bowls of cantaloupe tea
*/
Finally, you can also specify how many sentences to generate by passing a @Quantity value (only a maximum of 100 sentences will be generated):

exec GenerateSentence 'Nonsense', @Init='<liquidobject>', @Quantity=10
/*
Sentence
---------------------------------
448 glasses of Clorox
An almond milkshake
16 buckets of NyQuil
Some blueberry-scented perfume
A few vats of Tabasco sauce
3 gallons of snot
A couple of bowls of apricot soda
Some vomit
846 tanks of antelope soup
7 vials of artichoke oil
*/
The code for the LoadGrammarFile and GenerateSentence procedures is below. Note that they make use of many of the string UDFs that I discussed in my last blog entry.

So have some fun and download it and play with it… Make it part of your daily routine… a “thought for the day” to get you going in the morning.

If you generate any wildly funny or especially profound sentences, I’d love to hear them.

/*----------------------------------------------------------------------------------*/

if object_id('LoadGrammarFile','P') is not null drop procedure LoadGrammarFile
go
create procedure LoadGrammarFile
@GrammarDescript
nvarchar(40)
,@FileName nvarchar(max)
as
begin

set nocount on

declare @SqlCommand nvarchar(max)
,@GrammarDef varchar(max)
,@ElementDef varchar(max)
,@ExpansionList varchar(max)
,@Element varchar(40)
,@Expansion varchar(max)
,@i int
,@j int
/*
Use Dynamic SQL to load the contents of the file into a string
*/
set @SqlCommand=
N'set @GrammarDef=(select *
from openrowset(bulk '''
+@FileName+N''', single_clob) x)'
exec sp_executesql
@SqlCommand
,N'@GrammarDef varchar(max) output'
,@GrammarDef output

/*
Take care of odd stuff seen in various Stanford grammar files
*/
set @GrammarDef=replace(@GrammarDef,'> s ','>s ') /* Plurals */
set @GrammarDef=replace(@GrammarDef,'> ''s ','>''s ') /* Possessives */
set @GrammarDef=replace(@GrammarDef,'> "''s" ','>''s ') /* Possessives */

/*
Clear out any previous contents for the
desired Grammar description from the table
*/
delete Grammar where Descript=@GrammarDescript

/*
Parse through the grammar definition until done
*/
set @i=0
while 1=1
begin
set @i=@i+1
/*
Find the next Element Definition between {} delimiters
and convert any CR or LF or TAB characters to spaces
*/
set @ElementDef=dbo.StrExtract(@GrammarDef,'{','}',@i,0)
set @ElementDef=ltrim(rtrim(dbo.ChrTran(@ElementDef
,char(13)+char(10)+char(9)
,' ')))

if @ElementDef='' break /* We're done! */

/*
Get the name of the Element within the Definition
and get the list of Expansions for that Element
(in other words, the rest of the Definition contents)
*/
set @Element=dbo.StrExtract(@ElementDef,'<','>',1,4)
set @ExpansionList=';'+ltrim(substring(@ElementDef
,len(@Element)+1
,len(@ElementDef)))
/*
Parse through the List of Expansions until done
*/
set @j=0
while 1=1
begin
set @j=@j+1
/*
Find the next Expansion between semicolon delimiters
and include those delimiters in the result.
Note: We do this because a blank Expansion between semicolons
is perfectly valid
*/
set @Expansion=dbo.strExtract(@ExpansionList,';',';',@j,4)
if @Expansion='' break /* We're done! */
/*
Now get rid of those semicolon delimiters
*/
set @Expansion=ltrim(rtrim(replace(@Expansion,';','')))
/*
There may be some Expansions where we actually do want a semicolon
and we represent that with a double-colon (::), so change
those to actual semicolons
*/
set @Expansion=replace(@Expansion,'::',';')
/*
Now we can finally insert the Expansion into our table
*/
insert Grammar values (@GrammarDescript,@Element,@Expansion)
end
end
end
go

/*----------------------------------------------------------------------------------*/

if object_id('GenerateSentence','P') is not null drop procedure GenerateSentence
go
create procedure GenerateSentence
@GrammarDescript
nvarchar(40)
,@Quantity int = 1
,@Init varchar(40) = '<start>'
,@Debug bit = 0
as
begin

set nocount on

declare @Counter int
,@Sentence varchar(max)
,@Element varchar(500)
,@Expansion varchar(500)
,@Fragment varchar(500)
,@FragText varchar(500)
,@FirstChar char(1)
,@i int
,@p1 int
,@p2 int
,@p3 int

declare @SentenceBucket table (Sentence varchar(max))

set @Counter=0
while @Counter<case when @Quantity>100 then 100 else @Quantity end
begin

set @Counter=@Counter+1

/*
Infinitely loop in creating sentences until a valid one comes along
*/
while 1=1
begin

/*
Initialize
*/
set @Sentence=@Init
if @Debug=1 print ltrim(@Sentence)

/*
Perform substitutions of all Elements until done
*/
while 1=1
begin
set @Element=dbo.StrExtract(@Sentence,'<','>',1,4)
if @Element='' break /* We're done! */

/*
Get random value for the Element
*/
select top 1 @Expansion=Expansion
from Grammar
where Descript=@GrammarDescript and Element=@Element
order by newid()

/*
And put it into the sentence
*/
set @Sentence=stuff(@Sentence
,charindex(@Element,@Sentence)
,len(@Element)
,@Expansion)

if @Debug=1 print ltrim(@Sentence)
end

/*
If the sentence is valid then we're done!
Note: Sentences are valid 99.9% of the time
*/
if charindex('***',@Sentence)=0 break

/*
Otherwise, loop around again
*/
if @Debug=1 print 'Invalid sentence... Restarting...'
end

/*
Temporarily surround punctuation with spaces
*/
set @Sentence=replace(@Sentence,'"',' " ')
set @Sentence=replace(@Sentence,',',' , ')
set @Sentence=replace(@Sentence,';',' ; ')
set @Sentence=replace(@Sentence,'.',' . ')
set @Sentence=replace(@Sentence,'-',' - ')

/*
Handle plurals, verb tenses, and adjectives/adverbs
Examples: [man|men] --> man
[man|men]s --> men
[sit|sat] --> sit
[sit|sat]ed --> sat
[happy|happily] --> happy
[happy|happily]ly --> happily
*/
while 1=1
begin
set @p1=charindex('[',@Sentence)
if @p1=0 break /* We're done! */
set @p2=charindex(']',@Sentence,@p1)
set @p3=charindex(' ',@Sentence+' ',@p2)
set @Fragment=substring(@Sentence,@p1,@p3-@p1)
set @FragText=case
when @p3-@p2=1
then dbo.StrExtract(@Fragment,'[','|',1,0)
else dbo.StrExtract(@Fragment,'|',']',1,0)
end
set @Sentence=stuff(@Sentence
,charindex(@Fragment,@Sentence)
,len(@Fragment)
,@FragText+' ')
if @Debug=1 print ltrim(@Sentence)
end

/*
Handle PROPERIZE(: :) directive
*/
set @Fragment=dbo.StrExtract(@Sentence,'PROPERIZE(:',':)',1,4)
if @Fragment<>''
begin
set @FragText=substring(@Fragment,12,len(@Fragment)-13)
set @Sentence=stuff(@Sentence
,charindex(@Fragment,@Sentence)
,len(@Fragment)
,dbo.Properize(@FragText,0))
if @Debug=1 print ltrim(@Sentence)
end

/*
Handle "a" and "an" (i.e. "a elephant" -> "an elephant")
Note: Add a space at the beginning of the sentence in case
there's a potential word "A" at the beginning
*/
set @Sentence=' '+@Sentence
set @i=0
while 1=1
begin
set @i=@i+1
set @p1=dbo.At(' a ',@Sentence,@i)
if @p1=0 break
set @FirstChar=left(ltrim(substring(@Sentence,@p1+2,len(@Sentence))),1)
if @FirstChar in ('a','e','i','o','u')
set @Sentence=stuff(@Sentence,@p1+1,2,'an ')
if @Debug=1 print ltrim(@Sentence)
end

/*
Get rid of surrounding space around double quotes
in preparation for next step
*/
set @Sentence=replace(@Sentence,' " ','"')

/*
Get rid of all leading/trailing spaces
And capitalize the first letter of the sentence
*/
set @Sentence=ltrim(rtrim(@Sentence))
set @Sentence=upper(left(@Sentence,1))+substring(@Sentence,2,len(@Sentence))

/*
Capitalize any word that comes after the first
of a pair of double quotes
*/
set @i=-1
while 1=1
begin
set @i=@i+2 /* Every other double quote */
set @p1=dbo.At('"',@Sentence,@i)
if @p1=0 break
set @FragText=ltrim(substring(@Sentence,@p1+1,len(@Sentence)))
set @Sentence=left(@Sentence,@p1)
+upper(left(@FragText,1))
+substring(@FragText,2,len(@FragText))
end

/*
Fix all the close quotes
Get rid of any double-spaces
And clean up any spaces before punctuation
*/
while charindex(', "',@Sentence)>0
set @Sentence=replace(@Sentence,', "',',"')
while charindex('. "',@Sentence)>0
set @Sentence=replace(@Sentence,'. "','."')
while charindex(' ',@Sentence)>0
set @Sentence=replace(@Sentence,' ',' ')
while charindex(' ,',@Sentence)>0
set @Sentence=replace(@Sentence,' ,',',')
while charindex(' ;',@Sentence)>0
set @Sentence=replace(@Sentence,' ;',';')
while charindex(' .',@Sentence)>0
set @Sentence=replace(@Sentence,' .','.')
while charindex(' - ',@Sentence)>0
set @Sentence=replace(@Sentence,' - ','-')
while charindex(' ".',@Sentence)>0
set @Sentence=replace(@Sentence,' ".','".')

/*
Just in case, again get rid of leading/trailing spaces
*/
set @Sentence=ltrim(rtrim(@Sentence))

/*
And save it in our temporary file
*/
insert @SentenceBucket values (@Sentence)

end

/*
Send back our sentence(s)
*/
select Sentence from @SentenceBucket

end
go

Tuesday, January 26, 2010

Handy String Functions

Strings!Visual FoxPro (VFP) is very good at handling strings, and there are many useful string functions built into its language. In making the move to T-SQL, I have often wished some of those handy string functions were available.

But since they are not, I went ahead and created User-Defined Functions (UDFs) to emulate those VFP string functions. Perhaps you will find them to be useful also.

In this blog entry, I’ll introduce these UDFs (providing links to the VFP Books Online documentation that they're based upon) and give some examples of how to use them. The actual code to create these UDFs will be at the end of this article.

Stay tuned for my next blog entry, in which I’ll put together a zany project that makes use of some of these UDFs.

In order for many of these functions to do their work, they need the actual true length of a string. The LEN() function in T-SQL has the unfortunate feature of excluding trailing blanks in calculating string length. So the first function that I created is what I call a TRUELEN() function, which returns the full length of a string regardless of any trailing blanks.

select TrueLength=dbo.TrueLen('abcde     ')
/*
TrueLength
-----------
10
*/
The AT() Function is just like T-SQL’s CHARINDEX() function in that it returns the position of a search string within another string expression. The difference is that its third argument indicates the occurrence rather than a start position. Here are a couple of examples:

select FirstLetterE=dbo.At('e','Here is a sentence',1)
,ThirdLetterE=dbo.At('e','Here is a sentence',3)
,FifthLetterE=dbo.At('e','Here is a sentence',5)
/*
FirstLetterE ThirdLetterE FifthLetterE
------------ ------------ ------------
2 12 18
*/

select SecondTheOccurrence=dbo.At('the','The end of the story',2)
/*
SecondTheOccurrence
-------------------
12
*/
The RAT() Function is just like AT(), except that it finds the position of the rightmost occurrence of a search string within a string expression. This can come in handy, for example, if you have a fully-qualified filename containing lots of backslashes and you want to find the position at which the true filename starts:

select LastLetterE=dbo.RAt('e','Here is a sentence',1)
,SecondToLastLetterE=dbo.RAt('e','Here is a sentence',2)
,FifthToLastLetterE=dbo.RAt('e','Here is a sentence',5)
/*
LastLetterE SecondToLastLetterE FifthToLastLetterE
----------- ------------------- ------------------
18 15 2
*/

declare @FullyQualifiedName varchar(50)
set @FullyQualifiedName='C:\Windows\System32\Config\System.Log'
select PathNameOnly=left(@FullyQualifiedName
,dbo.RAt('\',@FullyQualifiedName,1))
,FileNameOnly=substring(@FullyQualifiedName
,dbo.RAt('\',@FullyQualifiedName,1)+1
,len(@FullyQualifiedName))
/*
PathNameOnly FileNameOnly
--------------------------- ------------
C:\Windows\System32\Config\ System.Log
*/
The OCCURS() Function will tell you the number of times a string expression occurs in another string expression:

select LetterEOccurrences=dbo.Occurs('e','Here is a sentence')
,WordTheOccurrences=dbo.Occurs('the','The end of the story')
/*
LetterEOccurrences WordTheOccurrences
------------------ ------------------
5 2
*/
The CHRTRAN() Function is kind of similar to T-SQL’s REPLACE() function, except it will do multiple individual character translations. Its first argument is a string that you want to perform the translations upon. The second argument is a string of characters that you want to individually translate. And the third argument is a string of characters that are the replacements/translations of the characters in the second argument. Here are some examples to illustrate:

select Translate1=dbo.ChrTran('abcdefghi','aei','XYZ')
,Translate2=dbo.ChrTran('123456789','316','***')
/*
Translate1 Translate2
---------- ----------
XbcdYfghZ *2*45*789
*/
The length of the third argument does not have to match the length of the second argument. Note what happens when I pass an empty string as the third argument… the characters in the second argument are removed (or just replaced with a zero-length character):

select VowelsRemoved=dbo.ChrTran('abcdefghi','aei','')
/*
VowelsRemoved
-------------
bcdfgh
*/
The above illustrates a handy way to remove multiple characters from a string. But here’s a clever way to use CHRTRAN() to remove all the characters from a string except certain ones. This example removes all non-numeric characters from the string:

declare @PhoneNumber varchar(30)
set @PhoneNumber='(650) 555-1212'
select NumericDigitsOnly=dbo.ChrTran(@PhoneNumber
,dbo.ChrTran(@PhoneNumber,'0123456789','')
,'')
/*
NumericDigitsOnly
-----------------
6505551212
*/
Note that the inner CHRTRAN() removed all the numeric characters from the string, and then the outer CHRTRAN() made use of that result to remove those characters from the original string. Cute, huh?

The STREXTRACT() Function extracts data from a string between two delimiters. I believe this was originally introduced to the VFP language to aid in shredding XML. The first argument is the string to search. The second and third arguments are the beginning and ending delimiters. The fourth argument specifies at which occurrence of the beginning delimiter you want to start the extraction.

And finally a fifth argument is a flag in which you can specify additive options… a 1 indicates a case-insensitive search (I do not make use of this value in my UDF definition and will let the collation of the database determine whether the search is case-insensitive or not), a 2 indicates that the end delimiter is not required to be found in order to do the extraction, and a 4 indicates that you wish to include the delimiters in the returned expression. Note that I said that these options are additive… for example, you can pass the value 6, which is the sum of 2+4, so both of those options will be honored.

Here are some examples to illustrate:

declare @XMLString varchar(max)
set @XMLString='
<stooge><id>1</id><name>Moe</name></stooge>
<stooge><id>2</id><name>Larry</name></stooge>
<stooge><id>3</id><name>Curly</name></stooge>'
select SecondStooge=dbo.StrExtract(@XMLString,'<name>','</name>',2,0)
,SecondStoogeWithDelimiters=dbo.StrExtract(@XMLString,'<name>','</name>',2,4)
/*
SecondStooge SecondStoogeWithDelimiters
------------ --------------------------
Larry <name>Larry</name>
*/


declare @WordList varchar(max)
set @WordList='one;two;three;four;five'
select FourthWord=dbo.StrExtract(';'+@WordList,';',';',4,0)
,FifthWord=dbo.StrExtract(';'+@WordList,';',';',5,0) /* Oops: No End Delimiter */
,FifthWordEndDelimNotReqd=dbo.StrExtract(';'+@WordList,';',';',5,2)
,FifthWordEndDelimNotReqdIncludeDelims=dbo.StrExtract(';'+@WordList,';',';',5,2+4)
/*
FourthWord FifthWord FifthWordEndDelimNotReqd FifthWordEndDelimNotReqdIncludeDelims
---------- --------- ------------------------ -------------------------------------
four five ,five
*/
VFP offers a PROPER() Function, which will capitalize a string “as appropriate” for proper names. So, for example, it will take the string BRAD SCHULZ and will return Brad Schulz. That’s fine, but the unfortunate thing is that PROPER() is very limiting… it will set the whole string to lower case and will capitalize any letters that follow a space (or are at the beginning of the string)… and that’s it. It will not handle letters that come after hyphens or parentheses or quotation marks or any other special characters correctly. So, if you pass it ITZIK BEN-GAN (T-SQL “GURU”), it would end up returning Itzik Ben-gan (t-sql “guru”)… only two letters would get capitalized… the ‘I’ and the ‘B’.

So I came up with a function that I call PROPERIZE(). It will correctly capitalize any letters that follow various special characters (like hyphen, ampersand, parenthesis, etc). It also takes a second argument… if it is equal to 1, then the string will be converted to lower case first before the capitalization takes place… if it is equal to 0, then the string is capitalized “as is”. The following examples illustrate this:

select Example1=dbo.Properize('itzik ben-gan (T-SQL "guru")',1)
,Example2=dbo.Properize('itzik ben-gan (T-SQL "guru")',0)
/*
Example1 Example2
---------------------------- ----------------------------
Itzik Ben-Gan (T-Sql "Guru") Itzik Ben-Gan (T-SQL "Guru")
*/

select Example3=dbo.Properize('ITZIK BEN-GAN (T-SQL "GURU")',1)
,Example4=dbo.Properize('ITZIK BEN-GAN (T-SQL "GURU")',0)
/*
Example3 Example4
---------------------------- ----------------------------
Itzik Ben-Gan (T-Sql "Guru") ITZIK BEN-GAN (T-SQL "GURU")
*/
Finally, we have the PADL() and PADR() and PADC() Functions, which pad a string to a specified length with a specific character on the left or right sides, or both:

select PadLeft=dbo.PadL('Title',15,'*')
,PadRight=dbo.PadR('Title',15,'*')
,PadCenter=dbo.PadC('Title',15,'*')
/*
PadLeft PadRight PadCenter
--------------- --------------- ---------------
**********Title Title********** *****Title*****
*/
The code to create all these functions is below. Many of these functions are dependent upon each other, so you should run the code to create them all at once. Instead of doing a copy/paste of the code below, you can go here to download the code.

Just a reminder… Be sure to tune in again next time, when I put together a wacky project that makes use of these string functions. Until then…

use TempDB  /* Change to whatever database you wish */
go

/*----------------------------------------------------------------------------*/

if object_id('TrueLen') is not null drop function TrueLen
go
create function TrueLen
(
@Expr nvarchar(max)
)
returns int
as

begin
declare @TrueLen int
set @TrueLen=len(@Expr+'*')-1 /* or datalength(@Expr)/2 */
return @TrueLen
end
go

/*----------------------------------------------------------------------------*/

if object_id('At') is not null drop function At
go
create function At
(
@Expr1 nvarchar(max)
,@Expr2 nvarchar(max)
,@Occurrence int = 1
)
returns int
as
begin
declare @Position int
if @Expr1 is not null and
@Expr2
is not null and
@Occurrence
is not null
begin
declare @Counter int
select @Position=0, @Counter=0
while @Counter<@Occurrence
begin
set @Counter=@Counter+1
set @Position=charindex(@Expr1,@Expr2,@Position+1)
if @Position=0 or @Position is null break
end
end
return @Position
end
go

/*----------------------------------------------------------------------------*/

if object_id('RAt') is not null drop function RAt
go
create function RAt
(
@Expr1 nvarchar(max)
,@Expr2 nvarchar(max)
,@Occurrence int = 1
)
returns int
as
begin
declare @Position int
if @Expr1 is not null and
@Expr2
is not null and
@Occurrence
is not null
begin
declare @Expr1Len int, @Expr2Len int, @AtPos int
select @Expr1Len=dbo.TrueLen(@Expr1)
,@Expr2Len=dbo.TrueLen(@Expr2)
,@AtPos=dbo.At(reverse(@Expr1),reverse(@Expr2),@Occurrence)
set @Position=@Expr2Len-(@Expr1Len+@AtPos-1)+1
end
return @Position
end
go

/*----------------------------------------------------------------------------*/

if object_id('Occurs') is not null drop function Occurs
go
create function Occurs
(
@Expr1 nvarchar(max)
,@Expr2 nvarchar(max)
)
returns int
as
begin
declare @Result int
if @Expr1 is not null and
@Expr2
is not null
begin
declare @Expr1Len int
,@Expr2Len int
,@NewExpr2 nvarchar(max)
,@NewExpr2Len int
select @Expr1Len=dbo.TrueLen(@Expr1)
,@Expr2Len=dbo.TrueLen(@Expr2)
,@NewExpr2=replace(@Expr2,@Expr1,'')
set @NewExpr2Len=dbo.TrueLen(@NewExpr2)
set @Result=case
when @Expr1Len=0
then 0
else (@Expr2Len-@NewExpr2Len)/@Expr1Len
end

end
return @Result
end
go

/*----------------------------------------------------------------------------*/

if object_id('ChrTran') is not null drop function ChrTran
go
create function ChrTran
(
@Expr nvarchar(max)
,@SearchChars nvarchar(max)
,@ReplaceChars nvarchar(max)
)
returns nvarchar(max)
as
begin
declare @Result nvarchar(max)
if @Expr is not null and
@SearchChars
is not null and
@ReplaceChars
is not null
begin
declare @Counter int
select @Result=@Expr, @Counter=0
while @Counter<dbo.TrueLen(@SearchChars)
begin
set @Counter=@Counter+1
set @Result=replace(@Result
,substring(@SearchChars,@Counter,1)
,substring(@ReplaceChars,@Counter,1))
end
end
return @Result
end
go

/*----------------------------------------------------------------------------*/

if object_id('StrExtract') is not null drop function StrExtract
go
create function StrExtract
(
@Expr nvarchar(max)
,@BeginDelim nvarchar(max)
,@EndDelim nvarchar(max)
,@Occurrence int
,@Flags int /* 2=EndDelim not required, 4=Include Delims in result */
)
returns nvarchar(max)
as
begin
declare @Result nvarchar(max)
if @Expr is not null and
@BeginDelim is not null and
@EndDelim
is not null and
@Occurrence
is not null and
@Flags is not null
begin
declare @Exprlen int
,@BeginDelimLen int
,@EndDelimLen int
,@BeginDelimPos int
,@EndDelimPos int
select @Result=''
,@Exprlen=dbo.TrueLen(@Expr)
,@BeginDelimLen=dbo.TrueLen(@BeginDelim)
,@EndDelimLen=dbo.TrueLen(@EndDelim)
,@BeginDelimPos=dbo.At(@BeginDelim
,@Expr
,@Occurrence)
if @BeginDelimPos>0
begin
set @EndDelimPos=charindex(@EndDelim
,@Expr
,@BeginDelimPos+@BeginDelimLen)
if @EndDelimPos=0 and @Flags&2=2
set @EndDelimPos=@Exprlen+1
if @EndDelimPos>0
set @Result=case
when @Flags&4=4 /* Include Delimiters in Result */
then substring(@Expr
,@BeginDelimPos
,@EndDelimPos-@BeginDelimPos+@EndDelimLen)
else substring(@Expr
,@BeginDelimPos+@BeginDelimLen
,@EndDelimPos-@BeginDelimPos-@BeginDelimLen)
end
end
end
return @Result
end
go

/*----------------------------------------------------------------------------*/

if object_id('Properize') is not null drop function Properize
go
create function Properize
(
@Expr nvarchar(max)
,@SetToLowerCaseFirst bit = 0
)
returns nvarchar(max)
as
begin
declare @Result nvarchar(max)
if @Expr is not null
begin
declare @Position int
,@Capitalize bit
,@Char nchar(1)
select @Result=case
when @SetToLowerCaseFirst=1
then lower(@Expr)
else @Expr
end
,@Position=0
,@Capitalize=1
while @Position<len(@Result)
begin
select @Position=@Position+1
,@Char=substring(@Result,@Position,1)
if @Capitalize=1
select @Capitalize=0
,@Result=stuff(@Result
,@Position
,1
,upper(@Char))
if charindex(@Char,' #%&*()-_=+[]{}":./')>0 set @Capitalize=1
end
end
return @Result
end
go

/*----------------------------------------------------------------------------*/

if object_id('PadL') is not null drop function PadL
go
create function PadL
(
@Expr nvarchar(max)
,@Size int
,@PadChar char(1)
)
returns nvarchar(max)
as
begin
declare @Result nvarchar(max)
if @Expr is not null and
@Size is not null and
@PadChar is not null
begin
set @Result=right(replicate(@PadChar,@Size)+@Expr,@Size)
end
return @Result
end
go

/*----------------------------------------------------------------------------*/

if object_id('PadR') is not null drop function PadR
go
create function PadR
(
@Expr nvarchar(max)
,@Size int
,@PadChar char(1)
)
returns nvarchar(max)
as
begin
declare @Result nvarchar(max)
if @Expr is not null and
@Size is not null and
@PadChar is not null
begin
set @Result=left(@Expr+replicate(@PadChar,@Size),@Size)
end
return @Result
end
go

/*----------------------------------------------------------------------------*/

if object_id('PadC') is not null drop function PadC
go
create function PadC
(
@Expr nvarchar(max)
,@Size int
,@PadChar char(1)
)
returns nvarchar(max)
as
begin
declare @Result nvarchar(max)
if @Expr is not null and
@Size is not null and
@PadChar is not null
begin
declare @Exprlen int
,@LeftSize int
set @Exprlen=dbo.TrueLen(@Expr)
set @LeftSize=case when @Size<@Exprlen then 0 else (@Size-@Exprlen)/2 end
set @Result=replicate(@PadChar,@LeftSize)+dbo.PadR(@Expr,@Size-@LeftSize,@PadChar)
end
return @Result
end
go